Commit graph

76 commits

Author SHA1 Message Date
87ec30a375 Explicit constant visibility
Also partially revert change to encoder determination
2020-10-24 22:53:12 -04:00
c234702cce Speed up encoding; make ISO 2022-JP more consistent
- The ISO 2022-JP encoder is now static as with all others; this is
slightly slower, but localises the encoder logic to its class
- Indexed encoders now cache pointer tables on first use, yielding
significant performance benefits
- Encoding multiple characters now uses fewer function calls, yielding
moderate performance benefits at the expense of slight complication
2020-10-19 23:12:45 -04:00
be2134cc71 API re-organization 2020-10-18 15:32:49 -04:00
808b4128dd Tests for replacement encoding; readme correction 2020-10-17 13:52:32 -04:00
ffa3f431d6 Coverage fixes 2020-10-16 20:35:27 -04:00
d580e93e52 ISO 2022-JP encoder tests and fixes 2020-10-16 20:13:53 -04:00
10328b6806 Tests for general encoder 2020-10-15 16:19:57 -04:00
db738bba99 Encoder for x-user-defined 2020-10-15 12:34:03 -04:00
a57dde6dbd Style fixes 2020-10-15 10:39:44 -04:00
cdd1c0182b Corrected ISO 2022-JP decoder and seeker 2020-10-14 12:29:19 -04:00
86c2b0d628 Fix coverage 2020-10-11 18:36:49 -04:00
53b27d1a55 Correct buggy Shift_JIS tests 2020-10-09 09:47:28 -04:00
96846d061c Complete Shift_JIS testing 2020-10-08 19:22:32 -04:00
d45e0be7c3 Typo 2020-10-08 17:18:48 -04:00
915aa7ca93 Finally fix Shift_JIS seeker 2020-10-07 22:48:50 -04:00
d9b8cd8dd1 Fixes for multi-byte index-base encoders
- array_flip() retains the last duplicate, when we need the first
- Indexes are now prepared with a list of first-duplicate code points
to search before flipping
- This affected only U+3000 in GBK
- Big5 did not use array_flip(), but its list of override code points
did not include U+2561; Big5 now flips like the others
- EUC-JP had a long list of errors, but this encoding was not
previously released
- Shift_JIS' indexes are probably not correct, still
2020-10-07 11:28:21 -04:00
9e812ffdf8 Second stab at Shift_JIS
- Decoder implemented, with correct table
- Modernized decoder; may have bugs
- Backwards seeker hopefully, though it does not yet pass fuzzer
2020-10-06 16:12:57 -04:00
b284056644 Encode correct duplicate pointers in EUC-JP 2020-10-06 15:39:33 -04:00
46b6ac3c44 Complete and correct EUC-JP implementation 2020-10-06 11:47:22 -04:00
f7246ccc34 Fix gb18030 seeking; tidy up 2020-10-06 11:42:32 -04:00
14d67ad49f Add fuzz test for backwards seeking
Test data is 1025 random bytes; gb18030 still fails
2020-10-06 11:42:32 -04:00
6417e8f0be Start overhauling error handling; adjust coverage annotations 2020-10-06 11:42:32 -04:00
befd1feb3a Apply stricter house style where possible 2020-10-06 11:42:30 -04:00
85f06186f2 Partial Shift_JIS implementation 2020-01-07 11:43:45 -05:00
f49d632642 Merge branch 'master' into multi-byte 2019-12-28 23:31:10 -05:00
c4a2ae1714 Tests for new features 2019-12-20 20:56:59 -05:00
200a310f72 Optionally allow surrogates
Also removed unnecessary docblocks
2019-12-18 14:57:54 -05:00
2e47fde774 Upgrade to PHPUnit 8 2019-12-13 11:05:01 -05:00
eae901a9e2 Add new methods 2019-12-13 11:00:25 -05:00
74d8e07a65 Fully corrected WPT test data for EUC-JP 2018-09-18 09:08:51 -04:00
8dfb1ba984 Initial implementation of EUC-JP 2018-09-17 19:33:46 -04:00
2810ed9b2a Full tests for EUC-KR 2018-09-15 19:46:42 -04:00
1121f32e96 Minor Big5 corrections 2018-09-15 13:53:21 -04:00
c4cdbdd5c8 Initial implementation of EUC-KR 2018-09-15 13:30:30 -04:00
c2a8b1ba52 Style fixes 2018-09-15 11:39:48 -04:00
bfc6c677c5 Complete Big5 tests, with numerous fixes 2018-09-15 11:38:51 -04:00
5217a6c0bc Tidying 2018-09-15 09:10:36 -04:00
32d7fc47b0 Fix HTML test generator; clean up 2018-09-05 11:44:54 -04:00
55cbc915c3 Refactor HTML-based test generators 2018-09-05 09:46:11 -04:00
63fccc3c3a Test UTF-16 EOF handling better 2018-09-04 09:41:05 -04:00
4a091610e9 Initial implementation of Big5 encoding
Only the decoder is tested, and even that requires more thorough testing.

Testing of seeking and encoding still to come
2018-08-30 23:27:29 -04:00
d5327a3b83 Implement x-user-defined decoder
Also further refactored tests to better account for one-way encodings
2018-08-30 12:26:50 -04:00
dd9bed2e84 Implement UTF-16 2018-08-30 11:06:15 -04:00
e683167905 Style fixes
Because of the large arrays in the GBCommon class and its test suite,
memory limits had to be disabled in php-cs-fixer
2018-08-29 23:47:42 -04:00
1449fae908 Refactor UTF-8 seeking 2018-08-29 23:39:56 -04:00
e4b6acb24a Refactor tests 2018-08-29 23:32:36 -04:00
4c686aa8a1 Complete battery of tests for gb18030 2018-08-29 17:16:16 -04:00
1b9889914a Fix numerous bugs with gb18030 2018-08-29 15:58:53 -04:00
467c565e8c Implement gb18030 seeking
Also fix some bugs in EOF handling
2018-08-28 15:31:51 -04:00
40d0054bd1 Implement gb18030 and GBK encoders 2018-08-28 11:48:25 -04:00