87ec30a375
Explicit constant visibility
...
Also partially revert change to encoder determination
2020-10-24 22:53:12 -04:00
c234702cce
Speed up encoding; make ISO 2022-JP more consistent
...
- The ISO 2022-JP encoder is now static as with all others; this is
slightly slower, but localises the encoder logic to its class
- Indexed encoders now cache pointer tables on first use, yielding
significant performance benefits
- Encoding multiple characters now uses fewer function calls, yielding
moderate performance benefits at the expense of slight complication
2020-10-19 23:12:45 -04:00
be2134cc71
API re-organization
2020-10-18 15:32:49 -04:00
808b4128dd
Tests for replacement encoding; readme correction
2020-10-17 13:52:32 -04:00
ffa3f431d6
Coverage fixes
2020-10-16 20:35:27 -04:00
d580e93e52
ISO 2022-JP encoder tests and fixes
2020-10-16 20:13:53 -04:00
10328b6806
Tests for general encoder
2020-10-15 16:19:57 -04:00
db738bba99
Encoder for x-user-defined
2020-10-15 12:34:03 -04:00
a57dde6dbd
Style fixes
2020-10-15 10:39:44 -04:00
cdd1c0182b
Corrected ISO 2022-JP decoder and seeker
2020-10-14 12:29:19 -04:00
86c2b0d628
Fix coverage
2020-10-11 18:36:49 -04:00
53b27d1a55
Correct buggy Shift_JIS tests
2020-10-09 09:47:28 -04:00
96846d061c
Complete Shift_JIS testing
2020-10-08 19:22:32 -04:00
d45e0be7c3
Typo
2020-10-08 17:18:48 -04:00
915aa7ca93
Finally fix Shift_JIS seeker
2020-10-07 22:48:50 -04:00
d9b8cd8dd1
Fixes for multi-byte index-base encoders
...
- array_flip() retains the last duplicate, when we need the first
- Indexes are now prepared with a list of first-duplicate code points
to search before flipping
- This affected only U+3000 in GBK
- Big5 did not use array_flip(), but its list of override code points
did not include U+2561; Big5 now flips like the others
- EUC-JP had a long list of errors, but this encoding was not
previously released
- Shift_JIS' indexes are probably not correct, still
2020-10-07 11:28:21 -04:00
9e812ffdf8
Second stab at Shift_JIS
...
- Decoder implemented, with correct table
- Modernized decoder; may have bugs
- Backwards seeker hopefully, though it does not yet pass fuzzer
2020-10-06 16:12:57 -04:00
b284056644
Encode correct duplicate pointers in EUC-JP
2020-10-06 15:39:33 -04:00
46b6ac3c44
Complete and correct EUC-JP implementation
2020-10-06 11:47:22 -04:00
f7246ccc34
Fix gb18030 seeking; tidy up
2020-10-06 11:42:32 -04:00
14d67ad49f
Add fuzz test for backwards seeking
...
Test data is 1025 random bytes; gb18030 still fails
2020-10-06 11:42:32 -04:00
6417e8f0be
Start overhauling error handling; adjust coverage annotations
2020-10-06 11:42:32 -04:00
befd1feb3a
Apply stricter house style where possible
2020-10-06 11:42:30 -04:00
85f06186f2
Partial Shift_JIS implementation
2020-01-07 11:43:45 -05:00
f49d632642
Merge branch 'master' into multi-byte
2019-12-28 23:31:10 -05:00
c4a2ae1714
Tests for new features
2019-12-20 20:56:59 -05:00
200a310f72
Optionally allow surrogates
...
Also removed unnecessary docblocks
2019-12-18 14:57:54 -05:00
2e47fde774
Upgrade to PHPUnit 8
2019-12-13 11:05:01 -05:00
eae901a9e2
Add new methods
2019-12-13 11:00:25 -05:00
74d8e07a65
Fully corrected WPT test data for EUC-JP
2018-09-18 09:08:51 -04:00
8dfb1ba984
Initial implementation of EUC-JP
2018-09-17 19:33:46 -04:00
2810ed9b2a
Full tests for EUC-KR
2018-09-15 19:46:42 -04:00
1121f32e96
Minor Big5 corrections
2018-09-15 13:53:21 -04:00
c4cdbdd5c8
Initial implementation of EUC-KR
2018-09-15 13:30:30 -04:00
c2a8b1ba52
Style fixes
2018-09-15 11:39:48 -04:00
bfc6c677c5
Complete Big5 tests, with numerous fixes
2018-09-15 11:38:51 -04:00
5217a6c0bc
Tidying
2018-09-15 09:10:36 -04:00
32d7fc47b0
Fix HTML test generator; clean up
2018-09-05 11:44:54 -04:00
55cbc915c3
Refactor HTML-based test generators
2018-09-05 09:46:11 -04:00
63fccc3c3a
Test UTF-16 EOF handling better
2018-09-04 09:41:05 -04:00
4a091610e9
Initial implementation of Big5 encoding
...
Only the decoder is tested, and even that requires more thorough testing.
Testing of seeking and encoding still to come
2018-08-30 23:27:29 -04:00
d5327a3b83
Implement x-user-defined decoder
...
Also further refactored tests to better account for one-way encodings
2018-08-30 12:26:50 -04:00
dd9bed2e84
Implement UTF-16
2018-08-30 11:06:15 -04:00
e683167905
Style fixes
...
Because of the large arrays in the GBCommon class and its test suite,
memory limits had to be disabled in php-cs-fixer
2018-08-29 23:47:42 -04:00
1449fae908
Refactor UTF-8 seeking
2018-08-29 23:39:56 -04:00
e4b6acb24a
Refactor tests
2018-08-29 23:32:36 -04:00
4c686aa8a1
Complete battery of tests for gb18030
2018-08-29 17:16:16 -04:00
1b9889914a
Fix numerous bugs with gb18030
2018-08-29 15:58:53 -04:00
467c565e8c
Implement gb18030 seeking
...
Also fix some bugs in EOF handling
2018-08-28 15:31:51 -04:00
40d0054bd1
Implement gb18030 and GBK encoders
2018-08-28 11:48:25 -04:00