Commit graph

21 commits

Author SHA1 Message Date
87ec30a375 Explicit constant visibility
Also partially revert change to encoder determination
2020-10-24 22:53:12 -04:00
600379a4dd Fill out API documentation 2020-10-24 14:24:23 -04:00
c234702cce Speed up encoding; make ISO 2022-JP more consistent
- The ISO 2022-JP encoder is now static as with all others; this is
slightly slower, but localises the encoder logic to its class
- Indexed encoders now cache pointer tables on first use, yielding
significant performance benefits
- Encoding multiple characters now uses fewer function calls, yielding
moderate performance benefits at the expense of slight complication
2020-10-19 23:12:45 -04:00
be2134cc71 API re-organization 2020-10-18 15:32:49 -04:00
d9b8cd8dd1 Fixes for multi-byte index-base encoders
- array_flip() retains the last duplicate, when we need the first
- Indexes are now prepared with a list of first-duplicate code points
to search before flipping
- This affected only U+3000 in GBK
- Big5 did not use array_flip(), but its list of override code points
did not include U+2561; Big5 now flips like the others
- EUC-JP had a long list of errors, but this encoding was not
previously released
- Shift_JIS' indexes are probably not correct, still
2020-10-07 11:28:21 -04:00
f7246ccc34 Fix gb18030 seeking; tidy up 2020-10-06 11:42:32 -04:00
0eb2a8ac24 Fix bugs in gb18030 and UTF-16
- UTF-16 needs to restore dirtyEOF after seeking
- gb18030 now tracks errors like other non-synchronizing encodings
- gb18030 could produce null when asked for a character
2020-10-06 11:42:32 -04:00
61a77086bb Make GenericEncoding trait an abstract class 2020-10-06 11:42:32 -04:00
f69cd98b4c Make posErr fully generic 2020-10-06 11:42:32 -04:00
7339176e3e Split error handlers 2020-10-06 11:42:32 -04:00
fc44bb1415 Generalize handling of dirty EOF 2019-12-23 15:36:38 -05:00
200a310f72 Optionally allow surrogates
Also removed unnecessary docblocks
2019-12-18 14:57:54 -05:00
fb70543c0f Change gb18030 loop to be consistent with Big5 and EUC-KR 2018-09-15 14:02:17 -04:00
4a091610e9 Initial implementation of Big5 encoding
Only the decoder is tested, and even that requires more thorough testing.

Testing of seeking and encoding still to come
2018-08-30 23:27:29 -04:00
a0bf8a9b05 Don't check for dirty EOF on every iteration 2018-08-30 08:55:33 -04:00
e683167905 Style fixes
Because of the large arrays in the GBCommon class and its test suite,
memory limits had to be disabled in php-cs-fixer
2018-08-29 23:47:42 -04:00
4c686aa8a1 Complete battery of tests for gb18030 2018-08-29 17:16:16 -04:00
1b9889914a Fix numerous bugs with gb18030 2018-08-29 15:58:53 -04:00
467c565e8c Implement gb18030 seeking
Also fix some bugs in EOF handling
2018-08-28 15:31:51 -04:00
40d0054bd1 Implement gb18030 and GBK encoders 2018-08-28 11:48:25 -04:00
766643aa37 Common infrstructure for gb18030 and GBK 2018-08-28 08:37:32 -04:00