You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If byte is not in the range 0x30 to 0x39, inclusive, then:
Prepend gb18030 second, gb18030 third, and byte to stream.
Set gb18030 first, gb18030 second, and gb18030 third to 0x00.
Return error.
Let code point be the index gb18030 ranges code point for ((gb18030 first − 0x81) × (10 × 126 × 10)) + ((gb18030 second − 0x30) × (10 × 126)) + ((gb18030 third − 0x81) × 10) + byte − 0x30.
If code point is null, return error.
Return a code point whose value is code point.
I'm having trouble understanding how, after the last step above, the decoder will accept the next byte correctly. Because gb18030 first/gb18030 second/gb18030 third is not 0x00 after this last step, it seems to enter the wrong steps for subsequent bytes.
For example, if I have the byte sequence in hex 20 81 40 84 31 83 30, decoding it will result in 丂︔� (error at the end) but the expected is 丂︔.
I think "set gb18030 first, gb18030 second, and gb18030 third to 0x00" before returning error or code point is missing?
The text was updated successfully, but these errors were encountered:
Thank you! This was introduced in #111. @hsivonen should I go back to my original proposed wording or duplicate the setting to 0x00 of these state variables?
(An alternative that avoids duplication would be some kind of "return and unset" routine, but that doesn't really fit in the current setup.)
@chfoo forgot to ask, ok to acknowledge you as Christopher Foo? Or would you prefer a different name? (Also, if you'd like, you'd be welcome to submit a PR to fix this after @hsivonen confirms an approach.)
https://encoding.spec.whatwg.org/commit-snapshots/b04091a5f079a7bdcab5aa8c7adead554326a96c/#gb18030-decoder
I'm having trouble understanding how, after the last step above, the decoder will accept the next byte correctly. Because
gb18030 first
/gb18030 second
/gb18030 third
is not 0x00 after this last step, it seems to enter the wrong steps for subsequent bytes.For example, if I have the byte sequence in hex
20 81 40 84 31 83 30
, decoding it will result in丂︔�
(error at the end) but the expected is丂︔
.I think "set
gb18030 first
,gb18030 second
, andgb18030 third
to 0x00" before returning error or code point is missing?The text was updated successfully, but these errors were encountered: