Skip to content

Big5 encoding/decoding support #58

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
r12a opened this issue Jun 20, 2016 · 7 comments
Closed

Big5 encoding/decoding support #58

r12a opened this issue Jun 20, 2016 · 7 comments

Comments

@r12a
Copy link
Collaborator

r12a commented Jun 20, 2016

Results for a series of tests for Big5 encoding/decoding can be found at
https://www.w3.org/International/tests/repo/results/encoding-dbl-byte.en#big5

The tests can be run from that page (select the link in the left-most column) or get the tests from the WPT repo. There is a PR at
web-platform-tests/wpt#3197

The tests check whether:

  1. the browser produces the expected byte sequences for all characters in the big5 encoding after 0x9F when encoding bytes for a URL produced by a form, using the encoder steps in the specification.
  2. the browser produces percent-escaped character references for a URL produced by a form when encoding miscellaneous characters that are not in the big5 encoding. (tests for several ranges)
  3. same two types of test when writing characters to an href value
  4. the browser decodes all characters as expected from a file generated by encoding all pointers in the big5 encoding per the encoder steps in the specification.
  5. the browser decodes all characters as expected from a file generated by encoding all pointers less than 5024 in the big5 encoding per the encoder steps in the specification.
  6. the browser decodes characters that are not recognised from the big5 encoding as replacement characters.

The following summarises the current situation according to my testing, for major desktop browsers. (I will be adding nightly results and perhaps other browsers in time.) The table lists the number of characters that were NOT successfully converted by the test.

screen shot 2016-06-20 at 16 26 04

Notes:

  • Edge fails all href encode tests because characters are not converted to percent-escapes in the href attribute.
  • Firefox fails all href encode tests for characters not in the encoding because it converts characters to percent-escaped Unicode values instead.

Can we please investigate the failures to ascertain whether:

  1. the browser needs to be changed
  2. the spec needs to be changed
  3. the test is at fault

The following tool may be helpful for investigating issues. It converts between byte sequences and characters for all encodings in the Encoding spec. http://r12a.github.io/apps/encodings/

@r12a
Copy link
Collaborator Author

r12a commented Sep 15, 2016

@jungshik
Copy link

Again, in case of Chromium's form(misc) failures, it has the same root cause as #62, #59, #61: See
https://bugs.chromium.org/p/chromium/issues/detail?id=647568

@r12a
Copy link
Collaborator Author

r12a commented Jun 15, 2017

Today and yesterday i updated the results at https://www.w3.org/International/tests/repo/results/encoding-dbl-byte.en#big5 for Firefox, FNightly, Chrome, and Canary. The latest summary is:

screen shot 2017-06-15 at 08 38 54

@hsivonen
Copy link
Member

Thank you. The Big5 tests LGTM for merging into WPT. /cc @domenic

@domenic
Copy link
Member

domenic commented Jun 15, 2017

Let's close this as web-platform-tests/wpt#6254 is ready to merge.

@domenic domenic closed this as completed Jun 15, 2017
@domenic
Copy link
Member

domenic commented Jun 16, 2017

Reopening per #61 (comment)

@domenic domenic reopened this Jun 16, 2017
@annevk
Copy link
Member

annevk commented Oct 17, 2018

Now that Firefox passes all these tests and a year has passed, I'm happy to consider this done. A new issue would also be less noisy at this point, were one warranted.

@annevk annevk closed this as completed Oct 17, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

5 participants