Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add pct encoder #147

Merged
merged 8 commits into from
Oct 8, 2020
Merged

Add pct encoder #147

merged 8 commits into from
Oct 8, 2020

Conversation

orbitz
Copy link

@orbitz orbitz commented Jul 30, 2020

@avsm So this implements the customization we talked about.

I think it works well for to_string but I feel it clutters up the individual string accessors. It also elucidates that not all component accessors perform pct encoding, I'm not sure if that is deliberate or not.

What do you think?

anmonteiro and others added 8 commits December 22, 2019 15:00
This diff reworks the current parser based on regexes in favor of one
that uses Angstrom.

It also removes the dependency on `ocaml-re`, moving the regular
expressions to a new package `uri-re`.

This happens to fix mirage#21.
This is useful for reusing the pct encoding function but with custom
character sets.  For example, RFC 6570 uses some non-standard pct
encodings in some template expansions.
@orbitz orbitz force-pushed the add-pct-encoder branch 2 times, most recently from 99665bc to 750c3f1 Compare August 5, 2020 11:20
@orbitz
Copy link
Author

orbitz commented Aug 5, 2020

@avsm I got this to pass tests. I don't know if what I did is correct, and I did not clean up all existing work (specifically there seems to be some Uri_re files left. I don't know enough about @anmonteiro change to fix those, but if someone wants to direct me I'm happy to look at it.

But, this is at least mergable in the "build is green" sense of the word.

@anmonteiro
Copy link
Contributor

anmonteiro commented Aug 5, 2020

I don't think deleting the legacy tests is accurate. I think the problem here is that the tests for the uri package depend on the uri-re package, which is not a dependency that we want to publish, and CI doesn't find them for some reason.

@orbitz
Copy link
Author

orbitz commented Aug 5, 2020

I deleted the legacy test because the Uri_legacy model couldn't be found on testing. Missing something else?

Is it correct that the legacy tests are verifying that this new implementation matches the old implementation? If so, is that necessary?

@anmonteiro
Copy link
Contributor

I deleted the legacy test because the Uri_legacy model couldn't be found on testing. Missing something else?

This is what I was trying to express above:

the tests for the uri package depend on the uri-re package

the uri-re package is where the legacy implementation now lives.

@orbitz
Copy link
Author

orbitz commented Aug 5, 2020

Ah ok. Is that something you want to fix in your branch and I can rebase my changes on it? Or do you can just tell me what to do to fix it.

@anmonteiro
Copy link
Contributor

I'm not sure how to fix it. I think it's a problem of how CI runs the tests, something along the lines of opam install -t .... I'm not an OPAM user so I'd appreciate more guidance here.

@orbitz
Copy link
Author

orbitz commented Aug 5, 2020

@avsm can you lend us any insight?

@tmcgilchrist
Copy link
Member

This looks like some internal timeout in the build infrastructure, not sure @avsm

#1 sha256:a1d94883c64a2cbdf64f0985d646ac381e7827f60258d78cb25670fd48ca96d8
#1 ...

#2 [internal] load .dockerignore
#2 sha256:5828acd1839bccbb9930d2cf0923fd4e7a3d81fd915f7aa74a63bd2dd028897f
#2 ERROR: no active session for jzb7zv9j8b3tuubj4a8siukmx: context deadline exceeded

#1 [internal] load build definition from Dockerfile
#1 sha256:a1d94883c64a2cbdf64f0985d646ac381e7827f60258d78cb25670fd48ca96d8
#1 ERROR: no active session for jzb7zv9j8b3tuubj4a8siukmx: context deadline exceeded
------
 > [internal] load .dockerignore:
------
------
 > [internal] load build definition from Dockerfile:
------
failed to solve with frontend dockerfile.v0: failed to resolve dockerfile: failed to build LLB: no active session for jzb7zv9j8b3tuubj4a8siukmx: context deadline exceeded
docker-build failed with exit-code 1
2020-08-28 12:59.56: Job failed: Failed: Build failed```

@tmcgilchrist
Copy link
Member

Rebuild the failing parts and they pass now. Strongly suspect there was a CI timeout environment issue.

@orbitz
Copy link
Author

orbitz commented Sep 1, 2020

@tmcgilchrist Nice! Thank you. @avsm How's this change looking to you?

@orbitz
Copy link
Author

orbitz commented Sep 27, 2020

@avsm @tmcgilchrist @anmonteiro One thing I noticed is the new parser means uri does not work in jsoo. Is this desired?

@anmonteiro
Copy link
Contributor

@orbitz Could you clarify why that is the case? The only change that I could imagine would cause that is the switch to Angstrom, though that has JSOO support (see inhabitedtype/angstrom#187)

@orbitz
Copy link
Author

orbitz commented Sep 28, 2020

@anmonteiro ahhhh you're right, I had some old dependencies. Upgraded and everything works.

Any idea how to get this version of uri released?

@dinosaure
Copy link
Member

So, this PR has what @anmonteiro did on #142 if I understand correctly? If it's the case, can we close #142?

Then, I can take my time to:

  • see if the PR does not change a lot all of the ecosystem
  • merge it
  • make a release

@orbitz
Copy link
Author

orbitz commented Sep 29, 2020

So it looks like I got bit by opam caching some results and it does not work. I get the following error:

bigstringaf_blit_from_bytes not implemented

I created the following topic to see if anyone has suggestions.

https://discuss.ocaml.org/t/bigstringaf-blit-from-bytes-not-implemented/6524

@dinosaure
Copy link
Member

I tried your PR (with opam pin) and did this little executable:

let v = Uri.of_string "http://localhost/" in
Format.printf "%a\n%!" Uri.pp v

Which is compiled with js_of_ocaml:

(executable
 (name main)
 (modes js)
 (libraries uri))

It seems that all works for me when I do:

$ nodejs _build/default/main.bs.js
https://localhost/

My version is angstrom.0.14.1 with bistringaf.0.6.1.

@orbitz
Copy link
Author

orbitz commented Oct 1, 2020

All is good. With some help from @dinosaure I figured out the issue was on my end based on how I'm compiling my program.

@dinosaure for your original points above: the interface to Uri should not have changed so I don't think releasing this will impact the ecosystem.

@dinosaure
Copy link
Member

@dinosaure for your original points above: the interface to Uri should not have changed so I don't think releasing this will impact the ecosystem.

Yes it's mostly about behavior when uri moved to angstrom - and we should take care about some details according to MirageOS (but from what I know, all should be good). I tested your branch on my universe and my unikernels and I did not get any weird behaviors (however, it's not a truly check). For my perspective, it's ready to merge, however I would like to have the opinion of @mirage/core before - so just waiting one week and we will be ready to make a release 👍.

@dinosaure
Copy link
Member

Let's start for a release 👍

@dinosaure dinosaure merged commit 9025a79 into mirage:master Oct 8, 2020
@avsm
Copy link
Member

avsm commented Oct 9, 2020

Only one way to find out :-) It's all good to me by inspection. This is not a gater to the release, but I'd like to compare the old and new implementation using memtrace to check that the new one is more constant-memory than the old regex based one.

dinosaure added a commit to dinosaure/opam-repository that referenced this pull request Oct 14, 2020
CHANGES:

* sexp: use the sexplib v0.13 ppx directives (@avsm, mirage/ocaml-uri#143).
* rework the URI parser with `angstrom` (@anmonteiro, review @avsm & @dinosaure, mirage/ocaml-uri#142).
* add simple fuzzer tests between `angstrom` parser and _legacy_ parser (with `re.posix`, mirage/ocaml-uri#142)
* add support of modifying pct encoding (with a custom one) (@orbitz, review @anmonteiro, @tmcgilchrist, @avsm & @dinosaure, mirage/ocaml-uri#147)
* allow the selection of generic set of safe characters (with `Generic`) (@madroach, review @dinosaure, mirage/ocaml-uri#141)
dinosaure added a commit to dinosaure/opam-repository that referenced this pull request Oct 14, 2020
CHANGES:

* sexp: use the sexplib v0.13 ppx directives (@avsm, mirage/ocaml-uri#143).
* rework the URI parser with `angstrom` (@anmonteiro, review @avsm & @dinosaure, mirage/ocaml-uri#142).
* add simple fuzzer tests between `angstrom` parser and _legacy_ parser (with `re.posix`, mirage/ocaml-uri#142)
* add support of modifying pct encoding (with a custom one) (@orbitz, review @anmonteiro, @tmcgilchrist, @avsm & @dinosaure, mirage/ocaml-uri#147)
* allow the selection of generic set of safe characters (with `Generic`) (@madroach, review @dinosaure, mirage/ocaml-uri#141)
avsm added a commit to avsm/opam-repository that referenced this pull request Apr 26, 2021
CHANGES:

* Do not mutate the base encoder when using custom percent encoders.
  This was a bug introduced in mirage/ocaml-uri#147. (mirage/ocaml-uri#156 @aantron)
* Disable Travis CI tests and switch Win/Mac tests to GitHub Actions and
  Linux ones to ocaml-ci (@avsm).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants