-
Notifications
You must be signed in to change notification settings - Fork 373
IO bottleneck: repository decompression #4586
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I guess OBuilder could add a |
Dev meeting: @AltGr has a patch which can disable the |
The patch is here: OCamlPro@567cb24 |
ironic that these |
Just one note:
this is only true if you don't have a cache: normally the untarring is only needed on |
what would be the possible reasons for not having a cache? |
Switching opam binary would invalidate it - aren't those opam roots made with opam 2 and you're then switching to 2.1? |
mmh, it seems that However I'm a bit surprised that |
The cache isn't out of date at that point? |
How do I know if it's out of date? The sequence of operation i had was roughly:
[*]: does it update the cache or is it just untaring every time because it is in an "invalid" state? |
@AltGr after trying again with a clean opamroot ( |
Notes from today: if we add a global option to allow this to be turned, Windows can benefit as it can be turned off by default (the caching mechanism is much slower than the already slow 2.0 mechanism!) and the option would of course have a corresponding environment variable allowing it to be set in CI systems. |
I was able to reproduce this behaviour quite easily with opam master (41f3684) in a docker container:
From there run EDIT: Actually, the tar call appears a bit after the first archives retrivals if that helps. |
cool! in the good moment to test the PR :D |
|
Ok I got it. So the issue is that we’re not storing entirety of the repository inside of the state-cache, so everytime opam installs something it first needs to pull the https://github.com/ocaml/opam/blob/master/src/client/opamAction.ml#L441 To me there doesn’t seem to be any upsides in keeping that around and I would suggest reverting #3752 as opam-update now is slower than in opam 2.0:
|
For extra files, it doesn't need to extract it when it looks for extra files, it is extracted when the repository state is loaded. Then when looking for extra files, it checks files existence in the repo (in tmp new path for tarred repo, and locally in On the times, i got similar values on a debian: opam 2.0.10 < opam 2.2 with repo optim < opam 2.2 without repo optim. I didn't test yet the revert times. @AltGr do you remember or noted your times when you did the optim? |
When calling
opam install <pkg>
one of the first processes opam calls are:These two commands seem to be the main IO bottleneck in CI where
/tmp
is not tmpfs for technical reason (cached containerization). I feel like we might see a good performance improvement by decompressing the repository in OCaml directly probably in some kind of stream mode if the thing that needs the repository information can be used this way.Partially related to the later discussion in #3050
The text was updated successfully, but these errors were encountered: