Skip to content

Speed up nearest_common_ancestor. #50106

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 25, 2018

Conversation

nnethercote
Copy link
Contributor

nearest_common_ancestor can be made faster.

Here are all the benchmarks where one of the measurements improved by at least 1%.

clap-rs-check
	avg: -4.5%	min: -8.8%	max: -0.3%
clap-rs
	avg: -2.6%	min: -4.5%	max: 0.5%
script-servo
	avg: -1.7%	min: -3.6%	max: 0.0%
regression-31157
	avg: -1.5%	min: -2.6%	max: -0.4%
hyper
	avg: -1.2%	min: -2.5%	max: -0.0%
piston-image
	avg: -1.6%	min: -2.5%	max: 0.1%
regex
	avg: -1.2%	min: -2.2%	max: 0.0%
issue-46449
	avg: -1.8%	min: -2.1%	max: -0.7%
crates.io
	avg: -1.2%	min: -2.1%	max: 0.0%
hyper-check
	avg: -1.0%	min: -2.1%	max: -0.1%
clap-rs-opt
	avg: -1.4%	min: -2.0%	max: -0.3%
piston-image-check
	avg: -1.2%	min: -1.9%	max: -0.1%
regex-check
	avg: -0.5%	min: -1.8%	max: -0.1%
syn
	avg: -1.1%	min: -1.7%	max: -0.1%
tokio-webpush-simple-check
	avg: -1.1%	min: -1.6%	max: -0.3%
tokio-webpush-simple
	avg: -1.2%	min: -1.6%	max: -0.0%
helloworld-check
	avg: -1.4%	min: -1.6%	max: -1.2%
deeply-nested
	avg: -1.2%	min: -1.4%	max: -0.8%
encoding-check
	avg: -0.8%	min: -1.3%	max: -0.3%
unify-linearly-check
	avg: -1.0%	min: -1.3%	max: -0.8%
script-servo-check
	avg: -0.6%	min: -1.3%	max: 0.0%
regression-31157-check
	avg: -0.9%	min: -1.2%	max: -0.7%
script-servo-opt
	avg: -0.5%	min: -1.2%	max: 0.1%
deeply-nested-check
	avg: -0.8%	min: -1.2%	max: -0.7%
encoding
	avg: -0.7%	min: -1.1%	max: -0.3%
issue-46449-check
	avg: -0.9%	min: -1.1%	max: -0.6%
parser-check
	avg: -0.9%	min: -1.1%	max: -0.8%
html5ever
	avg: -0.5%	min: -1.0%	max: -0.0%

This code path is rarely hit, which likely explains why this bug hasn't
been detected before now. (I only noticed it via code inspection.)
`nearest_common_ancestor()` uses an algorithm that requires computing
the full scope chain for both scopes, which is expensive because each
element involves a hash table lookup, and then looking for a common
tail.

This patch changes `nearest_common_ancestor()` to use a different
algorithm, which starts at the given scopes and works outwards (i.e. up
the scope tree) until a common ancestor is found. This is much faster
because in most cases the common ancestor is found well before the end
of the scope chains. Also, the use of a SmallVec avoids the need for any
allocation most of the time.
@@ -706,7 +706,7 @@ impl<'tcx> ScopeTree {
// "Modeling closures" section of the README in
// infer::region_constraints for more details.
let a_root_scope = a_ancestors[a_index];
let b_root_scope = a_ancestors[a_index];
let b_root_scope = b_ancestors[b_index];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch!

@nnethercote nnethercote force-pushed the nearest_common_ancestor branch from 210de11 to cccd51c Compare April 20, 2018 10:29
@michaelwoerister
Copy link
Member

Thanks, @nnethercote! I think @nikomatsakis had some ideas on how to exploit some domain knowledge to speed this. He probably should take a look at this.

r? @nikomatsakis

@pietroalbini pietroalbini added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Apr 20, 2018
@nnethercote
Copy link
Contributor Author

Here are things I've found from my measurements:

  • Scope chains can be long, usually at least dozens of entries, and often 100+ entries.
  • The vast majority of the time, the the nearest common ancestor is much closer to the front of the scope chain than the back. I.e. the two scope chains have a handful of elements that differ at the front and then a long common tail. This is why the new algorithm is a much better fit.

I'm not satisfied with the representation of parent_map, which is a HashMap mapping child scopes to parent scopes. It's hot and it feels to me like we should be able to use a different representation that doesn't require a hash table lookup for every child-to-parent lookup. I have some half-baked ideas that I am investigating, but they might be more suitable for a follow-up PR. Even if those ideas work out, I'm confident that this new algorithm will continue to be a better fit than the old one.

@nnethercote
Copy link
Contributor Author

I'm not satisfied with the representation of parent_map, which is a HashMap mapping child scopes to parent scopes. It's hot and it feels to me like we should be able to use a different representation that doesn't require a hash table lookup for every child-to-parent lookup. I have some half-baked ideas that I am investigating, but they might be more suitable for a follow-up PR.

I've done some work on this, and I haven't yet managed to make things any faster with the new representation. So I think it's definitely worth evaluating this PR as is.

Copy link
Contributor

@nikomatsakis nikomatsakis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like the algorithm I wanted to do too -- thanks @nnethercote !

@nikomatsakis
Copy link
Contributor

@bors r+

@bors
Copy link
Collaborator

bors commented Apr 24, 2018

📌 Commit cccd51c has been approved by nikomatsakis

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 24, 2018
@bors
Copy link
Collaborator

bors commented Apr 25, 2018

⌛ Testing commit cccd51c with merge 38eb9cd0bc367e95e227fa8bafec95940acf8823...

@bors
Copy link
Collaborator

bors commented Apr 25, 2018

💔 Test failed - status-appveyor

@bors bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Apr 25, 2018
@rust-highfive
Copy link
Contributor

The job x86_64-gnu-llvm-3.9 of your PR failed on Travis (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.
[00:01:39] warning: spurious network error (1 tries remaining): failed to get 200 response from `https://crates.io/api/v1/crates/filetime/0.1.15/download`, got 500
[00:01:39] error: unable to get packages from source
[00:01:39] 
[00:01:39] Caused by:
[00:01:39]   failed to get 200 response from `https://crates.io/api/v1/crates/filetime/0.1.15/download`, got 500
[00:01:39] failed to run: /checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/cargo build --manifest-path /checkout/src/bootstrap/Cargo.toml --locked
[00:01:39] Build completed unsuccessfully in 0:00:51
[00:01:39] Makefile:81: recipe for target 'prepare' failed
[00:01:39] make: *** [prepare] Error 1
[00:01:39]  Downloading serde_json v1.0.15
[00:01:40] error: unable to get packages from source
[00:01:40] 
[00:01:40] Caused by:
[00:01:40] Caused by:
[00:01:40]   failed to get 200 response from `https://crates.io/api/v1/crates/serde_json/1.0.15/download`, got 404
[00:01:40] failed to run: /checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/cargo build --manifest-path /checkout/src/bootstrap/Cargo.toml --locked
[00:01:40] Build completed unsuccessfully in 0:00:00
[00:01:40] Makefile:81: recipe for target 'prepare' failed
[00:01:40] make: *** [prepare] Error 1
[00:01:40]  Downloading getopts v0.2.17
[00:01:40] error: unable to get packages from source
[00:01:40] 
[00:01:40] Caused by:
[00:01:40] Caused by:
[00:01:40]   failed to get 200 response from `https://crates.io/api/v1/crates/getopts/0.2.17/download`, got 404
[00:01:40] failed to run: /checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/cargo build --manifest-path /checkout/src/bootstrap/Cargo.toml --locked
[00:01:40] Build completed unsuccessfully in 0:00:00
[00:01:40] Makefile:81: recipe for target 'prepare' failed
[00:01:40] make: *** [prepare] Error 1
[00:01:40]  Downloading serde_json v1.0.15
[00:01:41] error: unable to get packages from source
[00:01:41] 
[00:01:41] Caused by:
[00:01:41] Caused by:
[00:01:41]   failed to get 200 response from `https://crates.io/api/v1/crates/serde_json/1.0.15/download`, got 404
[00:01:41] failed to run: /checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/cargo build --manifest-path /checkout/src/bootstrap/Cargo.toml --locked
[00:01:41] Build completed unsuccessfully in 0:00:00
[00:01:41] Makefile:81: recipe for target 'prepare' failed
[00:01:41] make: *** [prepare] Error 1
[00:01:41]  Downloading filetime v0.1.15
[00:01:41] warning: spurious network error (2 tries remaining): failed to get 200 response from `https://crates.io/api/v1/crates/filetime/0.1.15/download`, got 500
[00:01:41] warning: spurious network error (1 tries remaining): failed to get 200 response from `https://crates.io/api/v1/crates/filetime/0.1.15/download`, got 500
[00:01:41] error: unable to get packages from source
[00:01:41] error: unable to get packages from source
[00:01:41] 
[00:01:41] Caused by:
[00:01:41]   failed to get 200 response from `https://crates.io/api/v1/crates/filetime/0.1.15/download`, got 500
[00:01:41] failed to run: /checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/cargo build --manifest-path /checkout/src/bootstrap/Cargo.toml --locked
[00:01:41] Build completed unsuccessfully in 0:00:00
[00:01:41] make: *** [prepare] Error 1
[00:01:41] Makefile:81: recipe for target 'prepare' failed
[00:01:41] The command has failed after 5 attempts.

The command "stamp sh -x -c "$RUN_SCRIPT"" exited with 1.
travis_time:start:1914da01
$ date && (curl -fs --head https://google.com | grep ^Date: | sed 's/Date: //g' || true)

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

@nnethercote
Copy link
Contributor Author

This looks like an infrastructure failure. Can someone please retry?

@nagbot-rs
Copy link

nagbot-rs commented Apr 25, 2018 via email

@bors
Copy link
Collaborator

bors commented Apr 25, 2018

@nagbot-rs: 🔑 Insufficient privileges: not in try users

@aturon
Copy link
Member

aturon commented Apr 25, 2018

@bors: retry

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 25, 2018
@bors
Copy link
Collaborator

bors commented Apr 25, 2018

⌛ Testing commit cccd51c with merge cc79420...

bors added a commit that referenced this pull request Apr 25, 2018
…sakis

Speed up `nearest_common_ancestor`.

`nearest_common_ancestor` can be made faster.

Here are all the benchmarks where one of the measurements improved by at least 1%.
```
clap-rs-check
	avg: -4.5%	min: -8.8%	max: -0.3%
clap-rs
	avg: -2.6%	min: -4.5%	max: 0.5%
script-servo
	avg: -1.7%	min: -3.6%	max: 0.0%
regression-31157
	avg: -1.5%	min: -2.6%	max: -0.4%
hyper
	avg: -1.2%	min: -2.5%	max: -0.0%
piston-image
	avg: -1.6%	min: -2.5%	max: 0.1%
regex
	avg: -1.2%	min: -2.2%	max: 0.0%
issue-46449
	avg: -1.8%	min: -2.1%	max: -0.7%
crates.io
	avg: -1.2%	min: -2.1%	max: 0.0%
hyper-check
	avg: -1.0%	min: -2.1%	max: -0.1%
clap-rs-opt
	avg: -1.4%	min: -2.0%	max: -0.3%
piston-image-check
	avg: -1.2%	min: -1.9%	max: -0.1%
regex-check
	avg: -0.5%	min: -1.8%	max: -0.1%
syn
	avg: -1.1%	min: -1.7%	max: -0.1%
tokio-webpush-simple-check
	avg: -1.1%	min: -1.6%	max: -0.3%
tokio-webpush-simple
	avg: -1.2%	min: -1.6%	max: -0.0%
helloworld-check
	avg: -1.4%	min: -1.6%	max: -1.2%
deeply-nested
	avg: -1.2%	min: -1.4%	max: -0.8%
encoding-check
	avg: -0.8%	min: -1.3%	max: -0.3%
unify-linearly-check
	avg: -1.0%	min: -1.3%	max: -0.8%
script-servo-check
	avg: -0.6%	min: -1.3%	max: 0.0%
regression-31157-check
	avg: -0.9%	min: -1.2%	max: -0.7%
script-servo-opt
	avg: -0.5%	min: -1.2%	max: 0.1%
deeply-nested-check
	avg: -0.8%	min: -1.2%	max: -0.7%
encoding
	avg: -0.7%	min: -1.1%	max: -0.3%
issue-46449-check
	avg: -0.9%	min: -1.1%	max: -0.6%
parser-check
	avg: -0.9%	min: -1.1%	max: -0.8%
html5ever
	avg: -0.5%	min: -1.0%	max: -0.0%
```
@bors
Copy link
Collaborator

bors commented Apr 25, 2018

☀️ Test successful - status-appveyor, status-travis
Approved by: nikomatsakis
Pushing cc79420 to master...

@bors bors merged commit cccd51c into rust-lang:master Apr 25, 2018
@nnethercote nnethercote deleted the nearest_common_ancestor branch April 26, 2018 05:57
@kirillkh
Copy link

Two remarks :

  1. In the worst case, the new algorithm is quadratic in the combined length of the two stacks (seen). Did you try to use a hash table instead of SmallVec?
  2. It is possible to ~halve the number of comparisons being done in each iteration of the outer loop. Maintain two seen vecs instead of one. Add items from stack a into seena and items from stack b into seenb. Look for items from stack a in seenb and vice versa.

@nnethercote
Copy link
Contributor Author

I did try a hash table. SmallVec was slightly faster for the rustc-perf suite.

Two Vecs would halve the number of comparisons, but it would require an extra heap allocation. I don't know which effect would be greater.

@kirillkh
Copy link

Another option is to maintain a single seen vec and use stride 2 for search.

@kirillkh
Copy link

And actually I expect 2 SmallVecs to be a win, considering they are stack-allocated most of the time.

@nnethercote
Copy link
Contributor Author

I filed #50365 for the two vectors change. Thank you for the suggestion, @kirillkh.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants