Speed up `nearest_common_ancestor`. #50106

nnethercote · 2018-04-20T10:25:46Z

nearest_common_ancestor can be made faster.

Here are all the benchmarks where one of the measurements improved by at least 1%.

clap-rs-check
	avg: -4.5%	min: -8.8%	max: -0.3%
clap-rs
	avg: -2.6%	min: -4.5%	max: 0.5%
script-servo
	avg: -1.7%	min: -3.6%	max: 0.0%
regression-31157
	avg: -1.5%	min: -2.6%	max: -0.4%
hyper
	avg: -1.2%	min: -2.5%	max: -0.0%
piston-image
	avg: -1.6%	min: -2.5%	max: 0.1%
regex
	avg: -1.2%	min: -2.2%	max: 0.0%
issue-46449
	avg: -1.8%	min: -2.1%	max: -0.7%
crates.io
	avg: -1.2%	min: -2.1%	max: 0.0%
hyper-check
	avg: -1.0%	min: -2.1%	max: -0.1%
clap-rs-opt
	avg: -1.4%	min: -2.0%	max: -0.3%
piston-image-check
	avg: -1.2%	min: -1.9%	max: -0.1%
regex-check
	avg: -0.5%	min: -1.8%	max: -0.1%
syn
	avg: -1.1%	min: -1.7%	max: -0.1%
tokio-webpush-simple-check
	avg: -1.1%	min: -1.6%	max: -0.3%
tokio-webpush-simple
	avg: -1.2%	min: -1.6%	max: -0.0%
helloworld-check
	avg: -1.4%	min: -1.6%	max: -1.2%
deeply-nested
	avg: -1.2%	min: -1.4%	max: -0.8%
encoding-check
	avg: -0.8%	min: -1.3%	max: -0.3%
unify-linearly-check
	avg: -1.0%	min: -1.3%	max: -0.8%
script-servo-check
	avg: -0.6%	min: -1.3%	max: 0.0%
regression-31157-check
	avg: -0.9%	min: -1.2%	max: -0.7%
script-servo-opt
	avg: -0.5%	min: -1.2%	max: 0.1%
deeply-nested-check
	avg: -0.8%	min: -1.2%	max: -0.7%
encoding
	avg: -0.7%	min: -1.1%	max: -0.3%
issue-46449-check
	avg: -0.9%	min: -1.1%	max: -0.6%
parser-check
	avg: -0.9%	min: -1.1%	max: -0.8%
html5ever
	avg: -0.5%	min: -1.0%	max: -0.0%

This code path is rarely hit, which likely explains why this bug hasn't been detected before now. (I only noticed it via code inspection.)

`nearest_common_ancestor()` uses an algorithm that requires computing the full scope chain for both scopes, which is expensive because each element involves a hash table lookup, and then looking for a common tail. This patch changes `nearest_common_ancestor()` to use a different algorithm, which starts at the given scopes and works outwards (i.e. up the scope tree) until a common ancestor is found. This is much faster because in most cases the common ancestor is found well before the end of the scope chains. Also, the use of a SmallVec avoids the need for any allocation most of the time.

michaelwoerister · 2018-04-20T10:28:46Z

src/librustc/middle/region.rs

@@ -706,7 +706,7 @@ impl<'tcx> ScopeTree {
            // "Modeling closures" section of the README in
            // infer::region_constraints for more details.
            let a_root_scope = a_ancestors[a_index];
-            let b_root_scope = a_ancestors[a_index];
+            let b_root_scope = b_ancestors[b_index];


Nice catch!

michaelwoerister · 2018-04-20T10:30:10Z

Thanks, @nnethercote! I think @nikomatsakis had some ideas on how to exploit some domain knowledge to speed this. He probably should take a look at this.

r? @nikomatsakis

nnethercote · 2018-04-23T00:36:52Z

Here are things I've found from my measurements:

Scope chains can be long, usually at least dozens of entries, and often 100+ entries.
The vast majority of the time, the the nearest common ancestor is much closer to the front of the scope chain than the back. I.e. the two scope chains have a handful of elements that differ at the front and then a long common tail. This is why the new algorithm is a much better fit.

I'm not satisfied with the representation of parent_map, which is a HashMap mapping child scopes to parent scopes. It's hot and it feels to me like we should be able to use a different representation that doesn't require a hash table lookup for every child-to-parent lookup. I have some half-baked ideas that I am investigating, but they might be more suitable for a follow-up PR. Even if those ideas work out, I'm confident that this new algorithm will continue to be a better fit than the old one.

nnethercote · 2018-04-24T07:28:14Z

I'm not satisfied with the representation of parent_map, which is a HashMap mapping child scopes to parent scopes. It's hot and it feels to me like we should be able to use a different representation that doesn't require a hash table lookup for every child-to-parent lookup. I have some half-baked ideas that I am investigating, but they might be more suitable for a follow-up PR.

I've done some work on this, and I haven't yet managed to make things any faster with the new representation. So I think it's definitely worth evaluating this PR as is.

nikomatsakis

This looks like the algorithm I wanted to do too -- thanks @nnethercote !

nikomatsakis · 2018-04-24T18:53:50Z

@bors r+

bors · 2018-04-24T18:53:51Z

📌 Commit cccd51c has been approved by nikomatsakis

bors · 2018-04-25T00:48:16Z

⌛ Testing commit cccd51c with merge 38eb9cd0bc367e95e227fa8bafec95940acf8823...

bors · 2018-04-25T00:52:09Z

💔 Test failed - status-appveyor

rust-highfive · 2018-04-25T00:52:18Z

The job x86_64-gnu-llvm-3.9 of your PR failed on Travis (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.

[00:01:39] warning: spurious network error (1 tries remaining): failed to get 200 response from `https://crates.io/api/v1/crates/filetime/0.1.15/download`, got 500
[00:01:39] error: unable to get packages from source
[00:01:39] 
[00:01:39] Caused by:
[00:01:39]   failed to get 200 response from `https://crates.io/api/v1/crates/filetime/0.1.15/download`, got 500
[00:01:39] failed to run: /checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/cargo build --manifest-path /checkout/src/bootstrap/Cargo.toml --locked
[00:01:39] Build completed unsuccessfully in 0:00:51
[00:01:39] Makefile:81: recipe for target 'prepare' failed
[00:01:39] make: *** [prepare] Error 1
[00:01:39]  Downloading serde_json v1.0.15
[00:01:40] error: unable to get packages from source
[00:01:40] 
[00:01:40] Caused by:
[00:01:40] Caused by:
[00:01:40]   failed to get 200 response from `https://crates.io/api/v1/crates/serde_json/1.0.15/download`, got 404
[00:01:40] failed to run: /checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/cargo build --manifest-path /checkout/src/bootstrap/Cargo.toml --locked
[00:01:40] Build completed unsuccessfully in 0:00:00
[00:01:40] Makefile:81: recipe for target 'prepare' failed
[00:01:40] make: *** [prepare] Error 1
[00:01:40]  Downloading getopts v0.2.17
[00:01:40] error: unable to get packages from source
[00:01:40] 
[00:01:40] Caused by:
[00:01:40] Caused by:
[00:01:40]   failed to get 200 response from `https://crates.io/api/v1/crates/getopts/0.2.17/download`, got 404
[00:01:40] failed to run: /checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/cargo build --manifest-path /checkout/src/bootstrap/Cargo.toml --locked
[00:01:40] Build completed unsuccessfully in 0:00:00
[00:01:40] Makefile:81: recipe for target 'prepare' failed
[00:01:40] make: *** [prepare] Error 1
[00:01:40]  Downloading serde_json v1.0.15
[00:01:41] error: unable to get packages from source
[00:01:41] 
[00:01:41] Caused by:
[00:01:41] Caused by:
[00:01:41]   failed to get 200 response from `https://crates.io/api/v1/crates/serde_json/1.0.15/download`, got 404
[00:01:41] failed to run: /checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/cargo build --manifest-path /checkout/src/bootstrap/Cargo.toml --locked
[00:01:41] Build completed unsuccessfully in 0:00:00
[00:01:41] Makefile:81: recipe for target 'prepare' failed
[00:01:41] make: *** [prepare] Error 1
[00:01:41]  Downloading filetime v0.1.15
[00:01:41] warning: spurious network error (2 tries remaining): failed to get 200 response from `https://crates.io/api/v1/crates/filetime/0.1.15/download`, got 500
[00:01:41] warning: spurious network error (1 tries remaining): failed to get 200 response from `https://crates.io/api/v1/crates/filetime/0.1.15/download`, got 500
[00:01:41] error: unable to get packages from source
[00:01:41] error: unable to get packages from source
[00:01:41] 
[00:01:41] Caused by:
[00:01:41]   failed to get 200 response from `https://crates.io/api/v1/crates/filetime/0.1.15/download`, got 500
[00:01:41] failed to run: /checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/cargo build --manifest-path /checkout/src/bootstrap/Cargo.toml --locked
[00:01:41] Build completed unsuccessfully in 0:00:00
[00:01:41] make: *** [prepare] Error 1
[00:01:41] Makefile:81: recipe for target 'prepare' failed
[00:01:41] The command has failed after 5 attempts.

The command "stamp sh -x -c "$RUN_SCRIPT"" exited with 1.
travis_time:start:1914da01
$ date && (curl -fs --head https://google.com | grep ^Date: | sed 's/Date: //g' || true)

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

nnethercote · 2018-04-25T01:06:23Z

This looks like an infrastructure failure. Can someone please retry?

nagbot-rs · 2018-04-25T01:49:20Z

@bors: retry * crates.io was down

…

On Tue, Apr 24, 2018 at 7:52 PM, bors ***@***.***> wrote: 💔 Test failed - status-appveyor <https://ci.appveyor.com/project/rust-lang/rust/build/1.0.7131> — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#50106 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AaPN0C5s5zxNU9kkd7L0ES0DWk2vGJZaks5tr8jOgaJpZM4TdO2p> . -- You received this message because you are subscribed to the Google Groups "rust-ops" group. To unsubscribe from this group and stop receiving emails from it, send an email to ***@***.*** To post to this group, send email to ***@***.*** To view this discussion on the web visit https://groups.google.com/d/ msgid/rust-ops/rust-lang/rust/pull/50106/c384125552%40github.com <https://groups.google.com/d/msgid/rust-ops/rust-lang/rust/pull/50106/c384125552%40github.com?utm_medium=email&utm_source=footer> . For more options, visit https://groups.google.com/d/optout.

bors · 2018-04-25T01:49:21Z

@nagbot-rs: 🔑 Insufficient privileges: not in try users

aturon · 2018-04-25T02:04:09Z

@bors: retry

bors · 2018-04-25T06:48:46Z

⌛ Testing commit cccd51c with merge cc79420...

…sakis Speed up `nearest_common_ancestor`. `nearest_common_ancestor` can be made faster. Here are all the benchmarks where one of the measurements improved by at least 1%. ``` clap-rs-check avg: -4.5% min: -8.8% max: -0.3% clap-rs avg: -2.6% min: -4.5% max: 0.5% script-servo avg: -1.7% min: -3.6% max: 0.0% regression-31157 avg: -1.5% min: -2.6% max: -0.4% hyper avg: -1.2% min: -2.5% max: -0.0% piston-image avg: -1.6% min: -2.5% max: 0.1% regex avg: -1.2% min: -2.2% max: 0.0% issue-46449 avg: -1.8% min: -2.1% max: -0.7% crates.io avg: -1.2% min: -2.1% max: 0.0% hyper-check avg: -1.0% min: -2.1% max: -0.1% clap-rs-opt avg: -1.4% min: -2.0% max: -0.3% piston-image-check avg: -1.2% min: -1.9% max: -0.1% regex-check avg: -0.5% min: -1.8% max: -0.1% syn avg: -1.1% min: -1.7% max: -0.1% tokio-webpush-simple-check avg: -1.1% min: -1.6% max: -0.3% tokio-webpush-simple avg: -1.2% min: -1.6% max: -0.0% helloworld-check avg: -1.4% min: -1.6% max: -1.2% deeply-nested avg: -1.2% min: -1.4% max: -0.8% encoding-check avg: -0.8% min: -1.3% max: -0.3% unify-linearly-check avg: -1.0% min: -1.3% max: -0.8% script-servo-check avg: -0.6% min: -1.3% max: 0.0% regression-31157-check avg: -0.9% min: -1.2% max: -0.7% script-servo-opt avg: -0.5% min: -1.2% max: 0.1% deeply-nested-check avg: -0.8% min: -1.2% max: -0.7% encoding avg: -0.7% min: -1.1% max: -0.3% issue-46449-check avg: -0.9% min: -1.1% max: -0.6% parser-check avg: -0.9% min: -1.1% max: -0.8% html5ever avg: -0.5% min: -1.0% max: -0.0% ```

bors · 2018-04-25T09:19:04Z

☀️ Test successful - status-appveyor, status-travis
Approved by: nikomatsakis
Pushing cc79420 to master...

kirillkh · 2018-04-30T09:13:23Z

Two remarks :

In the worst case, the new algorithm is quadratic in the combined length of the two stacks (seen). Did you try to use a hash table instead of SmallVec?
It is possible to ~halve the number of comparisons being done in each iteration of the outer loop. Maintain two seen vecs instead of one. Add items from stack a into seena and items from stack b into seenb. Look for items from stack a in seenb and vice versa.

nnethercote · 2018-04-30T09:17:34Z

I did try a hash table. SmallVec was slightly faster for the rustc-perf suite.

Two Vecs would halve the number of comparisons, but it would require an extra heap allocation. I don't know which effect would be greater.

kirillkh · 2018-04-30T09:24:20Z

Another option is to maintain a single seen vec and use stride 2 for search.

kirillkh · 2018-04-30T09:27:21Z

And actually I expect 2 SmallVecs to be a win, considering they are stack-allocated most of the time.

nnethercote · 2018-05-02T03:42:09Z

I filed #50365 for the two vectors change. Thank you for the suggestion, @kirillkh.

nnethercote added 2 commits April 20, 2018 20:27

Fix a copy-and-paste bug in nearest_common_ancestor.

144c0d5

This code path is rarely hit, which likely explains why this bug hasn't been detected before now. (I only noticed it via code inspection.)

michaelwoerister reviewed Apr 20, 2018

View reviewed changes

nnethercote force-pushed the nearest_common_ancestor branch from 210de11 to cccd51c Compare April 20, 2018 10:29

pietroalbini assigned nikomatsakis Apr 20, 2018

pietroalbini added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Apr 20, 2018

Marwes mentioned this pull request Apr 20, 2018

lalrpop v0.13.1 is very slow to compile lalrpop/lalrpop#260

Open

nikomatsakis approved these changes Apr 24, 2018

View reviewed changes

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 24, 2018

bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Apr 25, 2018

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 25, 2018

bors merged commit cccd51c into rust-lang:master Apr 25, 2018

nnethercote deleted the nearest_common_ancestor branch April 26, 2018 05:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speed up `nearest_common_ancestor`. #50106

Speed up `nearest_common_ancestor`. #50106

nnethercote commented Apr 20, 2018

michaelwoerister Apr 20, 2018

michaelwoerister commented Apr 20, 2018

nnethercote commented Apr 23, 2018

nnethercote commented Apr 24, 2018

nikomatsakis left a comment

nikomatsakis commented Apr 24, 2018

bors commented Apr 24, 2018

bors commented Apr 25, 2018

bors commented Apr 25, 2018

rust-highfive commented Apr 25, 2018

nnethercote commented Apr 25, 2018

nagbot-rs commented Apr 25, 2018 via email

bors commented Apr 25, 2018

aturon commented Apr 25, 2018

bors commented Apr 25, 2018

bors commented Apr 25, 2018

kirillkh commented Apr 30, 2018

nnethercote commented Apr 30, 2018

kirillkh commented Apr 30, 2018

kirillkh commented Apr 30, 2018

nnethercote commented May 2, 2018

Speed up nearest_common_ancestor. #50106

Speed up nearest_common_ancestor. #50106

Conversation

nnethercote commented Apr 20, 2018

michaelwoerister Apr 20, 2018

Choose a reason for hiding this comment

michaelwoerister commented Apr 20, 2018

nnethercote commented Apr 23, 2018

nnethercote commented Apr 24, 2018

nikomatsakis left a comment

Choose a reason for hiding this comment

nikomatsakis commented Apr 24, 2018

bors commented Apr 24, 2018

bors commented Apr 25, 2018

bors commented Apr 25, 2018

rust-highfive commented Apr 25, 2018

nnethercote commented Apr 25, 2018

nagbot-rs commented Apr 25, 2018 via email

bors commented Apr 25, 2018

aturon commented Apr 25, 2018

bors commented Apr 25, 2018

bors commented Apr 25, 2018

kirillkh commented Apr 30, 2018

nnethercote commented Apr 30, 2018

kirillkh commented Apr 30, 2018

kirillkh commented Apr 30, 2018

nnethercote commented May 2, 2018

Speed up `nearest_common_ancestor`. #50106

Speed up `nearest_common_ancestor`. #50106