Inline smaller callees first #130696

scottmcm · 2024-09-22T11:53:18Z

Then limit the total size and number of inlined things, to allow more top-down inlining (particularly important after calling something generic that couldn't inline internally) without getting exponential blowup.

Fixes #130590
r? saethlin

Then limit the total size and number of inlined things, to allow more top-down inlining (particularly important after calling something generic that couldn't inline internally) without getting exponential blowup. Fixes 130590

rustbot · 2024-09-22T11:53:27Z

Some changes occurred to MIR optimizations

cc @rust-lang/wg-mir-opt

scottmcm · 2024-09-22T12:02:19Z

@bors try @rust-timer queue

Inline smaller callees first Then limit the total size and number of inlined things, to allow more top-down inlining (particularly important after calling something generic that couldn't inline internally) without getting exponential blowup. Fixes rust-lang#130590 r? saethlin

bors · 2024-09-22T12:03:30Z

⌛ Trying commit 51efba2 with merge 018727d...

rust-log-analyzer · 2024-09-22T12:22:00Z

The job x86_64-gnu-llvm-18 failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)

------
 > importing cache manifest from ghcr.io/rust-lang/rust-ci-cache:30ca74372d8b771363f68f939a58b017a592fae4f69398600dc51145997160f03e9652051f957840c41898984a88855e9757fa23464703a5a4ba21316ddebb04:
------
##[endgroup]
Setting extra environment values for docker:  --env ENABLE_GCC_CODEGEN=1 --env GCC_EXEC_PREFIX=/usr/lib/gcc/
[CI_JOB_NAME=x86_64-gnu-llvm-18]
---
sccache: Starting the server...
##[group]Configure the build
configure: processing command line
configure: 
configure: build.configure-args := ['--build=x86_64-unknown-linux-gnu', '--llvm-root=/usr/lib/llvm-18', '--enable-llvm-link-shared', '--set', 'rust.randomize-layout=true', '--set', 'rust.thin-lto-import-instr-limit=10', '--set', 'change-id=99999999', '--enable-verbose-configure', '--enable-sccache', '--disable-manage-submodules', '--enable-locked-deps', '--enable-cargo-native-static', '--set', 'rust.codegen-units-std=1', '--set', 'dist.compression-profile=balanced', '--dist-compression-formats=xz', '--set', 'rust.lld=false', '--disable-dist-src', '--release-channel=nightly', '--enable-debug-assertions', '--enable-overflow-checks', '--enable-llvm-assertions', '--set', 'rust.verify-llvm-ir', '--set', 'rust.codegen-backends=llvm,cranelift,gcc', '--set', 'llvm.static-libstdcpp', '--enable-new-symbol-mangling']
configure: target.x86_64-unknown-linux-gnu.llvm-config := /usr/lib/llvm-18/bin/llvm-config
configure: llvm.link-shared     := True
configure: rust.randomize-layout := True
configure: rust.thin-lto-import-instr-limit := 10

bors · 2024-09-22T13:51:59Z

☀️ Try build successful - checks-actions
Build commit: 018727d (018727d766cb382a9e117d9d50a6abc0f5bd7ef0)

rust-timer · 2024-09-22T13:52:03Z

Queued 018727d with parent 80aa6fa, future comparison URL.
There is currently 1 preceding artifact in the queue.
It will probably take at least ~2.3 hours until the benchmark run finishes.

the8472 · 2024-09-22T20:03:08Z

compiler/rustc_mir_transform/src/inline.rs

+
+    let mut changed = false;
+    let mut remaining_cost = tcx.sess.opts.unstable_opts.inline_mir_total_threshold.unwrap_or(300);
+    let mut remaining_count = MAX_INLINED_CALLEES_PER_BODY;


Shouldn't this go by some size heuristic? if the callee is absolutely tiny (e.g. just another call) then I'd expect the cost of inlining to be ~zero (replacing a call with another call), which means we can inline an unlimited amount of those cases.

Ah, looking at the testcases I guess the scopes make this non-zero cost?
Are those debug-only?

There's a heterogeneous recursion example in the tests that will go ≈infinitely if there's no numeric limit, so there needs to be a limit more than just the total cost to keep that from excessively exploding.

I can definitely experiment with upping the count limit and have the normal case be hitting the cost limit, though, since as you say we almost always want to inline the ≈free things.

I could also try some tricks like adding synthetic cost the deeper the inlining, or something, so that it ends up hitting the cost limits anyway. Might be better than the count limit anyway since it'll encourage more breadth instead of depth...

scottmcm · 2024-09-27T06:51:24Z

Some impressive size results here:

diesel opt-full 13,372,460 → 11,831,084
hyper opt-full 5,393,450 → 4,634,454

But it's almost all red for compile-time, so I've moved this to draft until I can rework it to hopefully not have that impact.

oli-obk · 2024-09-27T07:26:41Z

We could limit it to nonincremental release builds.

It may also be of interest to embedded folk when optimizing for size

scottmcm · 2024-09-27T07:34:33Z

@oli isn't the inliner already off for incremental?

rust/compiler/rustc_mir_transform/src/inline.rs

Lines 59 to 67 in 58420a0

    
           match sess.mir_opt_level() { 
        
               0 | 1 => false, 
        
               2 => { 
        
                   (sess.opts.optimize == OptLevel::Default 
        
                       || sess.opts.optimize == OptLevel::Aggressive) 
        
                       && sess.opts.incremental == None 
        
               } 
        
               _ => true, 
        
           }

Or is mir-opt-level set to 3 with opt-level=3?

oli-obk · 2024-09-27T08:12:17Z

Considering incremental regressed, I'd say we're indeed running on level 3

bjorn3 · 2024-09-27T08:17:04Z

When optimizations are enabled, -Zmir-opt-level=2 is used by default and -Zmir-opt-level=1 when optimizations are disabled. Incr check and debug builds also regress. It probably has more to do with this optimization running on the standard library.

saethlin · 2024-09-27T14:55:21Z

Considering incremental regressed

Any improvement to the inliner tends to regress incremental. I think this happens because the standard library is not built with incremental, so better inlining means more of the standard library gets transitively dragged into third party CGUs.

the8472 · 2024-09-27T15:13:54Z

Considering incremental regressed

Any improvement to the inliner tends to regress incremental. I think this happens because the standard library is not built with incremental, so better inlining means more of the standard library gets transitively dragged into third party CGUs.

This is for generic MIR exported from the library or for prebuilt monomorphic code? Would it make sense to distinguish between MIR for export (and maybe defer optimization to the consuming crate?) and MIR generated for same-crate codegen?

saethlin · 2024-09-27T15:27:25Z

This is for generic MIR exported from the library or for prebuilt monomorphic code?

It's for MIR exported from the library; not all of that MIR is generic.

and maybe defer optimization to the consuming crate?

I suspect this would have similarly problematic consequences for incremental builds.

What might help is having two version of optimized_mir, one which has all interprocedural optimizations disabled and is used for incremental builds. It's not clear to me if the compiler complexity and metadata size increase is worth that.

scottmcm · 2024-12-20T19:41:02Z

I still want to do something like this, but closing for now because it probably needs some other reworks first.

Note to self: perhaps #126640 can help.

Inline smaller callees first

51efba2

Then limit the total size and number of inlined things, to allow more top-down inlining (particularly important after calling something generic that couldn't inline internally) without getting exponential blowup. Fixes 130590

rustbot assigned saethlin Sep 22, 2024

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Sep 22, 2024

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Sep 22, 2024

the8472 reviewed Sep 22, 2024

View reviewed changes

This comment was marked as outdated.

Sign in to view

scottmcm marked this pull request as draft September 27, 2024 06:46

Dylan-DPC added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Nov 20, 2024

scottmcm closed this Dec 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inline smaller callees first #130696

Inline smaller callees first #130696

scottmcm commented Sep 22, 2024

rustbot commented Sep 22, 2024

scottmcm commented Sep 22, 2024

This comment has been minimized.

bors commented Sep 22, 2024

rust-log-analyzer commented Sep 22, 2024

bors commented Sep 22, 2024

rust-timer commented Sep 22, 2024

the8472 Sep 22, 2024

the8472 Sep 22, 2024 •

edited

Loading

scottmcm Sep 23, 2024

scottmcm Sep 23, 2024

This comment was marked as outdated.

scottmcm commented Sep 27, 2024

oli-obk commented Sep 27, 2024

scottmcm commented Sep 27, 2024

oli-obk commented Sep 27, 2024

bjorn3 commented Sep 27, 2024

saethlin commented Sep 27, 2024 •

edited

Loading

the8472 commented Sep 27, 2024

saethlin commented Sep 27, 2024 •

edited

Loading

scottmcm commented Dec 20, 2024

Inline smaller callees first #130696

Inline smaller callees first #130696

Conversation

scottmcm commented Sep 22, 2024

rustbot commented Sep 22, 2024

scottmcm commented Sep 22, 2024

This comment has been minimized.

bors commented Sep 22, 2024

rust-log-analyzer commented Sep 22, 2024

bors commented Sep 22, 2024

rust-timer commented Sep 22, 2024

the8472 Sep 22, 2024

Choose a reason for hiding this comment

the8472 Sep 22, 2024 • edited Loading

Choose a reason for hiding this comment

scottmcm Sep 23, 2024

Choose a reason for hiding this comment

scottmcm Sep 23, 2024

Choose a reason for hiding this comment

This comment was marked as outdated.

scottmcm commented Sep 27, 2024

oli-obk commented Sep 27, 2024

scottmcm commented Sep 27, 2024

oli-obk commented Sep 27, 2024

bjorn3 commented Sep 27, 2024

saethlin commented Sep 27, 2024 • edited Loading

the8472 commented Sep 27, 2024

saethlin commented Sep 27, 2024 • edited Loading

scottmcm commented Dec 20, 2024

the8472 Sep 22, 2024 •

edited

Loading

saethlin commented Sep 27, 2024 •

edited

Loading

saethlin commented Sep 27, 2024 •

edited

Loading