Correctly calculate access chains for compound types #233

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Merged

LegNeato merged 9 commits into Rust-GPU:main from LegNeato:fixregression

Apr 21, 2025

Collaborator

LegNeato commented Apr 12, 2025 •

edited

Loading

Fixes #46.

This was extremely painful and I am still not sure I understand the code completely and all the various cases.

@schell , @Firestar99 , please try it in your projects and see if I broke anything!

LegNeato added 2 commits

April 12, 2025 12:51


          Logging to track down Rust-GPU#46

8b5697f


          Update tests

99d91a1

LegNeato requested review from eddyb, Firestar99 and schell as code owners

April 12, 2025 19:49

LegNeato added 2 commits

April 12, 2025 15:52


          Correctly calculate access chains for compound types

74538e8

Fixes Rust-GPU#46.


          Convert debug! to trace! and remove duplicated messages

b04e9bc

LegNeato force-pushed the fixregression branch from dd44792 to b04e9bc Compare

April 12, 2025 19:52

LegNeato added 5 commits

April 12, 2025 16:02


          Add back fixme

0530d65


          Fix clippy issue

f03fae5


          Add back pointer to spec / docs about first GEP offset

0fc8de0


          Missed one debug -> trace

e38c1a4


          Another debug -> trace

16b61ce

Member

Firestar99 commented Apr 12, 2025

My stuff still works with this branch (+ nonuniform patches) 🎉

Contributor

schell commented Apr 18, 2025

My project compiled just fine and passes tests 👍 . I also successfully built a shader with [Default::default(); N] in it.

Awesome work!

schell approved these changes

View reviewed changes

Contributor

schell left a comment

LGTM but take my review with a grain of salt 🧂 as I never got a chance to dive in very deep, but I have tested the results.

Collaborator Author

LegNeato commented Apr 18, 2025

@eddyb said he wanted to take a look at this so I'll wait a bit before merging.

Collaborator Author

LegNeato commented Apr 21, 2025

Ok, I am just going to land it and we can fix forward or revert if it isn't correct.

LegNeato merged commit 2f915a0 into Rust-GPU:main

7 checks passed

LegNeato deleted the fixregression branch

April 21, 2025 02:45

eddyb reviewed

View reviewed changes

Collaborator

eddyb left a comment •

edited

Loading

I was late on this and I'm not even sure I should leave this review.

I've got mixed feelings about it - don't mind all the tracing usage and the comments, but it turns out that there are fewer functional changes than I expected - in fact, just one, and it doesn't make sense to me.

There's a non-trivial chance this is a case of "two wrongs make a right", or more specifically something else ends up reacting better to what seems to be invalid SPIR-V (AFAICT).

Now, too be fair, I finally looked at this because of:

Modifies the index instruction from gep [0 x %Type] to gep %Type rust-lang/rust#134117

And the simplest workaround for that seems to be "run the GEP merge logic twice, using a loop, and the second time use pointercast first which can generate its own GEP via recover_access_chain_from_offset" - so there's a chance the "byte GEP" stuff hits a similar weird combo.

In fact, funnily enough, I think we could've always done that pointercast before the merge, because of the if ty == original_pointee_ty that currently gates merging: if that condition holds, the pointercast is a noop anyway.
(EDIT: yupp, that works, and it amounts to moving just two lets slightly earlier in the function)

I should try to figure out the overall "decision trees" for maybe_inbounds_gep and other recover_access_chain_from_offset callers, then reduce duplicated work and handle all possible cases that can be statically resolved.

crates/rustc_codegen_spirv/src/builder/builder_methods.rs

Comment on lines -440 to -441

		// FIXME(eddyb) this isn't efficient, `recover_access_chain_from_offset`
		// could instead be doing all the extra digging itself.

Collaborator

eddyb May 7, 2025

I am not sure what this comment meant exactly, but I suspect it still applies to the loop being a loop, and not a single call to recover_access_chain_from_offset.

crates/rustc_codegen_spirv/src/builder/builder_methods.rs Show resolved Hide resolved

crates/rustc_codegen_spirv/src/builder/builder_methods.rs

                               return Some((indices, ty));
                           }
                       }
                   }
+                  #[instrument(level = "trace", skip(self), fields(ty = ?self.debug_type(ty), ptr, combined_indices = ?combined_indices.iter().map(|x| (self.debug_type(x.ty), x.kind)).collect::<Vec<_>>(), is_inbounds))]

Collaborator

eddyb May 7, 2025

Surprised rustfmt didn't make this multi-line, I feel like it could be more readable if each entry in fields was on its own line.

crates/rustc_codegen_spirv/src/builder/builder_methods.rs Show resolved Hide resolved

crates/rustc_codegen_spirv/src/builder/builder_methods.rs

Comment on lines +693 to +699

+                              // Special case: If we started with a byte GEP (`is_byte_gep` is true) and
+                              // we are currently indexing into a byte type, the result is still a byte type.
+                              // This prevents errors if `indices` contains non-zero values when `ty` is u8/i8.
+                              SpirvType::Integer(8, signedness) if is_byte_gep => {
+                                  // Define the resulting byte type as it might not exist yet).
+                                  SpirvType::Integer(8, signedness).def(self.span(), self)
+                              }

Collaborator

eddyb May 7, 2025

I don't see how this could be reached - indices can only dig into structs/arrays, and only a bug elsewhere would allow indices.len() > 0 when doing byte offsetting.

Locally removing this case does not fail any tests, so at the very least none were added that could hit it.

Collaborator Author

LegNeato May 7, 2025

Oh weird, I added it specifically because byte indexing was broken in tests. Perhaps a later change fixed this and I didn't notice.

crates/rustc_codegen_spirv/src/builder/builder_methods.rs

+                          // Search the current function's instructions...
+                          // FIXME(eddyb) this could get ridiculously expensive, at the very least
+                          // it could use `.rev()`, hoping the base pointer was recently defined?
+                          let search_result = {

Collaborator

eddyb May 7, 2025

Why the extra variable? It just inflates the diff needlessly.

Collaborator Author

LegNeato May 7, 2025

I probably did this for logging which looks like it was removed

crates/rustc_codegen_spirv/src/builder/builder_methods.rs

Comment on lines +780 to +804

+                              // Determine the result type for the `AccessChain` instruction we might
+                              // emit. By default, use the `final_spirv_ptr_type` strictly calculated
+                              // earlier from `ty` and `indices`.
+                              //
+                              // If this is a byte GEP *and* the recovered type happens to be a byte
+                              // type, we can use the pointer type derived from the *recovered* type
+                              // (`base_pointee_ty`). This helps preserve type information when
+                              // recovery works for byte addressing.
+                              let result_wrapper_type = if !is_byte_gep
+                                  || matches!(self.lookup_type(base_pointee_ty), SpirvType::Integer(8, _))
+                              {
+                                  trace!(
+                                      "Using strictly calculated type for wrapper: {}",
+                                      // Use type based on input `ty` + `indices`
+                                      self.debug_type(calculated_pointee_type)
+                                  );
+                                  final_spirv_ptr_type
+                              } else {
+                                  trace!(
+                                      "Byte GEP allowing recovered type for wrapper: {}",
+                                      // Use type based on recovery result
+                                      self.debug_type(base_pointee_ty)
+                                  );
+                                  self.type_ptr_to(base_pointee_ty)
+                              };

Collaborator

eddyb May 7, 2025

~~This looks like the only addition to the "constant offset" path and I don't understand it.~~

~~You have to use final_spirv_ptr_type (i.e. pointer-to-calculated_pointee_type) because that's what the type has to be.~~

When indices is empty, that does mean that pointer-to-base_pointee_ty can be obtained and it has the right offset, even if it's not pointing to a ty/calculated_pointee_type (which are equal in the empty indices case) - but there are two issues there:

"ty is NOT byte, OR base_pointee_ty is byte" doesn't seem like it matches the comment, nor can it relate to "do we have the desired pointer, even if of the wrong type?"
- maybe the actual issue here is that the "then" and "else" sides got flipped for no good reason, so the condition ends up negated, and should be is_byte_gep && ...?
the caller definitely expects a value of type final_spirv_ptr_type, so if the access chain inherently has a different type, it must be cast to the right type (which would create a LogicalPtrCast, and hopefully not create downstream issues)
- to explain "inherently" a bit further: an access chain always has only one valid output type, given the input pointer type and the number of indices (and value, when dealing with struct fields), and we can't just put in a different type and have valid SPIR-V

My bad, I kept missing the let result_pointee_type = if indices.is_empty() { in the old code, now it makes sense that there was something there (but I've kept my original comment for posterity).

That's exactly the condition I was describing earlier, where indices being empty determines what the correct type is. However, I suppose the old code was lacking the pointercast I mentioned above.

Collaborator

eddyb May 7, 2025

This keeps issue-46 passing, but I will admit I don't know why it's valid:

@@ -785,8 +778,9 @@
                 // type, we can use the pointer type derived from the *recovered* type
                 // (`base_pointee_ty`). This helps preserve type information when
                 // recovery works for byte addressing.
-                let result_wrapper_type = if !is_byte_gep
-                    || matches!(self.lookup_type(base_pointee_ty), SpirvType::Integer(8, _))
+                let result_wrapper_type = if !(is_byte_gep
+                    && !matches!(self.lookup_type(base_pointee_ty), SpirvType::Integer(8, _))
+                    && indices.is_empty())
                 {
                     trace!(
                         "Using strictly calculated type for wrapper: {}",
@@ -828,13 +822,17 @@
                     );
                     // Emit a single AccessChain using the original pointer `ptr_id` and the fully combined index list.
                     // Note: We don't pass `ptr_base_index` here because its effect is incorporated into `base_indices`.
-                    return self.emit_access_chain(
+                    let mut final_ptr = self.emit_access_chain(
                         result_wrapper_type, // The chosen result pointer type
                         ptr_id,              // The original base pointer ID
                         None,                // No separate base index needed
                         combined_indices,    // The combined structured + original indices
                         is_inbounds,         // Preserve original inbounds request
                     );
+                    if result_wrapper_type != final_spirv_ptr_type {
+                        final_ptr = self.pointercast(final_ptr, final_spirv_ptr_type);
+                    }
+                    return final_ptr;
                 } else {
                     // Recovery happened, but the recovered type `base_pointee_ty` doesn't match the input `ty`,
                     // AND there are more `indices` to process. Using the `base_indices` derived from

I suppose the trick is that is_byte_gep requires indices.is_empty() (as bytes have no inner structure to further index), so:

my && indices.is_empty() is redundant (explains why nothing broke with your version)
you strengthened the condition for using base_pointee_ty, meaning final_spirv_ptr_type is used even in cases where it's incorrect to do so (and a cast should have been used instead)
- generating the cast to final_spirv_ptr_type doesn't help, which means... what exactly? that two wrongs make a right?

crates/rustc_codegen_spirv/src/builder/builder_methods.rs

Comment on lines +969 to +970

		// HACK(eddyb): Workaround for potential upstream issues where pointers might lack precise type info.
		// FIXME(eddyb): Ideally, this should use untyped memory features if available/necessary.

Collaborator

eddyb May 7, 2025

These should've stayed next to the pointercast - the new comments might not even make sense, because the cast is only due to the mismatch - anyway, it's all a historical mess, and lots of concepts don't fit the new (much more untyped) approach, sadly.

crates/rustc_codegen_spirv/src/builder/builder_methods.rs

+                          let search_result = {
+                              let emit = self.emit();
+                              let module = emit.module_ref();
+                              emit.selected_function().and_then(|func_idx| {

Collaborator

eddyb May 7, 2025

This accounts for selected_function() being None - that should never be possible when you are in a method of a builder that will emit instructions into a selected block, not just function.

crates/rustc_codegen_spirv/src/builder/builder_methods.rs

Comment on lines 903 to +910

                       if let Some((original_ptr, mut original_indices)) = maybe_original_access_chain {
-                          // Transform the following:
-                          // OpAccessChain original_ptr [a, b, c]
-                          // OpPtrAccessChain ptr base [d, e, f]
-                          // into
-                          // OpAccessChain original_ptr [a, b, c + base, d, e, f]
-                          // to remove the need for OpPtrAccessChain
-                          let last = original_indices.last_mut().unwrap();
-                          *last = self
-                              .add(last.with_type(ptr_base_index.ty), ptr_base_index)
-                              .def(self);
-                          original_indices.extend(indices);
-                          return self.emit_access_chain(
-                              result_type,
-                              original_ptr,
-                              None,
-                              original_indices,
-                              is_inbounds,
-                          );
+                          trace!("has original access chain, attempting to merge GEPs");
+                          // Check if merging is possible. Requires:
+                          // 1. The original AccessChain had at least one index.
+                          // 2. The *last* index of the original AccessChain is a constant.
+                          // 3. The *first* index (`ptr_base_index`) of the *current* GEP is a constant.
+                          // Merging usually involves adding these two constant indices.

Collaborator

eddyb May 7, 2025

Wait, you replaced a potentially-runtime addition which allows array indexing GEPs to merge, with a constant-only addition?

How could that help? If they were constants, the outcome is unchanged, you're still using add, and if they were not constants, not merging now means it hits the worst path and I can't think of any reason why that would result in better recovery downstream.

Changing this back to what it was before doesn't impact tests, btw, so that's another instance of that.

I am almost certain now that the weird !is_byte_gep || ... condition is the only actual change in this PR that impacted tests, and that's the one I don't understand yet.

Collaborator Author

LegNeato commented May 7, 2025 •

edited

Loading

@eddyb feel free to fix or back out whatever you want!

The new code flow made more sense to me and I confirmed a bunch of stuff with traces in various contexts. But I am still grappling with the underlying model and how best to apply it. I've paged this diff out of my brain (finally!) so I'd have to play around with it again to give reasons why I made the choices I did. I will say this was like the 5th revision / attempt, so it wasn't like I had an overarching strategy...I just tried to understand the code's overall goal and make it work w/o thinking of the best way to do things. It also wasn't clear where best to make changes...there are a lot of moving pieces in this code that all rely on each other. (rakes as you say!)

Feel free to just ignore this and focus on qptr if you want, as it is a bit of a side quest that will be obsoleted by your work (AFAIK).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet