-
Notifications
You must be signed in to change notification settings - Fork 53
Correctly calculate access chains for compound types #233
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
My stuff still works with this branch (+ nonuniform patches) 🎉 |
My project compiled just fine and passes tests 👍 . I also successfully built a shader with Awesome work! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM but take my review with a grain of salt 🧂 as I never got a chance to dive in very deep, but I have tested the results.
@eddyb said he wanted to take a look at this so I'll wait a bit before merging. |
Ok, I am just going to land it and we can fix forward or revert if it isn't correct. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was late on this and I'm not even sure I should leave this review.
I've got mixed feelings about it - don't mind all the tracing
usage and the comments, but it turns out that there are fewer functional changes than I expected - in fact, just one, and it doesn't make sense to me.
There's a non-trivial chance this is a case of "two wrongs make a right", or more specifically something else ends up reacting better to what seems to be invalid SPIR-V (AFAICT).
Now, too be fair, I finally looked at this because of:
And the simplest workaround for that seems to be "run the GEP merge logic twice, using a loop, and the second time use pointercast
first which can generate its own GEP via recover_access_chain_from_offset
" - so there's a chance the "byte GEP" stuff hits a similar weird combo.
In fact, funnily enough, I think we could've always done that pointercast
before the merge, because of the if ty == original_pointee_ty
that currently gates merging: if that condition holds, the pointercast
is a noop anyway.
(EDIT: yupp, that works, and it amounts to moving just two let
s slightly earlier in the function)
I should try to figure out the overall "decision trees" for maybe_inbounds_gep
and other recover_access_chain_from_offset
callers, then reduce duplicated work and handle all possible cases that can be statically resolved.
// FIXME(eddyb) this isn't efficient, `recover_access_chain_from_offset` | ||
// could instead be doing all the extra digging itself. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure what this comment meant exactly, but I suspect it still applies to the loop being a loop, and not a single call to recover_access_chain_from_offset
.
return Some((indices, ty)); | ||
} | ||
} | ||
} | ||
|
||
#[instrument(level = "trace", skip(self), fields(ty = ?self.debug_type(ty), ptr, combined_indices = ?combined_indices.iter().map(|x| (self.debug_type(x.ty), x.kind)).collect::<Vec<_>>(), is_inbounds))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Surprised rustfmt
didn't make this multi-line, I feel like it could be more readable if each entry in fields
was on its own line.
// Special case: If we started with a byte GEP (`is_byte_gep` is true) and | ||
// we are currently indexing into a byte type, the result is still a byte type. | ||
// This prevents errors if `indices` contains non-zero values when `ty` is u8/i8. | ||
SpirvType::Integer(8, signedness) if is_byte_gep => { | ||
// Define the resulting byte type as it might not exist yet). | ||
SpirvType::Integer(8, signedness).def(self.span(), self) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see how this could be reached - indices
can only dig into structs/arrays, and only a bug elsewhere would allow indices.len() > 0
when doing byte offsetting.
Locally removing this case does not fail any tests, so at the very least none were added that could hit it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh weird, I added it specifically because byte indexing was broken in tests. Perhaps a later change fixed this and I didn't notice.
// Search the current function's instructions... | ||
// FIXME(eddyb) this could get ridiculously expensive, at the very least | ||
// it could use `.rev()`, hoping the base pointer was recently defined? | ||
let search_result = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the extra variable? It just inflates the diff needlessly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I probably did this for logging which looks like it was removed
// Determine the result type for the `AccessChain` instruction we might | ||
// emit. By default, use the `final_spirv_ptr_type` strictly calculated | ||
// earlier from `ty` and `indices`. | ||
// | ||
// If this is a byte GEP *and* the recovered type happens to be a byte | ||
// type, we can use the pointer type derived from the *recovered* type | ||
// (`base_pointee_ty`). This helps preserve type information when | ||
// recovery works for byte addressing. | ||
let result_wrapper_type = if !is_byte_gep | ||
|| matches!(self.lookup_type(base_pointee_ty), SpirvType::Integer(8, _)) | ||
{ | ||
trace!( | ||
"Using strictly calculated type for wrapper: {}", | ||
// Use type based on input `ty` + `indices` | ||
self.debug_type(calculated_pointee_type) | ||
); | ||
final_spirv_ptr_type | ||
} else { | ||
trace!( | ||
"Byte GEP allowing recovered type for wrapper: {}", | ||
// Use type based on recovery result | ||
self.debug_type(base_pointee_ty) | ||
); | ||
self.type_ptr_to(base_pointee_ty) | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like the only addition to the "constant offset" path and I don't understand it.
You have to use final_spirv_ptr_type
(i.e. pointer-to-calculated_pointee_type
) because that's what the type has to be.
When indices
is empty, that does mean that pointer-to-base_pointee_ty
can be obtained and it has the right offset, even if it's not pointing to a ty
/calculated_pointee_type
(which are equal in the empty indices
case) - but there are two issues there:
- "
ty
is NOT byte, ORbase_pointee_ty
is byte" doesn't seem like it matches the comment, nor can it relate to "do we have the desired pointer, even if of the wrong type?"- maybe the actual issue here is that the "then" and "else" sides got flipped for no good reason, so the condition ends up negated, and should be
is_byte_gep && ...
?
- maybe the actual issue here is that the "then" and "else" sides got flipped for no good reason, so the condition ends up negated, and should be
- the caller definitely expects a value of type
final_spirv_ptr_type
, so if the access chain inherently has a different type, it must be cast to the right type (which would create aLogicalPtrCast
, and hopefully not create downstream issues)- to explain "inherently" a bit further: an access chain always has only one valid output type, given the input pointer type and the number of indices (and value, when dealing with struct fields), and we can't just put in a different type and have valid SPIR-V
My bad, I kept missing the let result_pointee_type = if indices.is_empty() {
in the old code, now it makes sense that there was something there (but I've kept my original comment for posterity).
That's exactly the condition I was describing earlier, where indices
being empty determines what the correct type is. However, I suppose the old code was lacking the pointercast
I mentioned above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This keeps issue-46
passing, but I will admit I don't know why it's valid:
@@ -785,8 +778,9 @@
// type, we can use the pointer type derived from the *recovered* type
// (`base_pointee_ty`). This helps preserve type information when
// recovery works for byte addressing.
- let result_wrapper_type = if !is_byte_gep
- || matches!(self.lookup_type(base_pointee_ty), SpirvType::Integer(8, _))
+ let result_wrapper_type = if !(is_byte_gep
+ && !matches!(self.lookup_type(base_pointee_ty), SpirvType::Integer(8, _))
+ && indices.is_empty())
{
trace!(
"Using strictly calculated type for wrapper: {}",
@@ -828,13 +822,17 @@
);
// Emit a single AccessChain using the original pointer `ptr_id` and the fully combined index list.
// Note: We don't pass `ptr_base_index` here because its effect is incorporated into `base_indices`.
- return self.emit_access_chain(
+ let mut final_ptr = self.emit_access_chain(
result_wrapper_type, // The chosen result pointer type
ptr_id, // The original base pointer ID
None, // No separate base index needed
combined_indices, // The combined structured + original indices
is_inbounds, // Preserve original inbounds request
);
+ if result_wrapper_type != final_spirv_ptr_type {
+ final_ptr = self.pointercast(final_ptr, final_spirv_ptr_type);
+ }
+ return final_ptr;
} else {
// Recovery happened, but the recovered type `base_pointee_ty` doesn't match the input `ty`,
// AND there are more `indices` to process. Using the `base_indices` derived from
I suppose the trick is that is_byte_gep
requires indices.is_empty()
(as bytes have no inner structure to further index), so:
- my
&& indices.is_empty()
is redundant (explains why nothing broke with your version) - you strengthened the condition for using
base_pointee_ty
, meaningfinal_spirv_ptr_type
is used even in cases where it's incorrect to do so (and a cast should have been used instead)- generating the cast to
final_spirv_ptr_type
doesn't help, which means... what exactly? that two wrongs make a right?
- generating the cast to
// HACK(eddyb): Workaround for potential upstream issues where pointers might lack precise type info. | ||
// FIXME(eddyb): Ideally, this should use untyped memory features if available/necessary. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These should've stayed next to the pointercast
- the new comments might not even make sense, because the cast is only due to the mismatch - anyway, it's all a historical mess, and lots of concepts don't fit the new (much more untyped) approach, sadly.
let search_result = { | ||
let emit = self.emit(); | ||
let module = emit.module_ref(); | ||
emit.selected_function().and_then(|func_idx| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This accounts for selected_function()
being None
- that should never be possible when you are in a method of a builder that will emit instructions into a selected block, not just function.
if let Some((original_ptr, mut original_indices)) = maybe_original_access_chain { | ||
// Transform the following: | ||
// OpAccessChain original_ptr [a, b, c] | ||
// OpPtrAccessChain ptr base [d, e, f] | ||
// into | ||
// OpAccessChain original_ptr [a, b, c + base, d, e, f] | ||
// to remove the need for OpPtrAccessChain | ||
let last = original_indices.last_mut().unwrap(); | ||
*last = self | ||
.add(last.with_type(ptr_base_index.ty), ptr_base_index) | ||
.def(self); | ||
original_indices.extend(indices); | ||
return self.emit_access_chain( | ||
result_type, | ||
original_ptr, | ||
None, | ||
original_indices, | ||
is_inbounds, | ||
); | ||
trace!("has original access chain, attempting to merge GEPs"); | ||
|
||
// Check if merging is possible. Requires: | ||
// 1. The original AccessChain had at least one index. | ||
// 2. The *last* index of the original AccessChain is a constant. | ||
// 3. The *first* index (`ptr_base_index`) of the *current* GEP is a constant. | ||
// Merging usually involves adding these two constant indices. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, you replaced a potentially-runtime addition which allows array indexing GEPs to merge, with a constant-only addition?
How could that help? If they were constants, the outcome is unchanged, you're still using add
, and if they were not constants, not merging now means it hits the worst path and I can't think of any reason why that would result in better recovery downstream.
Changing this back to what it was before doesn't impact tests, btw, so that's another instance of that.
I am almost certain now that the weird !is_byte_gep || ...
condition is the only actual change in this PR that impacted tests, and that's the one I don't understand yet.
@eddyb feel free to fix or back out whatever you want! The new code flow made more sense to me and I confirmed a bunch of stuff with traces in various contexts. But I am still grappling with the underlying model and how best to apply it. I've paged this diff out of my brain (finally!) so I'd have to play around with it again to give reasons why I made the choices I did. I will say this was like the 5th revision / attempt, so it wasn't like I had an overarching strategy...I just tried to understand the code's overall goal and make it work w/o thinking of the best way to do things. It also wasn't clear where best to make changes...there are a lot of moving pieces in this code that all rely on each other. (rakes as you say!) Feel free to just ignore this and focus on qptr if you want, as it is a bit of a side quest that will be obsoleted by your work (AFAIK). |
Fixes #46.
This was extremely painful and I am still not sure I understand the code completely and all the various cases.
@schell , @Firestar99 , please try it in your projects and see if I broke anything!