forked from llvm/llvm-project
-
Notifications
You must be signed in to change notification settings - Fork 333
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
stable/20240723: [IR][AArch64] Support swiftcoro
CC and param attr, and ret.popless
intrinsic.
#10155
Merged
ahmedbougacha
merged 9 commits into
stable/20240723
from
eng/PR-135984630-popless-ret-swiftcorocc-stable20240723
Mar 28, 2025
Merged
stable/20240723: [IR][AArch64] Support swiftcoro
CC and param attr, and ret.popless
intrinsic.
#10155
ahmedbougacha
merged 9 commits into
stable/20240723
from
eng/PR-135984630-popless-ret-swiftcorocc-stable20240723
Mar 28, 2025
+727
−14
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
@swift-ci test |
74c1c1e
to
3974124
Compare
@swift-ci test |
@swift-ci test |
This reverts commit 6c453dd.
…e SP." This reverts commit 39998ff.
The 'swiftcorocc' calling convention is a variant of 'swiftcc', but additionally allows the 'swiftcorocc' function to have popless returns. "popless" returns don't fully restore the stack, thereby allowing the caller to access some stack allocations made in the 'swiftcorocc' callee. Calls to these functions don't restore SP (but do restore FP). So the most important characteristic of a 'swiftcorocc' call is that it forces the caller function to access its stack through FP, like it does with e.g., variable-size allocas. This patch only implements the 'swiftcorocc' keyword and CallingConv, but doesn't implement its support on any target yet.
It doesn't have any really interesting treatment, other than being passed in a fixed register. In most of our AArch64 calling conventions, that's X23. In effect, this is mostly similar to swiftself. rdar://135984630
'swiftcorocc' calls are allowed to have "popless" returns, which don't fully restore the stack, thereby allowing the caller to access some stack allocations made in the 'swiftcorocc' callee. Concretely, calls to these functions don't restore SP (but do restore FP). So the most important characteristic of a 'swiftcorocc' call is that it forces the caller function to access its stack through FP, like it does with e.g., variable-size allocas. Support this on AArch64 by marking the frame as having a popless call, which we generally honor when we decide whether the frame needs FP and FP-based addressing, as we do today for variably-sized allocas. rdar://135984630
Marks the following ret instruction as a "popless" return, one that does not not restore SP to its function-entry value (i.e., does not deallocate the stack frame), allowing allocations made in the function to be accessible by the caller. The function must be annotated with an appropriate target-specific calling convention, so the caller can generate stack accesses accordingly, generally by treating the call as a variably-sized alloca, so using FP-based addressing for its own frame rather than relying on statically known SP offsets. The single argument is forwarded as a return value, that must then be used as the operand to the following ret instruction. Calls to this intrinsic need to be musttail, but don't follow the other ABI requirements for musttail calls, since this is really annotating the ret. This doesn't implement any lowering, but only adds the intrinsic definition, basic verifier checks, and an inliner opt-out. rdar://135984630
On AArch64, swiftcorocc functions are the only functions yet that can support popless returns. In the backend, that's done by recognizing the musttail call to llvm.ret.popless preceding a ret instruction, and asking the target to adjust that ret to be popless. Throughout most of the backend, that's not an interesting difference. In frame lowering, these popless rets now induce several special behaviors in their (never shrink-wrapped) epilogues, all consequences of not restoring SP: - they of course don't do the SP adjustment or restore itself. - most importantly, they force the epilogue callee-save restores to be FP-based rather than SP-based. - they restore FP/LR last, as we still need the old FP, pointing at the frame being destroyed, to do the CSR restoring. - with ptrauth-returns, they first derive the entry SP from FP, into X16, to use as a discriminator for a standalone AUTIB. rdar://135984630
We originally had the intrinsic forward its return value to the ret to have musttail-like behavior, which ensured it was always preserved. Now that the intrinsic call is musttail but doesn't have any forwarded operands, it needs to be kept alive through other means. It might make sense to mark it as having side effects, and not duplicable, but that shouldn't be necessary, and it's as duplicable as any musttail call+ret sequence would be. Because of this, we can't rely on it being DCE'd in ISel either, so drop it explicitly in IRTranslator for GISel. We already had to do it in SDISel anyway. While there, explicitly reject it in FastISel. rdar://147236255
Loading status checks…
In a swiftcorocc function, on the restoreless epilogue path (using llvm.ret.popless), we're using FP-based addressing to restore callee-saved registers, as we can't rely on SP having been restored to its initial value, since we're not restoring it at all. FP-based CSR restore is novel and bound to find interesting divergence from all of our existing epilogues. In this case, at least the problem is pretty simple, and was even visible in one of the original test case: we were missing the statically-sized locals. I haven't gotten to the point of convincing myself this is sufficient yet, and I'm confident I'm missing some other convoluted PEI-ism, but with this we can actually successfully run a bunch of end-to-end swift tests! While there, add an assert that checks that the FP/LR frame record itself is only ever loaded from FP+0, without an offset. If there's an offset from FP, we must have goofed somewhere, since that breaks the frame record linked list. rdar://147838968
3974124
to
72aa7ba
Compare
@swift-ci test |
@swift-ci test macos |
llvm.ret.popless
intrinsic.
llvm.ret.popless
intrinsic.swiftcoro
calling convention and parameter attribute, and ret.popless
intrinsic.
swiftcoro
calling convention and parameter attribute, and ret.popless
intrinsic.swiftcoro
CC and param attr, and ret.popless
intrinsic.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This implements the "popless" return, enabling its use in conjunction for the new coroutine accessor ABI. At a high level that's done with the
swiftcorocc
calling convention, theswiftcoro
parameter attribute, and most importantly thellvm.ret.popless
intrinsic (which is currently only allowed in AArch64 forswiftcorocc
functions.)[IR] Define 'swiftcorocc' calling convention.
The 'swiftcorocc' calling convention is a variant of 'swiftcc', but
additionally allows the 'swiftcorocc' function to have popless returns.
"popless" returns don't fully restore the stack, thereby allowing the
caller to access some stack allocations made in the 'swiftcorocc'
callee.
Calls to these functions don't restore SP (but do restore FP).
So the most important characteristic of a 'swiftcorocc' call is that it
forces the caller function to access its stack through FP, like it does
with e.g., variable-size allocas.
This patch only implements the 'swiftcorocc' keyword and CallingConv,
but doesn't implement its support on any target yet.
[AArch64] Support 'swiftcorocc' "popless" calls.
'swiftcorocc' calls are allowed to have "popless" returns, which don't
fully restore the stack, thereby allowing the caller to access some
stack allocations made in the 'swiftcorocc' callee.
Concretely, calls to these functions don't restore SP (but do restore FP).
So the most important characteristic of a 'swiftcorocc' call is that it
forces the caller function to access its stack through FP, like it does
with e.g., variable-size allocas.
Support this on AArch64 by marking the frame as having a popless
call, which we generally honor when we decide whether the frame needs
FP and FP-based addressing, as we do today for variably-sized allocas.
rdar://135984630
[IR][AArch64] Add 'swiftcoro' parameter attribute.
It doesn't have any really interesting treatment, other than
being passed in a fixed register.
In most of our AArch64 calling conventions, that's X23.
In effect, this is mostly similar to swiftself.
rdar://135984630
[IR] Define @llvm.ret.popless intrinsic, a ret that doesn't restore SP.
Marks the following ret instruction as a "popless" return, one that does
not not restore SP to its function-entry value (i.e., does not
deallocate the stack frame), allowing allocations made in the function
to be accessible by the caller.
The function must be annotated with an appropriate target-specific
calling convention, so the caller can generate stack accesses
accordingly, generally by treating the call as a variably-sized alloca,
so using FP-based addressing for its own frame rather than relying on
statically known SP offsets.
The single argument is forwarded as a return value, that must then be
used as the operand to the following ret instruction.
Calls to this intrinsic need to be musttail, but don't follow the other
ABI requirements for musttail calls, since this is really annotating the
ret.
This doesn't implement any lowering, but only adds the intrinsic
definition, basic verifier checks, and an inliner opt-out.
rdar://135984630
[IR] Don't DCE llvm.ret.popless.
We originally had the intrinsic forward its return value to the ret to
have musttail-like behavior, which ensured it was always preserved.
Now that the intrinsic call is musttail but doesn't have any forwarded
operands, it needs to be kept alive through other means.
It might make sense to mark it as having side effects, and not
duplicable, but that shouldn't be necessary, and it's as duplicable
as any musttail call+ret sequence would be.
Because of this, we can't rely on it being DCE'd in ISel either, so drop
it explicitly in IRTranslator for GISel. We already had to do it in
SDISel anyway. While there, explicitly reject it in FastISel.
rdar://147236255
[AArch64] Lower @llvm.ret.popless in swiftcorocc functions.
On AArch64, swiftcorocc functions are the only functions yet that can
support popless returns.
In the backend, that's done by recognizing the musttail call to
llvm.ret.popless preceding a ret instruction, and asking the target to
adjust that ret to be popless.
Throughout most of the backend, that's not an interesting difference.
In frame lowering, these popless rets now induce several special
behaviors in their (never shrink-wrapped) epilogues, all consequences
of not restoring SP:
to be FP-based rather than SP-based.
at the frame being destroyed, to do the CSR restoring.
FP, into X16, to use as a discriminator for a standalone AUTIB.
rdar://135984630
[AArch64] Fix offset in FP-based epilogue restore for popless ret.
In a swiftcorocc function, on the restoreless epilogue path (using
llvm.ret.popless), we're using FP-based addressing to restore
callee-saved registers, as we can't rely on SP having been restored to
its initial value, since we're not restoring it at all.
FP-based CSR restore is novel and bound to find interesting divergence
from all of our existing epilogues.
In this case, at least the problem is pretty simple, and was even
visible in one of the original test case: we were missing the
statically-sized locals. I haven't gotten to the point of convincing
myself this is sufficient yet, and I'm confident I'm missing some other
convoluted PEI-ism, but with this we can actually successfully run
a bunch of end-to-end swift tests!
While there, add an assert that checks that the FP/LR frame record
itself is only ever loaded from FP+0, without an offset. If there's an
offset from FP, we must have goofed somewhere, since that breaks the
frame record linked list.
rdar://147838968