Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stable/20240723: [IR][AArch64] Support swiftcoro CC and param attr, and ret.popless intrinsic. #10155

Conversation

ahmedbougacha
Copy link

@ahmedbougacha ahmedbougacha commented Mar 4, 2025

This implements the "popless" return, enabling its use in conjunction for the new coroutine accessor ABI. At a high level that's done with the swiftcorocc calling convention, the swiftcoro parameter attribute, and most importantly the llvm.ret.popless intrinsic (which is currently only allowed in AArch64 for swiftcorocc functions.)

[IR] Define 'swiftcorocc' calling convention.

The 'swiftcorocc' calling convention is a variant of 'swiftcc', but
additionally allows the 'swiftcorocc' function to have popless returns.

"popless" returns don't fully restore the stack, thereby allowing the
caller to access some stack allocations made in the 'swiftcorocc'
callee.

Calls to these functions don't restore SP (but do restore FP).

So the most important characteristic of a 'swiftcorocc' call is that it
forces the caller function to access its stack through FP, like it does
with e.g., variable-size allocas.

This patch only implements the 'swiftcorocc' keyword and CallingConv,
but doesn't implement its support on any target yet.

[AArch64] Support 'swiftcorocc' "popless" calls.

'swiftcorocc' calls are allowed to have "popless" returns, which don't
fully restore the stack, thereby allowing the caller to access some
stack allocations made in the 'swiftcorocc' callee.

Concretely, calls to these functions don't restore SP (but do restore FP).

So the most important characteristic of a 'swiftcorocc' call is that it
forces the caller function to access its stack through FP, like it does
with e.g., variable-size allocas.

Support this on AArch64 by marking the frame as having a popless
call, which we generally honor when we decide whether the frame needs
FP and FP-based addressing, as we do today for variably-sized allocas.

rdar://135984630

[IR][AArch64] Add 'swiftcoro' parameter attribute.

It doesn't have any really interesting treatment, other than
being passed in a fixed register.

In most of our AArch64 calling conventions, that's X23.

In effect, this is mostly similar to swiftself.

rdar://135984630

[IR] Define @llvm.ret.popless intrinsic, a ret that doesn't restore SP.

Marks the following ret instruction as a "popless" return, one that does
not not restore SP to its function-entry value (i.e., does not
deallocate the stack frame), allowing allocations made in the function
to be accessible by the caller.

The function must be annotated with an appropriate target-specific
calling convention, so the caller can generate stack accesses
accordingly, generally by treating the call as a variably-sized alloca,
so using FP-based addressing for its own frame rather than relying on
statically known SP offsets.

The single argument is forwarded as a return value, that must then be
used as the operand to the following ret instruction.

Calls to this intrinsic need to be musttail, but don't follow the other
ABI requirements for musttail calls, since this is really annotating the
ret.

This doesn't implement any lowering, but only adds the intrinsic
definition, basic verifier checks, and an inliner opt-out.

rdar://135984630

[IR] Don't DCE llvm.ret.popless.

We originally had the intrinsic forward its return value to the ret to
have musttail-like behavior, which ensured it was always preserved.
Now that the intrinsic call is musttail but doesn't have any forwarded
operands, it needs to be kept alive through other means.

It might make sense to mark it as having side effects, and not
duplicable, but that shouldn't be necessary, and it's as duplicable
as any musttail call+ret sequence would be.

Because of this, we can't rely on it being DCE'd in ISel either, so drop
it explicitly in IRTranslator for GISel. We already had to do it in
SDISel anyway. While there, explicitly reject it in FastISel.

rdar://147236255

[AArch64] Lower @llvm.ret.popless in swiftcorocc functions.

On AArch64, swiftcorocc functions are the only functions yet that can
support popless returns.

In the backend, that's done by recognizing the musttail call to
llvm.ret.popless preceding a ret instruction, and asking the target to
adjust that ret to be popless.

Throughout most of the backend, that's not an interesting difference.

In frame lowering, these popless rets now induce several special
behaviors in their (never shrink-wrapped) epilogues, all consequences
of not restoring SP:

  • they of course don't do the SP adjustment or restore itself.
  • most importantly, they force the epilogue callee-save restores
    to be FP-based rather than SP-based.
  • they restore FP/LR last, as we still need the old FP, pointing
    at the frame being destroyed, to do the CSR restoring.
  • with ptrauth-returns, they first derive the entry SP from
    FP, into X16, to use as a discriminator for a standalone AUTIB.

rdar://135984630

[AArch64] Fix offset in FP-based epilogue restore for popless ret.

In a swiftcorocc function, on the restoreless epilogue path (using
llvm.ret.popless), we're using FP-based addressing to restore
callee-saved registers, as we can't rely on SP having been restored to
its initial value, since we're not restoring it at all.

FP-based CSR restore is novel and bound to find interesting divergence
from all of our existing epilogues.

In this case, at least the problem is pretty simple, and was even
visible in one of the original test case: we were missing the
statically-sized locals. I haven't gotten to the point of convincing
myself this is sufficient yet, and I'm confident I'm missing some other
convoluted PEI-ism, but with this we can actually successfully run
a bunch of end-to-end swift tests!

While there, add an assert that checks that the FP/LR frame record
itself is only ever loaded from FP+0, without an offset. If there's an
offset from FP, we must have goofed somewhere, since that breaks the
frame record linked list.

rdar://147838968

@ahmedbougacha ahmedbougacha requested a review from a team as a code owner March 4, 2025 03:35
@ahmedbougacha
Copy link
Author

@swift-ci test

@ahmedbougacha ahmedbougacha force-pushed the eng/PR-135984630-popless-ret-swiftcorocc-stable20240723 branch from 74c1c1e to 3974124 Compare March 5, 2025 17:21
@ahmedbougacha
Copy link
Author

@swift-ci test

@nate-chandler
Copy link

@swift-ci test

The 'swiftcorocc' calling convention is a variant of 'swiftcc', but
additionally allows the 'swiftcorocc' function to have popless returns.

"popless" returns don't fully restore the stack, thereby allowing the
caller to access some stack allocations made in the 'swiftcorocc'
callee.

Calls to these functions don't restore SP (but do restore FP).

So the most important characteristic of a 'swiftcorocc' call is that it
forces the caller function to access its stack through FP, like it does
with e.g., variable-size allocas.

This patch only implements the 'swiftcorocc' keyword and CallingConv,
but doesn't implement its support on any target yet.
It doesn't have any really interesting treatment, other than
being passed in a fixed register.

In most of our AArch64 calling conventions, that's X23.

In effect, this is mostly similar to swiftself.

rdar://135984630
'swiftcorocc' calls are allowed to have "popless" returns, which don't
fully restore the stack, thereby allowing the caller to access some
stack allocations made in the 'swiftcorocc' callee.

Concretely, calls to these functions don't restore SP (but do restore FP).

So the most important characteristic of a 'swiftcorocc' call is that it
forces the caller function to access its stack through FP, like it does
with e.g., variable-size allocas.

Support this on AArch64 by marking the frame as having a popless
call, which we generally honor when we decide whether the frame needs
FP and FP-based addressing, as we do today for variably-sized allocas.

rdar://135984630
Marks the following ret instruction as a "popless" return, one that does
not not restore SP to its function-entry value (i.e., does not
deallocate the stack frame), allowing allocations made in the function
to be accessible by the caller.

The function must be annotated with an appropriate target-specific
calling convention, so the caller can generate stack accesses
accordingly, generally by treating the call as a variably-sized alloca,
so using FP-based addressing for its own frame rather than relying on
statically known SP offsets.

The single argument is forwarded as a return value, that must then be
used as the operand to the following ret instruction.

Calls to this intrinsic need to be musttail, but don't follow the other
ABI requirements for musttail calls, since this is really annotating the
ret.

This doesn't implement any lowering, but only adds the intrinsic
definition, basic verifier checks, and an inliner opt-out.

rdar://135984630
On AArch64, swiftcorocc functions are the only functions yet that can
support popless returns.

In the backend, that's done by recognizing the musttail call to
llvm.ret.popless preceding a ret instruction, and asking the target to
adjust that ret to be popless.

Throughout most of the backend, that's not an interesting difference.

In frame lowering, these popless rets now induce several special
behaviors in their (never shrink-wrapped) epilogues, all consequences
of not restoring SP:
- they of course don't do the SP adjustment or restore itself.
- most importantly, they force the epilogue callee-save restores
  to be FP-based rather than SP-based.
- they restore FP/LR last, as we still need the old FP, pointing
  at the frame being destroyed, to do the CSR restoring.
- with ptrauth-returns, they first derive the entry SP from
  FP, into X16, to use as a discriminator for a standalone AUTIB.

rdar://135984630
We originally had the intrinsic forward its return value to the ret to
have musttail-like behavior, which ensured it was always preserved.
Now that the intrinsic call is musttail but doesn't have any forwarded
operands, it needs to be kept alive through other means.

It might make sense to mark it as having side effects, and not
duplicable, but that shouldn't be necessary, and it's as duplicable
as any musttail call+ret sequence would be.

Because of this, we can't rely on it being DCE'd in ISel either, so drop
it explicitly in IRTranslator for GISel.  We already had to do it in
SDISel anyway.  While there, explicitly reject it in FastISel.

rdar://147236255
In a swiftcorocc function, on the restoreless epilogue path (using
llvm.ret.popless), we're using FP-based addressing to restore
callee-saved registers, as we can't rely on SP having been restored to
its initial value, since we're not restoring it at all.

FP-based CSR restore is novel and bound to find interesting divergence
from all of our existing epilogues.

In this case, at least the problem is pretty simple, and was even
visible in one of the original test case: we were missing the
statically-sized locals.  I haven't gotten to the point of convincing
myself this is sufficient yet, and I'm confident I'm missing some other
convoluted PEI-ism, but with this we can actually successfully run
a bunch of end-to-end swift tests!

While there, add an assert that checks that the FP/LR frame record
itself is only ever loaded from FP+0, without an offset.  If there's an
offset from FP, we must have goofed somewhere, since that breaks the
frame record linked list.

rdar://147838968
@ahmedbougacha ahmedbougacha force-pushed the eng/PR-135984630-popless-ret-swiftcorocc-stable20240723 branch from 3974124 to 72aa7ba Compare March 27, 2025 01:37
@ahmedbougacha
Copy link
Author

@swift-ci test

@ahmedbougacha
Copy link
Author

@swift-ci test macos

@ahmedbougacha ahmedbougacha changed the title stable/20240723: [IR][AArch64] Define 'swiftcorocc' calling convention and 'swiftcoro' parameter attribute. stable/20240723: [IR][AArch64] Support 'swiftcoro' calling convention and parameter attribute, and llvm.ret.popless intrinsic. Mar 27, 2025
@ahmedbougacha ahmedbougacha changed the title stable/20240723: [IR][AArch64] Support 'swiftcoro' calling convention and parameter attribute, and llvm.ret.popless intrinsic. stable/20240723: [IR][AArch64] Support swiftcoro calling convention and parameter attribute, and ret.popless intrinsic. Mar 27, 2025
@ahmedbougacha ahmedbougacha changed the title stable/20240723: [IR][AArch64] Support swiftcoro calling convention and parameter attribute, and ret.popless intrinsic. stable/20240723: [IR][AArch64] Support swiftcoro CC and param attr, and ret.popless intrinsic. Mar 27, 2025
@ahmedbougacha ahmedbougacha merged commit 8fc6907 into stable/20240723 Mar 28, 2025
3 checks passed
@ahmedbougacha ahmedbougacha deleted the eng/PR-135984630-popless-ret-swiftcorocc-stable20240723 branch March 28, 2025 20:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants