stable/20240723: [IR][AArch64] Support `swiftcoro` CC and param attr, and `ret.popless` intrinsic. #10155

ahmedbougacha · 2025-03-04T03:35:20Z

This implements the "popless" return, enabling its use in conjunction for the new coroutine accessor ABI. At a high level that's done with the swiftcorocc calling convention, the swiftcoro parameter attribute, and most importantly the llvm.ret.popless intrinsic (which is currently only allowed in AArch64 for swiftcorocc functions.)

[IR] Define 'swiftcorocc' calling convention.

The 'swiftcorocc' calling convention is a variant of 'swiftcc', but
additionally allows the 'swiftcorocc' function to have popless returns.

"popless" returns don't fully restore the stack, thereby allowing the
caller to access some stack allocations made in the 'swiftcorocc'
callee.

Calls to these functions don't restore SP (but do restore FP).

So the most important characteristic of a 'swiftcorocc' call is that it
forces the caller function to access its stack through FP, like it does
with e.g., variable-size allocas.

This patch only implements the 'swiftcorocc' keyword and CallingConv,
but doesn't implement its support on any target yet.

[AArch64] Support 'swiftcorocc' "popless" calls.

'swiftcorocc' calls are allowed to have "popless" returns, which don't
fully restore the stack, thereby allowing the caller to access some
stack allocations made in the 'swiftcorocc' callee.

Concretely, calls to these functions don't restore SP (but do restore FP).

So the most important characteristic of a 'swiftcorocc' call is that it
forces the caller function to access its stack through FP, like it does
with e.g., variable-size allocas.

Support this on AArch64 by marking the frame as having a popless
call, which we generally honor when we decide whether the frame needs
FP and FP-based addressing, as we do today for variably-sized allocas.

rdar://135984630

[IR][AArch64] Add 'swiftcoro' parameter attribute.

It doesn't have any really interesting treatment, other than
being passed in a fixed register.

In most of our AArch64 calling conventions, that's X23.

In effect, this is mostly similar to swiftself.

rdar://135984630

[IR] Define @llvm.ret.popless intrinsic, a ret that doesn't restore SP.

Marks the following ret instruction as a "popless" return, one that does
not not restore SP to its function-entry value (i.e., does not
deallocate the stack frame), allowing allocations made in the function
to be accessible by the caller.

The function must be annotated with an appropriate target-specific
calling convention, so the caller can generate stack accesses
accordingly, generally by treating the call as a variably-sized alloca,
so using FP-based addressing for its own frame rather than relying on
statically known SP offsets.

The single argument is forwarded as a return value, that must then be
used as the operand to the following ret instruction.

Calls to this intrinsic need to be musttail, but don't follow the other
ABI requirements for musttail calls, since this is really annotating the
ret.

This doesn't implement any lowering, but only adds the intrinsic
definition, basic verifier checks, and an inliner opt-out.

rdar://135984630

[IR] Don't DCE llvm.ret.popless.

We originally had the intrinsic forward its return value to the ret to
have musttail-like behavior, which ensured it was always preserved.
Now that the intrinsic call is musttail but doesn't have any forwarded
operands, it needs to be kept alive through other means.

It might make sense to mark it as having side effects, and not
duplicable, but that shouldn't be necessary, and it's as duplicable
as any musttail call+ret sequence would be.

Because of this, we can't rely on it being DCE'd in ISel either, so drop
it explicitly in IRTranslator for GISel. We already had to do it in
SDISel anyway. While there, explicitly reject it in FastISel.

rdar://147236255

[AArch64] Lower @llvm.ret.popless in swiftcorocc functions.

On AArch64, swiftcorocc functions are the only functions yet that can
support popless returns.

In the backend, that's done by recognizing the musttail call to
llvm.ret.popless preceding a ret instruction, and asking the target to
adjust that ret to be popless.

Throughout most of the backend, that's not an interesting difference.

In frame lowering, these popless rets now induce several special
behaviors in their (never shrink-wrapped) epilogues, all consequences
of not restoring SP:

they of course don't do the SP adjustment or restore itself.
most importantly, they force the epilogue callee-save restores
to be FP-based rather than SP-based.
they restore FP/LR last, as we still need the old FP, pointing
at the frame being destroyed, to do the CSR restoring.
with ptrauth-returns, they first derive the entry SP from
FP, into X16, to use as a discriminator for a standalone AUTIB.

rdar://135984630

[AArch64] Fix offset in FP-based epilogue restore for popless ret.

In a swiftcorocc function, on the restoreless epilogue path (using
llvm.ret.popless), we're using FP-based addressing to restore
callee-saved registers, as we can't rely on SP having been restored to
its initial value, since we're not restoring it at all.

FP-based CSR restore is novel and bound to find interesting divergence
from all of our existing epilogues.

In this case, at least the problem is pretty simple, and was even
visible in one of the original test case: we were missing the
statically-sized locals. I haven't gotten to the point of convincing
myself this is sufficient yet, and I'm confident I'm missing some other
convoluted PEI-ism, but with this we can actually successfully run
a bunch of end-to-end swift tests!

While there, add an assert that checks that the FP/LR frame record
itself is only ever loaded from FP+0, without an offset. If there's an
offset from FP, we must have goofed somewhere, since that breaks the
frame record linked list.

rdar://147838968

ahmedbougacha · 2025-03-04T03:37:39Z

@swift-ci test

ahmedbougacha · 2025-03-05T17:21:53Z

@swift-ci test

nate-chandler · 2025-03-11T18:51:03Z

@swift-ci test

This reverts commit 6c453dd.

…e SP." This reverts commit 39998ff.

The 'swiftcorocc' calling convention is a variant of 'swiftcc', but additionally allows the 'swiftcorocc' function to have popless returns. "popless" returns don't fully restore the stack, thereby allowing the caller to access some stack allocations made in the 'swiftcorocc' callee. Calls to these functions don't restore SP (but do restore FP). So the most important characteristic of a 'swiftcorocc' call is that it forces the caller function to access its stack through FP, like it does with e.g., variable-size allocas. This patch only implements the 'swiftcorocc' keyword and CallingConv, but doesn't implement its support on any target yet.

It doesn't have any really interesting treatment, other than being passed in a fixed register. In most of our AArch64 calling conventions, that's X23. In effect, this is mostly similar to swiftself. rdar://135984630

'swiftcorocc' calls are allowed to have "popless" returns, which don't fully restore the stack, thereby allowing the caller to access some stack allocations made in the 'swiftcorocc' callee. Concretely, calls to these functions don't restore SP (but do restore FP). So the most important characteristic of a 'swiftcorocc' call is that it forces the caller function to access its stack through FP, like it does with e.g., variable-size allocas. Support this on AArch64 by marking the frame as having a popless call, which we generally honor when we decide whether the frame needs FP and FP-based addressing, as we do today for variably-sized allocas. rdar://135984630

Marks the following ret instruction as a "popless" return, one that does not not restore SP to its function-entry value (i.e., does not deallocate the stack frame), allowing allocations made in the function to be accessible by the caller. The function must be annotated with an appropriate target-specific calling convention, so the caller can generate stack accesses accordingly, generally by treating the call as a variably-sized alloca, so using FP-based addressing for its own frame rather than relying on statically known SP offsets. The single argument is forwarded as a return value, that must then be used as the operand to the following ret instruction. Calls to this intrinsic need to be musttail, but don't follow the other ABI requirements for musttail calls, since this is really annotating the ret. This doesn't implement any lowering, but only adds the intrinsic definition, basic verifier checks, and an inliner opt-out. rdar://135984630

On AArch64, swiftcorocc functions are the only functions yet that can support popless returns. In the backend, that's done by recognizing the musttail call to llvm.ret.popless preceding a ret instruction, and asking the target to adjust that ret to be popless. Throughout most of the backend, that's not an interesting difference. In frame lowering, these popless rets now induce several special behaviors in their (never shrink-wrapped) epilogues, all consequences of not restoring SP: - they of course don't do the SP adjustment or restore itself. - most importantly, they force the epilogue callee-save restores to be FP-based rather than SP-based. - they restore FP/LR last, as we still need the old FP, pointing at the frame being destroyed, to do the CSR restoring. - with ptrauth-returns, they first derive the entry SP from FP, into X16, to use as a discriminator for a standalone AUTIB. rdar://135984630

We originally had the intrinsic forward its return value to the ret to have musttail-like behavior, which ensured it was always preserved. Now that the intrinsic call is musttail but doesn't have any forwarded operands, it needs to be kept alive through other means. It might make sense to mark it as having side effects, and not duplicable, but that shouldn't be necessary, and it's as duplicable as any musttail call+ret sequence would be. Because of this, we can't rely on it being DCE'd in ISel either, so drop it explicitly in IRTranslator for GISel. We already had to do it in SDISel anyway. While there, explicitly reject it in FastISel. rdar://147236255

In a swiftcorocc function, on the restoreless epilogue path (using llvm.ret.popless), we're using FP-based addressing to restore callee-saved registers, as we can't rely on SP having been restored to its initial value, since we're not restoring it at all. FP-based CSR restore is novel and bound to find interesting divergence from all of our existing epilogues. In this case, at least the problem is pretty simple, and was even visible in one of the original test case: we were missing the statically-sized locals. I haven't gotten to the point of convincing myself this is sufficient yet, and I'm confident I'm missing some other convoluted PEI-ism, but with this we can actually successfully run a bunch of end-to-end swift tests! While there, add an assert that checks that the FP/LR frame record itself is only ever loaded from FP+0, without an offset. If there's an offset from FP, we must have goofed somewhere, since that breaks the frame record linked list. rdar://147838968

ahmedbougacha · 2025-03-27T01:42:18Z

@swift-ci test

ahmedbougacha · 2025-03-27T22:51:39Z

@swift-ci test macos

ahmedbougacha requested a review from a team as a code owner March 4, 2025 03:35

ahmedbougacha force-pushed the eng/PR-135984630-popless-ret-swiftcorocc-stable20240723 branch from 74c1c1e to 3974124 Compare March 5, 2025 17:21

nate-chandler mentioned this pull request Mar 6, 2025

[CoroutineAccessors] Adopt swiftcoro param attr. swiftlang/swift#79824

Merged

ahmedbougacha added 9 commits March 26, 2025 17:39

Revert "Define 'swiftcorocc' calling convention."

25ac3eb

This reverts commit 6c453dd.

Revert "Define @llvm.ret.popless intrinsic, a ret that doesn't restor…

6b96ba5

…e SP." This reverts commit 39998ff.

[IR][AArch64] Add 'swiftcoro' parameter attribute.

9a3013e

It doesn't have any really interesting treatment, other than being passed in a fixed register. In most of our AArch64 calling conventions, that's X23. In effect, this is mostly similar to swiftself. rdar://135984630

ahmedbougacha force-pushed the eng/PR-135984630-popless-ret-swiftcorocc-stable20240723 branch from 3974124 to 72aa7ba Compare March 27, 2025 01:37

ahmedbougacha changed the title ~~stable/20240723: [IR][AArch64] Define 'swiftcorocc' calling convention and 'swiftcoro' parameter attribute.~~ stable/20240723: [IR][AArch64] Support 'swiftcoro' calling convention and parameter attribute, and llvm.ret.popless intrinsic. Mar 27, 2025

ahmedbougacha changed the title ~~stable/20240723: [IR][AArch64] Support swiftcoro calling convention and parameter attribute, and ret.popless intrinsic.~~ stable/20240723: [IR][AArch64] Support swiftcoro CC and param attr, and ret.popless intrinsic. Mar 27, 2025

ahmedbougacha merged commit 8fc6907 into stable/20240723 Mar 28, 2025
3 checks passed

ahmedbougacha deleted the eng/PR-135984630-popless-ret-swiftcorocc-stable20240723 branch March 28, 2025 20:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

stable/20240723: [IR][AArch64] Support `swiftcoro` CC and param attr, and `ret.popless` intrinsic. #10155

stable/20240723: [IR][AArch64] Support `swiftcoro` CC and param attr, and `ret.popless` intrinsic. #10155

ahmedbougacha commented Mar 4, 2025 •

edited

Loading

ahmedbougacha commented Mar 4, 2025

ahmedbougacha commented Mar 5, 2025

nate-chandler commented Mar 11, 2025

ahmedbougacha commented Mar 27, 2025

ahmedbougacha commented Mar 27, 2025

stable/20240723: [IR][AArch64] Support swiftcoro CC and param attr, and ret.popless intrinsic. #10155

stable/20240723: [IR][AArch64] Support swiftcoro CC and param attr, and ret.popless intrinsic. #10155

Conversation

ahmedbougacha commented Mar 4, 2025 • edited Loading

[IR] Define 'swiftcorocc' calling convention.

[AArch64] Support 'swiftcorocc' "popless" calls.

[IR][AArch64] Add 'swiftcoro' parameter attribute.

[IR] Define @llvm.ret.popless intrinsic, a ret that doesn't restore SP.

[IR] Don't DCE llvm.ret.popless.

[AArch64] Lower @llvm.ret.popless in swiftcorocc functions.

[AArch64] Fix offset in FP-based epilogue restore for popless ret.

ahmedbougacha commented Mar 4, 2025

ahmedbougacha commented Mar 5, 2025

nate-chandler commented Mar 11, 2025

ahmedbougacha commented Mar 27, 2025

ahmedbougacha commented Mar 27, 2025

stable/20240723: [IR][AArch64] Support `swiftcoro` CC and param attr, and `ret.popless` intrinsic. #10155

stable/20240723: [IR][AArch64] Support `swiftcoro` CC and param attr, and `ret.popless` intrinsic. #10155

ahmedbougacha commented Mar 4, 2025 •

edited

Loading