Skip to content

Add #[inline] to memfd_create to delay codegen and prevent undefined symbols in dylibs #2049

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 11, 2023

Conversation

saethlin
Copy link
Contributor

@saethlin saethlin commented Jun 2, 2023

I think this is related to #1972

I want this because I'm trying to use ctrlc in the compiler (rust-lang/rust#111769), which uses this crate, and the feature set that ctrlc uses pulls in this function.

The compiler is primarily built as a shared object, and it is linked against an old version of glibc (2.17) so there is no memfd_create symbol available when it is built. Adding #[inline] causes this function to be codegenned differently; it is kept as MIR then since it is in fact dead, we never actually codegen it. Without #[inline] this function is eagerly codegenned then since dylibs have poor/no dead code elimination, we end up with an undefined symbol.

@saethlin saethlin changed the title Add #[inline] to memfd_create Add #[inline] to memfd_create (so that this can be used in the compiler) Jun 2, 2023
@saethlin saethlin changed the title Add #[inline] to memfd_create (so that this can be used in the compiler) Always use a raw syscall for memfd Jun 3, 2023
@saethlin saethlin changed the title Always use a raw syscall for memfd Use a raw syscall for memfd_create on gnu Jun 3, 2023
@saxon-parker
Copy link

Are there plans to merge this MR? I've also hit this issue and would appreciate this change being merged.

@asomers
Copy link
Member

asomers commented Aug 2, 2023

No, because I still think it's the wrong solution. As described in the issue, I think it's a linker problem, and we need somebody to explore the linker more.

@saethlin
Copy link
Contributor Author

saethlin commented Aug 2, 2023

I agree with you, when nix is being used in a binary. But rustc is compiled as a shared library so linker dead code elimination is significantly less effective.

@asomers
Copy link
Member

asomers commented Aug 2, 2023

Wait, rustc itself is consuming Nix? That's surprising. It seems like a dependency loop situation. I thought that rustc had very few external dependencies.

@saethlin
Copy link
Contributor Author

saethlin commented Aug 2, 2023

I am trying to add this crate to the compiler's list of third-party dependencies. This crate is a dependency of ctrlc which is a very nice way to set up the signal handling that I need for the rustc PR that I linked in the description of this PR: rust-lang/rust#111769

I don't think it's quite accurate to say that the compiler has very few external dependencies. It certainly does not have a lot and dependencies are added rather carefully, but everyone who works on the compiler and the core tooling have done a lot of work to enable people to build a robust package ecosystem, so it doesn't really make sense to deprive ourselves of the fruits of our labor.

There's no dependency loop, for the same reason that we can write the Rust compiler in Rust. The new compiler will depend on nix but the old one will not, which we will use to build the new one.

@saxon-parker
Copy link

My use case is similar to the one mentioned in the ticket. I'm building a static library in Rust that exposes C bindings for downstream C/C++ libraries to use. The consumers of my library build on CentOS7 with glibc 2.17, which causes a linking failure when using my library due to the missing memfd_create symbol.

I would say this is more of a portability issue than a linker issue. Using the memfd_create function forces a minimum libc version of 2.27 on libraries that use this crate.

@asomers
Copy link
Member

asomers commented Aug 2, 2023

My use case is similar to the one mentioned in the ticket. I'm building a static library in Rust that exposes C bindings for downstream C/C++ libraries to use. The consumers of my library build on CentOS7 with glibc 2.17, which causes a linking failure when using my library due to the missing memfd_create symbol.

This bug makes a little more sense if it only affects cdylib projects.

I would say this is more of a portability issue than a linker issue. Using the memfd_create function forces a minimum libc version of 2.27 on libraries that use this crate.

Do you mean glibc 2.27?

@asomers
Copy link
Member

asomers commented Aug 2, 2023

I still don't like hacking Nix to workaround a glibc bug. But looking closer, I realize that memfd_create probably shouldn't be gated by the fs feature. I'm not sure why it is. It's really closer to the mman feature. Not that it's a 100% solution, but would changing it's feature gate fix your problem, @saxon-parker ? I've already verified that it would fix the problem for the ctrlc crate.

@saethlin
Copy link
Contributor Author

saethlin commented Aug 3, 2023

This is not a glibc bug. We are both targeting CentOS 7, which predates memfd_create. That system call was added in Linux 3.17/glibc 2.27, and CentOS 7 is Linux 3.10/glibc 2.17.

@saethlin
Copy link
Contributor Author

saethlin commented Aug 3, 2023

I agree that memfd_create is probably under the wrong feature, but I didn't initially propose changing that because I figured it would be a breaking change. One possible way around that is to give the fs feature a sub-feature that turns on everything except memfd_create, then have ctrlc depend only on that new sub-feature (I was thinking fs-minimal or fs-compat or something like that).

(but if you're willing to do a little compatability break to fix a mistake, don't let me of all people discourage you)

@asomers
Copy link
Member

asomers commented Aug 3, 2023

This is not a glibc bug. We are both targeting CentOS 7, which predates memfd_create. That system call was added in Linux 3.17/glibc 2.27, and CentOS 7 is Linux 3.10/glibc 2.17.

Yeah, but new syscalls get added all the time. Why is this the only one that causes linker failures?

@saxon-parker
Copy link

I still don't like hacking Nix to workaround a glibc bug. But looking closer, I realize that memfd_create probably shouldn't be gated by the fs feature. I'm not sure why it is. It's really closer to the mman feature. Not that it's a 100% solution, but would changing it's feature gate fix your problem, @saxon-parker ? I've already verified that it would fix the problem for the ctrlc crate.

Unfortunately moving memfd_create to the mman feature wouldn't solve my problem. nix is brought into my library as a dependency of the mmap-rs crate, which uses the mman feature.

I can get around this issue in other ways, so if this MR doesn't get merged it's not a huge deal for me.

@saethlin
Copy link
Contributor Author

saethlin commented Aug 3, 2023

Yeah, but new syscalls get added all the time. Why is this the only one that causes linker failures?

The Rust standard library handles use of new system calls by using a combination of weak linkage and syscall: https://github.com/rust-lang/rust/blob/fcf3006e0133365ecd26894689c086387edcbecb/library/std/src/sys/unix/weak.rs

@asomers
Copy link
Member

asomers commented Aug 6, 2023

Yeah, but new syscalls get added all the time. Why is this the only one that causes linker failures?

The Rust standard library handles use of new system calls by using a combination of weak linkage and syscall: https://github.com/rust-lang/rust/blob/fcf3006e0133365ecd26894689c086387edcbecb/library/std/src/sys/unix/weak.rs

I mean that new syscalls get added to Nix all the time. For example, I can build a cdylib that uses Nix with the "fs" feature, which enables the fspacectl function, on FreeBSD13. That works even though the fspacectl wasn't added until FreeBSD 14. And I can link that cdylib into a running program without problems. That's why I think this bug involves linker settings on your Linux distro. Or else why is memfd_create the only syscall that causes this problem?

@saethlin
Copy link
Contributor Author

saethlin commented Aug 6, 2023

You must pass the linker flag --no-allow-shlib-undefined in order to get this linker error. I don't know the first thing about FreeBSD so I can't comment there, but the only syscall wrapper I can find that this crate has added since glibc 2.17 that is actually used in all the Rust programs installed on my machine (I just ran nm | grep over all of them) is copy_file_range, and this crate implements that by calling libc::syscall.

@asomers
Copy link
Member

asomers commented Aug 6, 2023

How do you pass that linker flag? For me, the following program builds fine as a cdylib on both FreeBSD and Linux. Can you cause it to fail with that flag?

mod ffi {
    extern "C" {
        pub fn does_not_exist() -> libc::c_int;
    }
}

pub extern "C" fn foo() -> i32 {
    unsafe { ffi::does_not_exist() }
}

@Noratrieb
Copy link

Noratrieb commented Aug 8, 2023

You must pass the linker flag --no-allow-shlib-undefined in order to get this linker error. I don't know the first thing about FreeBSD so I can't comment there, but the only syscall wrapper I can find that this crate has added since glibc 2.17 that is actually used in all the Rust programs installed on my machine (I just ran nm | grep over all of them) is copy_file_range, and this crate implements that by calling libc::syscall.

This does not appear to be true when actually linking the final binary using that shared library with missing symbols (on both my nixos and the debian rust:latest container). While the dylib builds fine unless I manually pass --no-undefined (with RUSTFLAGS=-Clink-arg=-Wl,--no-undefined), the final binary fails by default with both binutils ld and ld.lld.

So "undefined symbol in shared library" is an error by default with no linker configs involved.

Here's a repro: rust-lang/rust#111769 (comment)

@asomers
Copy link
Member

asomers commented Aug 11, 2023

Oooh, it's "dylib" crates that you're worried about. I thought it was "cdylib". Now that I try using a dylib I can reproduce the problem. But I like the #[inline] solution better than the syscall one, because it preserves type safety. It also has the advantage that it will rightly fail at link time on systems that lack memfd_create, if the crate actually tries to use that function. Could you restore the PR to that version please?

@asomers asomers added this to the 0.26.3 milestone Aug 11, 2023
@saethlin saethlin changed the title Use a raw syscall for memfd_create on gnu Add #[inline] to memfd_create to delay codegen and prevent undefined symbols in dylibs Aug 11, 2023
Copy link
Member

@asomers asomers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, assuming CI passes.

@asomers asomers added this pull request to the merge queue Aug 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants