Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rustup-init.sh fails to detect platform correctly under docker buildx which lacks /proc #2700

Open
miigotu opened this issue Mar 26, 2021 · 18 comments · May be fixed by #3800
Open

rustup-init.sh fails to detect platform correctly under docker buildx which lacks /proc #2700

miigotu opened this issue Mar 26, 2021 · 18 comments · May be fixed by #3800
Labels
E-mentor enhancement help wanted O-bsd *BSD related O-containers Not really an OS, but container-specific
Milestone

Comments

@miigotu
Copy link

miigotu commented Mar 26, 2021

rustup-init.sh installs the incorrect rustc and other binaries because of a failure to detect arch.
Problems: /proc/self/exe does not exist during docker build, so i686/386 etc is detected incorrectly as x86_64 due to failure on line 153. mips64 likely suffers the same issue because it also uses get_bitness

grep '^Features' /proc/cpuinfo | grep -q -v neon fails and ARMv6 is incorrectly detected as arm7 on line 367
Running the downloaded binary fails with /lib/ld-linux-armhf.so.3: No such file or directory (because it isnt armhf, it is armel at /lib/ld-linux.so.3 )

Logs and example code to produce the container and error:
https://gist.github.com/miigotu/2a0b80677420d806c96d8e792ae6652e

Note: gcc inside the container reports the correct info, kernel reports x86_64

root@386c88edfbc5:/# uname -m
x86_64
root@386c88edfbc5:/# gcc -dumpmachine | sed "s/-/-$(uname -p)-/"
i686-unknown-linux-gnu

and

root@dac44b74e4c0:/# uname -m
armv7l
root@dac44b74e4c0:/# gcc -dumpmachine | sed "s/-/-$(uname -p)-/"
arm-unknown-linux-gnueabi
@kinnison
Copy link
Contributor

We use /proc/self/exe because that tends to tell us the userland host type rather than uname -m which would tell us the kernel architecture. Yes it's possible that's correct, but it's also possible for it to be wrong. E.g. some aarch64 systems can run 32-bit userlands, some armhf kernels can run armel userlands, etc.

It sounds like this is a limitation of docker buildx somehow not providing /proc which is unfortunate.

Any work to correct this would need to be fallback code in rustup-init.sh where if it cannot use /proc/self for some reason it looks at alternatives with suitable warnings.

If someone wants to work on this, please talk to us on the Rust discord in #wg-rustup because it will need some careful discussion.

@kinnison kinnison changed the title rustup-init.sh fails to detect correct arch/os when building with docker, and installed incorrect binaries rustup-init.sh fails to detect platform correctly under docker buildx which lacks /proc Mar 26, 2021
@miigotu
Copy link
Author

miigotu commented Mar 26, 2021

This is my temporary solution that works (building python cryptography):

# rust installer needs patched to get the correct binaries for ARMv6 and i686
RUN sed -i -e's/ main/ main contrib non-free/gm' /etc/apt/sources.list
RUN apt-get update -q && \
 apt-get install -yq build-essential curl git libssl-dev libffi-dev libxml2 libxml2-dev libxslt1.1 libxslt-dev libz-dev mediainfo python3-dev unrar nano && \
 pip install -U pip wheel && \
 curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs > rustup-init.sh && \
 sed -i 's#/proc/self/exe#$(which head)#g' rustup-init.sh && \
 sed -i 's#/proc/cpuinfo#/proc/cpuinfo 2> /dev/null || echo ''#g' rustup-init.sh && \
 sed -i 's#get_architecture || return 1#RETVAL=$(gcc -dumpmachine | sed "s/-/-unknown-/") #g' rustup-init.sh && \
 sh -x rustup-init.sh -y --default-host=$(gcc -dumpmachine | sed 's/-/-unknown-/') && \
 rm rustup-init.sh && \
 PATH=$PATH:$HOME/.cargo/bin pip install --no-cache-dir --no-input -Ur requirements.txt && \
 PATH=$PATH:$HOME/.cargo/bin rustup self uninstall -y && \
 apt-get purge -yq --autoremove build-essential libssl-dev libffi-dev libxml2-dev libxslt-dev libz-dev python3-dev && \
 apt-get clean -yq && rm -rf /var/lib/apt/lists/*

@workingjubilee
Copy link
Contributor

I do not believe all BSDs support /proc, so I am labeling this as a BSD issue until it is confirmed otherwise.
@rustbot label:+O-bsd

@rustbot rustbot added the O-bsd *BSD related label Apr 29, 2021
@miigotu
Copy link
Author

miigotu commented Apr 30, 2021

I do not believe all BSDs support /proc, so I am labeling this as a BSD issue until it is confirmed otherwise.
@rustbot label:+O-bsd

This is debian buster

@workingjubilee
Copy link
Contributor

True, but the relevant high-order bit there seemed to be
@rustbot label: +O-containers

@rustbot rustbot added the O-containers Not really an OS, but container-specific label May 21, 2021
@kinnison kinnison added this to the 1.25.0 milestone Jun 8, 2021
@vadixidav
Copy link

I am getting this on macOS Monterey:

vscode ➜ ~ $ curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
/usr/bin/head: error reading '/proc/self/exe': Bad file descriptor
/usr/bin/head: failed to close '/proc/self/exe': Bad file descriptor
rustup: unknown platform bitness
/bin/sh: 358: [: Illegal number: 
info: downloading installer

I am inside of a container built on: FROM --platform=linux/amd64 mcr.microsoft.com/vscode/devcontainers/cpp:0-debian-11. I am running linux/amd64 under Docker's built-in QEMU emulation. The container build process is fairly involved, so it definitely functions in general. I also can run other x86_64 applications in general in the container.

Cargo does install correctly though, so it doesn't hurt anything. Originally I thought this was a problem, but it still installs as expected. I figured this was still worth reporting here as another example of this occurring.

@workingjubilee
Copy link
Contributor

Maybe I am misreading, but I can't quite tell: What CPU architecture is the host? AArch64? AMD64? PowerPC?

@vadixidav
Copy link

Maybe I am misreading, but I can't quite tell: What CPU architecture is the host? AArch64? AMD64? PowerPC?

aarch64-apple-darwin

@kinnison
Copy link
Contributor

Perhaps we could switch from reading /proc/self/exe to reading $SHELL - would there be any situations we can think of where that wouldn't work?

@miigotu
Copy link
Author

miigotu commented Sep 28, 2022

I am getting this on macOS Monterey:

vscode ➜ ~ $ curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
/usr/bin/head: error reading '/proc/self/exe': Bad file descriptor
/usr/bin/head: failed to close '/proc/self/exe': Bad file descriptor
rustup: unknown platform bitness
/bin/sh: 358: [: Illegal number: 
info: downloading installer

I am inside of a container built on: FROM --platform=linux/amd64 mcr.microsoft.com/vscode/devcontainers/cpp:0-debian-11. I am running linux/amd64 under Docker's built-in QEMU emulation. The container build process is fairly involved, so it definitely functions in general. I also can run other x86_64 applications in general in the container.

Cargo does install correctly though, so it doesn't hurt anything. Originally I thought this was a problem, but it still installs as expected. I figured this was still worth reporting here as another example of this occurring.

This is how I have been understanding it, I could be wrong entirely.

If you are providing FROM --platform=$TARGETPLATFOM imagine:tag in the docker file, you aren't cross compiling inside the container, and the default target for cargo/rust is all you need. It's downloading the image from the manifest that matches the target arch, and running that image with qemu.

With docker buildx --platforms linux/amd64,linux/arm64 ... case, you are using qemu to boot and build the dockerfile AS that target platform. --platform=$*PLATFORM shouldn't need added at all since it has already booted the correct image with qemu. As far as I'm concerned, --platform should be the same as if it were injected and be exactly the same as if you had put --platform=$TARGETPLATFOM in the dockerfile.

When you need to add a target is when inside your docker file you have FROM --platform=$BUILDPLATFORM image name:tag (your host arch) and building targets inside that are not for your host arch with a cross compiler. (When it's a different image than what buildx thought it should boot)

The definition of cross is when building a binary for a different architecture than the OS currently running. But it's 2 systems entirely, a docker image and the host OS. With one --platform arg inside the docker container you are just compiling (not cross), and outside you are building a dockerfile with cross, not cross compiling.
With the other --platform= arg you are building a dockerfile and inside you are cross compiling.

Inside vs outside cross is a confusing situation right now with buildx.

This has been some help understanding the confusion, but not enough:
https://github.com/BretFisher/multi-platform-docker-build

I'm currently having this exact issue again, without --platform in my dockerfile, using a base image of python3.10-slim (debian bullseye base)

I'm going to try my previous hack some later, but maybe a bit different. Super annoying.

@miigotu
Copy link
Author

miigotu commented Oct 3, 2022

So, I have made some progress on this. Buildkit runs in a security context that prevents the build from accessing /proc and other mounts.

I am testing right now by providing these changes to my workflow:

You have to pass this setting to buildkit and use a setting in the dockerfile.

Pass allow-insecure-entitlement security.insecure to buildkitd in one of three ways:
docker buildx --buildkitd-flags '--allow-insecure-entitlement security.insecure'
When creating the builder with:
--allow-insecure-entitlement security.insecure
Or in the buildkit.toml config file:

insecure-entitlements = [ "security.insecure" ]

Then in your dockerfile:
RUN --security=insecure curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y

Test in buildkit shows exactly this:
https://github.com/moby/buildkit/pull/1081/files#diff-d7f92add99ec729fffc073a432807fecbabd9fe2bb0dc35608b1eeef1fba69dbR29

Now, since we know we can't read /proc/self/exe out of the box when using BUILDKIT=1 docker ... the question is should it be documented, or should we build in a fallback or test to see if we are running under buildkit that then uses a different method? It will require a few changes to configuration for people who don't know what's happening.

@miigotu
Copy link
Author

miigotu commented Oct 4, 2022

I found the solution! Demo and explanation incoming. Just need some documentation I think, there is nothing broken in rustup.

I spoke too soon. I successfully got security-insecure to work on github actions, and the script downloads the correct binaries for host/target, but the error still happens when trying to read /proc/self/exe. iirc from when I opened this issue it is a specially protected file in buildkit/docker to prevent exploiting any vulnerabilities and escaping the container.

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sed 's#/proc/self/exe#$SHELL#g' | sh -s -- -y still gets around this problem. The arm6 vs arm7 problem is a bit more of a problem if someone needs support for armv6. But since armv6 is not supported by most official images anymore I guess that's not such a big deal.

I'll continue to look at it.

Here was my POC
https://github.com/miigotu/actions-security-insecure-demo
https://github.com/miigotu/actions-security-insecure-demo/actions

@miigotu
Copy link
Author

miigotu commented Oct 4, 2022

POC seems to work with /proc/self/exe, but not rustup... Same command lol

@kerberjg
Copy link

@miigotu I actually managed to get it running! I tried your command from the previous comment, but it seems that the SHELL env var was undeclared, so I just replaced it with "/bin/sh" and voilà!

RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sed 's#/proc/self/exe#\/bin\/sh#g' | sh -s -- -y

For transparency: I was running the build through compose, with the following adjustments:

  • Command COMPOSE_DOCKER_CLI_BUILD=1 DOCKER_BUILDKIT=1 docker-compose build
  • Dockerfile has FROM --platform=linux/amd64 ubuntu:20.04 at the beginning
  • docker-compose has platform: linux/amd64 on the container (probably superfluous)

ctron added a commit to ctron/trunk-container that referenced this issue Mar 6, 2023
@bmarwell
Copy link

It's been a year. running the current rustup with FROM ubuntu:16.04 will result in:

error: could not read metadata for file: '/tmp/tmp.JlK1p9xHCy/rustup-init': Function not implemented (os error 38)

@rami3l rami3l removed this from the 1.25.0 milestone Jan 17, 2024
@rami3l rami3l added this to the 1.28.0 milestone Jan 17, 2024
@cschwan
Copy link

cschwan commented Apr 27, 2024

@bmarwell do you still see the problem? I've also encountered it with the manylinux2014_x86_64 container (Containerfile). I followed it with strace:

[pid  9489] statx(24, "usr/local/cargo/bin", AT_STATX_DONT_SYNC|AT_SYMLINK_NOFOLLOW, STATX_TYPE|STATX_MODE, {stx_mask=STATX_BASIC_STATS|STATX_MNT_ID, stx_attributes=0, stx_mode=S_IFDIR|0755, stx_size=4096, ...}) =
 0
[pid  9784] statx(AT_FDCWD, "/usr/local/cargo/bin/rustup", AT_STATX_SYNC_AS_STAT, STATX_ALL,  <unfinished ...>
[pid  9489] statx(24, "usr/local/cargo", AT_STATX_DONT_SYNC|AT_SYMLINK_NOFOLLOW, STATX_BASIC_STATS, {stx_mask=STATX_BASIC_STATS|STATX_MNT_ID, stx_attributes=0, stx_mode=S_IFDIR|0755, stx_size=4096, ...}) = 0
[pid  9784] <... statx resumed>0x7ffcfcd8b040) = -1 ENOENT (No such file or directory)
[pid  9784] statx(AT_FDCWD, "/tmp/tmp.3p2dR3GpcZ/rustup-init", AT_STATX_SYNC_AS_STAT|AT_SYMLINK_NOFOLLOW, STATX_ALL, 0x7ffcfcd8ad50) = -1 ENOSYS (Function not implemented)
error: could not read metadata for file: '/tmp/tmp.3p2dR3GpcZ/rustup-init': Function not implemented (os error 38)

This problem seems unrelated to the original issue, but I'm writing here because this is the only place I found a report of the same behaviour. Could this be a problem of a missing kernel configuration? I found a patch for a kernel bug: https://lore.kernel.org/lkml/20240414003434.2659-1-danny@orbstack.dev/, but the described scenario doesn't quite fit the description; in the output above you see that the failing statx syscall isn't the first statx call.

@bmarwell
Copy link

Uh I gave up on compiling on 16.04, sorry

@cschwan
Copy link

cschwan commented Apr 28, 2024

In my case it turned out that a podman system reset (deletes all containers, images, etc.) did the trick; possibly the container was bugged and a new version fixed the problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
E-mentor enhancement help wanted O-bsd *BSD related O-containers Not really an OS, but container-specific
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants