[CD] Fix slim-wheel cuda_nvrtc import problem #145614

pytorchbot · 2025-01-24T14:19:35Z

Similar fix as: #144816

Found during testing of #138340

Please note both nvrtc and nvjitlink exist for cuda 11.8, 12.4 and 12.6 hence we can safely remove if statement. Preloading can apply to all supporting cuda versions.

CUDA 11.8 path:

(.venv) root@b4ffe5c8ac8c:/pytorch/.ci/pytorch/smoke_test# ls /.venv/lib/python3.12/site-packages/torch/lib/../../nvidia/cuda_nvrtc/lib
__init__.py  __pycache__  libnvrtc-builtins.so.11.8  libnvrtc-builtins.so.12.4  libnvrtc.so.11.2  libnvrtc.so.12
(.venv) root@b4ffe5c8ac8c:/pytorch/.ci/pytorch/smoke_test# ls /.venv/lib/python3.12/site-packages/torch/lib/../../nvidia/nvjitlink/lib
__init__.py  __pycache__  libnvJitLink.so.12

Test with rc 2.6 and CUDA 11.8:

python cudnn_test.py
2.6.0+cu118
---------------------------------------------SDPA-Flash---------------------------------------------
ALL GOOD
---------------------------------------------SDPA-CuDNN---------------------------------------------
ALL GOOD

Thank you @nWEIdia for discovering this issue

cc @seemethere @malfet @osalpekar

@nWEIdia

Similar fix as: #144816 Fixes: #145580 Found during testing of #138340 Please note both nvrtc and nvjitlink exist for cuda 11.8, 12.4 and 12.6 hence we can safely remove if statement. Preloading can apply to all supporting cuda versions. CUDA 11.8 path: ``` (.venv) root@b4ffe5c8ac8c:/pytorch/.ci/pytorch/smoke_test# ls /.venv/lib/python3.12/site-packages/torch/lib/../../nvidia/cuda_nvrtc/lib __init__.py __pycache__ libnvrtc-builtins.so.11.8 libnvrtc-builtins.so.12.4 libnvrtc.so.11.2 libnvrtc.so.12 (.venv) root@b4ffe5c8ac8c:/pytorch/.ci/pytorch/smoke_test# ls /.venv/lib/python3.12/site-packages/torch/lib/../../nvidia/nvjitlink/lib __init__.py __pycache__ libnvJitLink.so.12 ``` Test with rc 2.6 and CUDA 11.8: ``` python cudnn_test.py 2.6.0+cu118 ---------------------------------------------SDPA-Flash--------------------------------------------- ALL GOOD ---------------------------------------------SDPA-CuDNN--------------------------------------------- ALL GOOD ``` Thank you @nWEIdia for discovering this issue Pull Request resolved: #145582 Approved by: https://github.com/nWEIdia, https://github.com/eqy, https://github.com/kit1980, https://github.com/malfet Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com> (cherry picked from commit 9752c7c)

pytorch-bot · 2025-01-24T14:19:39Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/145614

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 9 Pending

As of commit 34c3e25 with merge base f7e621c ():

NEW FAILURE - The following job has failed:

pull / linux-jammy-py3-clang12-executorch / test (executorch, 1, 1, linux.2xlarge) (gh)
Process completed with exit code 1.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

nWEIdia · 2025-01-24T17:27:03Z

@atalman I am noticing that you might have tested cu124 first, and then cu118, please see that your test directory containing both libnvrtc.so.11.2 libnvrtc.so.12
so I went ahead and tested vanilla cu118 binary (standalone, not with cu124), I have an impression that this line may have prevented things from working on cu118 (i.e. cu118 binary seems to still be breaking):

if "nvidia/cuda_runtime/lib/libcudart.so" not in _maps:
return

The above libcudart.so might be too strict, I guess libcudart.so.* existence should be fine?
Please see below what cuda_runtime/lib has for cu118

/usr/local/lib/python3.12/dist-packages/torch# ls ../nvidia/cuda_runtime/lib/
init.py pycache libOpenCL.so.1 libcudart.so.11.0

nWEIdia · 2025-01-24T17:40:58Z

Only if /usr/local/lib/python3.12/dist-packages/nvidia/cuda_nvrtc/lib/libnvrtc.so symlink is created
or cu124 installed it.

by default installation, it is only libnvrtc.so.11.2
and the code only checks libnvrtc.so and otherwise return and would not execute the preload

atalman · 2025-01-24T18:05:07Z

Looks like you are right standalone cu118 is not loading:

---------------------------------------------SDPA-Flash---------------------------------------------
ALL GOOD
---------------------------------------------SDPA-CuDNN---------------------------------------------
Could not load library libnvrtc.so.11.2. Error: libnvrtc.so.11.2: cannot open shared object file: No such file or directory
Could not load library libnvrtc.so. Error: libnvrtc.so: cannot open shared object file: No such file or directory
Could not load library libnvrtc.so.11.2. Error: libnvrtc.so.11.2: cannot open shared object file: No such file or directory
Could not load library libnvrtc.so. Error: libnvrtc.so: cannot open shared object file: No such file or directory
Could not load library libnvrtc.so.11.2. Error: libnvrtc.so.11.2: cannot open shared object file: No such file or directory
Could not load library libnvrtc.so. Error: libnvrtc.so: cannot open shared object file: No such file or directory

However the file is there:

ldd /venv/lib/python3.12/site-packages/torch/lib/../../nvidia/cuda_nvrtc/lib/libnvrtc.so.11.2
	linux-vdso.so.1 (0x00007fff3d1eb000)
	libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x000079dff81d0000)
	librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x000079dff81cb000)
	libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x000079dff81c6000)
	libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x000079dff80dd000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x000079dff45ee000)
	/lib64/ld-linux-x86-64.so.2 (0x000079dff81da000)

FYI. the statement: if "nvidia/cuda_runtime/lib/libcudart.so" not in _maps: is not an issue

As per @nWEIdia workaround is: ln -s libnvrtc.so.11.2 libnvrtc.so

nWEIdia · 2025-01-24T18:07:03Z

Yeah, not sure why but two workarounds identified so far: (either of them works)

export LD_LIBRARY_PATH=/usr/local/lib/python3.12/dist-packages/nvidia/cuda_nvrtc/lib/:$LD_LIBRARY_PATH
ln -s libnvrtc.so.11.2 libnvrtc.so

nWEIdia · 2025-01-24T18:32:42Z

I am going to switch the preload order, but need the test case for the first issue.
DO not want to fix one but regress the other.

Would they be incompatible (both want to be preloaded first?)

Update: it seems the libnvjitlink test would just be "python -c 'import torch'" , so if libnvrtc test case works, libnvjitlink test must also have worked fine.

There is no libnvjitlink in CUDA-11.x , so attempts to load it first will abort the execution and prevent the script from preloading nvrtc Fixes issues reported in #145614 (comment) Pull Request resolved: #145638 Approved by: https://github.com/atalman, https://github.com/kit1980, https://github.com/malfet Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>

There is no libnvjitlink in CUDA-11.x , so attempts to load it first will abort the execution and prevent the script from preloading nvrtc Fixes issues reported in #145614 (comment) Pull Request resolved: #145638 Approved by: https://github.com/atalman, https://github.com/kit1980, https://github.com/malfet Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com> (cherry picked from commit 2a70de7)

[CUDA] Change slim-wheel libraries load order (#145638) There is no libnvjitlink in CUDA-11.x , so attempts to load it first will abort the execution and prevent the script from preloading nvrtc Fixes issues reported in #145614 (comment) Pull Request resolved: #145638 Approved by: https://github.com/atalman, https://github.com/kit1980, https://github.com/malfet Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com> (cherry picked from commit 2a70de7) Co-authored-by: Wei Wang <weiwan@nvidia.com>

There is no libnvjitlink in CUDA-11.x , so attempts to load it first will abort the execution and prevent the script from preloading nvrtc Fixes issues reported in pytorch#145614 (comment) Pull Request resolved: pytorch#145638 Approved by: https://github.com/atalman, https://github.com/kit1980, https://github.com/malfet Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>

This was referenced Jan 24, 2025

[v.2.6.0] Release Tracker #142814

Closed

[CD] Fix slim-wheel cuda_nvrtc import problem #145582

Closed

pytorchbot added the open source label Jan 24, 2025

malfet approved these changes Jan 24, 2025

View reviewed changes

malfet merged commit 3207040 into release/2.6 Jan 24, 2025
111 of 120 checks passed

nWEIdia mentioned this pull request Jan 24, 2025

[CUDA] Change slim-wheel libraries load order #145638

Closed

pytorchbot mentioned this pull request Jan 24, 2025

[CUDA] Change slim-wheel libraries load order #145662

Merged

nWEIdia mentioned this pull request Jan 29, 2025

torch crashes on ubuntu:24.04 during SDPA-CuDNN test #145580

Closed

github-actions bot deleted the cherry-pick-145582-by-pytorch_bot_bot_ branch February 24, 2025 02:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CD] Fix slim-wheel cuda_nvrtc import problem #145614

[CD] Fix slim-wheel cuda_nvrtc import problem #145614

pytorchbot commented Jan 24, 2025

pytorch-bot bot commented Jan 24, 2025 •

edited

Loading

nWEIdia commented Jan 24, 2025

nWEIdia commented Jan 24, 2025

atalman commented Jan 24, 2025 •

edited

Loading

nWEIdia commented Jan 24, 2025

nWEIdia commented Jan 24, 2025 •

edited

Loading

[CD] Fix slim-wheel cuda_nvrtc import problem #145614

[CD] Fix slim-wheel cuda_nvrtc import problem #145614

Conversation

pytorchbot commented Jan 24, 2025

pytorch-bot bot commented Jan 24, 2025 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/145614

❌ 1 New Failure, 9 Pending

nWEIdia commented Jan 24, 2025

nWEIdia commented Jan 24, 2025

atalman commented Jan 24, 2025 • edited Loading

nWEIdia commented Jan 24, 2025

nWEIdia commented Jan 24, 2025 • edited Loading

pytorch-bot bot commented Jan 24, 2025 •

edited

Loading

atalman commented Jan 24, 2025 •

edited

Loading

nWEIdia commented Jan 24, 2025 •

edited

Loading