蓝娅萍 发表于 2025-5-31 23:54:09

解决Cupy相关报错

问题背景

仅作为一个记录,介绍一些单独安装和使用cupy的过程中有可能遇到的一些报错及相应的解决办法,还有一些问题是配置过程中遇到的环境问题。
libnvrtc相关问题

报错信息:
RuntimeError: CuPy failed to load libnvrtc.so.11.2: OSError: libnvrtc.so.11.2: cannot open shared object file: No such file or directory解决方案:
$ sudo find / -name "libnvrtc.so.11.2
$ export LD_LIBRARY_PATH=xxx/lib/:$LD_LIBRARY_PATH配置好正确的libnvrtc路径即可。
cuVS相关问题

报错信息:
RuntimeError: cuVS >= 24.12 or pylibraft < 24.12 should be installed to use this feature解决方案:
$ python3 -m pip install --upgrade cuvs-cu11==24.12.*cdist计算报错

报错信息:
Traceback (most recent call last):
    dis = cdist(middle, grid)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/lib/python3.12/site-packages/cupyx/scipy/spatial/distance.py", line 630, in cdist
    pairwise_distance(XA, XB, output_arr, metric, p)
File "resources.pyx", line 110, in cuvs.common.resources.auto_sync_resources.wrapper
File "/root/miniconda3/lib/python3.12/site-packages/pylibraft/common/outputs.py", line 83, in wrapper
    ret_value = f(*args, **kwargs)
                ^^^^^^^^^^^^^^^^^^
File "distance.pyx", line 140, in cuvs.distance.distance.pairwise_distance
File "exceptions.pyx", line 37, in cuvs.common.exceptions.check_cuvs
cuvs.common.exceptions.CuvsException: CUDA error encountered at: file=/pyenv/versions/3.12.9/lib/python3.12/site-packages/libraft/include/raft/linalg/detail/coalesced_reduction-inl.cuh line=271: call='cudaPeekAtLastError()', Reason=cudaErrorInvalidConfiguration:invalid configuration argument
Obtained 26 stack frames
#1 in /root/miniconda3/lib/python3.12/site-packages/libcuvs/lib64/libcuvs.so: raft::cuda_error::cuda_error(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) +0xbd
#2 in /root/miniconda3/lib/python3.12/site-packages/libcuvs/lib64/libcuvs.so(+0x4310f5)
#3 in /root/miniconda3/lib/python3.12/site-packages/libcuvs/lib64/libcuvs.so(+0xf383ac)
#4 in /root/miniconda3/lib/python3.12/site-packages/libcuvs/lib64/libcuvs.so(+0xf3865f)
#5 in /root/miniconda3/lib/python3.12/site-packages/libcuvs/lib64/libcuvs.so: void cuvs::distance::pairwise_distance<float, int, float>(raft::resources const&, float const*, float const*, float*, int, int, int, cuvsDistanceType, bool, float) +0x3b6
#6 in /root/miniconda3/lib/python3.12/site-packages/libcuvs/lib64/libcuvs.so: void cuvs::distance::pairwise_distance<float, std::experimental::layout_right, int, float>(raft::resources const&, std::experimental::mdspan<float const, std::experimental::extents<int, 18446744073709551615ul, 18446744073709551615ul>, std::experimental::layout_right, raft::host_device_accessor<std::experimental::default_accessor<float const>, (raft::memory_type)2> >, std::experimental::mdspan<float const, std::experimental::extents<int, 18446744073709551615ul, 18446744073709551615ul>, std::experimental::layout_right, raft::host_device_accessor<std::experimental::default_accessor<float const>, (raft::memory_type)2> >, std::experimental::mdspan<float, std::experimental::extents<int, 18446744073709551615ul, 18446744073709551615ul>, std::experimental::layout_right, raft::host_device_accessor<std::experimental::default_accessor<float>, (raft::memory_type)2> >, cuvsDistanceType, float) +0x3c7
#7 in /root/miniconda3/lib/python3.12/site-packages/libcuvs/lib64/libcuvs_c.so(+0x7a1a3)
#8 in /root/miniconda3/lib/python3.12/site-packages/libcuvs/lib64/libcuvs_c.so: cuvsPairwiseDistance +0x37
#9 in /root/miniconda3/lib/python3.12/site-packages/cuvs/distance/distance.cpython-312-x86_64-linux-gnu.so(+0x48237)
#10 in python3: _PyObject_Call +0x122
#11 in python3: _PyEval_EvalFrameDefault +0x503a
#12 in python3: PyVectorcall_Call +0xe1
#13 in /root/miniconda3/lib/python3.12/site-packages/cuvs/common/resources.cpython-312-x86_64-linux-gnu.so(+0x4850e)
#14 in python3: _PyObject_MakeTpCall +0x2fb
#15 in python3: _PyEval_EvalFrameDefault +0x6ce
#16 in python3: PyEval_EvalCode +0xae
#17 in python3()
#18 in python3()
#19 in python3()
#20 in python3: _PyRun_SimpleFileObject +0x1b0
#21 in python3: _PyRun_AnyFileObject +0x43
#22 in python3: Py_RunMain +0x303
#23 in python3: Py_BytesMain +0x39
#24 in /lib/x86_64-linux-gnu/libc.so.6(+0x29d90)
#25 in /lib/x86_64-linux-gnu/libc.so.6: __libc_start_main +0x80
#26 in python3() 原因:输入了0维数组,需确认输入正确。
libffi相关问题

报错信息:
$ git pull
/usr/lib/git-core/git-remote-https: symbol lookup error: /lib/x86_64-linux-gnu/libp11-kit.so.0: undefined symbol: ffi_type_pointer, version LIBFFI_BASE_7.0解决方案:
$ sudo find / -name libffi*
xxx/lib/libffi.so.7找到相应的动态链接库文件之后,先做一个备份:
$ mv xxx/lib/libffi.so.7 xxx/lib/libffi.so.7.bak然后制作一个软链接:
$ sudo ln -s /usr/lib/x86_64-linux-gnu/libffi.so.7.1.0 xxx/lib/libffi.so.7确保相应的动态链接库在系统路径的配置中:
export LD_LIBRARY_PATH=/usr/lib/x86_64-linux-gnu:$LD_LIBRARY_PATH总结概要

本文记录了一些使用python-cupy的过程中有可能的遇到的一些问题,一部分是环境配置问题,还有一部分是运行输入问题。
版权声明

本文首发链接为:https://www.cnblogs.com/dechinphy/p/cupy-error.html
作者ID:DechinPhy
更多原著文章:https://www.cnblogs.com/dechinphy/
请博主喝咖啡:https://www.cnblogs.com/dechinphy/gallery/image/379634.html

来源:程序园用户自行投稿发布,如果侵权,请联系站长删除
免责声明:如果侵犯了您的权益,请联系站长,我们会及时删除侵权内容,谢谢合作!
页: [1]
查看完整版本: 解决Cupy相关报错