#tensorflow #gpu #nvidia
Вопрос:
Я настроил свой другой компьютер на запуск TensorFlow (Manjaro-Linux), поэтому я не использую свою основную систему для длительных вычислений, я установил все через репозиторий с помощью
sudo pacman -S cuda cudnn python-tensorflow-opt-cuda
именно это я и сделал со своей другой системой под управлением GTX 1060, я обновил все свои заголовочные файлы и ядро Linux, но я получаю ошибку от tensorflow при попытке запустить мой код таким образом, что:
2021-06-24 12:17:16.984616: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
WARNING:root:Limited tf.compat.v2.summary API due to missing TensorBoard installation.
WARNING:root:Limited tf.compat.v2.summary API due to missing TensorBoard installation.
WARNING:root:Limited tf.compat.v2.summary API due to missing TensorBoard installation.
WARNING:root:Limited tf.summary API due to missing TensorBoard installation.
2021-06-24 12:17:25.310893: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcuda.so.1
2021-06-24 12:17:25.530511: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-24 12:17:25.531007: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: NVIDIA Quadro K2200 computeCapability: 5.0
coreClock: 1.124GHz coreCount: 5 deviceMemorySize: 3.94GiB deviceMemoryBandwidth: 74.65GiB/s
2021-06-24 12:17:25.541584: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2021-06-24 12:17:25.662406: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublas.so.11
2021-06-24 12:17:25.662508: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublasLt.so.11
2021-06-24 12:17:25.712766: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcufft.so.10
2021-06-24 12:17:25.744087: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcurand.so.10
2021-06-24 12:17:25.783408: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusolver.so.11
2021-06-24 12:17:25.829728: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusparse.so.11
2021-06-24 12:17:25.847623: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudnn.so.8
2021-06-24 12:17:25.847767: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-24 12:17:25.848155: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-24 12:17:25.848463: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
Found 20580 images belonging to 120 classes.
2021-06-24 12:17:26.485288: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-06-24 12:17:26.515400: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-24 12:17:26.516586: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: NVIDIA Quadro K2200 computeCapability: 5.0
coreClock: 1.124GHz coreCount: 5 deviceMemorySize: 3.94GiB deviceMemoryBandwidth: 74.65GiB/s
2021-06-24 12:17:26.516645: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-24 12:17:26.517055: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-24 12:17:26.517521: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
2021-06-24 12:17:26.536155: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2021-06-24 12:17:31.411471: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-06-24 12:17:31.411503: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264] 0
2021-06-24 12:17:31.411509: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0: N
2021-06-24 12:17:31.411682: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-24 12:17:31.412043: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-24 12:17:31.412421: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-24 12:17:31.412844: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3398 MB memory) -> physical GPU (device: 0, name: NVIDIA Quadro K2200, pci bus id: 0000:01:00.0, compute capability: 5.0)
2021-06-24 12:17:31.430573: I tensorflow/core/common_runtime/process_util.cc:146] Creating new thread pool with default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance.
2021-06-24 12:17:31.556036: F ./tensorflow/core/kernels/random_op_gpu.h:244] Non-OK-status: GpuLaunchKernel(FillPhiloxRandomKernelLaunch<Distribution>, num_blocks, block_size, 0, d.stream(), key, counter, gen, data, size, dist) status: Internal: no kernel image is available for execution on the device
Aborted (core dumped)
Кто-нибудь может помочь мне решить эту проблему?
ОТРЕДАКТИРУЙТЕ Вот то nvidia-smi
и nvcc --version
эти были загружены с pacman
Комментарии:
1. Ваш quadro K2200-это устройство с вычислительными возможностями 5.0. Ваша сборка TF поддерживает устройства 1060 (CC 6.1), но не поддерживает устройства CC 5.0.
2. @RobertCrovella Это именно то, что я искал спасибо, я искал, какие версии tensorflow поддерживают CC 5.0, но я не смог найти ничего конкретного. Большое вам спасибо, итак, исходя из этого, что бы вы порекомендовали мне установить? Как мне найти, какие сборки TF поддерживают какие CC