Tensorflow повторил сообщения об успешном выполнении, и узел NUMA прочитал предупреждение

#python #tensorflow

Вопрос:

Я только что установил cuda 11.2 через файл выполнения, а tensorflow через pip install tensorflow Ubuntu 20.04 с Python 3.8. При создании тензора я получаю странные показания, а использование памяти на моем RTX 3090 достигает 95%

 Python 3.8.5 (default, May 27 2021, 13:30:53) 
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
2021-06-25 10:42:08.881025: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
>>> a = tf.zeros(1)
2021-06-25 10:42:16.739723: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcuda.so.1
2021-06-25 10:42:16.775681: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-25 10:42:16.776749: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties: 
pciBusID: 0000:09:00.0 name: GeForce RTX 3090 computeCapability: 8.6
coreClock: 1.695GHz coreCount: 82 deviceMemorySize: 23.69GiB deviceMemoryBandwidth: 871.81GiB/s
2021-06-25 10:42:16.776781: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2021-06-25 10:42:16.779208: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublas.so.11
2021-06-25 10:42:16.779252: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublasLt.so.11
2021-06-25 10:42:16.780078: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcufft.so.10
2021-06-25 10:42:16.780261: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcurand.so.10
2021-06-25 10:42:16.780973: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusolver.so.11
2021-06-25 10:42:16.781346: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusparse.so.11
2021-06-25 10:42:16.781423: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudnn.so.8
2021-06-25 10:42:16.781476: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-25 10:42:16.782026: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-25 10:42:16.782642: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
2021-06-25 10:42:16.782902: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-06-25 10:42:16.783252: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-25 10:42:16.783876: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties: 
pciBusID: 0000:09:00.0 name: GeForce RTX 3090 computeCapability: 8.6
coreClock: 1.695GHz coreCount: 82 deviceMemorySize: 23.69GiB deviceMemoryBandwidth: 871.81GiB/s
2021-06-25 10:42:16.783920: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-25 10:42:16.784470: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-25 10:42:16.785250: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
2021-06-25 10:42:16.785276: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2021-06-25 10:42:17.059795: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-06-25 10:42:17.059823: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264]      0 
2021-06-25 10:42:17.059828: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0:   N 
2021-06-25 10:42:17.059954: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-25 10:42:17.060459: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-25 10:42:17.061025: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-25 10:42:17.061578: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 21542 MB memory) -> physical GPU (device: 0, name: GeForce RTX 3090, pci bus id: 0000:09:00.0, compute capability: 8.6)
 

Есть какие-нибудь идеи?

Комментарии:

1. Это поведение по умолчанию, не о чем беспокоиться. Они являются просто информационным сообщением в том виде , в каком они имеют префикс I , если это сообщение об ошибке, в котором они будут иметь префикс E или W для предупреждений. Спасибо!

2. Спасибо @TFer2 . В итоге я просто согласился с этим, но это все равно довольно раздражает, так как скрывает другие полезные предупреждения.

3. Предоставлено решение для подавления ведения журнала. Я надеюсь, что это вам поможет. Спасибо!

4. @TFer2 У меня такое же поведение. Что означает это информационное сообщение? Я ничего не могу найти о NUMA node в Google.

Ответ №1:

Я смог воспроизвести вашу проблему

 (base) XXXXXX@XXXXX-Xlaptop:~$ python
Python 3.7.7 (default, Mar 26 2020, 15:48:22) 
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow
2021-07-08 16:22:12.609456: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
>>> import tensorflow as tf
>>> a = tf.zeros(1)
2021-07-08 16:23:07.820538: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcuda.so.1
2021-07-08 16:23:07.896686: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (mothukuru-glaptop): /proc/driver/nvidia/version does not exist
2021-07-08 16:23:07.897348: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
 

Они являются просто информационным сообщением в том виде , в каком они имеют префикс I , если это сообщение об ошибке, в котором они будут иметь префикс E или W для предупреждений.

Чтобы отключить ведение журнала, вы можете попробовать, как показано ниже

 >>> import tensorflow.compat.v1 as tf
>>> tf.logging.set_verbosity(tf.logging.ERROR)
>>> a = tf.zeros(1)