1

I am running DIGITS Docker container but for some reason it fails to recognize host's GPU: it does not report any GPUs (where I expect 1 to be reported) so in the upper right corner of the DIGITS home page there is no indication of any GPUs and also during the training phase, DIGITS uses only CPU.

enter image description here

I have GeForce GT 640 graphics card:

$ nvidia-smi -L
GPU 0: GeForce GT 640 (UUID: GPU-f2583df9-404d-2564-d332-e7878a94d087)

$ lspci ... VGA compatible controller: NVIDIA Corporation GK107 [GeForce GT 640 OEM] (rev a1) ...

GK107 is a code name for GeForce GT 640 (GDDR5) (source: https://en.wikipedia.org/wiki/GeForce_600_series) which, according to https://developer.nvidia.com/cuda-gpus, has computing capability 3.5 (which is supported as it has to be >2.1 according to https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#installing-on-ubuntu-and-debian).

This is my docker run command:

$ docker run --gpus all -d --name digits --rm -p 8888:5000 -v /home/userx/data:/data -v /home/userx/jobs:/workspace/jobs nvcr.io/nvidia/digits:20.12-tensorflow-py3

When nvidia-smi runs from Docker container, it does see the graphics card:

$ docker exec -it digits bash
root@e58b860504a9:/workspace# nvidia-smi
Fri Feb 12 23:33:17 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03    Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce GT 640      Off  | 00000000:01:00.0 N/A |                  N/A |
| 40%   32C    P8    N/A /  N/A |    260MiB /  1992MiB |     N/A      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+

I am using the latest version of Docker and Nvidia Docker:

$ docker --version
Docker version 20.10.3, build 48d30b5

$ nvidia-docker version NVIDIA Docker: 2.5.0 Client: Docker Engine - Community Version: 20.10.3 API version: 1.41 Go version: go1.13.15 Git commit: 48d30b5 Built: Fri Jan 29 14:33:21 2021 OS/Arch: linux/amd64 Context: default Experimental: true

Server: Docker Engine - Community Engine: Version: 20.10.3 API version: 1.41 (minimum version 1.12) Go version: go1.13.15 Git commit: 46229ca Built: Fri Jan 29 14:31:32 2021 OS/Arch: linux/amd64 Experimental: false containerd: Version: 1.4.3 GitCommit: 269548fa27e0089a8b8278fc4fc781d7f65a939b runc: Version: 1.0.0-rc92 GitCommit: ff819c7e9184c13b7c2607fe6c30ae19403a7aff docker-init: Version: 0.19.0 GitCommit: de40ad0

I am running Ubuntu 20.04:

$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 20.04.2 LTS
Release:    20.04
Codename:   focal

I installed the most recent version of NVIDIA driver for Ubuntu:

$ modinfo nvidia
filename:       /lib/modules/5.4.0-65-generic/updates/dkms/nvidia.ko
alias:          char-major-195-*
version:        460.32.03
supported:      external
license:        NVIDIA
srcversion:     9BFA7969070552C6938D8A8
alias:          pci:v000010DEd*sv*sd*bc03sc02i00*
alias:          pci:v000010DEd*sv*sd*bc03sc00i00*
depends:        
retpoline:      Y
name:           nvidia
vermagic:       5.4.0-65-generic SMP mod_unload 
...

Would anyone be kind to give me a hint why DIGITS running in Docker does not recognize my graphics card?

1 Answers1

0

I found the answer. https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#platform-requirements specifies compute capability requirements for NVIDIA Container Toolkit but compute capability requirements for DIGITS Docker image are specified for each image release. For digits:20.12 https://docs.nvidia.com/deeplearning/digits/digits-release-notes/rel_20-12.html#rel_20-12 states the following:

Release 20.12 supports CUDA compute capability 6.0 and higher.

My GPU does not meet that requirement.