Issue: When nvidia-smi is called from within a gpu workspace there are no processes listed despite the fact I have currently running jobs.
Root cause: This behaviour is not intentional however it is a result of the design model when running GPUs in containers. The reason for not being able to see the processes list is due to the restricted privileges which the container is running on the actual underlying node.
If we are to run the same command on the node instead from the workspace we will indeed see a process list:
However there is an important difference when to note here. When there are truly no processes running on the GPU the same command will actually show "No running processes found" , but in the case of there being processes which we can not list from within a container we will not see this message as shown in the first example.
Please sign in to leave a comment.