To use NVIDIA GPUs for Kubernetes pods, follow these steps:
-
Install the NVIDIA Container Toolkit on your Kubernetes nodes. This toolkit provides a set of NVIDIA drivers, libraries, and tools that enable GPU support in containers.
-
Configure your Kubernetes cluster to use the NVIDIA device plugin. This plugin allows Kubernetes to discover and use the GPUs available on each node.
-
Create a Kubernetes pod that specifies the GPU resources required by the container. You can do this by adding the
resources
section to your pod's YAML file and specifying thenvidia.com/gpu
resource type. -
Launch the pod in your Kubernetes cluster. Kubernetes will use the NVIDIA device plugin to allocate a GPU to the container.
Here's an example YAML file for a Kubernetes pod that uses an NVIDIA GPU:
apiVersion: v1
kind: Pod
metadata:
name: gpu-pod
spec:
containers:
- name: gpu-container
image: nvidia/cuda:11.3.0-runtime-ubuntu20.04
resources:
limits:
nvidia.com/gpu: 1
In this example, the nvidia/cuda
image is used for the container, and the resources
section specifies that the container requires one NVIDIA GPU.
Once you have launched the pod, you can verify that the GPU is being used by running a CUDA program inside the container.
Note that GPU support in Kubernetes requires some additional setup and configuration compared to running containers without GPUs. You'll need to ensure that your cluster has compatible NVIDIA GPUs and that the NVIDIA drivers and toolkit are installed on each node. Additionally, some Kubernetes distributions may have different requirements for GPU support, so be sure to consult your distribution's documentation for more information.