Error: ImagePullBackOff

I am trying to install nvidia gpu operator in my self-managed k8s cluster.

```
helm install --wait --generate-name \
    -n gpu-operator --create-namespace \
    nvidia/gpu-operator \
    --version=v25.10.1
```

Facing issues in pod -creation

```
abhijithmallya@abhijithmallya-ROG-Strix-G512LI-G512LI:~$ kubectl get pods -n gpu-operator 
NAME                                                              READY   STATUS             RESTARTS   AGE
gpu-operator-1767031327-node-feature-discovery-gc-6fff94bftgpjn   1/1     Running            0          6m7s
gpu-operator-1767031327-node-feature-discovery-master-55b6z7f4g   1/1     Running            0          6m7s
gpu-operator-1767031327-node-feature-discovery-worker-dslpc       1/1     Running            0          6m7s
gpu-operator-6996bfc8df-82c66                                     0/1     ImagePullBackOff   0          6m7s
```


Describing the pod issue

```
abhijithmallya@abhijithmallya-ROG-Strix-G512LI-G512LI:~$ kubectl describe pod -n gpu-operator  gpu-operator-6996bfc8df-82c66 
Name:                 gpu-operator-6996bfc8df-82c66
Namespace:            gpu-operator
Priority:             2000001000
Priority Class Name:  system-node-critical
Service Account:      gpu-operator
Node:                 abhijithmallya-rog-strix-g512li-g512li/192.168.1.16
Start Time:           Mon, 29 Dec 2025 23:32:11 +0530
Labels:               app=gpu-operator
                      app.kubernetes.io/component=gpu-operator
                      app.kubernetes.io/instance=gpu-operator-1767031327
                      app.kubernetes.io/managed-by=Helm
                      app.kubernetes.io/name=gpu-operator
                      app.kubernetes.io/version=v24.9.2
                      helm.sh/chart=gpu-operator-v24.9.2
                      nvidia.com/gpu-driver-upgrade-drain.skip=true
                      pod-template-hash=6996bfc8df
Annotations:          cni.projectcalico.org/containerID: 9998b0f74cef936f7d7e95951d4011cb621420e046a9ea46e32d51a16d94a41a
                      cni.projectcalico.org/podIP: 172.16.111.146/32
                      cni.projectcalico.org/podIPs: 172.16.111.146/32
                      openshift.io/scc: restricted-readonly
Status:               Pending
IP:                   172.16.111.146
IPs:
  IP:           172.16.111.146
Controlled By:  ReplicaSet/gpu-operator-6996bfc8df
Containers:
  gpu-operator:
    Container ID:  
    Image:         nvcr.io/nvidia/gpu-operator:v24.9.2
    Image ID:      
    Port:          8080/TCP
    Host Port:     0/TCP
    Command:
      gpu-operator
    Args:
      --leader-elect
      --zap-time-encoding=epoch
      --zap-log-level=info
    State:          Waiting
      Reason:       ImagePullBackOff
    Ready:          False
    Restart Count:  0
    Limits:
      cpu:     500m
      memory:  350Mi
    Requests:
      cpu:      200m
      memory:   100Mi
    Liveness:   http-get http://:8081/healthz delay=15s timeout=1s period=20s #success=1 #failure=3
    Readiness:  http-get http://:8081/readyz delay=5s timeout=1s period=10s #success=1 #failure=3
    Environment:
      WATCH_NAMESPACE:       
      OPERATOR_NAMESPACE:    gpu-operator (v1:metadata.namespace)
      DRIVER_MANAGER_IMAGE:  nvcr.io/nvidia/cloud-native/k8s-driver-manager:v0.7.0
    Mounts:
      /host-etc/os-release from host-os-release (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-dtskt (ro)
Conditions:
  Type                        Status
  PodReadyToStartContainers   True 
  Initialized                 True 
  Ready                       False 
  ContainersReady             False 
  PodScheduled                True 
Volumes:
  host-os-release:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/os-release
    HostPathType:  
  kube-api-access-dtskt:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    Optional:                false
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 node-role.kubernetes.io/control-plane:NoSchedule
                             node-role.kubernetes.io/master:NoSchedule
                             node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason     Age                    From               Message
  ----     ------     ----                   ----               -------
  Normal   Scheduled  6m35s                  default-scheduler  Successfully assigned gpu-operator/gpu-operator-6996bfc8df-82c66 to abhijithmallya-rog-strix-g512li-g512li
  Normal   Pulling    2m48s (x5 over 6m35s)  kubelet            Pulling image "nvcr.io/nvidia/gpu-operator:v24.9.2"
  Warning  Failed     2m45s (x5 over 5m59s)  kubelet            Failed to pull image "nvcr.io/nvidia/gpu-operator:v24.9.2": unable to pull image or OCI artifact: pull image err: initializing source docker://nvcr.io/nvidia/gpu-operator:v24.9.2: Requesting bearer token: received unexpected HTTP status: 403 Forbidden; artifact err: pull artifact: initializing source docker://nvcr.io/nvidia/gpu-operator:v24.9.2: Requesting bearer token: received unexpected HTTP status: 403 Forbidden
  Warning  Failed     2m45s (x5 over 5m59s)  kubelet            Error: ErrImagePull
  Warning  Failed     52s (x20 over 5m59s)   kubelet            Error: ImagePullBackOff
  Normal   BackOff    38s (x21 over 5m59s)   kubelet            Back-off pulling image "nvcr.io/nvidia/gpu-operator:v24.9.2"
```


====

Please help me resolve this issue. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Error: ImagePullBackOff #2016

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Error: ImagePullBackOff #2016

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions