KubeasyKubeasy
Troubleshooting

Debugging CrashLoopBackOff in Kubernetes

Step-by-step guide to diagnose and fix CrashLoopBackOff errors. Covers log analysis, common causes, and practical solutions.

Paul BrissaudPaul Brissaud
2 min read
#troubleshooting#pods

Your pod is stuck in a restart loop. Every few seconds, Kubernetes tries to start it, it fails, and the cycle repeats. The status says CrashLoopBackOff, and you're not sure where to start. This is one of the most common—and frustrating—Kubernetes errors.

Quick Answer

CrashLoopBackOff means your container keeps crashing, and Kubernetes is backing off before retrying. To diagnose:

kubectl logs <pod-name> --previous

This shows logs from the last crashed container. The error message there usually points to the root cause.

What Does CrashLoopBackOff Mean?

CrashLoopBackOff is not an error itself—it's a state. It means:

  • Your container started
  • It crashed (exited with non-zero code)
  • Kubernetes restarted it
  • It crashed again
  • Kubernetes is now waiting longer between restarts (the "backoff")
  • The backoff delay increases exponentially: 10s, 20s, 40s... up to 5 minutes. This prevents a broken container from consuming all cluster resources.

    Common Causes

    Step-by-Step Troubleshooting

    Step 1: Get the Pod Status

    kubectl get pods

    Look for the RESTARTS column. A high number confirms the crash loop.

    NAME                     READY   STATUS             RESTARTS   AGE
    my-app-7d4b8c6f9-x2k4j   0/1     CrashLoopBackOff   5          3m

    Step 2: Check the Logs

    This is usually where you find the answer:

    # Current container logs (if it's running)
    kubectl logs <pod-name>
    
    # Previous container logs (after crash)
    kubectl logs <pod-name> --previous
    
    # Logs from a specific time range
    kubectl logs <pod-name> --since=1h
    
    # All containers in a multi-container pod
    kubectl logs <pod-name> --all-containers

    Look for:

  • Stack traces
  • "Error:", "Fatal:", "Exception" messages
  • Connection failures
  • Missing file/config errors
  • Step 3: Describe the Pod

    Get detailed information about the pod's state:

    kubectl describe pod <pod-name>

    Key sections to examine:

    Last State - Shows why the container exited:

    Last State:     Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Mon, 27 Jan 2026 10:00:00 +0000
      Finished:     Mon, 27 Jan 2026 10:00:05 +0000

    Events - Shows recent activity:

    Events:
      Type     Reason     Age                From               Message
      ----     ------     ----               ----               -------
      Warning  BackOff    1m (x5 over 3m)    kubelet            Back-off restarting failed container

    Step 4: Check Exit Codes

    Common exit codes and their meanings:

    Solutions by Cause

    Cause A: Application Error

    Symptoms: Logs show stack traces, unhandled exceptions, or application-specific errors.

    Fix: This is a code issue. Debug locally, fix the bug, rebuild and redeploy.

    # Test locally first
    docker run -it <your-image> sh

    Cause B: Missing Configuration

    Symptoms: Logs show "environment variable not set" or "config file not found".

    Fix: Ensure ConfigMaps and Secrets are properly mounted:

    env:
    - name: DATABASE_URL
      valueFrom:
        secretKeyRef:
          name: db-credentials
          key: url

    Verify the Secret exists:

    kubectl get secret db-credentials
    kubectl get secret db-credentials -o json | jq '.data | map_values(@base64d)'

    Cause C: Liveness Probe Failing

    Symptoms: kubectl describe pod shows repeated liveness probe failures before the crash.

    Warning  Unhealthy  10s (x3 over 30s)  kubelet  Liveness probe failed: connection refused

    Fix: The probe might be checking too early or too aggressively. Adjust the configuration:

    livenessProbe:
      httpGet:
        path: /health
        port: 8080
      initialDelaySeconds: 30  # Give app time to start
      periodSeconds: 10
      failureThreshold: 3      # Allow some failures

    Important: If your app takes time to initialize, consider using a startupProbe instead of increasing initialDelaySeconds:

    startupProbe:
      httpGet:
        path: /health
        port: 8080
      failureThreshold: 30
      periodSeconds: 10
    # This gives up to 5 minutes (30 x 10s) for startup

    Cause D: OOMKilled

    Symptoms: Last state shows Reason: OOMKilled, exit code 137.

    Fix: Increase memory limits:

    resources:
      limits:
        memory: "512Mi"  # Increase this
      requests:
        memory: "256Mi"

    See our detailed guide: How to Fix Kubernetes OOMKilled Pods

    Cause E: Wrong Command or Entrypoint

    Symptoms: Container exits immediately with code 1 or 127. Logs are empty or show "command not found".

    Fix: Verify the command in your deployment matches what the image expects:

    containers:
    - name: my-app
      image: my-app:v1
      command: ["/app/start.sh"]  # Make sure this exists in the image
      args: ["--config", "/etc/config/app.yaml"]

    Test interactively:

    kubectl run debug --rm -it --image=<your-image> -- sh
    # Then manually run your command

    Cause F: Missing Dependencies

    Symptoms: Logs show connection errors to databases, APIs, or other services.

    Fix:

  • Verify the dependency is running: kubectl get pods -A | grep <service>
  • Check network policies aren't blocking traffic
  • Verify service DNS resolution: kubectl run debug --rm -it --image=busybox -- nslookup <service-name>
  • Use init containers to wait for dependencies:
  • initContainers:
    - name: wait-for-db
      image: busybox
      command: ['sh', '-c', 'until nc -z postgres-svc 5432; do sleep 2; done']

    Debugging Tips

    Get Termination Details Directly

    Quickly check why the last container terminated:

    kubectl get pod <pod-name> -o jsonpath='{.status.containerStatuses[0].lastState.terminated}' | jq

    Run a Debug Container

    If the container crashes too fast to debug:

    # Override the command to keep it running
    kubectl run debug --rm -it --image=<your-image> --command -- sleep infinity
    
    # Then exec into it
    kubectl exec -it debug -- sh

    Use Ephemeral Debug Containers (K8s 1.25+)

    Attach a debug container to a running or crashed pod without modifying it:

    kubectl debug <pod-name> -it --image=busybox --target=<container-name>

    This is useful when the original image doesn't have debugging tools.

    Check Events Cluster-Wide

    kubectl get events --sort-by='.lastTimestamp' | grep <pod-name>

    Watch the Restart Cycle

    kubectl get pods -w

    Practice This Scenario

    Start the Probes Drift Challenge →

    In this hands-on challenge, you'll:

  • Investigate why a pod keeps getting killed during startup
  • Understand the relationship between probes and container lifecycle
  • Fix the configuration to achieve stable operation
  • Prevention Tips

  • Always check logs locally first - Run docker run before deploying to Kubernetes
  • Use startup probes for slow-starting apps - Don't abuse initialDelaySeconds
  • Set appropriate resource limits - Monitor actual usage with kubectl top pods
  • Handle signals gracefully - Implement proper SIGTERM handling
  • Add health endpoints - Make debugging easier with /health and /ready endpoints
  • Written by

    Paul Brissaud

    Paul Brissaud

    Paul Brissaud is a DevOps / Platform Engineer and the creator of Kubeasy. He believes Kubernetes education is often too theoretical and that real understanding comes from hands-on, failure-driven learning.

    Related Articles