K8S 08. Why requests, limits, and probes should be reviewed together

2026-06-11 3 minute read

Summary

In Kubernetes, requests and limits define resource boundaries for scheduling and runtime behavior. Probes tell Kubernetes whether a container can receive traffic, needs to be restarted, or needs more startup time.

The conclusion of this post is that resource settings and probes should not be reviewed separately. A low request can make scheduling too optimistic, a tight limit can slow or kill the application, and an aggressive probe can make the symptom look like a restart loop.

Document Information

Written on: 2026-04-24
Verification date: 2026-04-24
Document type: analysis
Test environment: verified in the author’s separate practice environment. OS, node details, and cluster topology are not fixed in this post.
Test version: Kubernetes official documentation checked on 2026-04-24. The documentation site displayed v1.36 links.
Source level: Kubernetes official documentation.
Note: production sizing should be based on application profiles and real metrics.

Problem Definition

Early manifests often mix these issues:

Deploying Pods without resources.
Treating requests and limits as the same concept.
Using the same endpoint for readiness and liveness.
Starting liveness checks too early for slow-starting applications.
Failing to separate application bugs, resource pressure, and probe misconfiguration when restarts happen.

Verified Facts

According to Kubernetes Resource Management documentation, container resources can define requests and limits. Evidence: Resource Management for Pods and Containers
According to the same documentation, the scheduler uses resource requests when choosing a node for a Pod. Evidence: Resource Management for Pods and Containers
According to Kubernetes Pod QoS documentation, Pods are assigned Guaranteed, Burstable, or BestEffort QoS classes based on resource settings. Evidence: Pod Quality of Service Classes
According to Kubernetes probe documentation, the kubelet can periodically diagnose container health with probes. Evidence: Liveness, Readiness, and Startup Probes
According to the same documentation, when a startup probe is configured, liveness and readiness probes do not run until the startup probe succeeds. Evidence: Liveness, Readiness, and Startup Probes
According to the same documentation, when a readiness probe fails, the Pod IP is removed from ready endpoints for matching Services. Evidence: Liveness, Readiness, and Startup Probes

A basic example:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: app
spec:
  replicas: 2
  selector:
    matchLabels:
      app: app
  template:
    metadata:
      labels:
        app: app
    spec:
      containers:
        - name: app
          image: example/app:1.0.0
          ports:
            - containerPort: 8080
          resources:
            requests:
              cpu: "100m"
              memory: "128Mi"
            limits:
              cpu: "500m"
              memory: "256Mi"
          startupProbe:
            httpGet:
              path: /healthz
              port: 8080
            failureThreshold: 30
            periodSeconds: 2
          readinessProbe:
            httpGet:
              path: /ready
              port: 8080
            periodSeconds: 10
          livenessProbe:
            httpGet:
              path: /healthz
              port: 8080
            periodSeconds: 10
            failureThreshold: 3

Directly Reproduced Results

Direct reproduction: I verified the main commands and configuration flow in the author’s practice environment.
Confirmed result: based on official documentation, I verified the roles of requests, limits, QoS classes, and startup/readiness/liveness probes.
Directly verified items: kubectl describe pod, kubectl get events, metrics collection, behavior after exceeding limits, and endpoint removal after readiness failure.

Interpretation / Opinion

For operations, I usually look at requests first. Requests are the signal the scheduler uses when choosing a node. If they are too low, the cluster can look roomier than it really is. If they are too high, fewer nodes can accept the Pod and Pending Pods can increase.

Limits are guardrails, but they are also performance settings. A memory limit can lead to termination when usage grows beyond expectation. A CPU limit can affect throughput and latency. It is risky to treat limits as values that should simply be as low as possible.

Probes are a contract that lets Kubernetes interpret application health. Readiness means whether traffic should be sent, liveness means whether restart is needed, and startup means whether Kubernetes should wait longer during initialization. An overly aggressive liveness probe can turn temporary delay into restarts, and restarts can create more delay.

Limits and Exceptions

The basic resource and probe flow was checked in the author’s practice environment. The CPU and memory values in the example are placeholders, not recommendations. Real values should be based on metrics, peak traffic, startup time, garbage collection behavior, and external dependency latency.

Batch jobs, queue workers, databases, JVM applications, Go services, and Node.js services can require different request/limit and probe strategies.

References

Kubernetes Docs, Resource Management for Pods and Containers
Kubernetes Docs, Pod Quality of Service Classes
Kubernetes Docs, Liveness, Readiness, and Startup Probes

Twitter Facebook LinkedIn

K8S 08. Why requests, limits, and probes should be reviewed together

Summary

Document Information

Problem Definition

Verified Facts

Directly Reproduced Results

Interpretation / Opinion

Limits and Exceptions

References

공유하기

댓글남기기

You may also enjoy

Cloud IAM and service account least privilege

AI agents and sensitive data boundaries

Container image signing and scan result interpretation

Threat modeling AI coding agents