API-initiated Eviction
API-initiated eviction is the process by which you use the Eviction API
to create an Eviction object that triggers graceful pod termination.
You can request eviction by calling the Eviction API directly, or programmatically
using a client of the API server, like the kubectl drain command. This
creates an Eviction object, which causes the API server to terminate the Pod.
API-initiated evictions respect your configured PodDisruptionBudgets
and terminationGracePeriodSeconds.
Using the API to create an Eviction object for a Pod is like performing a
policy-controlled DELETE operation
on the Pod.
Calling the Eviction API
You can use a Kubernetes language client
to access the Kubernetes API and create an Eviction object. To do this, you
POST the attempted operation, similar to the following example:
Note:
policy/v1 Eviction is available in v1.22+. Use policy/v1beta1 with prior releases.{
"apiVersion": "policy/v1",
"kind": "Eviction",
"metadata": {
"name": "quux",
"namespace": "default"
}
}
Note:
Deprecated in v1.22 in favor ofpolicy/v1{
"apiVersion": "policy/v1beta1",
"kind": "Eviction",
"metadata": {
"name": "quux",
"namespace": "default"
}
}
Alternatively, you can attempt an eviction operation by accessing the API using
curl or wget, similar to the following example:
curl -v -H 'Content-type: application/json' https://your-cluster-api-endpoint.example/api/v1/namespaces/default/pods/quux/eviction -d @eviction.json
How API-initiated eviction works
When you request an eviction using the API, the API server performs admission checks and responds in one of the following ways:
200 OK: the eviction is allowed, theEvictionsubresource is created, and the Pod is deleted, similar to sending aDELETErequest to the Pod URL.429 Too Many Requests: the eviction is not currently allowed because of the configured PodDisruptionBudget. You may be able to attempt the eviction again later. You might also see this response because of API rate limiting.500 Internal Server Error: the eviction is not allowed because there is a misconfiguration, like if multiple PodDisruptionBudgets reference the same Pod.
If the Pod you want to evict isn't part of a workload that has a
PodDisruptionBudget, the API server always returns 200 OK and allows the
eviction.
If the API server allows the eviction, the Pod is deleted as follows:
- The
Podresource in the API server is updated with a deletion timestamp, after which the API server considers thePodresource to be terminated. ThePodresource is also marked with the configured grace period. - The kubelet on the node where the local Pod is running notices that the
Podresource is marked for termination and starts to gracefully shut down the local Pod. - While the kubelet is shutting the Pod down, the control plane removes the Pod from EndpointSlice objects. As a result, controllers no longer consider the Pod as a valid object.
- After the grace period for the Pod expires, the kubelet forcefully terminates the local Pod.
- The kubelet tells the API server to remove the
Podresource. - The API server deletes the
Podresource.
Troubleshooting stuck evictions
In some cases, your applications may enter a broken state, where the Eviction
API will only return 429 or 500 responses until you intervene. This can
happen if, for example, a ReplicaSet creates pods for your application but new
pods do not enter a Ready state. You may also notice this behavior in cases
where the last evicted Pod had a long termination grace period.
If you notice stuck evictions, try one of the following solutions:
- Abort or pause the automated operation causing the issue. Investigate the stuck application before you restart the operation.
- Wait a while, then directly delete the Pod from your cluster control plane instead of using the Eviction API.
What's next
- Learn how to protect your applications with a Pod Disruption Budget.
- Learn about Node-pressure Eviction.
- Learn about Pod Priority and Preemption.