Kubernetes Job
A job is a workload resource in Kubernetes that stands for one-time, limited tasks that are intended to be completed and then terminated. A Job guarantees that a predetermined number of Pods terminate successfully with an exit code of zero, in contrast to normal Pods or Deployments, which are designed for long-running services that should remain available indefinitely.
You can also read Kind: A Practical Guide to Local Kubernetes Clusters
The Foundation of the Job Object
Pods must be produced, and the Job object must ensure a set number of them finish. A reconciliation loop is important to Kubernetes architecture. This loop constantly compares the cluster’s actual status, notably the number of completed Pods, to the Job manifest’s “desired state”. If a node issue causes a Pod to fail and a Job needs three successful completions, the Job controller will create a new Pod to attain the desired condition.
RestartPolicy must be set to either OnFailure or Never for Jobs. This is a crucial difference from long-running services, which usually follow an Always strategy. OnFailure is typically advised since it tells the local kubelet to restart the container inside the current Pod in the event of a failure, which is more resource-efficient. On the other hand, Never results in the Job controller creating a brand-new Pod for each failure, which might fill the cluster with “junk” Pods that need to be manually cleaned.
Here is an example Job config. It computes π to 2000 places and prints it out. It takes around 10s to complete.
apiVersion: batch/v1
kind: Job
metadata:
name: pi
spec:
template:
spec:
containers:
- name: pi
image: perl:5.34.0
command: ["perl", "-Mbignum=bpi", "-wle", "print bpi(2000)"]
restartPolicy: Never
backoffLimit: 4
You can run the example with this command:
kubectl apply -f https://kubernetes.io/examples/controllers/job.yaml
The output is similar to this:
job.batch/pi created
To view completed Pods of a Job, use kubectl get pods.
To list all the Pods that belong to a Job in a machine readable form, you can use a command like this:
pods=$(kubectl get pods --selector=batch.kubernetes.io/job-name=pi --output=jsonpath='{.items[*].metadata.name}')
echo $pods
The output is similar to this:
pi-5rwd7
Here, the selector is the same as the selector for the Job. The --output=jsonpath option specifies an expression with the name from each Pod in the returned list.
View the standard output of one of the pods:
kubectl logs $pods
Another way to view the logs of a Job:
kubectl logs jobs/pi
You can also read How to Get Started Kubernetes? Explained Briefly
Core Job Patterns
By combining two key characteristics completions (the total number of successful runs needed) and parallelism (the maximum number of Pods that can run simultaneously) Kubernetes enables a number of complex processing patterns.
- One-Shot Jobs: In this design, which is the most basic, a single Pod runs precisely once before being successfully terminated. For processes like database migrations, which should only be performed once to update a schema prior to the launch of a new application version, this is perfect.
- Parallel Jobs with Fixed Completions: Multiple Pods process a set number of jobs in parallel until they finish. A job could execute many generating tools to produce 100 report segments while limiting cluster load with a
parallelismvalue. - Work Queues: Items from a centralized work queue can also be processed using jobs. In this approach, the Job manages a pool of worker Pods and frequently leaves the completions option unset. The Job controller winds down the workers until the queue is empty after Pods successfully process items and exit with a zero code.
- Indexed Jobs: A completion index ranging from 0 to
.spec.completions-1is allocated to each Pod in Indexed Jobs, an advanced completion mode (stable in v1.24). Through environment variables or Pod hostnames, the containerized job can access this index, enabling static work assignment and Pod-to-Pod communication.
Setting Up and Configuring Kubernetes Jobs
A run-to-completion task must be defined, carried out, and verified in a number of steps while setting up and configuring a Kubernetes job. The procedure is described in the guidance that follows, based on the references given:
Create a YAML Configuration File
The first step is to create a manifest file (e.g., job-test.yaml) that defines the Job’s desired state.
Define the Job Specification
The YAML file must include the following required fields:
- apiVersion: Set to
batch/v1. - kind: Set to
Job. - metadata: Includes the name of the Job (maximum 63 characters and following DNS label rules for best compatibility).
- spec.template: This is the only required field within the
.specand defines the Pod template the Job will use to perform its task.
Configure Critical Parameters
Within the Job’s .spec, you must configure specific attributes to control its behavior:
- Restart Policy: You must specify a
restartPolicyof eitherOnFailure(restarts the container within the existing Pod) orNever(creates a new Pod for every failure). The defaultAlwayspolicy used for standard Pods is not permitted for Jobs. - Completions: Define how many successful Pod terminations are required for the Job to be considered complete.
- Parallelism: Specify the maximum number of Pods that should run concurrently at any given time.
- Backoff Limit: Set the number of retries (
.spec.backoffLimit) allowed before the Job is marked as failed (default is 6).
Apply the Job
Execute the Job by applying the configuration file to your cluster using the following command: kubectl apply -f job-test.yaml
Verify Job Execution
Once applied, monitor the Job’s progress and status:
- Check Status: Use
kubectl get jobsto see the number of completions and active Pods. - Describe Details: Use
kubectl describe job <job-name>to view detailed information, including events and failure reasons. - List Pods: To identify the specific Pods created by the Job, use
kubectl get pods --selector=batch.kubernetes.io/job-name=<job-name>.
Inspect Diagnostic Logs
Because Pods are not deleted automatically upon completion, you can inspect their standard output for results or debugging: kubectl logs <pod-name>
Manage Cleanup and Resource Retention
Completed jobs should be handled when they are finished because they place strain on the API server:
- Manual Deletion: Use
kubectl delete job <job-name>orkubectl delete -f job-test.yamlto remove the Job and its associated Pods. - Automatic Cleanup: Configure the TTL (Time to Live) mechanism by setting
.spec.ttlSecondsAfterFinished. This allows the cluster to automatically delete finished Jobs after a specified duration.
You can also read What is the Importance of Kubernetes & Why Kubernetes?
Handling Failures and Termination
Failures in distributed systems can be caused by hardware problems or application defects. Kubernetes offers various tools for handling these:
- Backoff Limits: The
.spec.backoffLimitdefaults to 6 and sets the number of retries before a job is considered unsuccessful. - Pod Failure Policy: This feature (stable in v1.31) allows fine-grained control, such as terminating a Job, if a Pod fails with a specified exit code that indicates a non-retriable software issue.
- Active Deadlines: Administrators can specify a time restriction for the duration of a job by setting
.spec.activeDeadlineSeconds. Once it is reached, the Job is reported as failed with the reasonDeadlineExceeded, and all active Pods are terminated.
Administrators can review logs for diagnostic purposes because completed jobs are typically not automatically erased. However, a TTL (Time to Live) mechanism (.spec.ttlSecondsAfterFinished) can be used to automatically remove completed Jobs after a predetermined amount of time to avoid resource accumulation and strain on the API server.
Use Cases For Kubernetes Jobs
- Data backup and restoration are done on a regular basis using Kubernetes tasks.
- Carrying out data analysis or processing operations in a dispersed way.
- Image processing and report generating are examples of batch jobs that it performs.
- Completing difficult or resource-intensive tasks that call for parallelism.
You can also read What is Container Orchestration in Kubernetes?
