How to Optimize Kubernetes Pod Performance with Pod-Level Resource Managers (Alpha)

Introduction

Kubernetes v1.36 introduces a powerful alpha feature called Pod-Level Resource Managers that transforms how you allocate CPU and memory for performance-sensitive workloads. Instead of forcing every container in a pod to require exclusive, integer-based resources, this enhancement lets you define a pod-level resource budget alongside flexible per-container allocations. The result? You can achieve NUMA alignment and Guaranteed QoS for your main application while letting lightweight sidecars share a common pool—saving cores and reducing waste.

How to Optimize Kubernetes Pod Performance with Pod-Level Resource Managers (Alpha)

This guide walks you through enabling and using Pod-Level Resource Managers, from prerequisites to real-world configuration examples. By the end, you’ll be able to deploy pods that combine exclusive resources for critical containers with shared resources for helpers, all without losing deterministic performance.

What You Need

Before you begin, ensure your environment meets these prerequisites:

Kubernetes cluster running v1.36 or later (alpha features require this baseline).
kubelet with feature gates enabled: PodLevelResourceManagers=true and PodLevelResources=true.
Topology Manager configured (set scope to pod via kubelet flag --topology-manager-scope=pod).
Knowledge of NUMA architecture and how CPU/Memory Managers allocate resources.
A performance-critical workload that includes sidecar containers (e.g., ML training, high-frequency trading, low-latency databases).
Cluster admin privileges to modify kubelet flags and apply pod manifests.

Step-by-Step Guide

Step 1: Determine If Pod-Level Resource Managers Are Right for You

Pod-Level Resource Managers are ideal when your pod contains at least one container that needs exclusive, NUMA-aligned resources (e.g., a database engine) and others that can tolerate shared resources (e.g., a logging sidecar). If all containers in your pod demand exclusive resources or you don’t require NUMA alignment, the existing per-container model may suffice.

Tip: This feature is still alpha. Test it in a non-production cluster first.

Step 2: Enable Feature Gates on the Kubelet

Edit the kubelet configuration (usually /var/lib/kubelet/config.yaml on the node) to activate the two required feature gates:

featureGates:
  PodLevelResourceManagers: true
  PodLevelResources: true

Restart the kubelet service (systemctl restart kubelet). Verify the gates are active by checking the kubelet logs or using kubectl describe node (look for “PodLevelResourceManagers” in the feature gates list).

Step 3: Configure Topology Manager Scope to Pod

Set the Topology Manager scope to pod so that NUMA alignment considers the entire pod’s resource budget, not just individual containers. Add this flag to the kubelet:

--topology-manager-scope=pod

If you previously used container scope, change it now. Restart the kubelet again. Confirm the scope with kubectl describe node | grep TopologyManager.

Step 4: Define Pod-Level Resources in Your Pod Spec

Add a resources field at the pod level (spec.resources) to establish the total CPU and memory budget available to all containers. This budget also defines the alignment size for NUMA.

apiVersion: v1
kind: Pod
metadata:
  name: tightly-coupled-database
spec:
  resources:
    requests:
      cpu: "8"
      memory: "16Gi"
    limits:
      cpu: "8"
      memory: "16Gi"

In this example, the pod requests and limits 8 CPU cores and 16 GiB of memory. The kubelet will attempt to allocate all these resources from a single NUMA node.

Step 5: Specify Per-Container Resource Requirements

For each container, you can now set container-level resources that are relative to the pod’s budget. The main application container gets exclusive, integer-based resources (e.g., 4 CPUs), while sidecars can use fractional or shared allocations.

containers:
- name: database
  image: postgres:15
  resources:
    requests:
      cpu: "4"
      memory: "10Gi"
    limits:
      cpu: "4"
      memory: "10Gi"
- name: metrics-exporter
  image: prom/node-exporter:v1
  resources:
    requests:
      cpu: "100m"
      memory: "128Mi"
- name: backup-agent
  image: my-backup:latest
  resources:
    requests:
      cpu: "200m"
      memory: "256Mi"

Notice that the sidecar containers use millicores (m) and small memory amounts. The kubelet will: (1) allocate exclusive 4 CPUs + 10 GiB to the database from the pod budget, (2) create a “pod shared pool” from the remaining budget (~4 CPUs + 6 GiB), and (3) run the sidecars from that shared pool.

Step 6: Apply the Pod and Verify Allocation

Use kubectl apply -f pod.yaml to create the pod. After the pod runs, inspect its resource allocation:

kubectl describe pod tightly-coupled-database

Look for the Conditions section to confirm the pod is PodLevelResourcesAllocated. Also check the Allocated Resources section (kubelet metrics or node status) to see that exclusive and shared pools are formed correctly.

For deeper verification, run kubectl exec into each container and check /proc/self/status for Cpus_allowed_list and Mems_allowed_list to confirm NUMA affinity.

Step 7: Tweak and Iterate

Experiment with different pod-to-container resource ratios. If a sidecar becomes a bottleneck, increase its request from the shared pool. If the main container needs more isolation, raise its exclusive allocation. Remember that total container requests cannot exceed the pod-level requests (that would be rejected by the scheduler or kubelet).

Tips for Success

Start small: Use a test pod with one heavy container and one light sidecar to understand the behavior before deploying to production.
Monitor resource contention: Shared pool containers can compete for resources under high load. Use metrics to ensure sidecars don’t starve each other.
Combine with CPU Manager policies: For exclusive containers, set CPU Manager policy to static (default none). This is already handled when you request integer CPUs.
Check feature gate stability: Since this is alpha, upgrade to a newer version or wait for beta before relying heavily in production.
Document your allocation strategy: Clearly label which containers use exclusive vs. shared resources; it helps with troubleshooting and future migrations.
Use namespace-level ResourceQuotas: Ensure the total pod-level requests don’t exceed cluster capacity; set quotas accordingly.