I set up CPU throttle alerts for some Postgres pods. The first alert that fired looked like this:
[FIRING] PodCpuThrottling
ns: , pod: , 98.32% > 90%
Namespace empty. Pod name empty. Completely useless.
The Query
The PromQL was a standard CPU throttle query joining cadvisor metrics with kube-state-metrics:
sum by (namespace, pod) (
rate(container_cpu_cfs_throttled_periods_total{container!=""}[5m])
) / sum by (namespace, pod) (
rate(container_cpu_cfs_periods_total{container!=""}[5m])
) * 100 > 90
This returns a percentage, grouped by namespace and pod. Looks correct. But the labels in the alert were empty.
The Problem
cadvisor metrics (container_cpu_*) use labels namespace and pod. kube-state-metrics (kube_pod_*) use labels exported_namespace and exported_pod. When Prometheus scrapes kube-state-metrics with honor_labels: false, the original namespace label gets renamed to exported_namespace.
So a join like:
container_cpu_cfs_throttled_periods_total{namespace="db"}
* on(namespace, pod) group_left
kube_pod_labels{namespace="db"}
…fails silently. The namespace in cadvisor is the real namespace. The namespace in kube-state-metrics is the namespace where kube-state-metrics itself runs (usually monitoring). The actual pod namespace is in exported_namespace.
The alert query returned data (the throttle percentage was real), but the labels for display came from the wrong source.
The Fix
Use label_replace to bridge the label names before joining:
label_replace(
kube_pod_labels,
"namespace", "$1", "exported_namespace", "(.*)"
)
Or skip the join entirely and use cadvisor labels directly — they’re correct:
sum by (namespace, pod, container) (
rate(container_cpu_cfs_throttled_periods_total{
namespace="db",
pod=~"core-postgres.*",
container!=""
}[5m])
) / sum by (namespace, pod, container) (
rate(container_cpu_cfs_periods_total{
namespace="db",
pod=~"core-postgres.*",
container!=""
}[5m])
) * 100 > 90
After fixing this, alerts showed the actual namespace, pod, and container name.
Takeaway
If your Prometheus alerts have empty labels, check whether you’re joining cadvisor metrics (which use namespace, pod) with kube-state-metrics (which use exported_namespace, exported_pod). The label mismatch is silent — the query still returns data, it just loses the grouping labels in the result. Use label_replace to bridge them, or avoid the join entirely.