Patroni Refused to Promote — ‘cannot execute LISTEN during recovery’

Apr 8, 2026

The staging database stopped responding. Backend pods crashed with cannot execute LISTEN during recovery. Every write attempt hit cannot execute DELETE in a read-only transaction. The database was stuck in an infinite recovery loop.


The Symptom

Every service that touched the database started failing:

cannot execute LISTEN during recovery
cannot execute DELETE in a read-only transaction

The Postgres pod was running. Patroni was running. But the database was permanently read-only — stuck in recovery mode and refusing to promote to primary.

The Investigation

Patroni manages Postgres high availability. It decides which instance is the primary (read-write) and which are replicas (read-only). When a primary fails, Patroni promotes a replica.

But in this case, Patroni wasn’t promoting anything. It was stuck in a loop — detecting that the instance was in recovery, attempting to promote, failing, and retrying. The logs showed no useful error, just endless promotion attempts.

The Root Cause

Patroni expects a database called postgres to exist. It’s the default connection target for health checks, leader election, and promotion decisions. If that database is missing — dropped accidentally, not created during init, or removed by a cleanup script — Patroni doesn’t know what to connect to.

It doesn’t fail loudly. It doesn’t log “missing postgres database.” It just… doesn’t promote. The instance stays in recovery mode forever, and every write is rejected.

The Fix

Forced promotion manually, bypassing Patroni:

kubectl exec -it <postgres-pod> -n db -- pg_ctl promote -D /home/postgres/pgdata/pgroot/data

This told Postgres directly: “you are the primary now.” Once promoted, the database accepted writes again and all services recovered.

The longer-term fix was ensuring the postgres database always exists and adding a health check that alerts if it’s missing.

Takeaway

If Patroni is stuck in recovery and won’t promote, check if the postgres database exists. Patroni silently depends on it for internal operations but doesn’t fail explicitly when it’s gone. The symptom — “cannot execute LISTEN during recovery” — looks like a replication problem but is actually a missing database problem.