I renamed a Kubernetes Service in a Flux repo and broke external load balancer access for four TCP proxy services. No errors in Flux, no errors in Kubernetes. The services were running fine internally. But all external traffic stopped.
The Setup
The project uses GCP Network Endpoint Groups (NEGs) to connect GKE Services to external TCP load balancers managed by Terraform. The flow:
Internet → GCP TCP LB (Terraform) → Backend Service → NEG → K8s Service → Pods
The K8s Service has an annotation that tells GKE to create a NEG:
metadata:
annotations:
cloud.google.com/neg: '{"exposed_ports":{"3333":{"name":"tcp-proxy-neg"}}}'
Terraform references that NEG name in its backend service:
resource "google_compute_backend_service" "tcp_proxy" {
backend {
group = "projects/.../zones/.../networkEndpointGroups/tcp-proxy-neg"
}
}
What Broke
During a migration from one Flux repo to another, the Service names were changed. The cloud.google.com/neg annotation generates the NEG name from the annotation value, but if the annotation itself changes (or the Service is deleted and recreated with a different name), a new NEG is created with a different name.
Terraform still pointed at the old NEG name. GCP returned no errors — the backend service just had zero healthy endpoints.
Why It Was Silent
- Kubernetes: Service is healthy, pods are running, NEG annotation is applied. No warnings.
- Terraform: Backend service exists, references a NEG that exists (the old one, now empty). No drift detected because Terraform manages the backend service, not the NEG itself.
- GCP: Load balancer is healthy from its perspective. It just has no backends to send traffic to.
The only signal was that external connections stopped working.
The Fix
Updated the NEG names in Terraform to match the new names generated by the renamed Services. One Terraform PR, four services fixed:
# Old
group = "old-proxy-neg"
# New
group = "tcp-proxy-neg"
Takeaway
NEG names are a hidden coupling point between GitOps (Flux/ArgoCD) and IaC (Terraform). Renaming a Kubernetes Service silently breaks the infrastructure layer. If you use NEGs with external load balancers, document the coupling and consider adding a health check that verifies the NEG has endpoints.