Runtime degraded performance

Incident Report for Estuary

Resolved

This incident has been resolved.
Posted Mar 15, 2025 - 03:00 UTC

Update

We are rolling an update with Google's help and it has improved tasks. They are recovering.
Posted Mar 14, 2025 - 22:19 UTC

Update

We are continuing to investigate this issue.
Posted Mar 14, 2025 - 20:39 UTC

Investigating

Starting yesterday, we've seen intermittent but escalating failures of DNS resolution in the primary public data-plane, which runs in a Google Kubernetes Cluster. We've traced the problem back to low-level Google-managed components within that cluster and have been engaging their support. Thus far, their recommendations have unfortunately made the problem worse today and we're currently seeing further-elevated task errors due to failures of DNS resolution. We're escalating as much as we can and going back and forth with Google support -- this has been frustrating because we're fairly beholden to them, given how deep the issue is within the bowels of the Google-managed environment, but we'll provide updates as we can.

Private data-planes, as well as the EU data-plane, are not affected. We've been planning a migration off of this legacy Kubernetes cluster to our new data-plane infrastructure, but unfortunately aren't yet in a position to kick it off.
Posted Mar 14, 2025 - 18:30 UTC
This incident affected: Runtime.