Partial outage in us public data plane

Incident Report for Estuary

Resolved

This incident has been resolved.
Posted Mar 27, 2025 - 14:49 UTC

Update

All impacted derivations are now active as of 30 minutes ago.

Summary: There was a partial outage for reporting, captures, materializations, and derivations in the public US data plane. We are now recovered and monitoring. There are some clean up items and a post mortem to come.

If you have data flows that are still seem impacted please reach out via slack or to support@estuary.dev. If you suspect any missing or duplicative data and would like to backfill to be sure your dataset is complete (or for any reason) in the next 15 days, we will provide a full credit. Please just email support@estuary.dev with backfill details (reason, time, size in GB).
Posted Mar 27, 2025 - 03:31 UTC

Update

We've begun re-enabling derivations and that will happen over the next couple of hours.

We have backfilled most of the captures that had an outage, and are planning to backfill a few more overnight. If you're seeing any issues and would like to backfill to be sure your dataset is complete, or for any reason in the next 15 days, we will eat the cost and provide a full credit. That is on us for this issue. Please just email support@estuary.dev with backfill details (reason, time, size in GB).
Posted Mar 27, 2025 - 01:09 UTC

Monitoring

Everything has been resolved except derivations which are being addressed now
Posted Mar 26, 2025 - 21:57 UTC

Update

US Data Plane Update:
- 1/3rd of materializations are recovered
- Stats are current
Posted Mar 26, 2025 - 20:11 UTC

Update

- Captures are all recovered
- Stats are catching up
- Materializations are being enabled and backfilled now. This process will take some time to fully complete but materializations are starting to come back now. It will take at least a couple of hours to be fully completed.
Posted Mar 26, 2025 - 18:47 UTC

Update

- All but a handful of captures have begun recovery
- Stats will be back up and running momentarily
- Recovery for materializations is now our primary focus
Posted Mar 26, 2025 - 16:49 UTC

Update

- Tasks not in the US-public data plane are all healthy but lacking reporting.
- Tasks in the US-public data plane that are green are running but lacking reporting.
- Tasks in the US-public data plane that aren’t green will need manual recovery that we are working on. This will involve a re-backfill that will be automatically done and credited for small tasks. Large tasks will be done manually and a member of the team will reach out if that is necessary.
Posted Mar 26, 2025 - 12:35 UTC

Update

Several tasks in the default public US data plane are not running (if your task is green it's running).
Reporting is also currently down, so if your task is running you will not see stats updating.
We're working on a fix and expect it to be a few hours until it's complete.
Posted Mar 26, 2025 - 07:30 UTC

Identified

The issue has been identified and a fix for some tasks has been rolled out. We're working on a more comprehensive fix.
Posted Mar 26, 2025 - 05:26 UTC

Investigating

Some connector are paused in the US public data plane. We are investigating.

Tasks may see issues like "INDEX_HAS_GREATER_OFFSET"
Posted Mar 26, 2025 - 01:29 UTC
This incident affected: Runtime.