Processing latency increase

Resolved·Degraded performance

The pipeline is fully caught up and we are now processing and alerting on data as it streams in. There was a short window, (~3 minutes) at the beginning of the incident, (~1pm PDT) where we dropped data. Afterward, no data was lost although processing was delayed. The primary cause has been identified and linked to the remediation steps we took during yesterday's outage. More details to follow.

We have scheduled a postmortem for this incident on Monday of next week. We will update this incident with the postmortem notes next week.

Fri, Apr 24, 2020, 12:30 AM

(5 years ago)

·

Affected components

Apr 23, 2020, 08:42 PM

Apr 24, 2020, 12:30 AM

Updates

Resolved

The pipeline is fully caught up and we are now processing and alerting on data as it streams in. There was a short window, (~3 minutes) at the beginning of the incident, (~1pm PDT) where we dropped data. Afterward, no data was lost although processing was delayed. The primary cause has been identified and linked to the remediation steps we took during yesterday's outage. More details to follow.

We have scheduled a postmortem for this incident on Monday of next week. We will update this incident with the postmortem notes next week.

Fri, Apr 24, 2020, 12:30 AM

Monitoring

The processing pipeline is currently catching up. We have also made a change to prioritize new data. This will mean that while the system is at or over capacity, new data will be given higher priority, processed, and alerted on.

Thu, Apr 23, 2020, 10:40 PM(1 hour earlier)

Identified

The pipeline processing latency has peaked at 60 minutes but decreasing now.

Thu, Apr 23, 2020, 09:42 PM(58 minutes earlier)

Identified

We've identified an issue causing processing latency of approximately 25 minutes. We're working to resolve the issue.

Thu, Apr 23, 2020, 08:42 PM(59 minutes earlier)