Web app, API, and pipeline outage
Incident Report for Rollbar
Postmortem

On Nov 25, 2021 from 11:08 AM PT to 11:29 AM PT our API, web app, and item processing pipeline were sporadically unavailable. Unfortunately, errors sent from 11:08 AM to 11:17 AM were not processed.

The root cause was an infrastructure change intended to better handle traffic spikes to our API. Unfortunately, this change caused us to briefly overload one of our application databases, causing the unavailability.

When we detected the database issues, we reverted the work and our services recovered.

To prevent this incident from recurring, we have paused the aforementioned infrastructure change work. Meanwhile, we are continuing work to scale our application databases. As mentioned in a previous postmortem, we have significantly increased resources for database work. The first database projects have been completed, but we have more projects planned for this quarter and next to continue to improve our databases.

As always, thank you for being a Rollbar customer.

Posted Dec 01, 2021 - 13:37 PST

Resolved
This incident has been resolved. Thank you for your patience. Please expect a post-mortem by Tuesday 5 PM PT.
Posted Nov 25, 2021 - 11:51 PST
Update
The web app and API are functional again. The pipeline is processing a backlog of items.
Posted Nov 25, 2021 - 11:34 PST
Monitoring
A fix has been implemented and we are monitoring the results.
Posted Nov 25, 2021 - 11:19 PST
Identified
An issue with one of our application databases has caused a web app, API, and pipeline outage. We will update when the database is available again.
Posted Nov 25, 2021 - 11:15 PST
This incident affected: Web App (rollbar.com), API Tier (api.rollbar.com) and Processing pipeline (Core Processing Pipeline, iOS Symbolication pipeline).