We've identified the cause of the issue. A set of slow running queries caused our primary DB to back up. As a side effect, the API tier and web tier became unresponsive while waiting on the DB. We have an internal tool that usually monitors slow running queries and clears them but the tool was broken by a recent commit. We've fixed the issue and the system is stable.
Posted 8 months ago. Feb 08, 2019 - 15:20 PST
The web and API tiers are back up and data is being processed. We are investigating the cause for the outage.
Posted 8 months ago. Feb 08, 2019 - 14:23 PST
We are currently investigating this issue.
Posted 8 months ago. Feb 08, 2019 - 14:11 PST
This incident affected: Web App (rollbar.com) and API Tier (api.rollbar.com).