All Systems Operational
Web App (rollbar.com) Operational
API Tier (api.rollbar.com) ? Operational
Processing Pipeline ? Operational
rollbar.min.js Operational
Mailgun SMTP ? Operational
Mailgun Outbound Delivery ? Operational
Rollbar Docs Operational
Operational
Degraded Performance
Partial Outage
Major Outage
Maintenance
Processing latency (Default) ?
Fetching
Processing latency (JavaScript Source Maps) ?
Fetching
Processing latency (iOS Symbolication) ?
Fetching
Past Incidents
May 29, 2020

No incidents reported today.

May 28, 2020

No incidents reported.

May 27, 2020

No incidents reported.

May 26, 2020
Completed - Maintenance is now complete, all systems operational.
May 26, 17:54 PDT
Verifying - Maintenance is complete, pipeline processing has just caught up to realtime.
May 26, 17:51 PDT
Update - Scheduled maintenance is still in progress. We will provide updates as necessary.
May 26, 17:18 PDT
Update - Beginning pipeline downtime window, will update when the maintenance is complete.
May 26, 17:17 PDT
Update - We are starting the maintenance period now, will update when the processing pipeline is halted.
May 26, 16:51 PDT
In progress - Scheduled maintenance is currently in progress. We will provide updates as necessary.
May 26, 16:01 PDT
Scheduled - We will be undergoing some routine database maintenance later today. No downtime to either web or API is expected, although there will be a small (~5-10 minutes) of pipeline processing delay. Will update when the processing delay starts and stops. Thanks for your patience!
May 26, 10:53 PDT
May 25, 2020

No incidents reported.

May 24, 2020

No incidents reported.

May 23, 2020
Completed - The scheduled maintenance has been completed.
May 23, 18:00 PDT
In progress - Scheduled maintenance is currently in progress. We will provide updates as necessary.
May 22, 18:02 PDT
Scheduled - We will be undergoing some database maintenance in response to last nights outage. No customer impact or downtime of any kind is expected. We will update when the maintenance is complete. Thanks for your patience as we improve Rollbar's stability!
May 22, 11:43 PDT
May 22, 2020
May 21, 2020
Postmortem - Read details
May 22, 23:19 PDT
Resolved - This incident is now resolved. All systems are fully operational, and all backlogged data has been processed.

We'll have a postmortem/RFO on this incident posted within the next few days.
May 21, 20:32 PDT
Update - Update: new events continue to be processed in near-real-time; the backlog from events received during the outage will be completed in about 30 minutes.
May 21, 20:12 PDT
Update - Update: the processing pipeline is at near-real-time for new data. Events received during the outage are being worked through in the background; we now expect that backlog to be cleared about 1.5 hours.

We will post the next update in about 1 hour, at 20:00 PDT.
May 21, 19:01 PDT
Update - Update: the processing pipeline is restored to near-real-time for new data. Events received during the outage are being worked through in the background; we expect that backlog to be cleared in 40-60 minutes.

We will post the next update in about 20 minutes, by 18:50 PDT.
May 21, 18:28 PDT
Update - Update: the processing pipeline is about 10 minutes away from reaching near-real-time for new data. (The previous update stating that the pipeline was already at near-real-time for new data, was in error.)

We will post the next update within 20 minutes, by 18:40 PDT.
May 21, 18:21 PDT
Monitoring - The Web UI, API, and Processing Pipeline are all operational.

Current Status:

- Web UI: Fully operational
- API: Fully operational
- Processing pipeline (notifications): working through backlog of events received during the outage; the backlog is expected to be cleared in about 40 minutes.

We will provide the next update in 20 minutes, at 18:20 PDT.
May 21, 18:00 PDT
Update - We've restored the API and Processing Pipeline tiers. We're working to restore the Web UI.
May 21, 17:50 PDT
Update - We've continuing to make progress toward recovery. Current status:

- Ingestion API: available; receiving data that will be processed once the processing pipeline comes online
- Web UI: partially available with high error rate
- Processing Pipeline: beginning recovery

We will provide the next update in 20 minutes, at 18:00 PDT.
May 21, 17:44 PDT
Update - We've continuing to make progress toward recovery. Current status:

- Ingestion API: available; receiving data that will be processed once the processing pipeline comes online
- Web UI: unavailbale
- Processing Pipeline: unavailable

We will provide the next update in 20 minutes, at 17:40 PDT.
May 21, 17:21 PDT
Update - We're making progress toward recovery. Current status:

- Ingestion API: available; receiving data that will be processed once the processing pipeline comes online
- Web UI: unavailable
- Processing Pipeline: unavailable

We will provide the next update in 20 minutes, at 17:20 PDT.
May 21, 17:00 PDT
Update - We're continuing to work toward recovery. Current status:

- Ingestion API: available; receiving data that will be processed once the processing pipeline comes online
- Web UI: unavailable
- Processing Pipeline: unavailable

We will provide the next update at 17:00 PDT.
May 21, 16:32 PDT
Update - We're continuing to work toward recovery. Current status:

- Ingestion API: available; receiving data that will be processed once the processing pipeline comes online
- Web UI: unavailable
- Processing Pipeline: unavailable

The immediate cause of the outage is a database crash. We're working to bring the database online, with two parallel strategies for recovery. We're also working to restore read-only availability of the web tier.
May 21, 15:59 PDT
Update - The ingestion API is now available, receiving data to be processed once the processing pipeline comes online. We continue to work toward recovery of the remaining services.
May 21, 15:37 PDT
Update - The ingestion API is coming back online. We're still working to recover the web UI and processing pipeline.
May 21, 15:29 PDT
Update - We're continuing to work toward service recovery, prioritizing the ingestion API. We have line of sight toward recovery of the ingestion API as well as to the entire service.

Current service status:

- Web UI is unavailable
- API is unavailable, except that rate-limited requests correctly respond 429
- Processing pipeline is unavailable
May 21, 15:19 PDT
Update - We're continuing to work toward service recovery, prioritizing the ingestion API.
May 21, 15:05 PDT
Identified - We've identified the issue and we anticipate the service will begin coming back online in 5-10 minutes.
May 21, 14:52 PDT
Investigating - We're investigating an outage of the web app and API.
May 21, 14:46 PDT
May 20, 2020
Resolved - e This incident has been resolved: the affected components (the Occurrences counts on the Account-Level Dashboard, Project Level dashboard, and Reports API) are again at near-real-time.

As noted earlier: as part of the fix, we have implemented a performance optimization that, in rare cases, may cause occurrences to be counted as the framework/level at time of processing instead of the framework/level at time of occurrence. This can affect 1) items whose levels are changed (i.e., manually or upon reactivation) and 2) merged items where the framework changes as a result of the merge (technically possible but very rare). For occurrences of such items, if the level/framework of the item changes between time of occurrence and time the occurrence is processed by the internal service that prepares the dashboard summary data, then the level/framework reported in these summaries will be the value at time of processing rather than the value at time of occurrence.
May 20, 23:41 PDT
Monitoring - We've implemented a fix for this issue. We expect the affected components (the Occurrences counts on the Account-Level Dashboard, Project Level dashboard, and Reports API) recover to near-real-time within about 45 minutes.

Note: as part of the fix, we have implemented a performance optimization that in rare cases may cause occurrences to be counted as the framework/level at time of processing instead of the framework/level at time of occurrence. This can affect 1) items whose levels are changed (i.e. manually or upon reactivation) and 2) merged items where the framework changes as a result of the merge (technically possible but very rare). For occurrences of such items, if the level/framework of the item changes between time of occurrence and time the occurrence is processed by the internal service that prepares the dashboard summary data, then the level/framework reported in these summaries will be value at time of processing rather than the value at time of occurrence.
May 20, 22:49 PDT
Update - Account level dashboard and project dashboards are still showing data that might be outdated by at most 6 hours
May 20, 04:42 PDT
Investigating - There is a processing delay in parts of the pipeline. Account level dashboards, project dashboards pages may show data that is outdated by a few hours.
May 19, 17:59 PDT
May 19, 2020
Resolved - Account-level dashboard and project dashboards are up to date.
May 19, 05:07 PDT
Update - Account-level dashboard and project-level dashboards are still catching up, slower than we expected. We estimate full resolution at approximately 5:00AM PDT.
May 19, 00:21 PDT
Monitoring - We continue to monitor the issue. Account-level dashboard and project-level dashboards are still catching up. We expect full resolution at approximately 23:00 PDT.
May 18, 21:19 PDT
Investigating - There is a processing delay in parts of the pipeline. Account level dashboards, project dashboards page may show data that is outdated by 1-2 hours.
May 18, 21:12 PDT
May 18, 2020
May 17, 2020

No incidents reported.

May 16, 2020

No incidents reported.

May 15, 2020
Completed - The scheduled maintenance has been completed.
May 15, 18:00 PDT
In progress - Scheduled maintenance is currently in progress. We will provide updates as necessary.
May 15, 17:00 PDT
Scheduled - The api-alt.rollbar.com (http://api-alt.rollbar.com/) endpoint is being sunsetted and will be powered down on May 15 2020. The standard api.rollbar.com (http://api.rollbar.com/) remains unchanged and customers utilizing the alt-api.rollbar.com (http://api-alt.rollbar.com/) are asked to migrate to our standard endpoint. Please contact customer support if you have questions around migrating to the main endpoint.
Apr 16, 12:07 PDT