Degraded performance
Incident Report for Buildkite
Resolved
We have completed our mitigation efforts, and have seen a full restoration of service for all users. Our monitoring shows that all customers are now operational and processing normally.
Posted Jan 07, 2025 - 07:33 UTC
Monitoring
The fix has been rolled out and all customers should now see recovery. We will continue to monitor.
Posted Jan 07, 2025 - 07:20 UTC
Update
The majority of customers are now operational and processing normally. Remaining customers experiencing issues are having targeted mitigations applied.
Posted Jan 07, 2025 - 06:10 UTC
Update
The majority of customers are now operational and processing normally. Remaining customers experiencing issues are having targeted mitigations applied.
Posted Jan 07, 2025 - 04:02 UTC
Identified
We continue to see the majority of customers see improvements as jobs are picked up and ran. We are implementing a further mitigation for the remaining impacted customers.
Posted Jan 07, 2025 - 02:48 UTC
Update
We continue to see the majority of customers see improvements as jobs are picked up and ran. We are investigating means to expand these mitigations to all customers.
Posted Jan 07, 2025 - 01:55 UTC
Update
We are continuing to see a restoration of services for the majority of our customers.
Posted Jan 07, 2025 - 00:44 UTC
Update
We’re seeing a partial restoration of services for majority of our customers.
Posted Jan 07, 2025 - 00:08 UTC
Update
We are still experiencing significant performance degradation to a database cluster. We are performing targeted load shedding to help restore service to broader customer base, before bringing the specific customers online.
Posted Jan 06, 2025 - 23:48 UTC
Update
We are still experiencing significant database degradation due to load. We are investigating multiple paths to try and resolve the issue.
Posted Jan 06, 2025 - 23:12 UTC
Update
We are currently experiencing significant database degradation and are continuing to investigate the issue.
Posted Jan 06, 2025 - 22:12 UTC
Investigating
The fix rolled out fixed the notification latency but we have run into another issue during this mitigation which the team is actively investigating.
Posted Jan 06, 2025 - 21:42 UTC
Monitoring
We've identified the cause of delayed notification delivery, a fix is in place and notification latency is recovering
Posted Jan 06, 2025 - 20:53 UTC
Identified
We identified the possible root cause of the issue and are actively working on mitigating the issue
Posted Jan 06, 2025 - 19:46 UTC
Update
We are currently experiencing degraded performance due to a recurrence of recent database performance issues. Our engineering team is actively investigating and working on mitigating the impact
Posted Jan 06, 2025 - 19:04 UTC
Update
We are continuing to investigate this issue
Posted Jan 06, 2025 - 18:11 UTC
Investigating
We are currently investigating this issue.
Posted Jan 06, 2025 - 17:55 UTC
This incident affected: Notifications (GitHub Commit Status Notifications) and Agent API, Job Queue.