On January 24, 2025, AppSheet customers experienced issues loading apps due to 500 errors and timeouts, affecting around 60% of requests in the us-east4 and europe-west4 regions for 1 hour and 50 minutes. A database schema migration in production triggered the incident, causing failures and timeouts on the primary database. The migration led to a surge of retries, overloading the secondary authentication database and making it unresponsive to requests in the affected regions. The authentication database stores user authentication tokens, and the issue was resolved by migrating traffic to the us-central1 and us-west1 regions. However, this triggered an increase in load on the service for validating users' Workspace license entitlements, leading to aggressive load shedding and elevated latency for 95% of traffic. Google engineers were alerted to the outage and started an investigation, redirecting traffic to mitigate the impact. The issue was resolved by 11:20 US/Pacific after restoring the authentication database and gradually reverting traffic to us-east4 and europe-west4. To prevent similar incidents, Google is improving alerting and monitoring of license server traffic, reducing dependency on licensing servers, and reviewing measures to increase the stability of the authentication database. The incident affected AppSheet customers in the Netherlands and Northern Virginia, who experienced failed requests, elevated errors, and intermittent latency. Google apologizes for the disruption and is committed to preventing similar incidents in the future.
status.cloud.google.com
status.cloud.google.com
