top of page
OutSystems-business-transformation-with-gen-ai-ad-300x600.jpg
OutSystems-business-transformation-with-gen-ai-ad-728x90.jpg
TechNewsHub_Strip_v1.jpg

LATEST NEWS

Marijan Hassan - Tech Journalist

Google Cloud suffers 12-hour outage in Frankfurt, Germany


On Thursday last week, Google Cloud experienced a significant outage in its Europe-west3 region located in Frankfurt, Germany. The outage began at 2:30 AM local time and wasn’t resolved until 3:09 PM, a duration of 12 hours and 39 minutes.



“We apologize for the inconvenience this service disruption/outage may have caused,” Google Cloud stated in an official advisory to users. According to the company, the incident stemmed from a power failure and cooling issue in Europe-west3-c, one of the three zones in the Frankfurt region. This resulted in significant degradation of services, especially in the affected zone.


Scope and Impact of the Outage

The outage disrupted multiple Google Cloud services, including essential tools used across industries for data processing, machine learning, and infrastructure management. Services impacted included Google Compute Engine, Google Kubernetes Engine, Persistent Disk, Cloud Dataflow, Cloud Build, Cloud Pub/Sub, and Vertex AI Batch Prediction.


Users experienced a range of issues, depending on the service:


Google Compute Engine: Virtual machine (VM) creation failures, delays in processing deletions, and inaccessible instances were among the problems faced by users.

Google Kubernetes Engine: Nodes within the affected zone were inaccessible, and attempts to create new nodes often failed.


Persistent Disk: Some instances were unreachable, rendering certain operations impossible.

Google Cloud Dataflow: Users encountered delays in scaling batch jobs, and some streaming jobs failed to progress or scale properly.


Vertex AI Batch Prediction: Multi-zonal issues led to job failures, with some users receiving an error message indicating an inability to prepare infrastructure for processing in time.


While most of the issues were localized within the Europe-west3-c zone, Google acknowledged some regional-level impact as well. For instance, in the two unaffected zones within Europe-west3, less than one percent of operations related to instance and disk resources encountered internal errors.


Mitigation

Google eventually implemented a fix to restore the full operation of the data center, mitigating the issue. However, the company faced criticism for its initial response, which was perceived as slow and lacking in actionable advice. The cloud giant notified users of the disruption 26 minutes after the outage began but was unable to suggest any workarounds for nearly three hours.


While the Europe-west3 region is not known for frequent outages, this incident highlights the potential impact of such disruptions on businesses and organizations relying on cloud services. It’s a reminder of the the importance of having a disaster recovery plan and exploring multi-region deployments.

wasabi.png
Gamma_300x600.jpg
paypal.png
bottom of page