Google broke its own cloud AGAIN, with TWO software bugs

Google broke its own cloud AGAIN, with TWO software bugs

A couple of days ago Google's cloud went offline, just about everywhere, for 18 minutes. Now the Alphabet subsidiary has explained why and issued a personal apology penned by “Veep for 24x7” Benjamin Treynor Sloss.

And yes, that is Sloss' real title.

Sloss says the problem started when “engineers removed an unused Google Compute Engine (GCE) IP block from our network configuration, and instructed Google’s automated systems to propagate the new configuration across our network.” Google announces the IP blocks it is using to help route traffic into its cloud.

On this occasion, the propagation failed due to “a timing quirk in the IP block removal - the IP block had been removed from one configuration file, but this change had not yet propagated to a second configuration file also used in network configuration management.”

When propagation fails, Google usually fails over to the configuration in place before the new block was added. But on this occasion “a previously-unseen software bug was triggered, and instead of retaining the previous known good configuration, the management software instead removed all GCE IP blocks from the new configuration and began to push this new, incomplete configuration to the network.”

Link to article: http://www.theregister.co.uk/2016/04/14/google_broke_its_own_cloud_again_with_two_software_bugs/?mt=1462203925272

Share this post

    Comments (0)

Leave a comment