How a Google Cloud Catch-22 Broke the Internet

Reported by WIRED:

Five days ago, the internet had a conniption. In broad patches around the globe, YouTube sputtered. Shopify stores shut down. Snapchat blinked out. And millions of people couldn’t access their Gmail accounts. The disruptions all stemmed from Google Cloud, which suffered an prolonged outage—which also prevented Google engineers from pushing a fix. And so, for an entire afternoon and into the night, the internet was stuck in a crippling ouroboros: Google couldn’t fix its cloud, because Google’s cloud was broken.

The root cause of the outage, as Google explained this week, was fairly unremarkable. (And no, it wasn’t hackers.) At 2:45 pm ET on Sunday, the company initiated what should have been a routine configuration change, a maintenance event intended for a few servers in one geographic region. When that happens, Google routinely reroutes jobs those servers are running to other machines, like customers switching lines at Target when a register closes. Or sometimes, importantly, it just pauses those jobs until the maintenance is over.

What happened next gets technically complicated—a cascading combination of two misconfigurations and a software bug—but had a simple upshot. Rather than that small cluster of servers blinking out temporarily, Google’s automation software descheduled network control jobs in multiple locations. Think of the traffic running through Google’s cloud like cars approaching the Lincoln Tunnel. In that moment, its capacity effectively went from six tunnels to two. The result: internet-wide gridlock.

Still, even then, everything held steady for a couple minutes. Google’s network is designed to “fail static,” which means even after a control plane has been descheduled, it can function normally for a small period of time. It wasn’t long enough. By 2:47 pm ET, this happened:

See if you can spot where Sunday’s Google Cloud outage started.

How a Google Cloud Catch-22 Broke the Internet

More Great WIRED Stories

Sports & Entertainment

Pat Tillman’s mother recalls command blunders behind ex-Cardinals safety’s death

Kit Harington leans into playing a bad guy in ‘Blood for Dust’

Reality TV’s Chrisleys are appealing their bank fraud and tax evasion convictions in federal court

Our annual mock draft with 32 first-round trades: How three star wideouts could find new teams

Categories

Recent Posts

Trump arrives in court, says ‘gag order has to come off’

Juror No. 4 rips ‘cowardly judge’ after dismissal from Trump case: ‘Something else is going on here’

New York congressman fed up with MTG’s ‘theater’

Democrats clear path to bring proposed repeal of Arizona’s near-total abortion ban to a vote

FOLLOW @ NATIONAL HILL

About us

Popular Posts

Amazon Came to the Bargaining Table—But Workers Want More

Review: The hype is justified for horror hit ‘Hereditary’

Contact Us

How a Google Cloud Catch-22 Broke the Internet

More Great WIRED Stories

Related articles

Sports & Entertainment

Categories

Recent Posts

FOLLOW @ NATIONAL HILL