Forum

Fastly Internet Out...
 
Notifications
Clear all
Fastly Internet Outage Explained: How One Customer Broke Amazon, Reddit And Half The Web
Fastly Internet Outage Explained: How One Customer Broke Amazon, Reddit And Half The Web
Group: Registered
Joined: 2021-06-19
New Member

About Me

id="article-body" class="row" section="article-body">

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Not Fastly's proudest moment.

 

 

Peter Dazeley/Getty Images

 

 

 

 

 

 

Tuesday will be remembered as the day the internet broke -- before swiftly being fixed again. Early in the morning, websites including Amazon, Reddit, Spotify, eBay, Twitch, Pinterest and, unfortunately, CNET went offline due to a major outage at a service called Fastly. Everywhere you looked, there were 503 errors and people complaining they couldn't access key services and news outlets. Within 24 hours, we .

After an investigation into what went wrong, Fastly published a blog post describing exactly what went down -- and it turns out the whole incident was triggered by just a single, unnamed Fastly customer.

 

 

 

 

 

 

 

 

 

 

 

 

In mid-May, Fastly issued a software deployment that contained a bug, which if triggered in specific circumstances could take down vast swaths of its network. The bug lay dormant until June 8, when one Fastly customer inadvertently triggered it during a "valid configuration change," which caused 85% of the company's network to return errors.

"We detected the disruption within 1 minute, then identified and isolated the cause, and disabled the configuration," said Nick Rockwell, Fastly's senior vice president of engineering and infrastructure, in the blog post. "Within 49 minutes, 95% of our network was operating as normal. This outage was broad and severe, and we're truly sorry for the impact to our customers and everyone who relies on them."

What happened during the Fastly outage?

At around 2:58 a.m. PT, Fastly's  noted an error, saying "we're currently investigating potential impact to performance with our CDN [content delivery network] services." Shortly thereafter, reports emerged on Twitter of major news publications including the BBC, CNN and The New York Times being offline. Twitter itself was still running, although the server that hosted its emojis went down, leading to some odd-looking tweets.

Rather than isolated incidents affecting individual sites, it turned out this was a massive outage that had brought much of the internet to its knees. Across the world, people were receiving Error: 503 messages as they tried to access sites, including some vital services, such as the UK government's gov.uk web properties.

Almost an hour later, at 3:44 a.m. PT -- or 6:44 a. If you adored this write-up and you would such as to receive even more info relating to corporate investigation agency kindly browse through our page. m. ET, on the cusp of the US East Coast workday, and coming up on noon in the UK -- Fastly updated its status page again to say the issue has been identified and a fix was being implemented. At 4:10 a.m. PT, the company tweeted: "We identified a service configuration that triggered disruptions across our POPs globally and have disabled that configuration. Our global network is coming back online."

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

window.CnetFunctions.logWithLabel('%c One Trust ', "Service loaded: script_twitterwidget with class optanon-category-5");

 

 

 

 

 

 

 

 

 

 

The same message was sent to CNET as a comment by Fastly spokespeople.

What is Fastly?

Fastly is a cloud computing service provider, headquartered in San Francisco, that's been around since 2011. In 2017, it launched an edge cloud platform designed to bring websites closer to the people who use them. Effectively this means that if you're accessing a website hosted in another country, it will store some of that website closer to you so that there's no need to waste bandwidth by going to fetch all of that website's content from far away every time you need it.

This makes for faster website load times, and optimizes images, videos and other high-payload content to show up quickly and smoothly when you land on a web page. Among the boasts on the company's website, it says it made loading pages on Buzzfeed 50% faster and allowed The New York Times to simultaneously handle 2 million readers on election night. Edge computing also performs vital cybersecurity functions, protecting sites from DDoS attacks and bots, as well as providing a web application firewall.

Due to the way Fastly sits between the back-end web servers and the front-facing internet as we see it, any errors on its part can cause whole websites to be unavailable. Due to the localized nature of the edge cloud platform, it also means that errors don't affect all regions in the same way at the same time (although people all across the world reported experiencing problems on Tuesday).

What is a 503 error?

When you see a website displaying a 503 error rather than showing you the page you were expecting, it means the server hosting the website isn't ready to handle the request. It also indicates that the problem is temporary and that it will likely be resolved soon.

Commonly, it is caused when a server is down for maintenance, or when a website has been overloaded -- for example, if too many people are trying to access it at once.

Fastly issues service updates throughout the outage.

 

 

Screenshot/CNET

 

 

Why did Fastly fail on Tuesday and will it happen again?

We now know that Tuesday's internet outage was caused by a service configuration change by one of Fastly's customers that triggered a bug hidden in Fastly's network. The bug had been lying dormant since a software update deployment by Fastly on May 6.

To make sure the problem doesn't repeat itself, Fastly has said it's taking a number of actions. It is deploying a bug fix across its network, while also conducting a complete post-mortem of the processes and practices it followed during the incident. It's also going to be figuring out why it didn't catch the bug during its own testing processes and evaluating ways to improve remediation time.

"Even though there were specific conditions that triggered this outage, we should have anticipated it," said Rockwell. "We provide mission critical services, and we treat any action that can cause service issues with the utmost sensitivity and priority."

Many people speculated on Twitter that the outage was caused by a cyberattack, but we now know that this wasn't the case. There are many technical reasons a CDN can fail, and cyberattacks are just one of them. It is concerning, however, to see quite how vulnerable they can be.

"CDNs are part of the internet's critical infrastructure and if threat actors hadn't already cottoned on to this as a direct attack vector to bring down the internet, they will now after monitoring [Tuesday's] misfortunate events," said Jake Moore, a cybersecurity specialist at security firm ESET in a statement.

Why were so many websites affected by the Fastly outage?

Fastly is a widely used service by web publishers -- and it became apparent exactly how widely used on Tuesday when vast swaths of the internet became unavailable. The whole incident demonstrated just how much of the internet relies on this largely unheard-of cloud computing service.

The reason it's so popular is that the services it provides are considered essential by many online web properties, but not many companies provide these services. As such, a vast number of websites are reliant on a very small group of companies to keep running. Similar problems were seen when last July, and when last November.

As Corinne Cath-Speth, a Ph.D. candidate at Oxford Internet Institute and the Alan Turing Institute , this means "a technical hiccup in a single company can have huge ramifications."

"This in turn -- raises major questions about the dangers of (power) consolidation in the cloud market and the unquestioned influence these often invisible actors have over access to information," she added.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

<div class="videoPlayer " data-component="videoPlayer" website

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Location

Occupation

corporate investigation agency
Social Networks
Member Activity
0
Forum Posts
0
Topics
0
Questions
0
Answers
0
Question Comments
0
Liked
0
Received Likes
0/10
Rating
0
Blog Posts
0
Blog Comments
Share: