Why in news?
A major Amazon Web Services (AWS) outage disrupted over 1,000 online services worldwide, including WhatsApp, Snapchat, Reddit, and even government and financial platforms like the UK tax service.
The incident exposed the vulnerability of global businesses that depend heavily on cloud-based infrastructure. The recent outage follows similar incidents, such as a Microsoft cloud failure last year, which also caused global service interruptions.
What’s in Today’s Article?
- AWS: The Backbone of the Internet
- The Massive AWS Outage
- AWS Outage Reveals About the Fragility of the Internet
- The Call for Cloud Diversification
AWS: The Backbone of the Internet
- AWS has positioned itself as the core infrastructure of the Internet, providing cloud storage, computing tools, databases, and web traffic management to about one-third of all online services.
- Its business model is simple — it hosts and manages computing systems for companies, sparing them the expense of maintaining their own data centres.
- While AWS contributes about 20% of Amazon’s total sales, it generates nearly 60% of its operating profits, highlighting its critical role in the company’s business model — and in keeping the global Internet running.
The Massive AWS Outage
- The AWS outage began on Oct 20 morning, with users worldwide reporting problems accessing major online platforms.
- The disruption affected AWS’s North Virginia data centre region, one of its key operational hubs.
- On its official health page, Amazon stated that it had experienced “increased error rates and latencies” across several services.
- The company later identified the root cause as a DNS (Domain Name System) resolution issue related to its DynamoDB service endpoints.
- Understanding the DNS Issue
- The Domain Name System (DNS) acts as the Internet’s address book — translating human-readable domain names (like example.com) into numerical IP addresses that computers use to locate servers.
- When DNS fails, web browsers cannot find the correct server, leading to slow loading, inaccessibility, or error messages.
- Such issues are common but disruptive, as they can cascade across multiple services relying on the same cloud infrastructure.
- Role of DynamoDB in the Outage
- At the centre of the outage was Amazon’s DynamoDB, a fully managed, serverless NoSQL database service that supports high-performance, scalable applications.
- Unlike traditional SQL databases with fixed table structures, NoSQL databases like DynamoDB can handle flexible, diverse data formats, making them popular for dynamic web platforms and apps.
- Because many major services depend on DynamoDB, a DNS failure in this system had widespread ripple effects, temporarily crippling parts of the Internet.
AWS Outage Reveals About the Fragility of the Internet
- Despite hosting billions of online services, most of the global Internet runs on cloud infrastructure managed by just three companies — AWS, Microsoft, and Google.
- Experts have long warned that this concentration of digital infrastructure poses a major risk: a minor glitch in one provider can disrupt large portions of the Internet, as seen in the recent AWS outage.
- Why Businesses Depend on Big Cloud Providers?
- Until a few years ago, companies managed their own servers and cloud systems.
- But shifting to major providers proved cheaper, faster, and more efficient, leading to mass outsourcing of IT operations.
- While such outages remain infrequent, their impact is massive because so many businesses rely on the same limited set of cloud vendors.
- Similar Global Disruptions in the Past
- In 2024, a CrowdStrike code update within Microsoft’s cloud network caused widespread disruptions across sectors like aviation, banking, and broadcasting.
- The incident showed how a single faulty update in a shared cloud ecosystem can paralyse global systems almost instantly.
- Impact on India: Aviation and Banking Sectors Hit
- In India, the AWS outage particularly affected the aviation industry, grounding operations as airlines’ digital systems failed, forcing a temporary return to manual processing.
- According to the Reserve Bank of India, at least ten banks and NBFCs experienced minor service disruptions, most of which were quickly resolved.
The Call for Cloud Diversification
- The outage reignited debate over the need for countries to develop independent cloud infrastructure to reduce reliance on U.S. giants.
- Incidents like this highlight that outsourcing digital infrastructure to a handful of firms carries significant risks, underscoring the need for diversification and digital self-reliance.