AWS global network infrastructure (reInvent 2019)

Network Aspects

  • Security
  • Availability
  • Scalability
  • Performance
  • Global Reach

Nitro Network Architecture

  • Nitro controller offloads in hardware lot of network features
    • ACL, Security Groups, VPC Peering, etc..
  • Gives consistent network performance
  • VPC encryption - hardware accelerated encryption

Building a scalable data center

  • Networking building blocks
    • Make it easy to scale in right-sized segments
    • Strong isolation boundaries
    • Large amounts of network capacity
  • Networking technology
    • Routes
      • Single-chip routers
        • Constrained failure domain
        • Fixed port types
        • Many devices to manage
        • Simpler forwarding architecture
    • Connectivity
      • Host Rack Networking
        • Partition placement groups
          • Ensures that all of instances in a partition group do not share any of underlying hardware with a resource in another partition group
        • Spread placement groups
          • Guarantee that each instance on the placement group is placed on distinct rack, with each rack having its own network and power source
    • Control plane
      • Using Single-chip-based platforms over large chassis-based platforms

Network Pattern

  • Core cell
    • Provides external network connectivity
  • Spine cell
    • Interconnects placement groups
  • Access cell
    • Provides connectivity for underlying host rack based on the require number of uplinks
  • Host rack

Availability Zones

  • Failure isolation from other AZs
  • Directly connects to other AZs
  • One or more data centers
  • Low-latency & close proximity
  • Scalability

Transit Centers

  • Provide internet and inter-region (backbone) connectivity
  • All AZs are connected redundantly
  • Located in facilities with dense internet connection

Physical Network Encryption

  • Any link outside of AWS physical control, including between AWS data centers, and across AWS backbone is protected
  • All traffic between AWS Regions (except China) is carried on the AWS backbone
  • Most links protected with MACSEC or optical encryption using AES-256
  • Small number of short-distance links use laser monitoring

AWS global network backbone

  • AWS Direct Connect
  • Internet Connectivity
  • AWS Global Accelerator
  • Region to Region communication
  • AWS CloudFront to AWS services

Benefits of having global backbone

  • Security
    • Traffic traverses AWS infrastructure rather than the internet
  • Availability
    • Controlling scaling and redundancy
    • Traffic operates over Amazon-controlled infrastructure
  • Reliable Performance
    • Controlling specific paths customer traffic traverses
  • Connecting close to customers
    • Avoid internet hot spots or sub-optimal external connectivity

Building a global backbone network

  • Latency Matters
    • Optimal in normalized situations
    • Minimize additional latency during path failures
  • 100G the new normal for backbone links
  • Similar design patterns and operations to the data centers
  • Extreme auditing of fiber paths
    • End-to-end latency
    • Path hazards
    • Repair expectations
  • Path diversity
    • Understanding shared risk link groups (SRLGs)
  • Capacity/Scale
    • Underlying optical transport capabilities

Inside an Edge POP

  • Multiple AWS Services
    • AWS Direct Connect
      • Low-latency access into AWS
      • Access to all AWS regions
      • Multiple customer-facing edge routes for redundancy
      • Multiple Direct Connect locations for redundancy
    • Amazon CloudFront & Amazon Route53
      • At the Amazon global network perimeter
      • Low-latency to external networks
      • Origin fetches traverse the AWS network backbone
      • IPv4 and IPv6 DNS anycast services
    • AWS Shield
      • Traffic scrubbing platforms to protect customers automatically
      • Stopped at the internet edge before traffic reaches the backbone
    • AWS Global Accelerator
  • AWS global network access
    • Optimal interconnection with external networks
    • AWS Region transit centers
  • External internet connectivity

Summary

  • Strong isolation from failures
  • Extensive network monitoring and auto-remediation systems
  • Large amounts of redundancy and over-provisioning
  • Easily scalable at every layer
  • Custom hardware and end-to-end control