Backup and Recovery Approaches using AWS Whitepaper (2016)

AWS Storage Services for Data Protection

  • Amazon S3
    • Single object limit of 5 TB
    • Range of storage classes: standard, standard IA, glacier
  • Amazon Glacier
    • Extremely low-cost, cloud archive service
    • Secure and durable storage for archiving and online backup
    • Infrequently accessed data, retrieval times of several hours
  • AWS Storage Gateway
    • Connects on-premises software appliance with cloud-based storage
  • AWS Transfer Services
    • AWS Direct Connect, AWS Snowball, AWS Storage Gateway, Amazon S3 Transfer Acceleration to quickly transfer your data

Designing a Backup and Recovery Solution

  • Backup process should meet the RTO and RPO of the business, including:
    • File-level recovery
    • Volume-level recovery
    • Application-level recovery (e.g. databases)
    • Image-level recovery

Cloud-Native Infrastructure

  • EBS Snapshot-Based Protection
    • EBS snapshots will be stored in Amazon S3, across multiple AZs
    • First snapshot is a full copy of the volume
    • Ongoing snapshots are incremental block-level changes only
    • EBS snapshots can be copied between regions
    • Consistent or Hot Backups
      • Best to have system in a state where it’s not performing any updates
      • For backing up the database, put it into hot backup mode when possible
      • For XFS filesystem, you can flash its data for a consistent backup, using xfs_freeze
      • For file systems that don’t support the ability to freeze, unmount the volume, issue the snapshot command and remount the filesystem
    • Multivolume Backups
      • May require different considerations
      • Data may be striped across multiple EBS volumes using a logical volume manager to increase potential throughput
      • Snapshots should be initiated simultaneously for all volumes making up the RAID set
        • Snapshots should be tagged so that you can manage them collectively during a restore
    • Database Backup Approaches
      • For databases on EC2, you can use native tools for databases or create a snapshot of the volumes
        • AMIs can be used to be able to quickly restore the instance using aws ec2 create-image
      • For databases built on RAID, you can remove the burden of backups by creating a read replica of the database
      • RDS fully automates backup and restore operations
        • Automated backups enable point-in-time recovery of your DB instance
          • Retention period of maximum 35 days for automated backup can be configured
        • DB snapshots are user-initiated backups that enable you to back up your DB instance and then restore to that state at any time
aws ec2 create-snapshot

aws ec2 create-volume --region us-west-1b --snaposhot-id mysnapshot-id

aws ec2 detach-volume --volume-id oldvolume-id --instance-id myec2instance-id

aws ec2 attach-volume --volume-id newvolume-id --instance-id myec2instance-id --device /dev/sdf

On-Premises to AWS Infrastructure

  • S3 and Glacier can be integrated with on-premise applications
  • AWS Storage Gateway can be used if existing software does not natively support the AWS cloud
  • Gateways act as plug-and-play devices providing standard iSCSI devices, which can be integrated into your backup or archive framework

More on Storage Gateways…

Hybrid Environments

  • Mix of applications, some running on the cloud and others running on-premise
  • AWS Direct Connect provides consistent latency to upload data to the cloud for the purposes of data protection and consistent performance and latency for hybrid workloads
  • Combination of native tools, Storage Gateway, Direct Connect and VPN can be used to securely backup on-premise infrastructure to the cloud

Securing Backup Data in AWS

  • S3 supports encryption in transit and at rest
  • All API endpoints are SSL encrypted through HTTPS
  • Sever-side encryption can be chosen using KMS or SES