Backup and Recovery Approaches using AWS Whitepaper (2016)
AWS Storage Services for Data Protection
- Amazon S3
- Single object limit of 5 TB
- Range of storage classes: standard, standard IA, glacier
- Amazon Glacier
- Extremely low-cost, cloud archive service
- Secure and durable storage for archiving and online backup
- Infrequently accessed data, retrieval times of several hours
- AWS Storage Gateway
- Connects on-premises software appliance with cloud-based storage
- AWS Transfer Services
- AWS Direct Connect, AWS Snowball, AWS Storage Gateway, Amazon S3 Transfer Acceleration to quickly transfer your data
Designing a Backup and Recovery Solution
- Backup process should meet the RTO and RPO of the business, including:
- File-level recovery
- Volume-level recovery
- Application-level recovery (e.g. databases)
- Image-level recovery
Cloud-Native Infrastructure
- EBS Snapshot-Based Protection
- EBS snapshots will be stored in Amazon S3, across multiple AZs
- First snapshot is a full copy of the volume
- Ongoing snapshots are incremental block-level changes only
- EBS snapshots can be copied between regions
- Consistent or Hot Backups
- Best to have system in a state where it’s not performing any updates
- For backing up the database, put it into hot backup mode when possible
- For XFS filesystem, you can flash its data for a consistent backup, using
xfs_freeze
- For file systems that don’t support the ability to freeze, unmount the volume, issue the snapshot command and remount the filesystem
- Multivolume Backups
- May require different considerations
- Data may be striped across multiple EBS volumes using a logical volume manager to increase potential throughput
- Snapshots should be initiated simultaneously for all volumes making up the RAID set
- Snapshots should be tagged so that you can manage them collectively during a restore
- Database Backup Approaches
- For databases on EC2, you can use native tools for databases or create a snapshot of the volumes
- AMIs can be used to be able to quickly restore the instance using
aws ec2 create-image
- For databases built on RAID, you can remove the burden of backups by creating a read replica of the database
- RDS fully automates backup and restore operations
- Automated backups enable point-in-time recovery of your DB instance
- Retention period of maximum 35 days for automated backup can be configured
- DB snapshots are user-initiated backups that enable you to back up your DB instance and then restore to that state at any time
aws ec2 create-snapshot
aws ec2 create-volume --region us-west-1b --snaposhot-id mysnapshot-id
aws ec2 detach-volume --volume-id oldvolume-id --instance-id myec2instance-id
aws ec2 attach-volume --volume-id newvolume-id --instance-id myec2instance-id --device /dev/sdf
On-Premises to AWS Infrastructure
- S3 and Glacier can be integrated with on-premise applications
- AWS Storage Gateway can be used if existing software does not natively support the AWS cloud
- Gateways act as plug-and-play devices providing standard iSCSI devices, which can be integrated into your backup or archive framework
More on Storage Gateways…
Hybrid Environments
- Mix of applications, some running on the cloud and others running on-premise
- AWS Direct Connect provides consistent latency to upload data to the cloud for the purposes of data protection and consistent performance and latency for hybrid workloads
- Combination of native tools, Storage Gateway, Direct Connect and VPN can be used to securely backup on-premise infrastructure to the cloud
Securing Backup Data in AWS
- S3 supports encryption in transit and at rest
- All API endpoints are SSL encrypted through HTTPS
- Sever-side encryption can be chosen using KMS or SES