Deep Dive on Amazon Relational Database Service (reInvent 2017)
Why use Amazon RDS?
- Lower TCO
- Get more leverage from your teams
- Focus on the things that differentiate you
- Built-in high availability and cross region replication across multiple data centers
- Even a small startup can leverage multiple data centers to design highly available apps with over 99.95% availability
Which instance type should I choose?
- T2 Family
- Burstable instances
- Moderate networking performance
- Good for smaller or variable workloads
- Monitor CPU credit metrics in Amazon CloudWatch
- T2.micro is eligible for free tier
- M3/M4 Family
- General-purpose instances
- High-performance networking
- Good for running CPU intensive workloads
- R3/R4 Family
- Memory-optimized instances
- High-performance networking
- Good for query intensive workloads or high connection counts
Which storage type should I choose?
- General purpose (GP2)
- SSD storage
- Maximum of 16 TB!
- Leverages Amazon EBS Elastic Volumes
- IOPS determined by volume size
- Minimum of 100 IOPS (below 33.33GiB)
- Bursts to 3,000 IOPS (applicable below 1.3 TB)
- Baseline of 10,000 IOPS (at 3.3 TB and above)
- Affordable performance
- Provisioned IOPS (IO1)
- SSD storage
- Maximum of 16 TB!
- Leverages Amazon EBS Elastic Volumes
- Maximum of 40K IOPS (20K on SQL Server)
- Delivers within 10% of the IOPS performance 99.9% of the time
- High performance and consistency
- Magnetic
- Magnetic storage
- Maximum of 1TB
- Supported for legacy databases
How do I decide between GP2 and IO1?
- GP2 is a great choice, but be aware of burst credits on volumes < 1TB
- Hitting credit-depletion results in IOPS drop - latency and queue depth metrics will spike until credits are replenished
- Monitor
BurstBalance
to see percent of burst-bucket I/O credits available
- Monitor read/write IOPS to see if average IOPS is greater than the baseline
- Think of GP2 burst rate and PIOPS stated as maximum I/O rates
How do I scale my database instance?
- Scale compute/memory vertically up or down
- Handle higher load to grow over time
- Lower usage to control costs
- New host is attached to existing storage with minimal downtime
- Scale up Amazon ECS storage (up to 16TB!)
- Amazon ECS engines now support Elastic Volumes for fast scaling (now including SQL Server)
- No downtime for storage scaling
- Initial scaling operation may take longer, because storage is reconfigured on older instances
- Can re-provision IOPS on the fly
What happens during a Multi-AZ failover?
- Each host manages set of Amazon EBS volumes with a full copy of the data
- Instances are monitored by an external observer to maintain consensus over quorum
- Failover initiated by automation or through the Amazon RDS API
- Redirection to the new primary instance is provided through DNS (watch for TTLs)
Why would I use Read Replicas?
- Relieve pressure on your source database with additional read capacity
- Bring data close to your applications in different regions
- Promote a Read Replica to a master for faster recovery in the event of disaster
- Upgrade a Read Replica to a new engine version
- Supported for MySQL, MariaDB, and PostgreSQL
How does Amazon RDS manage backups?
- Two options - automated backups and manual snapshots
- Backups leverage Amazon EBS snapshots stored in S3
- Transaction logs are stored every 5 minutes in Amazon S3 to support point-in-time recovery (PITR)
- No performance penalty for backups
- Snapshots can be copied across regions or shared with other accounts
When to use Automated vs Manual backups?
- Automated
- Specify backup retention window per instance (7-day default)
- Kept until outside of window (35-day maximum) or instance is deleted
- Support PITR
- Good for disaster recovery
- Manual
- Manually created through AWS console, AWS CLI, or Amazon RDS API
- Kept until you delete them
- Restores to saved snapshot
- Use for checkpoint before making large changes, non-production/test environments, final copy before deleting a database
Restoring Backups
- Restoring creates an entirely new database instance
- New volumes are hydrated from Amazon S3
- While the volume is usable immediately, full performance requires the volume to warm up until fully instantiated
- Migrate to a DB instance class with high I/O capacity
- Maximize I/O during restore process
When should I use Multi-AZ as opposed to Read Replicas?
- Multi-AZ
- Synchronous replication - highly durable
- Only primary instance is active at any point in time
- Backups can be taken from secondary
- Always in two Availability Zones within a Region
- Database engine version upgrades happen on primary
- Automate failover when a problem is detected
- Read Replicas
- Asynchronous replication - highly scalable
- All replicas are active and can be used for read scaling
- No backups configured by default
- Can be within an AZ, cross-AZ or cross-region
- Database engine version upgrades independently from source instance
- Can be manually promoted to a standalone database
How do I secure my Amazon RDS database?
- Designed to be secure by default: patches, updates, etc…
- NEtwork isolation with VPC
- AWS IAM based resource-level permission controls
- Encryption at rest using AWS KMS (all engines) or Oracle/Microsoft TDE
- No performance penalty for encryption data
- Encryption cannot be removed from DB instances
- If source is encrypted, Read Replicas must be encrypted
- Add encryption to an unencrypted DB instance by encryption a snapshot copy
- Use SSL protection for data in transit
- Do not use AWS root credentials to manage RDS resources - create IAM user for everyone, including yourself
- Can use AWS Multi-Factor Authentication (MFA) to provide extra level of protection
How do I monitor my Amazon RDS database?
- Amazon CloudWatch Metrics
- CPU/Storage/Memory
- Swap Usage
- I/O (read and write)
- Latency
- Throughput
- Replica lag
- Amazon CloudWatch Alarms
- Enhanced monitoring for RDS
- Access to over 50 CPU, memory, file system and disk I/O metrics
- Low as 1-second intervals
- Integration with third-party monitoring tools
- Amazon RDS Performance Insights
- Measures DB Load - Average Active Sessions (AAS)
- Identifies database bottlenecks (TOP SQL)
- Identifies source of bottlenecks
- Enables problem discovery
- Adjustable time frame (hour, day, week and longer)
- Subscribe to SNS notifications on events
How do you maintain my database?
- Any maintenance that cases downtime will be scheduled in your maintenance window
- Operating system or Amazon RDS software patches are usually performed without restarting databases
- Database engine upgrades require downtime
- Minor version upgrades - automate or manually applied
- Major version upgrades - manually applied
- Version deprecations - three to six-month notification before scheduled upgrades
- View upcoming maintenance events in your AWS Personal Health Dashboard
How am I charged for Amazon RDS?
- Database instance (instance hours)
- Database storage (GB-mo)
- Backup storage
- No charge for backup storage up to 100% of total database storage
- Data transfer (GB-mo)
- Uses AWS regional data-transfer pricing
How do I understand my bill?
- Amazon RDS charges are grouped by region
- Instances are grouped by engine
- Storage and backup charges are cross-engine
- Use AWS Cost Explorer for graphical comparison
- Use the AWS Cost & Usage Report for billing details
- Must be enabled for account
- Stored in your Amazon S3 bucket
Saving Money
- Use Reserved Instances
- Stop database when not in use