Deep Dive on Amazon Relational Database Service (reInvent 2017)

Why use Amazon RDS?

  • Lower TCO
    • Get more leverage from your teams
    • Focus on the things that differentiate you
  • Built-in high availability and cross region replication across multiple data centers
  • Even a small startup can leverage multiple data centers to design highly available apps with over 99.95% availability

Which instance type should I choose?

  • T2 Family
    • Burstable instances
    • Moderate networking performance
    • Good for smaller or variable workloads
    • Monitor CPU credit metrics in Amazon CloudWatch
    • T2.micro is eligible for free tier
  • M3/M4 Family
    • General-purpose instances
    • High-performance networking
    • Good for running CPU intensive workloads
  • R3/R4 Family
    • Memory-optimized instances
    • High-performance networking
    • Good for query intensive workloads or high connection counts

Which storage type should I choose?

  • General purpose (GP2)
    • SSD storage
    • Maximum of 16 TB!
    • Leverages Amazon EBS Elastic Volumes
    • IOPS determined by volume size
    • Minimum of 100 IOPS (below 33.33GiB)
    • Bursts to 3,000 IOPS (applicable below 1.3 TB)
    • Baseline of 10,000 IOPS (at 3.3 TB and above)
    • Affordable performance
  • Provisioned IOPS (IO1)
    • SSD storage
    • Maximum of 16 TB!
    • Leverages Amazon EBS Elastic Volumes
    • Maximum of 40K IOPS (20K on SQL Server)
    • Delivers within 10% of the IOPS performance 99.9% of the time
    • High performance and consistency
  • Magnetic
    • Magnetic storage
    • Maximum of 1TB
    • Supported for legacy databases

How do I decide between GP2 and IO1?

  • GP2 is a great choice, but be aware of burst credits on volumes < 1TB
    • Hitting credit-depletion results in IOPS drop - latency and queue depth metrics will spike until credits are replenished
    • Monitor BurstBalance to see percent of burst-bucket I/O credits available
    • Monitor read/write IOPS to see if average IOPS is greater than the baseline
  • Think of GP2 burst rate and PIOPS stated as maximum I/O rates

How do I scale my database instance?

  • Scale compute/memory vertically up or down
    • Handle higher load to grow over time
    • Lower usage to control costs
    • New host is attached to existing storage with minimal downtime
  • Scale up Amazon ECS storage (up to 16TB!)
    • Amazon ECS engines now support Elastic Volumes for fast scaling (now including SQL Server)
    • No downtime for storage scaling
    • Initial scaling operation may take longer, because storage is reconfigured on older instances
    • Can re-provision IOPS on the fly

What happens during a Multi-AZ failover?

  • Each host manages set of Amazon EBS volumes with a full copy of the data
  • Instances are monitored by an external observer to maintain consensus over quorum
  • Failover initiated by automation or through the Amazon RDS API
  • Redirection to the new primary instance is provided through DNS (watch for TTLs)

Why would I use Read Replicas?

  • Relieve pressure on your source database with additional read capacity
  • Bring data close to your applications in different regions
  • Promote a Read Replica to a master for faster recovery in the event of disaster
  • Upgrade a Read Replica to a new engine version
  • Supported for MySQL, MariaDB, and PostgreSQL

How does Amazon RDS manage backups?

  • Two options - automated backups and manual snapshots
  • Backups leverage Amazon EBS snapshots stored in S3
  • Transaction logs are stored every 5 minutes in Amazon S3 to support point-in-time recovery (PITR)
  • No performance penalty for backups
  • Snapshots can be copied across regions or shared with other accounts

When to use Automated vs Manual backups?

  • Automated
    • Specify backup retention window per instance (7-day default)
    • Kept until outside of window (35-day maximum) or instance is deleted
    • Support PITR
    • Good for disaster recovery
  • Manual
    • Manually created through AWS console, AWS CLI, or Amazon RDS API
    • Kept until you delete them
    • Restores to saved snapshot
    • Use for checkpoint before making large changes, non-production/test environments, final copy before deleting a database

Restoring Backups

  • Restoring creates an entirely new database instance
  • New volumes are hydrated from Amazon S3
    • While the volume is usable immediately, full performance requires the volume to warm up until fully instantiated
    • Migrate to a DB instance class with high I/O capacity
    • Maximize I/O during restore process

When should I use Multi-AZ as opposed to Read Replicas?

  • Multi-AZ
    • Synchronous replication - highly durable
    • Only primary instance is active at any point in time
    • Backups can be taken from secondary
    • Always in two Availability Zones within a Region
    • Database engine version upgrades happen on primary
    • Automate failover when a problem is detected
  • Read Replicas
    • Asynchronous replication - highly scalable
    • All replicas are active and can be used for read scaling
    • No backups configured by default
    • Can be within an AZ, cross-AZ or cross-region
    • Database engine version upgrades independently from source instance
    • Can be manually promoted to a standalone database

How do I secure my Amazon RDS database?

  • Designed to be secure by default: patches, updates, etc…
  • NEtwork isolation with VPC
  • AWS IAM based resource-level permission controls
  • Encryption at rest using AWS KMS (all engines) or Oracle/Microsoft TDE
    • No performance penalty for encryption data
    • Encryption cannot be removed from DB instances
    • If source is encrypted, Read Replicas must be encrypted
    • Add encryption to an unencrypted DB instance by encryption a snapshot copy
  • Use SSL protection for data in transit
  • Do not use AWS root credentials to manage RDS resources - create IAM user for everyone, including yourself
  • Can use AWS Multi-Factor Authentication (MFA) to provide extra level of protection

How do I monitor my Amazon RDS database?

  • Amazon CloudWatch Metrics
    • CPU/Storage/Memory
    • Swap Usage
    • I/O (read and write)
    • Latency
    • Throughput
    • Replica lag
  • Amazon CloudWatch Alarms
  • Enhanced monitoring for RDS
    • Access to over 50 CPU, memory, file system and disk I/O metrics
    • Low as 1-second intervals
  • Integration with third-party monitoring tools
  • Amazon RDS Performance Insights
    • Measures DB Load - Average Active Sessions (AAS)
    • Identifies database bottlenecks (TOP SQL)
    • Identifies source of bottlenecks
    • Enables problem discovery
    • Adjustable time frame (hour, day, week and longer)
  • Subscribe to SNS notifications on events

How do you maintain my database?

  • Any maintenance that cases downtime will be scheduled in your maintenance window
  • Operating system or Amazon RDS software patches are usually performed without restarting databases
  • Database engine upgrades require downtime
    • Minor version upgrades - automate or manually applied
    • Major version upgrades - manually applied
    • Version deprecations - three to six-month notification before scheduled upgrades
    • View upcoming maintenance events in your AWS Personal Health Dashboard

How am I charged for Amazon RDS?

  • Database instance (instance hours)
  • Database storage (GB-mo)
  • Backup storage
    • No charge for backup storage up to 100% of total database storage
  • Data transfer (GB-mo)
    • Uses AWS regional data-transfer pricing

How do I understand my bill?

  • Amazon RDS charges are grouped by region
  • Instances are grouped by engine
  • Storage and backup charges are cross-engine
  • Use AWS Cost Explorer for graphical comparison
  • Use the AWS Cost & Usage Report for billing details
    • Must be enabled for account
    • Stored in your Amazon S3 bucket

Saving Money

  • Use Reserved Instances
  • Stop database when not in use