Migrating Applications Running Relational Databases to AWS: Best Practices Guide (2019)

Migration Steps and Tools

  1. Migration assessment analysis
  2. Schema conversion to target database platform
  3. SQL statement and application code conversion
  4. Data migration
  5. Testing of converted database and application code
  6. Setting up replication and failover scenarios for data migration to the target platform
  7. Setting up monitoring for a new production environment and go live with the target environment

Tools to automate migration steps:

  • AWS Schema Conversion Tool (AWS SCT)
  • AWS Database Migration Service (AWS DMS)

Development Environment Setup Prerequisites

  • Most desirable for development env to mirror the production env
  • Setup the AWS SCT on a server in the development environment
  • Procure an Amazon RDS instance to serve as the migration target

Step 1: Migration Assessment

Database Migration Assessment Report

  • Identifies schema objects (tables, views, stored procedures, triggers, etc.) in the source database and the actions that are required to convert them (Action Items) to the target database (including fully automated conversion, small changes like selection of data types or attributes of tables, and rewrites of significant portions of the stored procedure)
  • Recommends the best target engine, based on the source database and the features used
  • Recommends other AWS services that can substitute for missing features
  • Recommends unique features available in Amazon RDS that can save the customer licensing and other costs
  • Recommends re-architecting for the cloud, for example sharding a large database into multiple Amazon RDS instances, such as sharding by customer or tenant, sharding by geography, or sharing by partition key

Report Sections:

  • Executive summary
    • Key migration metrics, help you choose the best target engine
  • Conversion statistics graph
    • Visualizes the schema objects and number of conversion issues / complexity required in the migration project
      • Objects automatically converted
      • Objects with simple actions (less than 1 hour)
      • Objects with medium-complexity actions (1-4 hours)
      • Objects with significant actions (4 hours or more)
  • Conversion action items
    • Detailed list with recommendations and references in the database code

Step 2: Schema Conversion

Translation of DDL for the source database to the target database syntax

  1. Convert the schema
  2. Apply the schema to the target database

AWS SCT

  • Automatically creates DDL scripts for objects that can be converted automatically
  • Highlights objects that require manual intervention to convert to the target platform
  • Supported languages:
    • Standard objects
    • Application code in triggers
    • PLSQL, T-SQL
    • Simpler procedural languages of MySQL and PostgreSQL
  • Schema Mapping Rules
    • Create custom schema transformations and mapping rules during the conversion
    • Standardize naming conventions
    • Can be exported to be used by AWS DMS during Data Migration step
    • Supported transformations:
      • Rename
      • Add prefix
      • Add suffix
      • Remove prefix
      • Remove suffix
      • Replace prefix
      • Replace suffix
      • Convert uppercase (not available for columns)
      • Convert lowercase (not available for columns)
      • Move to (tables only)
      • Change data type (columns only)

Step 3: Conversion of Embedded SQL and Application Code

  1. Run an assessment report to understand the level of effort required to convert the application code to the target platform
  2. Analyze the code to extract embedded SQL statements
  3. Allow the AWS SCT to automatically convert as much code as possible
  4. Work through the remaining conversion Action Items manually
  5. Save code changes

AWS SCT to convert application code:

  1. Extract SQL statements from the surrounding application code
  2. Convert SQL statements

Embedded SQL conversion process:

  1. Analyze the selected code folder to extract embedded SQL
  2. Convert the SQL to the target script. If the AWS SCT is able to convert the script automatically, it appears in the lower right pane. Any manual conversion code can also be entered here.
  3. Apply the converted SQL to the source code base, swapping out the original snippet for the newly converted snippet.
  4. Save the changes to the source code. A backup of the original source code is saved to your AWS SCT working directory with an extension of .old.
  5. Click the green checkmark to the right of Parsed SQL Script to validate the Target SQL script against the target database.

Step 4: Data Migration

AWS DMS process:

  1. Set up a replication instance (for the source DB)
  2. Define connections for the source and target databases
  3. Define data replication tasks

Supported Data Migrations:

  • Full load of existing data
  • Full load of existing data, followed by continuous replication of data changes to the target

Step 5: Testing Converted Code

  • Thoroughly test the migrated application
  • Ensure correct functional behavior on the new platform
  • Perform automated tests on the converted database
  • Examine data rows affected by the tested
  • Analyze data independently from application functionality
  • Accounts for 45% of the overall migration effort
    • Exercising critical functionality in the application
    • Verifying that converted SQL objects are functioning as intended

Step 6: Data Replication

AWS DMS Change Data Capture (CDC) process:

  • downtime for full load migration is not required
  • implements ongoing replication from the source database to the target database
  • no installation required neither on the source nor the target databases

CDC supported ongoing replication:

  • Migrate existing data and replicate ongoing changes
    • Creating the target schema
    • Migrating existing data and caching changes to existing data as it is migrated
    • Applying those cached data changes until the database reaches a steady state
    • Applying current data changes to the target as soon as they are received by replication instance
  • Replicate data changes only
    • Replicate data changes from specific point in time (in case schema and initial load is already completed)

Replication Sources:

  • MS SQL Server
    • Replication - must be enabled on the source server and distribution database
    • Transaction logs - source database should be in Full or Bulk Recovery Mode to enable transaction log backups
  • Oracle
    • BinaryReader or LogMiner
    • ARCHIVELOG
    • Supplemental Logging
  • PostgreSQL
    • Write-Ahead Logging (WAL)
    • Primary Key
  • MySQL
    • Binary Logging
    • Automatic backups
  • SAP ASE (Sybase)
    • Replication
  • MongoDB
    • Oplog
  • IBM Db2 LUW
    • LOGARCHMETH1 and LOGARCHMETH2 should be set to ON

Step 7: Deployment to AWS and Go Live

  • Enabled Validation option in the Task Settings of AWS DMS
    • AWS DMS validates the data migration by comparing the data in the source and target databases
  • Post-Deployment Monitoring
    • Table statistics pane
    • Migration task metrics pane
    • Amazon CloudWatch Logs
      • Enable CloudWatch for DMS Task in the Task Settings

Best Practices

  • Schema Conversion Best Practices
    • Save the Database Migration Assessment Report (CSV/PDF)
    • Apply DDL in the following order to avoid dependency errors:
      • Sequences
      • Tables
      • Views
      • Procedures
    • Configure the AWS SCT with the memory performance settings you need
    • Apply the additional schema that AWS SCT creates to the target database aw_[source platform]_ext
    • Use source code version control to track changes to target objects (both database and application code)
  • Application Code Conversion Best Practices
    • After running the initial application assessment report, save it as CSV/PDF
  • Data Migration Best Practices
    • Choose a replication instance class large enough to support your database size and transactional load
      • By default DMS loads 8 tables at a time. Performance can be increased on a larger instance.
    • On the target database, disable what isn’t needed
      • Disable unnecessary triggers, validation, foreign keys, secondary, jobs, backups and logging indexes on the target databases, if possible.
      • Tables in the source database that do not participate in common transactions can be allocated to different tasks
      • Monitor performance of the source system to ensure it is able to handle the load of the database migration tasks
      • Enable logging using Amazon CloudWatch Logs - helps to troubleshoot DMS errors
      • Optimize loading BLOBs, XML, or other binary data using Task Settings
  • Data Replication Best Practices
    • Achieve best performance by not applying indexes or foreign keys to the target database during the initial load - better write performance on the target database
    • Apply indexes and foreign keys to the target database before the application is ready to go live
    • For ongoing replication (such as for high availability), enable the Multi-AZ option of the replication instance - provides high availability and failover support for the replication instance.
    • Use the AWS API or AWS CLI for more advanced AWS DMS task settings - more granular control over data replication tasks and additional settings that are not currently available in the AWS Management Console
    • Disable backups on the target database during the full load for better performance. Enable them during cutover.
    • Wait until cutover to make your target RDS instance Multi-AZ for better performance.
  • Testing Best Practices
    • Have a test environment where full regression tests of the original application can be conducted.
    • In the absence of automated testing, run “smoke” tests on the old and new applications, comparing data values and UI functionality to ensure like behavior.
    • Apply standard practices for database-drive software testing regardless of the migration process.
    • Have sample test data that is used only for testing.
    • Know your data logic and apply it to your test plans. Helps you produce comprehensive test data to cover mission-critical application functionality.
    • Test using a dataset similar in size to the production dataset to expose performance bottlenecks, such as missing or non-performant indexes.
  • Deployment and Go Live Best Practices
    • Have a rollback plan in place
    • Test the deployment on a staging or pre-production environment to ensure that all needed objects, libraries, code, etc., are included in the deployment and created in the correct order of dependency
    • Verify that AWS DMS has reached a steady state and all existing data has been replicated to the new server before cutting off access to the old application in preparation for the cutover
    • Verify that database maintenance jobs are in place, such as backups and index maintenance.
    • Turn on Multi-AZ, if required
    • Verify that monitoring is in place
    • AWS provides several services to make deployments easier and trouble-free, such as CloudFormation, OpsWOrks and CodeDeploy.
  • Post-Deployment Monitoring Best Practices
    • Create CloudWatch Logs alarms and notifications to monitor for unusual database activity, and send alerts to notify production staff if the AWS instance is not performing well.
    • Monitor logs and exception reports for unusual activity and errors.
    • Determine if there are additional platform-specific metrics to capture and monitor.
    • Monitor instance health.