Maximizing Value with AWS Whitepaper (2017)

Achieve Total Cost of Operation Benefits Using Cloud Computing

Create a Culture of Cost Management

  • Use tools like AWS Trusted Advisor and AWS Billing Explorer
    • Put data in the hands of everyone
      • Reduces feedback loop between information/data and the action to correct usage and sizing issues
    • Enact policies and evangelize
      • Best practices to drive operational excellence
    • Spend time training
      • Educate staff on the items affecting the cost and steps to eliminate waste
    • Create incentives for good behavior
      • Encourage cost effectiveness throughout the organization

Driving Cost of Operation

Funding Models

  • Traditional Data Center
    • Few big purchase decisions are made by a few people every few years
    • Typically over-provisioned as a result of planning up from for spikes usage
  • Cloud
    • Decentralized spending power
    • Small decisions made by a lot of people
    • Resources are spun up and down as new services are designed and then decommissioned
    • Cost ramifications felt by the organization as a whole are closely monitored and tracked

Spending Habits

  • Actively manage workloads - turn services on and off as needed
  • Eliminate surprises. Provide visibility into costs by making dashboard review a daily habit
  • Make cost optimization a joint effort. “Spenders” closely working with “watchers”.
  • Allocate charges to organizations actually using services.
  • Savings. Know who uses services and how they use them. Select best rate, evaluate pricing options that best meet the workload.
  • Tie spending to business metrics. Determine what gets measured, track usage, and identify areas for improvement.
  • Use innovative approaches to optimize spend. Consider policies such as “default off” for test and dev. as opposed to 24/7 or even “on during business hours”.

Total Cost of Operation

  • Reduced investment in large capital expenditures
  • Reducing operating expense (OpEx) costs involved with management and maintenance of data

Migration model that delivers optimal cost efficiency:

  • Identify current and Migration Cost
    • Labor, Network, Capacity, Availability/Power, Servers, Space
  • Determine Break-Event Costs and Timeframes
  • Calculate Savings Following the Transition

Total Costs of Migration (TCM)

  • IT staff will need to acquire new skills
  • New business processes will need to be defined
  • Existing business processes will need to be modified
  • Cost of discovery and migration tooling needs to be calculated
  • Duplicate environments will need to run until one is decommissioned
  • Penalties could be incurred for breaking data center, colocation, or licensing agreements

Migration Bubble - time and cost of moving application and infra from on-premises data centers to the AWS Cloud. Certain costs increase during the move.

Migration Bubble

Employ Best Practices

  • Determine top-line business metrics
  • Stay on top of instance utilization
    • Choose a cadence, and regularly measure results for services that have moved to the cloud
    • Use tools that track performance and usage to reduce cost overruns
    • Keep track of running instances. Optimize the size of servers and adjust as needed
    • If an instance is underutilized, determine if you need it or resize it
    • Be up-to-date with new AWS technologies with certain cost benefits
  • Distribute daily spending updates
    • Provide weekly reporting to evaluate and drive accountability
    • Teams should review bills and optimize costs during dev/test and prod
    • Create atmosphere of friendly competition, leaderboard that highlights teams with the best cost efficiencies
  • Every engineer can be a cost engineer
    • Innovate. Spin Up instances to test new ideas.
    • Build sizing into architecture. Use tagging to help with cost allocation.
    • Schedule dev/test. Eliminate waste of resources not in use.
  • Build automation into services
    • Automate processes so that they turn off when not in use to eliminate waste
    • Automate alerts to show when thresholds have been exceeded
    • Configuration Management. Every machine defined in code spins up or down as needed to drive performance and cost optimization.
    • Set alerts on old snapshots, oversized resources, and unattached volumes and then automate and re-balance for optimal sizing.
    • Eliminate troubleshooting. If an instance goes down, spin up a new one. Stop wasting time on unproductive activities.
  • Implement a reservation process
    • Track usage and modify reservations as needed