Infrastructure as Code Whitepaper (2017)
Infrastructure Resource Lifecycle
IaaC Features:
- Both administrator and developers can instantiate infrastructure using configuration files
- Code can be used to produce compute, storage, network and application services
- Eliminates configuration drift through automation
- Increases the speed and agility of infrastructure deployments
Lifecycle:
- Resource provisioning
- Configuration management
- Monitoring and performance
- Compliance and governance
- Resource optimization
Resource Provisioning
AWS CloudFormation
- Uses JSON/YAML to describe the collection of AWS resources (knows as stack), dependencies and runtime parameters
- Templates can be used repeatedly to create copies of the same stack across AWS regions
- Code can be versioned
- Change Sets enables you to preview proposed changes to a stack without performing the associated updated
- Create a change set
- View the change set
- Execute the change set
- Reusable Templates
- Nested Stacks: associating parent / child stacks
- Cross-stack referencing: referencing resources from one stack in another stack
- Template Linting
- Static analysis of AWS CloudFormation templates
ValidateTepmplate
API: aws cloudformation validate-template --template-url [url]
cfn-nag
performs additional evaluations on templates to look for potential security concerns
cfn-check
performs deeper checks on resource specifications to identify potential errors before they emerge during stack creation
Template Anatomy
---
AWSTemplateFormatVersion: "version date"
Description:
String
Parameters:
set of parameters
Mappings:
set of mappings
Conditions:
set of conditions
Transform:
set of transforms
Resources:
set of resources
Outputs:
set of outputs
Best Practices for designing and implementing AWS CloudFormation templates:
- Planning and Organizing
- Organize your stacks by lifecycle and ownership
- Use IAM to control access
- Reuse templates to replica stacks in multiple environments
- Use nested stacks to reuse common template patterns
- Use cross-stack references to export shared resources
- Creating templates
- Do not embed credentials in your templates
- Use AWS-Specific parameter types
- Use Parameter constraints
- Use AWS::CloudFormation::Init to deploy software application to Amazon EC2 instances
- Use the latest helper scripts
- Validate templates before using them
- Use Parameter store to centrally manage parameters in your templates
- Managing stacks
- Manage all stack resources through AWS CloudFormation
- Create Change Sets before updating your stacks
- Use Stack Policies
- Use AWS CloudTrail to log AWS CloudFormation calls
- Use code reviews and revision control to manage your templates
- Update your Amazon EC2 linux instances regularly
Configuration Management
Amazon EC2 Systems Manager
Task List
- Run Command
- Manages the configuration of managed instances at scale by distributing commands across a fleet
- Inventory
- Automate the collection of the software inventory from managed instances
- State Manager
- Keep managed instances in a defined and consistent state
- Maintenance Windows
- Define a maintenance window for running administrative tasks
- Patch Manager
- Deploy software patches automatically across group of instances
- Automation
- Perform common maintenance and deployment tasks, such as updating Amazon Machine Images (AMIs)
- Parameter Store
- Store, control, access, and retrieve configuration data, whether plain-text data such as database strings or secrets such as passwords, encrypted through AWS Key Management System (KMS)
Document Structure
- Document defines actions that System Manager performs on your instances
- Includes pre-configured documents to support the capabilities
- Supports creation of custom version-controlled documents to augment capabilities of System Manager
- Steps in the document define execution order
- Written in JSON
Example:
{
"schemaVersion": "2.0",
"description": "Sample version 2.0 document v2",
"parameters": {},
"mainSteps": [
{
"action": "aws:runPowerShellScript",
"name": "runShellScript",
"inputs": {
"runCommand": ["ipconfig"]
}
},
{
"action": "aws:applications",
"name": "installapp",
"inputs": [
"action": "Install",
"source": "http://dev.mysql.com/get/Downloads/MySQLInstaller/mysql-installer-community-5.6.22.0.msi"
]
}
]
}
Best Practices
- Run Command
- Improve your security posture by leveraging Run Command to access your EC2 instances, instead of SSH/RDP
- Audit all API calls made by or on behalf of Run Command using AWS CloudTrail
- Use the rate control feature in Run Command using AWS Cloud Trail
- Use fine-grained access permissions for Run Command (and all System Manager Capabilities) by using IAM policies
- Inventory
- Use Inventory in combination with AWS Config to audit your application configuration overtime
- State Manager
- Update the SSM agent periodically (at least once a month) using pre-configured AWS-UpdateSSmAgent document
- Bootstrap EC2 instance on launch using EC2Config for Windows
- (Specific to Windows) Upload the PowerShell or Desire State Configuration (DSC) module to Amazon S3, and use AWS-InstallPowerShellModule
- Use tags to create application groups. Then target instances using the
Targets
parameters, instead of specifying individual instance IDs
- Automatically remediate findings generated by Amazon Inspector using Systems Manager
- Use a centralized configuration repository for all yof your System Manage documents, and share documents across your organization
- Maintenance Windows
- Define a schedule for performing disruptive actions on your instances
- Patch Manager
- Use patch Manager to roll out patches at scale and to increase fleet compliance visibility across your EC2 instances
- Automation
- Create self-serviceable runbooks for infrastructure as Automation documents
- Use Automation to simplify create AMIs from the AWS Marketplace or custom AMIs, using public documents, or authoring your own workflows
- Use the documents AWS-UpdateLinuxAmi or AWS-UpdateWindowsAmi or create a custom Automation document to build and maintain images
- Parameter Store
- Use Parameter Store to manage global configuration settings in a centralized manner
- Use Parameter Store fore secrets managements, encrypted through AWS KMS
- Use PArameter Store with Amazon EC2 Container Service (ECS) task definitions to store secrets
AWS OpsWorks for Chef Automate
- Brings capabilities of Chef Automate to support DevOps capabilities at scale
- Based on concepts of recipes
- Configuration scripts written in Ruby
- Supports DevOps practices: workflow, compliance, visibility
Supported resource definitions
- Bash
- Directory
- Execute
- File
- Git
- Group
- Package
- Route
- Service
- User
Example:
package 'apache2' do
case node[:platform]
when 'centos', 'redhat', 'fedora', 'amazon'
package_name 'httpd'
when 'debian', 'ubuntu'
package_name 'apache2'
end
action :install
end
Recipe Linting and Testing
- Linting with Rubocop
- Static analysis based on Ruby style guide
- Linting with Foodcritic
- Checks chef recipes based on a set of built-in rules
- Unit Testing with ChefSpec
- Integration Testing with Test Kitchen
- Creates test environments and validates the creation of resources specified in Chef recipes
Best Practices
- Consider storing Chef recipes in an Amazon S3 archive, with Amazon S3 versioning
- Establish backup schedule that meets your organizational governance requirements
- Use IAM to limit access to the OpsWorks for Chef Automate API calls
- Amazon CloudWatch Metrics
- Create alarms in Amazon CloudWatch
- Respond to metric-based alarms using built-in notifications, Amazon SNS, or custom Lambda functions
- Make use of Amazon CloudWatch Logs
- Install CloudWatch Logs Agent on EC2 instances
- Logstash, Graylog, Fluentd can also ship logs
- Logs stored to S3 can also be shipped to CloudWatch Logs, e.g. Lambda on S3 event
- CloudWatch logs can be used for metrics, that can trigger alarms
- Log processing and correlation allow deeper analysis
- CloudWatch Events
- Events from changes to AWS environments
- Targets can include built-in actions, SNS notifications, Lambda functions
Best Practices
- Ensure that all AWS resources are emitting metrics
- Create CloudWatch alarms for metrics that provide the appropriate responses as metric-related events arise
- Send logs from AWS resources, including Amazon S3, and Amazon EC2 to CloudWatch Logs for analysis using log stream triggers and Lambda functions
- Schedule ongoing maintenance tasks with CloudWatch and Lambda
- Use CloudWatch custom events to respond to application-level issues
Governance and Compliance
- AWS Config
- Assess, audit, and evaluate the configurations of AWS resources
- Automatically builds an inventory of your resources and tracks changes made to them
- Provides a clear view of resource change timeline
- AWS Config Rules
- Every change triggers an evaluation by the rules associated with the resources
- Provided managed rules for common requirements
- Easily identity noncompliant resources and help with reporting and remediation
- Supports custom rules using AWS Lambda functions
AWS Config Rule Structure
Example: Lambda to evaluate if flow logs are enabled on a given VPC:
import boto3
import json
def evaluate_compliance(config_item, vpc_id)"
if (config_item['resourceType'] != 'AWS::EC2::VPC'):
return 'NOT_APPLICABLE'
elif is_flow_logs_enabled(vpc_id):
return 'COMPLIANT'
else:
return 'NON_COMPLIANT'
def is_flow_logs_enabled(vpc_id):
ec2 = boto3.client('ec2')
response = ec2.describe_flow_)logs(
Filter=[{'Name': 'resource-id', 'Values': [vpc_id]}, ],
)
if len(response[u'FlowLogs']) != 0: return True
def lambda_handler(event, context):
invoking_event = json.loads(event['invokingEvent'])
compliance_value = 'NOT_APPLICABLE'
vpc_id = invoking_event['configurationItem']['resourceId']
compliance_value = evaluate_compliance(
invoking_event['configurationItem'], vpc_id
)
config = boto3.client('config')
response = config.put_evaluations(
Evalutions = [
{
'ComplianceResourceType': invoking_event['configurationItem']['resourceType'],
'ComplianceResourceId': vpc_id,
'ComplianceType': compliance_value
'OrderingTimestamp': invoking_event['configurationItem']['configurationItemCaptureTime']
}
],
ResultToken = event['resultToken']
)
Best Practices
- Enable AWS Config for all regions to record the configuration item history, to facilitate auditing and compliance tracking
- Implement a process to respond to changes detected by AWS Config. This could include email notifications and the use of AWS Config rules to respond to changes programmatically.
Resource Optimization
- AWS Trusted Advisor
- Observe best practices by scanning your AWS resources and comparing their usage against AWS best practices of 4 categories
- Cost optimization
- Performance
- Security
- Fault Tolerance
- Trusted Advisor integrates with CloudWatch Events
- You can design a Lambda function to respond to a change in the status of Trusted Advisor checks, e.g. send a notification
Best Practices
- Subscribe to Trusted Advisor notifications through email or other system
- Use distribution lists and ensure that the appropriate recipients are included
- For AWS Business or Enterprise support, use AWS Support API in conjunction with Trusted Advisor to create cases to perform remediation
Key Actions to Implement IaaC
- Start by using source control services, e.g. AWS CodeCommit
- Incorporate quality control process via unit tests and static code analysis before deployments
- Remove the human element and automate infrastructure provisioning, including infrastructure permission policies
- Create idempotent infrastructure code that you can easily redeploy
- Roll out every new update via code by updating idempotent stacks. Avoid making one-off changes manually.
- Embrace end-to-end automation
- Include infrastructure automation work as part of regular product sprints
- Make your changes auditable, and make logging mandatory
- Define common standards across your organization and continuously optimize