Notes

Datastore

  • Files stored in S3 can be served over Bittorrent to decrease costs
  • File Gateway (Storage Gateway) can expose S3 bucket files in the office through NFS
  • http://registry.opendata.aws contains publically open databases
  • AWS Glue allows you to extract data from S3 Bucket to a Table that can be queried using AWS Athena
  • Graph databases are best a storing complex relationship data and AWS Neptune is a graph database. While other options might be able to work, none would work as well as a true graph database and we can run such a database like SAP HANA or Neo4j on EC2.
  • Secondary Indexes and DynamoDB Accelerator (DAX) - in-memory cache in front of DDB can accelerate DynamoDB performance
  • Gateway Stored Volume Mode, or Volume Gateway Stored Mode as its also called, would be a way to maintain a full local copy of the data and have it replicated asynchronously to S3.
  • Amazon ElastiCache offers a fully managed Memcached and Redis service. Although the name only suggests caching functionality, the Redis service in particular can offer a number of operations such as Pub/Sub, Sorted Sets and an In-Memory Data Store. However, Elasticache is only a key-value store and cannot therefore store relational data.
  • A global secondary index can be used to speed up queries against non-primary key items. A local secondary index, on the other hand, must retain the partition key of the table. Hash key is another term for partition key.
  • If you make a HEAD or GET request for the S3 key name before creating the object, S3 provides eventual consistency for read-after-write. As a result, we will get a 404 Not Found error until the upload is fully replicated. However, this replication usually only takes a few seconds and we might get the metadata after all.
  • The ACID consistency model is Atomic, Consistent, Isolated and Durable.

Networking

  • Only two components allow VPC to Internet communication using IPv6 addresses and those are “Internet Gateways” (inbound) and “Egress-Only Internet Gateways” (outbound). “NAT Instances” and “NAT Gateways” explicitly do not support IPv6 traffic and a “Direct Connection” carries data between a Data Centre and an AWS VPC, but does not travel over the Internet.
  • You can use DHCP Option Sets to configure which DNS is issued via DHCP to instances. This can be any DNS address. So long as its reachable from the VPC, instances can use it to resolve.
  • Jumbo frames - allow more than 1500 bytes of data per packet (9001 MTU)
    • Increased packet sizes, means less packets are required
    • Less packets, means less total overhead as SUM of overheads per packets
    • Faster transmission time, maximizing network throughput
    • Should be used with caution for traffic leaving VPC - may experience slowdown due to packet fragmentation
    • Configure MTU size by route / network interfaces

Security

  • By default, CloudTrail will log all regions and store them in a single S3 location. It can however be configured to only log specific regions.
  • OAuth 2.0 provides authorization only.
  • Service Control Policy is the best way to implement restriction on OU level for allowed regions.
    • ACLs and Resource-based policies apply to assets and not users or groups. Identity-based policies using the aws:RequestedRegion condition key could do the job but since we are trying to control at the OU level, an SCP would require less management and localized care. We can use a DENY with StringNotEqualsIfExists conditional against aws:ReqeustedRegion for allowed regions.
  • DDoS Layer 7 Attacks: The challenge with layer 7 detection is telling an attack from normal user traffic. CloudFront in conjunction with AWS WAF can be an effective way to create DDoS resilience at Layer 7. Network Load Balancers are NACLs are Layer 4 solutions, and would have no visibility of Layer 7 DDoS. CloudTrail and GuardDuty are focused on the security of the AWS account, and would not be suitable in isolation for securing at Layer 7 Further information: https://d1.awsstatic.com/whitepapers/Security/DDoS_White_Paper.pdf

Business Continuity

  • Via Aurora Global Database, an Aurora PostgreSQL database does support automatic failover to a secondary region.
  • AWS does not recommend the use of RAID on EBS as it greatly affects the IOPS
  • RAID0 offers no drive fault-tolerance. RAID1, also known as mirroring, requires 2x the required volume space. RAID5 requires 3 volumes at a minimum.
  • Elasticache for Redis supports multi-AZ failover
  • Recovery Point Objective will define the potential for data loss during a disaster. This can inform an expectation of manual data re-entry for BC planners.
  • Redshift currently only supports single-AZ deployments but you can run multiple clusters in different AZs.
  • Both spread placement groups and horizontal scaling spread risk across more resources. These are reasonable approaches if hardware failure is a concern.
  • RAID0, sometimes known as striping, provides the highest write performance of these options because writes are distributed across disks and no parity is required.

Deployment and Operations

  • CloudFormation Stack Policy should have “Allow” statement to whitelist what actions can be done
  • Once applied, stack policy can be updated only using the CLI
  • Continuous Deliver differs from Continuous Deployment in that Delivery still includes a manual check before release to production.
  • A Canary Release is a way to introduce a new version of an application into production with limited exposure.
  • AWS CodePipeline is a continuous delivery service that enables you to model, visualize, and automate the steps required to release your software.
  • A Disposable Upgrade is one were a new release is deployed on new instances while instances containing the old version are terminated.
  • AWS EKS runs the Kubernetes management platform for you on AWS across multiple AZs. Because its Kubernetes conformant, you can use third-party add-ons.
  • Service Discovery makes it easy for containers within an ECS cluster to discover and connect with each other, using Route 53 endpoints. Task Definitions define the resource utilisation and configuration of tasks, using JSON templates. Task Scheduling allows you to run batch processing jobs run on a schedule. File Storage is not a component of ECS. Storage within ECS is handled by EBS volumes attached to the underlying EC2 instances and not by ECS itself.
  • OpsWorks is a global service but when creating a stack you must specify a region and it will not allow you to clone to another region.
  • AWS CodeDeploy does not provide Scaling or Provisioning of the deployment. Elastic Beanstalk, CloudFormation and OpsWorks can do this.

Cost Management

  • The primary value proposition around cost for AWS is that it creates the opportunity for agility using a pay-as-you-go model. Traditional CapEx models make it difficult to quickly test new ideas.
  • You should first get a solid understanding of current costs. It may turn out that a move to the cloud is not warranted even with financial evidence so the other activities would be waste.
  • Bulk buys are almost always cheaper than on-demand, so RIs can be a good proxy. Managed services will be more cost-effective than just mimicking a pure on-prem server farm. Additionally, soft costs like agility or maintenance should be accounted for in the model.
  • Tagging can be directly used for all of these purposes except Purchasing. However, indirectly, I could configure a CloudWatch event to trigger some action when a tag changes. That action might be a call to an API that places an order with a vendor.
  • Right sizing is using the lowest cost resource that still meets the technical specifications of a specific workload. CloudWatch and Trusted Advisor are the most direct tools for this.
  • Dedicated Hosts reserve capacity because you are paying for the whole physical server that cannot be allocated to anyone else. Dedicated Instances are available as on-demand, reserved and spot instances.
  • Costs will most certainly increase during a migration given items like training, dual environments, lease penalties, consulting and planning. AWS calls this period the migration bubble.
  • Regional RIs are not specific to an AZ and can be consumed across a region. Zonal RIs can be modified for use in another AZ using the console of ModifyReserveInstances API.
  • Consolidated Billing is a feature of AWS Organizations. Once enabled and configured, you will receive a bill containing the costs and charges for all of the AWS accounts within the Organization. Although each of the individual AWS accounts are combined into a single bill, they can still be tracked individually and the cost data can be downloaded in a separate file. Using Consolidated Billing may ultimately reduce the amount you pay, as you may qualify for Volume Discounts. There is no charge for using Consolidated Billing.
  • Consolidated Billing allows you to potentially realize lower prices on some services with tiered pricing.
  • A buffering pattern is useful in smoothing demand. We can do this with SQS using FIFO to satisfy the in order requirement. If we solely use a spot fleet, we might be outbid and not have available instances. So, we can use a RI instead.

Areas of Focus:

  • Fault Tolerance, High Availability, Disaster Recovery
  • AWS Organizations, Security Compliance Policy
  • AWS Support Plans
  • AWS Trusted Advisor
  • Direct Connect, VPN (Gateway)