Deep Dive on Amazon S3 & Amazon Glacier Storage Management (reInvent 2017)

Storage Management on S3

  • Organize
    • Object Tagging
  • Monitor and Analyze
    • S3 Inventory
    • Amazon CloudWatch
    • Storage Class Analysis
    • AWS CloudTrail
  • Act
    • Cross Region replications
    • Event Notification
    • Lifecycle Policy
  • Security Management
    • AWS KMS
    • AWS IAM
    • Bucket Permissions Check
    • Encryption Status in S3 Inventory
    • Default Encryption
    • Trusted advisor
    • Amazon Macie

User Permission Management By Tagging

    "version": "2012-10-17",
    "Statement": [
            "Effect": "Allow",
            "Action": [
            "Resource": "arn:aws:s3:::Project-bucket/*",
            "Condition": {"StringEquals": {"s3:RequestObjectTag/Project": "x"}}

S3 Inventory

  • Generates a CSV / ORC file based of all objects in S3 bucket with respect to filter criteria.
  • Triggers business workflows and applications such as secondary index, garbage collection, data auditing and offline analytics.


  • Save time
  • Daily or Weekly delivery
  • Delivery notification
  • Delivery to S3 bucket
  • Same set of metadata as the LIST API
  • Can add size, last modified date, storage class, etag or replication status
  • Object-level Encryption Status
  • Encrypt Inventory with SSE-S3 or SSE-KMS
  • CSV or ORC output format
  • Query with Athena, Redshift Spectrum or any Hive tools

S3 Inventory can be queried with Amazon Athena:

CREATE EXTERNAL TABLE my_inventory_table(
    `bucket` string,
    `key` string,
    `version_id` string,
    `is_latest` boolean, 
    `is_delete_marker` boolean, 
    `size` bigint, 
    `last_modified_date` timestamp, 
    `e_tag` string,
    `storage_class` string, 
    `is_multipart_uploaded` boolean,
    `replication_status` string,
    `encryption_status` string
PARTITIONED BY (dt string)
LOCATION 's3://bucketname/inventory/output_destination/hive'

Storage Class Analysis

  • Data-driven storage management for S3
  • Daily Storage Class Analysis
  • Export Analysis data to your S3 Bucket
  • Filter by Bucket, Prefix, or Object Tags


  1. Monitors access patterns to understand your storage usage
  2. After 30 days, recommends when to move objects to Standard - Infrequent Access
  3. Export file includes a daily report of storage, retrieved bytes, and GETs by object age

Object-Level Logging

  • Allows Logging CloudTrail for Read / Write Events on the Objects

Cross-Region Replication (CRR)

Use cases:

  • Compliance
  • Lower latency
  • Security


  • Ownership overwrite for cross-account CRR
  • Support SSE-KMS Encrypted objects
  • Choose any S3 Storage Class as target
  • Choose any AWS region as target
  • Bi-directional replication
  • Lifecycle Policy

Automate with Trigger-Based Workflow Amazon S3 event notifications

  • Notifications when objects are created via Put, Post, Copy, Multipart Upload, or Delete
  • Filter on prefixes and suffixes
  • Trigger workflow with Amazon SNS, Amazon SQS, and Amazon Lambda functions

Default Encryption

  • Automatically encrypts all objects written to your Amazon S3 bucket
  • Choose SSE-S3 or SSE-KMS
  • Makes it easy to satisfy compliance needs

Amazon Macie

  • Security service that uses machine learning to automatically discover, classify and protect sensitive data in AWS
  • Recognizes sensitive data
  • Continuously monitors data access
  • Provides dashboards and alerts

AlertLogic Use Case on AWS S3

S3 Object Management

  • S3 Object Keys use hash prefix for performance: logmsgs-001:/X-OGA/11543.2016-03/...
  • S3 Objects written with two Tags
    • Customer identitfier (cid=1234567890)
    • Date (date=2017-06)
  • AWS KMS used to generate data encryptionkeys
    • Customer Master Key (CMK) for each data type with automatic rotation enabeld
    • Data Keys generated per-customer/per-month

Tags with Lifecycle Expiration Policies

  • Per Customer Expiration Rule
  • Uses cid and date tags as filter
  • Indepdendent of object create time
        <!-- Depends entirely on the tag values -->

Tags with Lifecycle Transition Policies

  • One Transition Rule per month
  • Uses date tag as filter

Demonstrate Scale of Storage Solution (AWS re:Invent 2017)

  • Scaled wrokload 100x successfully
    • 140PB/month of customer data
    • 30k writes/second sustained
    • Write latency 200ms at 95th percentile
    • Read latency 125ms at 95th percentile
  • Limited only by resources driving traffic