S3

Amazon Simple Storage Service (S3) 筆記。

Overview

  • S3 is one of main building blocks of AWS
  • Advertised as “infinitely scaling” storage
  • Many AWS Services use S3 as an integration

Use Cases

  • Backup and Storage
  • Disaster Recovery
  • Archive
  • Hybrid Cloud Storage
  • Application Hosting
  • Media Hosting
  • Data Lakes & Big Data Analytics
  • Software delivery
  • Static Website

Bucket

  • S3 stores objects (files) in “buckets” (directories)
  • Buckets must have a globally unique name (across all regions, all accounts)
  • Buckets are defined at the region level
  • S3 looks like a global service but buckets are created in a region

Naming Convention

  • No uppercase, no underscore
  • 3-63 characters long
  • Not an IP
  • Must start with lowercase letter or number
  • Must not start with xn-- prefix
  • Must not end with -s3alias suffix

Objects

  • Objects have a Key (full path)
    • s3://my-bucket/my_file.txt
    • s3://my-bucket/my_folder1/another_folder/my_file.txt
  • Key = prefix + object name
  • No concept of “directories” (just keys with slashes)

Content

  • Object values are the content of the body
  • Max Object Size: 5 TB (5000 GB)
  • If uploading more than 5 GB, must use multi-part upload
  • Metadata (list of text key/value pairs)
  • Tags (Unicode key/value pair, up to 10)
  • Version ID (if versioning is enabled)

Security

Bucket Policy

User-Based:

  • IAM Policies: which API calls allowed for a specific user

Resource-Based:

  • Bucket Policies: bucket-wide rules
  • Object Access Control List (ACL) - finer grain
  • Bucket Access Control List (ACL) - less common

An IAM principal can access an S3 object if:

  • The user IAM permissions ALLOW it OR the resource policy ALLOWS it
  • AND there’s no explicit DENY

Policy Structure

{
  "Version": "2012-10-17",
  "Statement": [{
    "Sid": "PublicRead",
    "Effect": "Allow",
    "Principal": "*",
    "Action": ["s3:GetObject"],
    "Resource": ["arn:aws:s3:::examplebucket/*"]
  }]
}

Block Public Access Settings

  • Block all public access: ON (recommended)
  • Created to prevent company data leaks
  • If bucket should never be public, leave these on
  • Can be set at the account level

Static Website Hosting

  • S3 can host static websites
  • URL format: http://<bucket-name>.s3-website-<aws-region>.amazonaws.com
  • If 403 Forbidden error, ensure bucket policy allows public reads

Versioning

  • Enabled at the bucket level
  • Same key overwrite changes the “version”: 1, 2, 3…
  • Best practice to version your buckets:
    • Protect against unintended deletes
    • Easy rollback to previous version
  • Files not versioned prior to enabling versioning will have version “null”
  • Suspending versioning doesn’t delete previous versions

Replication (CRR & SRR)

Prerequisites: Must enable versioning in source and destination buckets

  • Cross-Region Replication (CRR): compliance, lower latency, cross-account
  • Same-Region Replication (SRR): log aggregation, live replication between prod/test

Storage Classes

ClassDescription
Standard - General Purpose99.99% availability, frequently accessed data
Standard-IAInfrequent access, 99.9% availability
One Zone-IASingle AZ, 99.5% availability
Glacier Instant RetrievalMillisecond retrieval, 90 days minimum
Glacier Flexible RetrievalExpedited (1-5 min), Standard (3-5 hrs), Bulk (5-12 hrs)
Glacier Deep ArchiveStandard (12 hrs), Bulk (48 hrs)
Intelligent TieringAuto-moves objects between tiers based on usage

Transfer Acceleration

  • Speeds up content transfer to/from S3
  • Can increase transfer speeds by 50-500% for long-distance transfers
  • Uses CloudFront Edge Locations and AWS backbone network
  • Pay only for accelerated transfers

Summary

FeatureDescription
Buckets vs ObjectsGlobal unique name, tied to region
S3 SecurityIAM policy, Bucket Policy, Encryption
S3 WebsitesHost static websites
S3 VersioningMultiple versions, prevent accidental deletes
S3 ReplicationSame-region or cross-region, requires versioning
S3 Storage ClassesStandard, IA, Glacier variants, Intelligent Tiering
Snow FamilyImport data via physical device
Storage GatewayHybrid solution to extend on-premises storage