S3
Amazon Simple Storage Service (S3) 筆記。
Overview
- S3 is one of main building blocks of AWS
- Advertised as “infinitely scaling” storage
- Many AWS Services use S3 as an integration
Use Cases
- Backup and Storage
- Disaster Recovery
- Archive
- Hybrid Cloud Storage
- Application Hosting
- Media Hosting
- Data Lakes & Big Data Analytics
- Software delivery
- Static Website
Bucket
- S3 stores objects (files) in “buckets” (directories)
- Buckets must have a globally unique name (across all regions, all accounts)
- Buckets are defined at the region level
- S3 looks like a global service but buckets are created in a region
Naming Convention
- No uppercase, no underscore
- 3-63 characters long
- Not an IP
- Must start with lowercase letter or number
- Must not start with
xn--prefix - Must not end with
-s3aliassuffix
Objects
- Objects have a Key (full path)
s3://my-bucket/my_file.txts3://my-bucket/my_folder1/another_folder/my_file.txt
- Key = prefix + object name
- No concept of “directories” (just keys with slashes)
Content
- Object values are the content of the body
- Max Object Size: 5 TB (5000 GB)
- If uploading more than 5 GB, must use multi-part upload
- Metadata (list of text key/value pairs)
- Tags (Unicode key/value pair, up to 10)
- Version ID (if versioning is enabled)
Security
Bucket Policy
User-Based:
- IAM Policies: which API calls allowed for a specific user
Resource-Based:
- Bucket Policies: bucket-wide rules
- Object Access Control List (ACL) - finer grain
- Bucket Access Control List (ACL) - less common
An IAM principal can access an S3 object if:
- The user IAM permissions ALLOW it OR the resource policy ALLOWS it
- AND there’s no explicit DENY
Policy Structure
{
"Version": "2012-10-17",
"Statement": [{
"Sid": "PublicRead",
"Effect": "Allow",
"Principal": "*",
"Action": ["s3:GetObject"],
"Resource": ["arn:aws:s3:::examplebucket/*"]
}]
}Block Public Access Settings
- Block all public access: ON (recommended)
- Created to prevent company data leaks
- If bucket should never be public, leave these on
- Can be set at the account level
Static Website Hosting
- S3 can host static websites
- URL format:
http://<bucket-name>.s3-website-<aws-region>.amazonaws.com - If 403 Forbidden error, ensure bucket policy allows public reads
Versioning
- Enabled at the bucket level
- Same key overwrite changes the “version”: 1, 2, 3…
- Best practice to version your buckets:
- Protect against unintended deletes
- Easy rollback to previous version
- Files not versioned prior to enabling versioning will have version “null”
- Suspending versioning doesn’t delete previous versions
Replication (CRR & SRR)
Prerequisites: Must enable versioning in source and destination buckets
- Cross-Region Replication (CRR): compliance, lower latency, cross-account
- Same-Region Replication (SRR): log aggregation, live replication between prod/test
Storage Classes
| Class | Description |
|---|---|
| Standard - General Purpose | 99.99% availability, frequently accessed data |
| Standard-IA | Infrequent access, 99.9% availability |
| One Zone-IA | Single AZ, 99.5% availability |
| Glacier Instant Retrieval | Millisecond retrieval, 90 days minimum |
| Glacier Flexible Retrieval | Expedited (1-5 min), Standard (3-5 hrs), Bulk (5-12 hrs) |
| Glacier Deep Archive | Standard (12 hrs), Bulk (48 hrs) |
| Intelligent Tiering | Auto-moves objects between tiers based on usage |
Transfer Acceleration
- Speeds up content transfer to/from S3
- Can increase transfer speeds by 50-500% for long-distance transfers
- Uses CloudFront Edge Locations and AWS backbone network
- Pay only for accelerated transfers
Summary
| Feature | Description |
|---|---|
| Buckets vs Objects | Global unique name, tied to region |
| S3 Security | IAM policy, Bucket Policy, Encryption |
| S3 Websites | Host static websites |
| S3 Versioning | Multiple versions, prevent accidental deletes |
| S3 Replication | Same-region or cross-region, requires versioning |
| S3 Storage Classes | Standard, IA, Glacier variants, Intelligent Tiering |
| Snow Family | Import data via physical device |
| Storage Gateway | Hybrid solution to extend on-premises storage |