Question 1

How does Amazon S3 achieve 11-nine durability (99.999999999%)?

Accepted Answer

S3 achieves 99.999999999% durability by storing each object across at least 3 geographically separated Availability Zones (AZs) within a region. When you PUT an object: (1) S3 receives the object data, (2) synchronously replicates it to storage nodes in 2+ additional AZs before acknowledging success, (3) returns 200 OK only after all replicas confirm persistence to durable storage. For very large objects, S3 uses erasure coding: split the object into k=10 data shards and m=4 parity shards (14 total). Any k=10 shards can reconstruct the original u2014 you can lose any 4 shards before losing data. This uses 1.4u00d7 storage overhead versus 3u00d7 for full replication, while tolerating 4 simultaneous shard failures. Additional durability measures: checksums on all data at rest (MD5/SHA-256) and during transfer (Content-MD5 header) to detect bit rot. Background integrity scanning checks all stored objects periodically and automatically repairs corruption by recreating the corrupted shard from parity. The 11-nine figure means: with 1 trillion objects stored, you expect to lose at most 0.001 objects per year. Practical comparison: storing data on a single consumer SSD gives ~4-nines (99.99%) annual durability.

Question 2

What is a pre-signed URL and when should you use it?

Accepted Answer

A pre-signed URL is a time-limited URL that grants temporary access to a specific S3 object without requiring AWS credentials. Generated by your application server using the AWS SDK with: bucket name, object key, HTTP method (GET or PUT), and expiration time (seconds, up to 7 days). The URL includes the AWS signature cryptographically tied to the specific object and expiry. Use cases: (1) Browser-to-S3 direct upload u2014 user selects a file in your web app; your server generates a pre-signed PUT URL (expiring in 5 minutes) and returns it to the browser; the browser uploads directly to S3 using the URL. Avoids routing large files through your application server (reduces server load, cost, and latency). (2) Temporary download access u2014 generate a pre-signed GET URL for a protected object. The URL is only valid for 1 hour u2014 users who share the URL cannot grant permanent access. (3) Sharing private content u2014 send a pre-signed URL in an email or webhook for secure one-time downloads. Security considerations: pre-signed URLs bypass IAM policies for the specific object u2014 anyone with the URL can access the object until expiry, so use short expiry times (minutes for uploads, hours for downloads) and never embed pre-signed URLs in client-side code that could be extracted.

Question 3

How do S3 storage classes reduce costs for infrequently accessed data?

Accepted Answer

S3 offers tiered storage classes with different cost/access trade-offs, letting you minimize cost by matching the storage class to your access pattern. S3 Standard: ~$0.023/GB-month, < 1ms retrieval. For data accessed frequently. S3 Standard-IA (Infrequent Access): ~$0.0125/GB-month but a retrieval fee of $0.01/GB. Cost-effective when accessed less than once per month (break-even vs Standard is ~1.8 GETs/month). S3 Glacier Instant Retrieval: ~$0.004/GB-month, millisecond retrieval. For archives accessed quarterly. S3 Glacier Flexible Retrieval: ~$0.0036/GB-month, 3-5 hour retrieval. For compliance archives where retrieval is rare. S3 Glacier Deep Archive: ~$0.00099/GB-month, 12-hour retrieval. For 7-year compliance retention at minimum cost u2014 95% cheaper than Standard. S3 Lifecycle policies automate transitions: CREATE u2192 Standard u2192 Standard-IA (after 30 days) u2192 Glacier (after 90 days) u2192 expire (after 365 days). For application logs: you need instant access to the last 7 days (Standard), occasional access to last 90 days (IA), rare access to last year (Glacier), then delete. A lifecycle policy handles all transitions automatically, reducing storage cost by 70-90% versus keeping everything in Standard.

System Design Interview: Object Storage (Amazon S3)

What Is Object Storage?

Core API

Data Durability via Replication

Consistency Model

Bucket Internals: Data Placement

Storage Classes and Lifecycle Policies

Access Control

CDN Integration

Companies That Ask This