System Design: Design a Social Media Feed (Twitter/Instagram)
News feed design is one of the most comprehensive system design problems — it combines social graph traversal, content ranking, fan-out, caching, and real-time updates. Understanding the push vs. pull vs. hybrid approaches is key to a strong answer.
Requirements
Functional: user follows others, sees recent posts from followed users, ranked by recency or relevance, infinite scroll, support for photos/video/text.
Non-functional: serve feeds in <200ms, 500M DAU, 500K posts per second at peak, eventual consistency acceptable, different SLA for celebrities vs. regular users.
Two Core Approaches
Fan-Out on Write (Push Model)
When user A posts, immediately write to the feed list of every follower. Feed reads are O(1) — just read user’s pre-built feed list from cache.
- Pros: instant feed reads, simple read path
- Cons: write amplification — a celebrity with 50M followers generates 50M writes per post. Unusable for high-follower accounts.
Fan-Out on Read (Pull Model)
When user opens feed, query recent posts from all followed users and merge-sort them. No write amplification.
- Pros: no write amplification, storage-efficient
- Cons: high read latency — user follows 500 people, must query 500 users’ post lists and merge. Unusable for high fan-out readers.
Hybrid (Twitter/Instagram approach)
Fan-out on write for regular users (≤ 1000 followers). Pull on read for celebrities (> 1000 followers). On feed read, merge pre-built feed with real-time pull from followed celebrities.
def get_feed(user_id, page):
# 1. Get pre-built feed from Redis (fan-out on write from regular users)
feed = redis.lrange(f"feed:{user_id}", 0, 99)
# 2. Get followed celebrities
celebrities = get_followed_celebrities(user_id) # followers > threshold
# 3. Pull recent posts from celebrities
celeb_posts = []
for celeb_id in celebrities:
posts = get_recent_posts(celeb_id, limit=20)
celeb_posts.extend(posts)
# 4. Merge and rank
all_posts = feed + celeb_posts
return sorted(all_posts, key=lambda p: p.timestamp, reverse=True)[:20]
Data Model
-- Users and social graph
users(id, username, follower_count, following_count, ...)
follows(follower_id, followee_id, created_at)
-- Posts
posts(id, user_id, content, media_url, created_at, like_count, comment_count)
-- Feed cache (Redis)
feed:{user_id} → sorted list of post IDs (newest first)
post:{post_id} → cached post object (hash)
user:{user_id} → cached user object (hash)
Fan-Out Implementation
# When user posts
def on_new_post(post_id, author_id):
post = get_post(post_id)
# Cache the post
redis.hmset(f"post:{post_id}", post.__dict__)
redis.expire(f"post:{post_id}", 86400)
if get_follower_count(author_id) > CELEB_THRESHOLD:
return # celebrities: pull-based, skip fan-out
# Fan-out to followers (async via Kafka)
followers = get_followers(author_id) # paginated
for follower_id in followers:
kafka.publish("fanout", {
"post_id": post_id,
"user_id": follower_id
})
# Fanout worker
def process_fanout(post_id, user_id):
redis.lpush(f"feed:{user_id}", post_id)
redis.ltrim(f"feed:{user_id}", 0, 999) # keep latest 1000
Feed Ranking
Pure recency is simple but engagement-based ranking increases time-on-platform:
- Recency: sort by post timestamp. Predictable, transparent to users.
- Engagement: score = f(recency, likes, comments, shares, author relationship). Facebook/Instagram use ML models trained on click-through rate.
- Interest graph: boost posts from users you interact with frequently (reply, like) vs. just follow.
Scale Numbers
- Twitter: ~500M tweets/day, ~200M DAU, Elon Musk (~100M followers) generates 100M fan-out writes per tweet → pull-based for celebrities
- Instagram: 100M photos/day, ~2B MAU
- Facebook: 100B feed impressions/day
Interview Checklist
- Start with the write path: post → fan-out workers → Redis feed lists
- Explain the celebrity problem and the hybrid push+pull solution
- Draw the read path: Redis feed + real-time celebrity posts → merge → rank
- Discuss feed ranking: recency vs. engagement scoring
- Address storage: Redis for feeds (1000 posts max), CDN for media