System Design: News Feed (Facebook / Instagram / Twitter)
A social media news feed aggregates posts from a user’s network and ranks them for relevance. This is one of the most common system design interview problems — it tests your knowledge of fanout, ranking, caching, and the tradeoffs between write-time and read-time computation.
Requirements
Functional: Users follow other users. When a user creates a post, it appears in followers’ feeds. Feeds are ranked (not purely chronological). Support pagination. Handle celebrities (users with millions of followers).
Non-functional: 500M DAU, feed loads in <500ms, 1 billion feed reads per day, eventual consistency acceptable.
Core Models
User: user_id, username, follower_count
Post: post_id, author_id, content, media_url, created_at, like_count
Follow: follower_id, followee_id, created_at
FeedItem: user_id, post_id, score, created_at (pre-built feed entries)
Fanout Strategies
When a user posts, how do followers see it in their feed? Two approaches:
Fanout on Write (Push Model)
On post creation: asynchronously write a FeedItem row to every follower’s feed cache in Redis. When the user opens their feed, it reads directly from their pre-built feed — O(1) lookup.
def on_post_created(post_id, author_id):
followers = get_all_followers(author_id) # from Follow DB
for follower_id in followers:
redis.lpush(f"feed:{follower_id}", post_id)
redis.ltrim(f"feed:{follower_id}", 0, 999) # keep last 1000 posts
kafka.publish("post.created", {post_id, author_id})
Pros: Fast reads — feed is pre-built. Cons: Write amplification — a celebrity with 10M followers causes 10M Redis writes per post. Unacceptable for accounts like @BarackObama (130M followers).
Fanout on Read (Pull Model)
On feed load: query all followees of the user, fetch their recent posts, merge and rank. No pre-building.
def get_feed(user_id, page=0):
followees = get_followees(user_id) # up to 5,000 follows
posts = []
for followee_id in followees:
posts += get_recent_posts(followee_id, limit=10)
ranked = rank_posts(posts, user_id)
return ranked[page*20 : (page+1)*20]
Pros: No write amplification. Cons: Read is slow — N followees = N DB queries. Unacceptable for users who follow thousands of accounts.
Hybrid Model (Industry Standard)
The actual approach used by Facebook, Instagram, and Twitter:
- Regular users (<10K followers): Fanout on write. Pre-build feed in Redis.
- Celebrities (>10K followers): Fanout on read. Their posts are NOT pushed to followers’ feeds. Instead, at read time, the feed service fetches the top celebrity posts and merges them with the pre-built feed from non-celebrity follows.
def get_feed(user_id):
# 1. Pre-built feed (regular follows, fanout-on-write)
regular_posts = redis.lrange(f"feed:{user_id}", 0, 99)
# 2. Celebrity posts (fanout-on-read)
celebrity_followees = get_celebrity_follows(user_id)
celebrity_posts = []
for celeb_id in celebrity_followees:
celebrity_posts += get_recent_posts(celeb_id, limit=5)
# 3. Merge and rank
all_posts = fetch_posts(regular_posts + celebrity_posts)
return rank_posts(all_posts, user_id)[:20]
Feed Ranking
Chronological order was replaced by ML ranking at Facebook in 2009, Twitter in 2016. Ranking signals:
- Affinity: How closely does the user interact with the author? (replies, likes, DMs)
- Content weight: Video > images > links > text (based on engagement data)
- Recency: Decay function — newer posts score higher, but not purely chronological
- Predicted engagement: ML model estimates probability of like/share for this user-post pair
In an interview, a score formula suffices: score = affinity * 0.4 + content_weight * 0.2 + recency_decay * 0.4. Full ML model not expected — but mention that real systems use gradient boosted trees or neural ranking.
Pagination
Avoid offset-based pagination (LIMIT 20 OFFSET 100) — slow on large tables (DB scans 120 rows to return 20). Use cursor-based pagination: the client sends the post_id of the last seen post; the server fetches posts with created_at < that post’s timestamp. This is O(1) index lookup regardless of page number.
Storage
- Posts: Cassandra — append-only writes, partition by author_id, sort by created_at DESC. Efficiently fetch all posts by an author ordered by time.
- Follow graph: Graph DB (Neo4j) or Cassandra (two tables: followers_of, following_of). For read performance, denormalize.
- Feed cache: Redis lists per user (max 1000 items). TTL = 7 days (inactive users’ caches expire).
- Post content: Images/videos → CDN (S3 + CloudFront). Store only the CDN URL in the Post record.
Scale Numbers
- 1B feed reads/day = 11,600 reads/sec (Redis handles this per node)
- Average user follows 300 people → pre-built feed write: 300 Redis writes/post
- Celebrity with 100M followers: 100M Redis writes/post → use fanout-on-read instead
- Celebrity threshold at Twitter: ~10K followers triggers pull model
Interview Tips
- Start with the hybrid model — it demonstrates depth. Don’t just pick one fanout strategy.
- Mention the celebrity problem explicitly — it’s the key tradeoff interviewers want to hear.
- Feed ranking: even a simple formula shows you know real feeds aren’t chronological.
- Cursor-based pagination over offset: always mention this for feed-type problems.
{“@context”:”https://schema.org”,”@type”:”FAQPage”,”mainEntity”:[{“@type”:”Question”,”name”:”What is the fanout problem in social media news feed design?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Fanout describes how a post reaches all followers. Fanout on write (push model): when a user posts, immediately write the post to every follower’s feed cache. Pros: O(1) feed reads. Cons: a celebrity with 10M followers requires 10M writes per post — write amplification makes this unusable at scale. Fanout on read (pull model): when a user requests their feed, query all followees and merge their recent posts. Pros: no write amplification. Cons: users who follow thousands of accounts see O(N followees) queries per feed load. The industry solution is a hybrid: use fanout-on-write for regular users (under ~10K followers), and fanout-on-read for celebrities (over ~10K followers). At feed load time, pre-built regular feed from Redis is merged with celebrity posts fetched on demand. This is how Facebook, Instagram, and Twitter actually work.”}},{“@type”:”Question”,”name”:”How does a news feed ranking algorithm work?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Modern feeds are not chronological — they use ML ranking. The ranking model estimates, for each user-post pair, the probability that the user will engage (like, comment, share). Key ranking signals: (1) Affinity score — how much the user interacts with this author (replies, DMs, profile views). Higher affinity = posts scored higher. (2) Content weight — based on historical engagement data, video scores higher than images, images higher than links, links higher than text. (3) Recency — a time decay function so newer posts score higher, but not purely chronological (a viral post from 6 hours ago beats a new post from an author you rarely engage with). (4) Social proof — posts liked by your close friends get a boost. In interviews, present a simple weighted formula: score = affinity*0.4 + content_weight*0.2 + recency_decay*0.4. Real systems use gradient boosted trees (Facebook’s EdgeRank successor) or deep learning.”}},{“@type”:”Question”,”name”:”How do you implement cursor-based pagination for a news feed?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Cursor-based pagination passes the last seen item as a cursor for the next page, rather than a numeric offset. For a news feed: the client sends the post_id (or timestamp) of the last post it received; the server fetches posts with created_at < cursor_timestamp using an index scan. Why not OFFSET? SQL OFFSET 100 scans and discards the first 100 rows before returning the next 20 — on a table with millions of rows, page 500 is extremely slow (100,020 rows scanned for 20 returned). Cursor pagination is O(1) regardless of page depth — the created_at index allows direct seek to the cursor position. Implementation: SELECT * FROM posts WHERE author_id IN (…) AND created_at < :cursor ORDER BY created_at DESC LIMIT 20. Return the last post's created_at as the next cursor. Handle ties by also including post_id in the cursor (sort by created_at DESC, post_id DESC) for stable pagination."}}]}
🏢 Asked at: Meta Interview Guide 2026: Facebook, Instagram, WhatsApp Engineering
🏢 Asked at: Twitter/X Interview Guide 2026: Timeline Algorithms, Real-Time Search, and Content at Scale
🏢 Asked at: LinkedIn Interview Guide 2026: Social Graph Engineering, Feed Ranking, and Professional Network Scale
🏢 Asked at: Snap Interview Guide
🏢 Asked at: Airbnb Interview Guide 2026: Search Systems, Trust and Safety, and Full-Stack Engineering