System Design Interview: Design a Social Media Feed System

What Is a Social Media Feed System?

A social media feed aggregates and ranks posts from followed users and surfaces them in a personalized order. Examples: Twitter/X Home Timeline, Instagram Feed, LinkedIn Feed. Core challenges: fanout at scale (a post by a celebrity with 50M followers must populate 50M feeds), ranking (relevance over chronological order), and low read latency (<100ms to load the feed).

  • Shopify Interview Guide
  • Airbnb Interview Guide
  • LinkedIn Interview Guide
  • Snap Interview Guide
  • Meta Interview Guide
  • Twitter Interview Guide
  • System Requirements

    Functional

    • User posts content; followers see it in their feed
    • Feed is ranked by relevance, not just chronology
    • Pagination: infinite scroll with cursor-based pagination
    • Real-time updates for followed users’ new posts
    • 10M DAU, 100K posts/second at peak

    Fanout Strategies

    Fanout on Write (Push Model)

    When a user posts, immediately write the post_id to each follower’s feed cache. Fast reads (pre-computed). Expensive writes for users with many followers. Works well for “regular” users (<10K followers).

    def on_post_created(post_id, author_id):
        followers = get_followers(author_id)
        for follower_id in followers:
            redis.lpush(f'feed:{follower_id}', post_id)
            redis.ltrim(f'feed:{follower_id}', 0, 999)  # keep 1000 posts
    

    Fanout on Read (Pull Model)

    When a user loads their feed, fetch the latest posts from each followed user and merge. Expensive reads. Works well for celebrities/influencers (>10K followers) — avoids writing to millions of feeds.

    Hybrid Model (Production Standard)

    Use push for regular users, pull for celebrities. Threshold: if followee has >10K followers, skip push; compute on read. At read time: load pre-computed feed from Redis + fetch recent posts from followed celebrities + merge and re-rank. This is how Twitter and Instagram actually work.

    Feed Ranking

    Chronological is simple but suboptimal. Ranked feeds use ML models. Features used:

    • Post signals: age, engagement rate (likes/views), media type (video ranks higher)
    • Author signals: closeness to viewer (interaction history), account age
    • Viewer signals: historical engagement with similar content, time of day

    Ranking pipeline: candidate retrieval (top 500 posts from pool) → lightweight scoring model (logistic regression, O(1ms)) → top 100 → heavy ranking model (neural net) → top 20 → diversity filter (avoid same author twice in a row) → final feed.

    Feed Storage

    posts: id, author_id, content, media_url, created_at, like_count, comment_count
    follows: follower_id, followee_id, created_at
    feed_cache: Redis sorted set per user, keyed by score (relevance * timestamp)
    

    Read Path

    1. Read from Redis: ZREVRANGE feed:{user_id} 0 19 WITHSCORES — top 20 posts by score
    2. Fetch post content: Redis hash or Cassandra lookup by post_id
    3. Fetch engagement counts: Redis counters (updated in real-time)
    4. Return to client with cursor for next page

    Cursor-Based Pagination

    Avoid offset pagination (LIMIT 20 OFFSET 100) — new posts inserted between page 1 and page 2 shifts everything. Use cursor: the last_seen_post_id or last_score. “Give me 20 posts with score < cursor_score." The cursor is returned to the client and sent back on the next request.

    Feed Cache TTL and Eviction

    Feed caches for inactive users waste memory. TTL: if user has not logged in for 7 days, let feed cache expire. On next login, generate the feed from scratch (cold start): pull latest 200 posts from followed users and rank them. Pre-warm on login for returning inactive users (triggered when session is created).

    Interview Tips

    • Hybrid push/pull with the 10K follower threshold is the key insight.
    • Cursor-based pagination prevents duplicate posts across pages.
    • Two-stage ranking (lightweight then heavy) balances latency and quality.
    • Separate the fanout service from the post service — fanout is asynchronous.
    Scroll to Top