Q: How does cursor-based pagination prevent duplicate posts in an infinite scroll feed?

Offset-based pagination (LIMIT 20 OFFSET 40) has a critical flaw: if a new post is inserted between your first and second page request, every post shifts by one position, and you see a duplicate (the last post of page 1 reappears as the first post of page 2). Cursor-based pagination uses a stable reference point. The server returns a cursor with each page: the score or timestamp of the last item delivered. The next request says "give me 20 items with score < cursor." New posts have higher scores/newer timestamps and appear before the cursor — they show up at the top on pull-to-refresh, not in the middle of the feed. The cursor is typically an opaque token (base64-encoded score + timestamp) that the client sends back. Database implementation: "WHERE score < ? ORDER BY score DESC LIMIT 20" using the cursor as the WHERE bound. This is stable regardless of concurrent inserts.

Q: How do you implement real-time feed updates without polling?

Real-time feed updates (new posts appearing in your feed as they happen) require a push mechanism from server to client. Options: WebSocket (bidirectional, persistent TCP connection — used by Twitter), Server-Sent Events (one-way server push over HTTP — simpler, auto-reconnects), or long polling (fallback for environments blocking WebSocket). Architecture: when a new post is created and fanout completes (pushed to follower feed caches in Redis), the notification service publishes a "new feed item" event to a Redis pub/sub channel keyed by user_id. Connection servers (one per ~50K concurrent connections) subscribe to these channels. On receiving a pub/sub message, the connection server pushes a "new post available" signal to the connected client. The client either appends the post directly or shows a "N new posts" banner. For mobile clients: use APNs/FCM push notification to wake the app, then the app fetches the new feed item via REST. WebSocket for web clients; APNs/FCM for mobile is the standard pattern.

Question 1

What is the hybrid push-pull model for social media feed fanout?

Accepted Answer

The hybrid model applies push (fanout on write) for regular users and pull (fanout on read) for high-follower users, typically above a threshold of 10K-100K followers. For regular users: when they post, their post_id is pushed to each follower's feed cache in Redis. Feed reads are instant — the cache is pre-computed. For celebrities (50M followers): pushing to 50M Redis keys on every post is expensive and slow. Instead, skip the push entirely. At read time: load the user's pre-computed feed from regular users they follow, then separately fetch the most recent N posts from each celebrity they follow, merge, and re-rank. The number of celebrities a user follows is typically small (<10), so the per-read fetches are bounded. This is why your Instagram feed loads fast: most of it was pre-computed, and the few celebrity posts you follow are fetched on demand. The threshold is tunable — 10K, 100K, or 1M depending on write/read cost tradeoff for the specific platform.

Question 2

How does cursor-based pagination prevent duplicate posts in an infinite scroll feed?

Accepted Answer

Offset-based pagination (LIMIT 20 OFFSET 40) has a critical flaw: if a new post is inserted between your first and second page request, every post shifts by one position, and you see a duplicate (the last post of page 1 reappears as the first post of page 2). Cursor-based pagination uses a stable reference point. The server returns a cursor with each page: the score or timestamp of the last item delivered. The next request says "give me 20 items with score < cursor." New posts have higher scores/newer timestamps and appear before the cursor — they show up at the top on pull-to-refresh, not in the middle of the feed. The cursor is typically an opaque token (base64-encoded score + timestamp) that the client sends back. Database implementation: "WHERE score < ? ORDER BY score DESC LIMIT 20" using the cursor as the WHERE bound. This is stable regardless of concurrent inserts.

Question 3

How do you implement real-time feed updates without polling?

Accepted Answer

Real-time feed updates (new posts appearing in your feed as they happen) require a push mechanism from server to client. Options: WebSocket (bidirectional, persistent TCP connection — used by Twitter), Server-Sent Events (one-way server push over HTTP — simpler, auto-reconnects), or long polling (fallback for environments blocking WebSocket). Architecture: when a new post is created and fanout completes (pushed to follower feed caches in Redis), the notification service publishes a "new feed item" event to a Redis pub/sub channel keyed by user_id. Connection servers (one per ~50K concurrent connections) subscribe to these channels. On receiving a pub/sub message, the connection server pushes a "new post available" signal to the connected client. The client either appends the post directly or shows a "N new posts" banner. For mobile clients: use APNs/FCM push notification to wake the app, then the app fetches the new feed item via REST. WebSocket for web clients; APNs/FCM for mobile is the standard pattern.

System Design Interview: Design a Social Media Feed System

System Requirements

Functional

Fanout Strategies

Fanout on Write (Push Model)

Fanout on Read (Pull Model)

Hybrid Model (Production Standard)

Feed Ranking

Feed Storage

Read Path

Feed Cache TTL and Eviction

Interview Tips

What Is a Social Media Feed System?