Low-Level Design: Social Media Platform — Posts, Feeds, Follows, and Notifications

Core Entities

User: user_id, username, display_name, bio, avatar_url, follower_count, following_count, created_at. Post: post_id, author_id, content, media_urls[], like_count, comment_count, share_count, visibility (PUBLIC, FOLLOWERS, PRIVATE), created_at. Follow: follower_id, followee_id, created_at. Like: user_id, post_id, created_at. Comment: comment_id, post_id, author_id, content, parent_comment_id (for replies), like_count, created_at. FeedItem: user_id, post_id, score, created_at (cached news feed entry).

Post Creation and Storage

Post creation flow: (1) User submits text + media. (2) Media uploaded directly to object storage (S3) via presigned URL. (3) Post record created in PostgreSQL: post_id, author_id, content, media_urls[]. (4) Post published to Kafka topic posts. (5) Feed fanout service consumes and distributes to follower feeds. Post storage at scale: shard posts table by author_id (most queries are author-scoped: “show me my posts”). Index on (author_id, created_at DESC) for profile pages. For global post IDs: use a distributed ID generator (Snowflake-style: timestamp + machine_id + sequence) to maintain rough time ordering. Media CDN: serve images and videos via CDN (CloudFront, Cloudflare) — never serve directly from S3 in production.

Feed Generation: Push vs Pull

Push (fanout on write): when a post is created, immediately write it to every follower feed. Feed reads are O(1). Downside: for users with millions of followers (celebrities), a single post triggers millions of feed writes. Use push for regular users (followers < 10K). Pull (fanout on read): when a user opens their feed, fetch the latest posts from all followed accounts and merge. No precomputation. Downside: expensive for users following many accounts (merge 1000 latest feeds). Use pull for celebrities. Hybrid: use push for regular followees, skip the fanout for celebrity followees. On feed read: combine precomputed feed (from push) with real-time fetched posts from celebrities. Twitter and Instagram use this hybrid approach.

Feed Ranking

Chronological feed: simplest, show posts in reverse time order. Algorithmic feed: rank by engagement signals. Features: post_age (recency), user_engagement_history (like rate with this author), post_engagement_rate (likes + comments / impressions), media_type (video often ranked higher), relationship_strength (how often you interact). Ranking model: gradient boosted trees or neural network trained on click/like/share signals. Score each candidate post: score = model.predict(features). Sort by score, return top K. Update frequency: re-score the feed every time the user opens the app (or every 5 minutes for active users). Cache the scored feed in Redis per user with a short TTL (5 minutes).

Like and Comment Systems

Likes: store in a likes table (user_id, post_id) with unique constraint. Like count: cached on the post row (like_count column). On like: INSERT INTO likes + UPDATE posts SET like_count=like_count+1. On unlike: DELETE FROM likes + UPDATE posts SET like_count=like_count-1. At extreme scale: use Redis INCR for the count, sync to DB periodically. Has-liked check: Redis SET per post with user_ids (SISMEMBER for O(1) check). Comments: threaded using parent_comment_id (self-referential FK). Fetch top-level comments with count of replies; expand on click. Sort comments by: newest, oldest, or most-liked. Comment count also cached on the post row.

Notification System

Notification types: like on your post, comment on your post, new follower, mention in a comment, re-share. Generation: consume events from Kafka (LikeEvent, CommentEvent, FollowEvent). For each event: create a Notification record (recipient_id, type, actor_id, reference_id, is_read=false, created_at). Delivery: push via FCM/APNs for mobile, WebSocket for web. Aggregation: instead of “Alice liked your post”, “Bob liked your post”, “Carol liked your post” — show “Alice and 2 others liked your post.” Aggregate within a 1-hour window per (recipient, reference_id, type). Rate limiting: cap notification delivery at N per hour per user to avoid spam. Users can configure per-type notification preferences.

Asked at: Meta Interview Guide

Asked at: Snap Interview Guide

Asked at: Twitter/X Interview Guide

Asked at: LinkedIn Interview Guide

Scroll to Top