Core Entities
User: user_id, username (unique, URL-safe), email, password_hash, display_name, bio, avatar_url, role (READER, AUTHOR, EDITOR, ADMIN), created_at. Post: post_id, author_id, title, slug (unique, URL-safe: “my-first-post”), body (Markdown or HTML), excerpt (auto-generated or manual), status (DRAFT, PUBLISHED, ARCHIVED), featured_image_url, published_at, created_at, updated_at, view_count, reading_time_minutes. Tag: tag_id, name, slug. PostTag: post_id, tag_id. Comment: comment_id, post_id, author_id (nullable for guests), author_name (for guests), author_email, body, status (PENDING, APPROVED, SPAM), parent_id (nullable for threading), created_at. PostRevision: revision_id, post_id, author_id, body_snapshot, title_snapshot, created_at, change_summary.
SEO-Friendly URLs and Slug Generation
Posts are accessed via /author-username/post-slug (e.g., /john/how-to-design-systems). Slug generation: convert title to lowercase, replace spaces with hyphens, strip non-alphanumeric characters except hyphens, truncate to 100 characters. Ensure uniqueness per author: if slug already exists, append -2, -3, etc. Redirect handling: when a post’s title (and thus slug) changes, keep the old slug mapping in a PostSlug table: (slug, post_id, is_primary). The old slug returns HTTP 301 to the new slug. Never return 404 for previously valid URLs — search engines penalize broken links. Canonical URL: set the canonical link tag to the primary slug URL to consolidate SEO credit if multiple paths reach the same post. RSS feed: generate /feed.xml for each author and globally. Cache the RSS XML (regenerate only on new post publish). Sitemap: generate /sitemap.xml with all published posts, tags, and author pages. Refresh daily. Submit to Google Search Console.
Draft and Publishing Workflow
class PostService:
def publish_post(self, post_id: int, publisher_id: int) -> Post:
post = self.repo.get_post(post_id)
if post.status != PostStatus.DRAFT:
raise InvalidStatusError()
if post.author_id != publisher_id and not self.is_editor(publisher_id):
raise PermissionError()
# Compute auto-fields
post.reading_time_minutes = self._estimate_reading_time(post.body)
post.excerpt = post.excerpt or self._generate_excerpt(post.body)
post.published_at = datetime.utcnow()
post.status = PostStatus.PUBLISHED
self.repo.save(post)
# Clear post cache
self.cache.delete(f"post:{post.slug}")
self.cache.delete("recent_posts")
self.cache.delete(f"author_posts:{post.author_id}")
# Trigger async jobs
self.events.publish("post.published", {
"post_id": post.post_id,
"author_id": post.author_id
})
return post
The post.published event triggers: notification to subscribers (newsletter send), ping to search engines (IndexNow API for instant indexing), social media auto-post (if configured), CDN cache warmup.
Comments with Moderation
Comment workflow: (1) Guest/user submits comment → status=PENDING. (2) Spam check: run through Akismet API (industry standard for blog spam). If spam: status=SPAM (not shown). (3) Auto-approve trusted commenters: users who have had 2+ comments approved are auto-approved (status=APPROVED immediately). First-time commenters wait for moderation. (4) Author/editor moderation queue: PENDING comments are visible in the admin dashboard. Approve or mark as spam. Email notification to the post author when a new comment awaits moderation. Threading: store parent_id on each comment. Load top-level comments first; load replies on expand. Flat representation in the database, tree representation in the UI. Comment count: cache the approved comment count per post in Redis. Increment on approve, decrement on delete. Avoid counting in SQL on every page load. Anti-spam: rate limit by IP (max 3 comments per 10 minutes), require JavaScript challenge (honeypot field), validate email format for guests.
Post Discovery and Recommendations
Tag-based discovery: posts are tagged with topics. Tag page (/tag/system-design) shows all posts for a tag sorted by published_at DESC. Index: (tag_id, published_at DESC) on PostTag join with Posts. Author page (/john) shows all of an author’s published posts. Related posts: after reading a post, show 3-5 related posts. Simple approach: posts sharing the most tags with the current post. Query: find posts sharing any tag, rank by number of shared tags, return top 5. More sophisticated: TF-IDF similarity on post content (compute offline, store in a related_posts cache). Cache related posts per post_id for 24 hours. Trending posts: posts with the most views in the last 7 days. Maintain a sorted set in Redis: ZINCRBY trending:posts 1 post_id on each page view. Refresh daily from the database for accuracy (Redis count is approximate). Full-text search: Elasticsearch index for searching across titles, bodies, and tags. Autocomplete suggestions from a completion suggester on post titles.
Asked at: Shopify Interview Guide
Asked at: Twitter/X Interview Guide
Asked at: Snap Interview Guide
Asked at: Cloudflare Interview Guide