Low-Level Design: Content Moderation System — Automated Filtering, Human Review, and Appeals

Core Entities

Content: content_id, type (TEXT, IMAGE, VIDEO), author_id, platform, raw_content / storage_url, created_at. ModerationDecision: decision_id, content_id, verdict (APPROVED, REMOVED, ESCALATED), confidence_score, reason_codes[], decided_by (model_id or reviewer_id), decided_at. Appeal: appeal_id, content_id, user_id, reason_text, status (PENDING, UPHELD, OVERTURNED), reviewer_id. ReviewQueue: priority-ordered queue of content awaiting human review.

Moderation Pipeline

Content ingestion → Automated ML classifiers → Threshold decision → Human review queue (if escalated) → Final decision → Action enforcement → Appeal handling.

Automated classifiers: text classifiers (toxicity, spam, hate speech), image classifiers (NSFW, violence, graphic content), video frame sampling + audio transcription. Classifiers return a confidence score [0, 1] per violation category.

Decision Thresholds

Define per-category thresholds: if confidence >= HIGH_THRESHOLD (e.g., 0.95): auto-remove. If confidence >= LOW_THRESHOLD (e.g., 0.6): escalate to human review. If confidence < LOW_THRESHOLD: auto-approve. Thresholds are tunable without code changes (stored in config). Calibrate thresholds using precision/recall trade-off: high precision (few false positives) for auto-removal; lower precision acceptable for escalation (humans catch false positives).

Human Review Queue

Priority queue ordered by: escalation reason severity (CSAM > violence > hate speech > spam), content virality (high-impression content reviewed first), time in queue (prevent starvation). Reviewers are assigned content from their approved categories (CSAM reviewers have specialized training). Track reviewer decisions and inter-rater agreement — flag reviewers with low agreement for calibration. Time-box each review (e.g., 60 seconds) — unreviewable content within time goes back to queue with escalated priority.

Action Enforcement

On REMOVED decision: hide content from feed immediately (soft delete — set is_visible=false). Notify the author with the reason code. Apply strike to the author account. After N strikes within 30 days: temporary suspension. After M total strikes: permanent ban. Strike records are part of the appeal. Action enforcement is a separate service that subscribes to ModerationDecision events — decoupled from the moderation pipeline.

Appeals System

class AppealService:
    def submit_appeal(self, content_id, user_id, reason):
        content = self.db.get_content(content_id)
        if content.author_id != user_id:
            raise PermissionError("Can only appeal own content")
        existing = self.db.get_appeal(content_id)
        if existing and existing.status == AppealStatus.PENDING:
            raise ValueError("Appeal already pending")
        appeal = Appeal(content_id=content_id, user_id=user_id,
                        reason=reason, status=AppealStatus.PENDING)
        self.db.insert(appeal)
        self.queue.enqueue(appeal, priority=Priority.NORMAL)
        return appeal

    def resolve_appeal(self, appeal_id, reviewer_id, decision):
        appeal = self.db.get_appeal(appeal_id)
        appeal.status = decision  # UPHELD or OVERTURNED
        appeal.reviewer_id = reviewer_id
        if decision == AppealDecision.OVERTURNED:
            self.enforcement.restore_content(appeal.content_id)
            self.enforcement.reverse_strike(appeal.user_id)
        self.db.update(appeal)
        self.notify(appeal)

Scaling Considerations

At 100M posts/day: the automated pipeline must process about 1,200 items/second. Use a Kafka topic per content type; classifier workers auto-scale by consumer lag. ML inference is the bottleneck — GPU workers with batch inference (batch 32-64 items per forward pass) improve throughput significantly. Cache model predictions for duplicate content (hash-based deduplication: same image/text seen before gets the cached verdict). Human review queue is smaller — only 2-5% of content escalates at typical platforms.

Asked at: Meta Interview Guide

Asked at: Twitter/X Interview Guide

Asked at: Snap Interview Guide

Asked at: Airbnb Interview Guide

See also: LinkedIn Interview Guide 2026: Social Graph Engineering, Feed Ranking, and Professional Network Scale

See also: Uber Interview Guide 2026: Dispatch Systems, Geospatial Algorithms, and Marketplace Engineering

See also: Netflix Interview Guide 2026: Streaming Architecture, Recommendation Systems, and Engineering Excellence

See also: Scale AI Interview Guide 2026: Data Infrastructure, RLHF Pipelines, and ML Engineering

See also: Databricks Interview Guide 2026: Spark Internals, Delta Lake, and Lakehouse Architecture

See also: Anthropic Interview Guide 2026: Process, Questions, and AI Safety

See also: Atlassian Interview Guide

See also: Coinbase Interview Guide

See also: Shopify Interview Guide

See also: Lyft Interview Guide 2026: Rideshare Engineering, Real-Time Dispatch, and Safety Systems

See also: Stripe Interview Guide 2026: Process, Bug Bash Round, and Payment Systems

Scroll to Top