Core Entities
Content: content_id, type (TEXT, IMAGE, VIDEO), author_id, platform, raw_content / storage_url, created_at. ModerationDecision: decision_id, content_id, verdict (APPROVED, REMOVED, ESCALATED), confidence_score, reason_codes[], decided_by (model_id or reviewer_id), decided_at. Appeal: appeal_id, content_id, user_id, reason_text, status (PENDING, UPHELD, OVERTURNED), reviewer_id. ReviewQueue: priority-ordered queue of content awaiting human review.
Moderation Pipeline
Content ingestion → Automated ML classifiers → Threshold decision → Human review queue (if escalated) → Final decision → Action enforcement → Appeal handling.
Automated classifiers: text classifiers (toxicity, spam, hate speech), image classifiers (NSFW, violence, graphic content), video frame sampling + audio transcription. Classifiers return a confidence score [0, 1] per violation category.
Decision Thresholds
Define per-category thresholds: if confidence >= HIGH_THRESHOLD (e.g., 0.95): auto-remove. If confidence >= LOW_THRESHOLD (e.g., 0.6): escalate to human review. If confidence < LOW_THRESHOLD: auto-approve. Thresholds are tunable without code changes (stored in config). Calibrate thresholds using precision/recall trade-off: high precision (few false positives) for auto-removal; lower precision acceptable for escalation (humans catch false positives).
Human Review Queue
Priority queue ordered by: escalation reason severity (CSAM > violence > hate speech > spam), content virality (high-impression content reviewed first), time in queue (prevent starvation). Reviewers are assigned content from their approved categories (CSAM reviewers have specialized training). Track reviewer decisions and inter-rater agreement — flag reviewers with low agreement for calibration. Time-box each review (e.g., 60 seconds) — unreviewable content within time goes back to queue with escalated priority.
Action Enforcement
On REMOVED decision: hide content from feed immediately (soft delete — set is_visible=false). Notify the author with the reason code. Apply strike to the author account. After N strikes within 30 days: temporary suspension. After M total strikes: permanent ban. Strike records are part of the appeal. Action enforcement is a separate service that subscribes to ModerationDecision events — decoupled from the moderation pipeline.
Appeals System
class AppealService:
def submit_appeal(self, content_id, user_id, reason):
content = self.db.get_content(content_id)
if content.author_id != user_id:
raise PermissionError("Can only appeal own content")
existing = self.db.get_appeal(content_id)
if existing and existing.status == AppealStatus.PENDING:
raise ValueError("Appeal already pending")
appeal = Appeal(content_id=content_id, user_id=user_id,
reason=reason, status=AppealStatus.PENDING)
self.db.insert(appeal)
self.queue.enqueue(appeal, priority=Priority.NORMAL)
return appeal
def resolve_appeal(self, appeal_id, reviewer_id, decision):
appeal = self.db.get_appeal(appeal_id)
appeal.status = decision # UPHELD or OVERTURNED
appeal.reviewer_id = reviewer_id
if decision == AppealDecision.OVERTURNED:
self.enforcement.restore_content(appeal.content_id)
self.enforcement.reverse_strike(appeal.user_id)
self.db.update(appeal)
self.notify(appeal)
Scaling Considerations
At 100M posts/day: the automated pipeline must process about 1,200 items/second. Use a Kafka topic per content type; classifier workers auto-scale by consumer lag. ML inference is the bottleneck — GPU workers with batch inference (batch 32-64 items per forward pass) improve throughput significantly. Cache model predictions for duplicate content (hash-based deduplication: same image/text seen before gets the cached verdict). Human review queue is smaller — only 2-5% of content escalates at typical platforms.
Asked at: Meta Interview Guide
Asked at: Twitter/X Interview Guide
Asked at: Snap Interview Guide
Asked at: Airbnb Interview Guide
See also: Uber Interview Guide 2026: Dispatch Systems, Geospatial Algorithms, and Marketplace Engineering
See also: Scale AI Interview Guide 2026: Data Infrastructure, RLHF Pipelines, and ML Engineering
See also: Databricks Interview Guide 2026: Spark Internals, Delta Lake, and Lakehouse Architecture
See also: Anthropic Interview Guide 2026: Process, Questions, and AI Safety
See also: Atlassian Interview Guide
See also: Coinbase Interview Guide
See also: Shopify Interview Guide
See also: Lyft Interview Guide 2026: Rideshare Engineering, Real-Time Dispatch, and Safety Systems
See also: Stripe Interview Guide 2026: Process, Bug Bash Round, and Payment Systems