System Design Interview: Design a Feature Flag System

System Design Interview: Design a Feature Flag System

Feature flag (feature toggle) systems allow engineers to enable or disable features at runtime without deploying code. They support gradual rollouts, A/B testing, and instant rollback. Asked at LinkedIn, Atlassian, Shopify, and growth-focused companies.

Requirements Clarification

Functional Requirements

  • Create and manage feature flags with multiple targeting rules
  • Target flags by user ID, percentage rollout, country, user segment, or custom attributes
  • Evaluate flags in real-time (<1ms latency)
  • Gradual rollout: increase percentage from 0% to 100%
  • Kill switch: instantly disable a feature for all users
  • Audit log: track who changed what and when

Non-Functional Requirements

  • Scale: 1B flag evaluations/day, 10K flags, 100M users
  • Latency: <1ms for flag evaluation (must not slow down critical paths)
  • Availability: 99.99% (flag service outage should not take down main application)
  • Consistency: eventual OK (brief inconsistency during flag updates acceptable)

Core Concept: Client-Side SDK

The key architectural insight is that flag evaluation happens in the application process, not via network call. SDKs (LaunchDarkly, Unleash, GrowthBook model):

  1. On startup, SDK fetches all flag configurations from flag service
  2. SDK caches rules in local memory
  3. Flag evaluation: pure local computation using cached rules (microseconds)
  4. SDK subscribes to streaming updates (SSE or WebSocket) for real-time rule changes
  5. Fallback: if streaming disconnects, poll every 30s

This eliminates network latency from the hot path. Flag evaluations are local dictionary lookups.

Flag Data Model

Flag:
  id: string
  key: string (e.g., "new-checkout-flow")
  status: ACTIVE | INACTIVE | ARCHIVED
  variations: [{value: true}, {value: false}]  # or strings, numbers
  targeting_rules: [
    {
      condition: {attribute: "country", operator: "in", values: ["US", "CA"]},
      variation: 0  # index into variations
    },
    {
      condition: {attribute: "user_id", operator: "in_percentage", values: [0, 10]},
      variation: 0  # 10% rollout
    }
  ]
  default_variation: 1  # fallback if no rule matches

Flag Evaluation Algorithm

def evaluate(flag_key, user_context):
    flag = local_cache[flag_key]
    if flag.status != ACTIVE:
        return flag.variations[flag.default_variation]

    for rule in flag.targeting_rules:
        if matches_condition(rule.condition, user_context):
            return flag.variations[rule.variation]

    return flag.variations[flag.default_variation]

def matches_condition(condition, user_context):
    value = user_context.get(condition.attribute)
    if condition.operator == "in_percentage":
        # Deterministic: hash(user_id + flag_key) % 100
        bucket = hash(user_context.user_id + flag_key) % 100
        return condition.values[0] <= bucket < condition.values[1]
    if condition.operator == "in":
        return value in condition.values
    # ... other operators

Consistent Hashing for Percentage Rollout

Use hash(user_id + flag_key) % 100 for bucket assignment. This ensures: same user always gets same bucket for same flag (sticky), different flags have independent bucketing, gradual rollout from 0-100% moves users in same order (predictable).

Architecture

Engineers use UI/API to create/modify flags
    |
Flag Service (CRUD API)
    |
PostgreSQL (source of truth)
    |
Change events -> Kafka
    |
Streaming Update Service (SSE/WebSocket)
    |
SDK instances in application servers (local cache)
    |
Flag evaluation (local, microseconds)

Real-Time Updates

When a flag changes, updates propagate to all SDK instances:

  • Server-Sent Events (SSE): SDK maintains persistent HTTP connection to streaming service. On flag change, server pushes update. Simple, works through load balancers, one-directional
  • WebSocket: bi-directional, better for high-frequency updates
  • Propagation latency: <1 second for 99% of SDK instances
  • Fallback: SDK polls every 30s if streaming connection drops

Percentage Rollout and A/B Testing

  • Gradual rollout: increase percentage in flag config (5% → 20% → 50% → 100%). Users in bucket 0-4 see feature at 5%, 0-19 at 20%.
  • A/B testing: run two variations simultaneously (50/50 split). Track conversion metrics per variation in analytics system.
  • Multi-variate testing: multiple variations (A/B/C/D). Each variation gets a percentage range.
  • Sticky sessions: hash-based bucketing ensures same user sees same variation consistently.

Kill Switch and Rollback

Kill switch = set flag status to INACTIVE. Propagates via streaming to all SDKs within 1 second. All users see default variation (feature off). No deployment needed. This is the primary benefit over code-level feature gating.

Interview Tips

  • Lead with SDK caching – flag evaluation must be <1ms so it cannot be a network call
  • Explain streaming updates (SSE) for real-time propagation
  • Describe hash-based bucketing for deterministic percentage rollout
  • Know the difference between feature flags and A/B testing (flags control visibility; A/B tests measure impact)
  • Mention GrowthBook, LaunchDarkly, Unleash as real implementations


Companies that ask this: Netflix Interview Guide 2026: Streaming Architecture, Recommendation Systems, and Engineering Excellence

Companies that ask this: Stripe Interview Guide 2026: Process, Bug Bash Round, and Payment Systems

Companies that ask this: Twitter/X Interview Guide 2026: Timeline Algorithms, Real-Time Search, and Content at Scale

Companies that ask this: Shopify Interview Guide

Companies that ask this: Atlassian Interview Guide

Companies that ask this: LinkedIn Interview Guide 2026: Social Graph Engineering, Feed Ranking, and Professional Network Scale

Scroll to Top