System Design Interview: Design a Feature Flag System
Feature flag (feature toggle) systems allow engineers to enable or disable features at runtime without deploying code. They support gradual rollouts, A/B testing, and instant rollback. Asked at LinkedIn, Atlassian, Shopify, and growth-focused companies.
Requirements Clarification
Functional Requirements
- Create and manage feature flags with multiple targeting rules
- Target flags by user ID, percentage rollout, country, user segment, or custom attributes
- Evaluate flags in real-time (<1ms latency)
- Gradual rollout: increase percentage from 0% to 100%
- Kill switch: instantly disable a feature for all users
- Audit log: track who changed what and when
Non-Functional Requirements
- Scale: 1B flag evaluations/day, 10K flags, 100M users
- Latency: <1ms for flag evaluation (must not slow down critical paths)
- Availability: 99.99% (flag service outage should not take down main application)
- Consistency: eventual OK (brief inconsistency during flag updates acceptable)
Core Concept: Client-Side SDK
The key architectural insight is that flag evaluation happens in the application process, not via network call. SDKs (LaunchDarkly, Unleash, GrowthBook model):
- On startup, SDK fetches all flag configurations from flag service
- SDK caches rules in local memory
- Flag evaluation: pure local computation using cached rules (microseconds)
- SDK subscribes to streaming updates (SSE or WebSocket) for real-time rule changes
- Fallback: if streaming disconnects, poll every 30s
This eliminates network latency from the hot path. Flag evaluations are local dictionary lookups.
Flag Data Model
Flag:
id: string
key: string (e.g., "new-checkout-flow")
status: ACTIVE | INACTIVE | ARCHIVED
variations: [{value: true}, {value: false}] # or strings, numbers
targeting_rules: [
{
condition: {attribute: "country", operator: "in", values: ["US", "CA"]},
variation: 0 # index into variations
},
{
condition: {attribute: "user_id", operator: "in_percentage", values: [0, 10]},
variation: 0 # 10% rollout
}
]
default_variation: 1 # fallback if no rule matches
Flag Evaluation Algorithm
def evaluate(flag_key, user_context):
flag = local_cache[flag_key]
if flag.status != ACTIVE:
return flag.variations[flag.default_variation]
for rule in flag.targeting_rules:
if matches_condition(rule.condition, user_context):
return flag.variations[rule.variation]
return flag.variations[flag.default_variation]
def matches_condition(condition, user_context):
value = user_context.get(condition.attribute)
if condition.operator == "in_percentage":
# Deterministic: hash(user_id + flag_key) % 100
bucket = hash(user_context.user_id + flag_key) % 100
return condition.values[0] <= bucket < condition.values[1]
if condition.operator == "in":
return value in condition.values
# ... other operators
Consistent Hashing for Percentage Rollout
Use hash(user_id + flag_key) % 100 for bucket assignment. This ensures: same user always gets same bucket for same flag (sticky), different flags have independent bucketing, gradual rollout from 0-100% moves users in same order (predictable).
Architecture
Engineers use UI/API to create/modify flags
|
Flag Service (CRUD API)
|
PostgreSQL (source of truth)
|
Change events -> Kafka
|
Streaming Update Service (SSE/WebSocket)
|
SDK instances in application servers (local cache)
|
Flag evaluation (local, microseconds)
Real-Time Updates
When a flag changes, updates propagate to all SDK instances:
- Server-Sent Events (SSE): SDK maintains persistent HTTP connection to streaming service. On flag change, server pushes update. Simple, works through load balancers, one-directional
- WebSocket: bi-directional, better for high-frequency updates
- Propagation latency: <1 second for 99% of SDK instances
- Fallback: SDK polls every 30s if streaming connection drops
Percentage Rollout and A/B Testing
- Gradual rollout: increase percentage in flag config (5% → 20% → 50% → 100%). Users in bucket 0-4 see feature at 5%, 0-19 at 20%.
- A/B testing: run two variations simultaneously (50/50 split). Track conversion metrics per variation in analytics system.
- Multi-variate testing: multiple variations (A/B/C/D). Each variation gets a percentage range.
- Sticky sessions: hash-based bucketing ensures same user sees same variation consistently.
Kill Switch and Rollback
Kill switch = set flag status to INACTIVE. Propagates via streaming to all SDKs within 1 second. All users see default variation (feature off). No deployment needed. This is the primary benefit over code-level feature gating.
Interview Tips
- Lead with SDK caching – flag evaluation must be <1ms so it cannot be a network call
- Explain streaming updates (SSE) for real-time propagation
- Describe hash-based bucketing for deterministic percentage rollout
- Know the difference between feature flags and A/B testing (flags control visibility; A/B tests measure impact)
- Mention GrowthBook, LaunchDarkly, Unleash as real implementations
Companies that ask this: Netflix Interview Guide 2026: Streaming Architecture, Recommendation Systems, and Engineering Excellence
Companies that ask this: Stripe Interview Guide 2026: Process, Bug Bash Round, and Payment Systems
Companies that ask this: Twitter/X Interview Guide 2026: Timeline Algorithms, Real-Time Search, and Content at Scale
Companies that ask this: Shopify Interview Guide
Companies that ask this: Atlassian Interview Guide
Companies that ask this: LinkedIn Interview Guide 2026: Social Graph Engineering, Feed Ranking, and Professional Network Scale