System Design Interview: Design a Subscription Billing System

What Is a Subscription Billing System?

A subscription billing system automates recurring charges for SaaS products, streaming services, and membership platforms. It handles plan management, trial periods, proration, dunning (failed payment retries), and invoicing. Examples: Stripe Billing, Recurly, Chargebee. Core challenges: idempotent charge processing, clock-based subscription state machines, and handling partial billing periods.

  • Airbnb Interview Guide
  • Coinbase Interview Guide
  • Atlassian Interview Guide
  • Netflix Interview Guide
  • Shopify Interview Guide
  • Stripe Interview Guide
  • System Requirements

    Functional

    • Create/upgrade/downgrade/cancel subscriptions
    • Charge on billing cycle (monthly/annual)
    • Prorate mid-cycle plan changes
    • Trial periods with automatic conversion
    • Dunning: retry failed payments 1, 3, 7 days after failure
    • Generate invoices and send receipts

    Non-Functional

    • Exactly-once charge guarantee (never double-charge)
    • 100M subscriptions, 5M daily billing events
    • 99.99% uptime during billing cycles

    Core Data Model

    plans: id, name, price_cents, interval(month/year), trial_days, features
    subscriptions: id, customer_id, plan_id, status, current_period_start,
                   current_period_end, trial_end, cancel_at_period_end
    invoices: id, subscription_id, amount_due, amount_paid, status, due_date
    invoice_items: id, invoice_id, description, amount, period_start, period_end
    payment_methods: id, customer_id, stripe_pm_id, is_default
    charges: id, invoice_id, amount, status, idempotency_key, created_at
    

    Subscription State Machine

    trialing ──trial_end──► active ──payment_fail──► past_due ──retries_exhausted──► canceled
                 │                                                                        │
                 └──cancel_now──────────────────────────────────────────────────────► canceled
    active ──cancel_at_period_end──► (stays active until period_end) ──► canceled
    active ──upgrade──► active (new plan, prorate immediately)
    

    Billing Engine: The Core Loop

    A daily cron job queries subscriptions where current_period_end <= NOW() and status = active. For each:

    1. Create invoice with period start/end
    2. Add invoice items (subscription fee, usage charges)
    3. Attempt charge with idempotency_key = “invoice_” + invoice_id
    4. If success: advance current_period_start/end, mark subscription active
    5. If failure: set status = past_due, schedule dunning retries

    Idempotent Charges

    Every charge to the payment processor includes an idempotency key. Stripe: pass Idempotency-Key header. If the network times out and you retry, Stripe detects the duplicate key and returns the original charge result without double-charging. The idempotency key should encode invoice ID (or subscription ID + billing period) — something deterministic that is stable across retries.

    response = stripe.charge.create(
        amount=amount_cents,
        customer=stripe_customer_id,
        idempotency_key=f"charge-inv-{invoice_id}"
    )
    

    Proration

    When a user upgrades mid-cycle from Plan A ($10/month) to Plan B ($20/month) on day 15 of 30: credit remaining Plan A = $10 * (15/30) = $5. Charge Plan B for remaining period = $20 * (15/30) = $10. Net charge = $5. Add these as invoice_items: one credit item (negative) and one charge item (positive). This shows the full math on the invoice while netting to the correct amount.

    Dunning — Failed Payment Recovery

    Dunning is the process of retrying failed payments and notifying customers. Schedule:

    • Day 0: payment fails → mark past_due, send “payment failed” email
    • Day 1: retry charge, send reminder if still failed
    • Day 3: retry, escalate email
    • Day 7: final retry, send “subscription will be canceled in 3 days”
    • Day 10: cancel subscription, send cancellation confirmation

    Use a dunning_attempts table to track retry count and next_retry_at. A separate dunning worker queries for past_due subscriptions where next_retry_at <= NOW() and retries them.

    Scaling Billing

    • Spread billing events across the day to avoid midnight spike — add jitter to current_period_end by up to 4 hours
    • Partition subscriptions table by billing_anchor_day for efficient batch queries
    • Use async queues (SQS/Kafka) for invoice generation and email sending — decouple from charge processing

    Interview Tips

    • Idempotency key is the single most important reliability concept — say it first.
    • Draw the state machine before writing any DB schema.
    • Proration formula: credit = plan_price * (days_remaining / days_in_period).
    • Dunning shows you understand the full product, not just the happy path.
    Scroll to Top