Low-Level Design: Payment Processor — Idempotency, State Machine, and Retry Handling

Core Entities

PaymentIntent: intent_id (UUID), merchant_id, customer_id, amount (integer cents), currency, status (CREATED, PROCESSING, REQUIRES_ACTION, SUCCEEDED, FAILED, CANCELLED, REFUNDED), payment_method_id, idempotency_key, created_at, updated_at, metadata (JSON). PaymentMethod: method_id, customer_id, type (CARD, BANK_TRANSFER, WALLET), card_fingerprint, last4, expiry_month, expiry_year, billing_address_id, is_default. Charge: charge_id, intent_id, processor (STRIPE, ADYEN, BRAINTREE), processor_charge_id, amount, status, created_at, processor_response (JSON). Refund: refund_id, charge_id, amount, reason, status (PENDING, SUCCEEDED, FAILED), created_at. WebhookEvent: event_id, source, event_type, payload (JSON), received_at, processed_at, status (PENDING, PROCESSED, FAILED).

Idempotency

Idempotency prevents double charges when clients retry on network failure. Every payment creation request must include an idempotency_key (client-generated UUID). Server logic:

class PaymentService:
    def create_intent(self, request: CreateIntentRequest) -> PaymentIntent:
        # Check idempotency cache first
        cached = self.idempotency_store.get(request.idempotency_key)
        if cached:
            return cached  # Return exact same response as original

        with self.db.transaction():
            # Double-check under lock to prevent race conditions
            existing = self.repo.find_by_idempotency_key(
                request.idempotency_key
            )
            if existing:
                return existing

            intent = PaymentIntent(
                intent_id=uuid4(),
                idempotency_key=request.idempotency_key,
                amount=request.amount,
                status=IntentStatus.CREATED,
                ...
            )
            self.repo.save(intent)

        # Cache the result for 24 hours
        self.idempotency_store.set(
            request.idempotency_key, intent, ttl=86400
        )
        return intent

Idempotency key uniqueness: scoped per merchant. Same key used by different merchants is valid (different namespaces). Validate that the request body matches the original request if the key is reused (return 422 Unprocessable Entity if the body differs — client error, not a retry).

Payment State Machine

CREATED → PROCESSING (charge attempt started) → SUCCEEDED (charge confirmed) or FAILED (charge declined). CREATED → CANCELLED (cancelled before processing). SUCCEEDED → REFUNDED (full refund). PROCESSING → REQUIRES_ACTION (3DS authentication needed) → PROCESSING (customer completes 3DS). Each transition is persisted atomically: UPDATE payment_intents SET status=:new, updated_at=NOW() WHERE intent_id=:id AND status=:expected. If rows_affected=0: concurrent update happened — retry or return conflict. All transitions logged in an events table for audit. Invalid transitions are rejected with a 409 Conflict response. State machine validation: define VALID_TRANSITIONS = {CREATED: [PROCESSING, CANCELLED], PROCESSING: [SUCCEEDED, FAILED, REQUIRES_ACTION], …}. Assert before any state update.

Retry Logic and Processor Failover

Transient failures: network timeouts, processor rate limits (HTTP 429), temporary unavailability. Retry strategy: exponential backoff with jitter. First retry after 1s, second after 2s, third after 4s, up to 5 retries. Jitter: add ±20% random variance to prevent thundering herd (all retries hitting the processor simultaneously). Non-retriable failures: card declined (insufficient funds, invalid card number) — do not retry, return FAILED immediately. Processor failover: maintain two processors (primary and backup, e.g., Stripe primary, Adyen backup). If the primary fails 3 consecutive times for a single intent: fail over to the backup. Track failover events in the Charge table. Reconciliation: after processor failover, reconcile the charge status with the primary processor on recovery (to detect if a charge actually succeeded but timed out). Dead letter queue: failed payments after all retries are placed in a DLQ for manual review and customer notification.

Asked at: Stripe Interview Guide

Asked at: Shopify Interview Guide

Asked at: Coinbase Interview Guide

Asked at: DoorDash Interview Guide

Scroll to Top