System Design Interview: Design a Task Scheduling System (Cron/Airflow)

What Is a Task Scheduling System?

A task scheduling system triggers jobs at specified times or on recurring schedules. Examples: Unix cron, Apache Airflow, AWS EventBridge Scheduler, Celery Beat. Core challenges: exactly-once execution (no job fires twice), reliability under scheduler crashes, distributed clock synchronization, and handling long-running vs short tasks differently.

  • Shopify Interview Guide
  • LinkedIn Interview Guide
  • Databricks Interview Guide
  • Stripe Interview Guide
  • Airbnb Interview Guide
  • Uber Interview Guide
  • System Requirements

    Functional

    • Schedule jobs: one-time (fire at T) and recurring (cron expression)
    • Execute jobs reliably: at-least-once, with idempotency for exactly-once
    • Monitor: job history, success/failure, execution duration
    • Priority queues: urgent jobs run before lower-priority ones
    • Dependencies: Job B starts only after Job A completes (DAG)

    Non-Functional

    • 100M scheduled jobs, 10K executions/second
    • Job trigger latency: <1 second from scheduled time
    • 99.99% reliability (no missed jobs)

    Core Data Model

    jobs: id, name, schedule(cron/timestamp), handler, payload,
          status(active/paused/deleted), max_retries, timeout_seconds
    executions: id, job_id, status(pending/running/success/failed),
                started_at, completed_at, worker_id, attempt_number
    

    Scheduling Architecture

    Scheduler Service ──polls DB──► jobs due within next 30s
            │
       Acquires distributed lock (per job)
            │
       Creates execution record + publishes to Job Queue (Kafka/SQS)
            │
    Worker Pool ──consumes──► executes handler ──► updates execution status
    

    Polling vs Event-Driven Trigger

    Polling: the scheduler queries for jobs due in the next N seconds, runs every N/2 seconds. Simple, works for up to millions of jobs. At 100M jobs: partition the jobs table by next_run_time bucket; each scheduler instance owns a range. Event-driven (time-wheel or delay queue): store jobs in a sorted set (Redis ZADD with score = next_run_unix_timestamp). Poll Redis ZRANGEBYSCORE min max with max = now + 5s. Lower latency, handles high volumes efficiently.

    # Redis time-wheel approach
    ZADD scheduled_jobs {next_run_unix} job_id
    # Scheduler loop (every 1 second):
    due = redis.zrangebyscore('scheduled_jobs', 0, time.time() + 1)
    for job_id in due:
        redis.zrem('scheduled_jobs', job_id)
        enqueue(job_id)
        if recurring: redis.zadd('scheduled_jobs', next_time(job), job_id)
    

    Exactly-Once Execution

    The scheduler may crash after enqueuing a job but before recording it as enqueued — causing double execution on restart. Solution: use a distributed lock per job (Redis SET NX EX) before enqueuing. Only the scheduler holding the lock enqueues the job. Lock TTL = job execution timeout + buffer. If the worker crashes mid-execution: the execution record stays in “running” state. A watchdog process detects executions running longer than timeout_seconds and marks them as failed, releasing the lock for retry.

    Cron Expression Parsing

    * * * * *    every minute
    0 9 * * MON  every Monday at 9am
    0 */4 * * *  every 4 hours
    

    Parse cron expressions to compute next_run_time. Use a library (croniter in Python, cron-parser in Node). Store next_run_time in the DB; after execution, compute and update the next occurrence.

    DAG Dependencies (Airflow-style)

    For pipeline orchestration: model tasks as a DAG. Store edges in a task_dependencies table. Before executing a task, check all upstream tasks in the current run have status=success. A dependency resolver runs after each task completion and enqueues newly unblocked tasks.

    Retry and Backoff

    On task failure: increment attempt_number, compute next_retry_at = now + 2^attempt * base_delay (exponential backoff with jitter). Cap at max_retries. After max_retries: mark as permanently failed, send alert. Idempotent handlers make retries safe.

    Interview Tips

    • Redis sorted set (score = next_run_unix) is the canonical time-wheel data structure.
    • Distributed lock per job prevents double-execution at the scheduler level.
    • Separate scheduler (which decides when) from worker (which executes) for independent scaling.
    • DAG dependency model is Airflow’s core — mention it for pipeline use cases.

    {
    “@context”: “https://schema.org”,
    “@type”: “FAQPage”,
    “mainEntity”: [
    {
    “@type”: “Question”,
    “name”: “How does a Redis sorted set work as a task scheduler queue?”,
    “acceptedAnswer”: { “@type”: “Answer”, “text”: “A Redis sorted set (ZADD) stores tasks with their scheduled unix timestamp as the score. The scheduler polls with ZRANGEBYSCORE queue 0 {now} LIMIT 0 10 to fetch tasks due now or in the past. These are atomically removed with ZREM after pickup (use a Lua script for atomicity). This is called a time-wheel pattern: O(log N) insert, O(log N + K) fetch for K tasks. It handles millions of scheduled tasks efficiently. The scheduler runs in a leader node (chosen via distributed lock) to prevent double-execution. Tasks not yet due remain in the sorted set with future timestamps. For recurring tasks, after executing, the next occurrence is re-inserted with the next timestamp computed from the cron expression.” }
    },
    {
    “@type”: “Question”,
    “name”: “How do you guarantee exactly-once execution in a distributed task scheduler?”,
    “acceptedAnswer”: { “@type”: “Answer”, “text”: “Exactly-once requires two components: (1) Leader election via distributed lock — only one scheduler node dequeues tasks at a time, preventing multiple nodes from picking up the same task. Use Redis SET NX EX or etcd leader lease. If the leader crashes, a new leader acquires the lock and resumes. (2) Idempotent task execution — even with exactly-once dequeue, a worker crash after dequeue but before completion can cause re-execution on retry. Tag each task with a unique execution_id. Workers write to an execution_log table with a UNIQUE constraint on execution_id. If the task is retried, the second INSERT fails (ON CONFLICT DO NOTHING), and the task is skipped. True exactly-once delivery is impossible across distributed systems without idempotent consumers; the combination of dedup + unique execution_id gives effectively-once semantics.” }
    },
    {
    “@type”: “Question”,
    “name”: “How do you implement task dependency DAGs in a scheduler like Airflow?”,
    “acceptedAnswer”: { “@type”: “Answer”, “text”: “Task dependencies form a Directed Acyclic Graph (DAG). Each task has a list of upstream dependencies. A task is eligible to run only when all upstream tasks have completed successfully. Implementation: maintain a task_executions table with status (pending/running/success/failed) and a dependencies table linking child_task_id to parent_task_id. A scheduler loop queries: SELECT task_id FROM task_executions WHERE status=pending AND NOT EXISTS (SELECT 1 FROM dependencies d JOIN task_executions p ON d.parent_id=p.task_id WHERE d.child_id=task_id AND p.status != success). When a task completes, the scheduler checks which downstream tasks are now unblocked. For fan-out parallelism: once all predecessors succeed, all eligible successors can be enqueued simultaneously. Cycle detection at DAG definition time (topological sort) prevents deadlocks at runtime.” }
    }
    ]
    }

    Scroll to Top