Low-Level Design: CRM System — Contact Management, Pipeline Tracking, and Activity Logging

Core Entities

Contact: contact_id, first_name, last_name, email (unique), phone, company_id, job_title, source (INBOUND, OUTBOUND, REFERRAL), owner_id (assigned sales rep), created_at, updated_at, tags (array). Company: company_id, name, domain, industry, size, website, owner_id. Deal: deal_id, name, contact_id, company_id, owner_id, pipeline_id, stage_id, amount, currency, close_date (expected), status (OPEN, WON, LOST), created_at, closed_at. Pipeline: pipeline_id, name, stages (ordered list). Stage: stage_id, pipeline_id, name, order_index, win_probability (0-100). Activity: activity_id, type (EMAIL, CALL, MEETING, NOTE, TASK), contact_id, deal_id (nullable), user_id, subject, body, scheduled_at, completed_at, created_at. CustomField: field_id, entity_type (CONTACT, DEAL, COMPANY), field_name, field_type (TEXT, NUMBER, DATE, SELECT, MULTI_SELECT), options (JSON for SELECT types).

Deal Pipeline and Stage Transitions

class DealService:
    def move_stage(self, deal_id: int, new_stage_id: int,
                   user_id: int) -> Deal:
        deal = self.repo.get_deal(deal_id)
        old_stage = deal.stage_id
        new_stage = self.repo.get_stage(new_stage_id)

        # Validate stage belongs to the same pipeline
        if new_stage.pipeline_id != deal.pipeline_id:
            raise InvalidStageError("Stage not in deal's pipeline")

        deal.stage_id = new_stage_id
        deal.updated_at = datetime.utcnow()

        # Auto-close on WON/LOST stages
        if new_stage.name in ["Closed Won", "Closed Lost"]:
            deal.status = DealStatus.WON if "Won" in new_stage.name else DealStatus.LOST
            deal.closed_at = datetime.utcnow()

        self.repo.save(deal)

        # Log stage transition as activity
        self.activity_service.log(ActivityType.NOTE,
            contact_id=deal.contact_id, deal_id=deal.deal_id,
            user_id=user_id,
            body=f"Deal moved from {self.repo.get_stage(old_stage).name} "
                 f"to {new_stage.name}")
        return deal

Custom Fields

CRMs need custom fields because each business tracks different attributes. Two storage approaches: (1) EAV (Entity-Attribute-Value): store all custom field values in a generic table: (entity_id, field_id, value_text/value_number/value_date). Pros: flexible, no schema migrations for new fields. Cons: terrible query performance (JOINs for every field), type casting issues, hard to enforce constraints. (2) JSONB column: add a custom_data JSONB column to each entity table. Store all custom field values as a JSON object: {“lead_score”: 85, “preferred_contact”: “email”}. Index specific fields: CREATE INDEX ON contacts ((custom_data->>”lead_score”)). Pros: fast for most queries, single row per entity, no JOIN explosion. Cons: schema validation must be in the application layer (not the database). Recommendation: use JSONB for new CRM systems. EAV was the standard before JSONB support in PostgreSQL; it is now generally avoided.

Activity timeline: ordered log of all interactions with a contact or deal. Query: SELECT * FROM activities WHERE contact_id=:id ORDER BY created_at DESC LIMIT 50. Index: (contact_id, created_at DESC). For deal activities: (deal_id, created_at DESC). Inbox view: show each user’s upcoming tasks and overdue activities: SELECT * FROM activities WHERE user_id=:u AND type=’TASK’ AND completed_at IS NULL ORDER BY scheduled_at ASC. Full-text search across contacts, companies, and activities: Elasticsearch index with a unified search endpoint. Documents: contact (first_name, last_name, email, company, tags), deal (name, description), activity (subject, body). Query time: search across all entity types in one query, merge and rank results by relevance + recency. Sync from PostgreSQL via Debezium CDC. Autocomplete: use a completion suggester for contact names and company names in the search bar.

Reporting and Sales Analytics

Pipeline report: deals grouped by stage with total value. Query: SELECT stage_id, COUNT(*) as deal_count, SUM(amount) as total_value FROM deals WHERE status=’OPEN’ AND owner_id=:id GROUP BY stage_id. Forecast report: expected revenue by month = SUM(amount * win_probability / 100) WHERE close_date BETWEEN :start AND :end. Conversion rates: deals that moved from stage A to stage B / total deals that entered stage A. Requires tracking stage history (deal_stage_history table: deal_id, from_stage, to_stage, changed_at). Activity metrics: calls per rep per day, email response rate, avg deal age by stage. Run reports against a read replica to avoid impacting the transactional primary. For complex analytics: sync to a data warehouse (BigQuery, Snowflake) and use BI tools (Tableau, Metabase).

Asked at: Shopify Interview Guide

Asked at: Stripe Interview Guide

Asked at: LinkedIn Interview Guide

Asked at: Airbnb Interview Guide

Scroll to Top