Low-Level Design: Poll and Voting System — Real-Time Results, Fraud Prevention, and Analytics

Core Entities

Poll: poll_id, creator_id, question, poll_type (SINGLE_CHOICE, MULTIPLE_CHOICE, RANKED, RATING), status (DRAFT, ACTIVE, CLOSED), is_anonymous, allow_change_vote, visibility (PUBLIC, PRIVATE, ORGANIZATION), start_time, end_time, settings (JSONB: max_choices, require_authentication, show_results_before_close), created_at. PollOption: option_id, poll_id, text, image_url, display_order. Vote: vote_id, poll_id, voter_id (NULL if anonymous), voter_fingerprint (browser fingerprint hash), selected_options (int array for multi-choice), rank_order (JSONB for ranked choice), created_at, ip_address, user_agent. VoteResult: poll_id, option_id, vote_count, percentage, last_updated. (Materialized aggregate, updated on each vote.) PollAnalytics: poll_id, total_votes, unique_voters, completion_rate, avg_response_time_seconds, geographic_distribution (JSONB), device_breakdown (JSONB), time_series (JSONB: votes per hour).

Vote Submission with Deduplication

class VotingService:
    def cast_vote(self, poll_id: int, voter_id: Optional[int],
                  options: list[int], request_meta: dict) -> VoteResult:
        poll = self.db.get_poll(poll_id)

        # Validate poll is active
        if poll.status != "ACTIVE":
            raise PollNotActive(poll_id)
        if poll.end_time and datetime.utcnow() > poll.end_time:
            raise PollExpired(poll_id)

        # Validate options
        valid_option_ids = {o.option_id for o in poll.options}
        if not all(o in valid_option_ids for o in options):
            raise InvalidOption()
        if poll.poll_type == "SINGLE_CHOICE" and len(options) != 1:
            raise InvalidSelection("Single choice poll requires exactly 1 option")

        # Deduplication key
        if voter_id:
            dedup_key = f"voted:{poll_id}:{voter_id}"
        else:
            # Anonymous: use IP + user agent fingerprint
            fingerprint = self._compute_fingerprint(request_meta)
            dedup_key = f"voted:{poll_id}:{fingerprint}"

        with db.transaction():
            # Check for existing vote (atomic with Redis for speed,
            # DB constraint as fallback)
            if self.redis.get(dedup_key):
                if poll.allow_change_vote:
                    return self._change_vote(poll_id, voter_id, options)
                raise AlreadyVoted()

            # Insert vote record
            vote = self.db.insert("votes", {
                "poll_id": poll_id, "voter_id": voter_id,
                "selected_options": options,
                "ip_address": request_meta["ip"],
                "user_agent": request_meta["user_agent"],
                "voter_fingerprint": fingerprint if not voter_id else None
            })

            # Update materialized vote counts (in same transaction)
            for option_id in options:
                self.db.execute(
                    "UPDATE vote_results SET vote_count = vote_count + 1 "
                    "WHERE poll_id = %s AND option_id = %s",
                    poll_id, option_id
                )
            self.db.execute(
                "UPDATE vote_results SET percentage = "
                "vote_count * 100.0 / (SELECT SUM(vote_count) FROM vote_results "
                "WHERE poll_id = %s) WHERE poll_id = %s",
                poll_id, poll_id
            )

        # Set dedup key in Redis (TTL = poll end time + 24h for safety)
        self.redis.setex(dedup_key, 86400 * 30, "1")

        # Publish real-time update
        self.pubsub.publish(f"poll:{poll_id}:update", vote.to_json())
        return self.db.get_vote_results(poll_id)

Real-Time Results with WebSocket

Live results update as votes come in. Architecture: on each vote, publish a PollUpdateEvent to Redis pub/sub channel (poll:{poll_id}:update). A WebSocket gateway subscribes to these channels and pushes updates to all connected clients watching that poll. Throttling: if a poll receives thousands of votes per second (viral poll), push at most one update per 500ms per connected client (debounce). Client receives: {option_id: count, …} partial update or full recalculation. Rate limiting updates: the pub/sub consumer tracks last-push timestamp per poll. If < 500ms since last push: buffer the update and schedule a delayed push. This prevents WebSocket message floods to thousands of clients on viral polls.

Fraud Detection and Vote Integrity

Fraud vectors: (1) Same user voting multiple times with different accounts (ballot stuffing). Detection: IP rate limiting (max N votes per IP per poll per hour), browser fingerprinting (canvas, WebGL, fonts), device ID tracking. (2) Bot voting: automated script votes. Detection: CAPTCHA on anonymous polls, rate limiting by IP, behavioral analysis (too-fast click pattern). (3) Vote buying: users selling votes. Mitigation: real-time anomaly detection on vote velocity per IP subnet. Implementation: track votes per IP in Redis sorted sets with sliding time windows. INCR vote_count:{poll_id}:{ip_prefix} with TTL. Threshold alerts: if any IP prefix votes > 50 times in 10 minutes: flag for manual review, require CAPTCHA for subsequent votes from that IP, and mark those votes as SUSPICIOUS in the database. Do not automatically delete suspicious votes — human review determines validity.


{“@context”:”https://schema.org”,”@type”:”FAQPage”,”mainEntity”:[{“@type”:”Question”,”name”:”How do you prevent the same user from voting multiple times in a poll?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Authenticated users: create a unique index on (poll_id, voter_id) in the votes table. The database enforces one vote per user per poll at the storage level. Additionally, check a Redis key (voted:{poll_id}:{voter_id}) before inserting for fast pre-check without hitting the DB. Anonymous polls: use multi-factor fingerprinting: IP address (can be shared by NAT, not reliable alone), browser fingerprint (canvas, WebGL, installed fonts hashed), device cookies (set a long-TTL cookie with a UUID on first visit). A vote is considered duplicate if two or more factors match a previous vote for the same poll. Limitation: determined adversaries can clear cookies or use VPNs. For high-stakes polls: require email verification (send a one-time vote link), or phone number verification (SMS OTP). Each verification method has a cost-vs-security trade-off.”}},{“@type”:”Question”,”name”:”How do materialized vote counts avoid expensive real-time aggregation?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Naive approach: count votes with SELECT COUNT(*) FROM votes WHERE poll_id = ? AND option_id = ? on each request. At 10M votes, this query takes seconds without careful indexing. Materialized counts: maintain a vote_results table with pre-aggregated counts (vote_count per option). On each vote: UPDATE vote_results SET vote_count = vote_count + 1 WHERE poll_id = ? AND option_id = ? in the same database transaction as the vote insert. This keeps counts exactly in sync with the votes table. Read is now O(1) per option (SELECT * FROM vote_results WHERE poll_id = ?). Trade-off: slightly more complex write path (two updates per vote). Alternative: Redis INCR for real-time counts (fast) with periodic sync to DB for durability. Use Redis counts for display and DB aggregate for billing/audit.”}},{“@type”:”Question”,”name”:”How do you implement real-time vote result updates without overloading clients?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Push-based updates: after each vote, publish a VoteUpdate event to Redis pub/sub (PUBLISH poll:{poll_id}:updates {option_id, new_count}). WebSocket gateway instances subscribe and forward to connected clients. For viral polls (thousands of votes/second): without throttling, each vote generates a WebSocket message to all connected clients — too many small messages. Throttling: the gateway debounces updates per poll. After receiving a VoteUpdate, schedule a push 500ms later. If another update arrives before the 500ms, reset the timer. After the delay: push the current vote_results (snapshot, not individual updates). Clients receive at most 2 updates per second regardless of vote rate. Alternative: clients poll every 2-3 seconds (simpler, less real-time). WebSocket is better for live polls where sub-second freshness matters (election night, live events).”}},{“@type”:”Question”,”name”:”How do you handle ranked-choice voting in the data model?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Ranked-choice voting (RCV): voters rank options by preference (1st choice, 2nd choice, etc.). Data model: Vote.rank_order stores a JSON object {option_id: rank} or an ordered array of option IDs. Example: [3, 1, 4, 2] means option 3 is 1st choice, option 1 is 2nd, etc. Instant-runoff algorithm: (1) Count 1st-choice votes for each option. (2) If any option has > 50%: that option wins. (3) Otherwise: eliminate the option with fewest 1st-choice votes. (4) Redistribute eliminated option's votes to voters' next-ranked remaining option. (5) Repeat until a winner emerges. Implementation: this is computationally expensive at scale — precompute results in a batch job after the poll closes rather than in real-time. Store intermediate round results in PollAnalytics for auditability.”}},{“@type”:”Question”,”name”:”How do you design a poll system that supports very high vote rates (viral polls)?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”For polls that go viral (millions of votes in minutes): (1) Vote ingestion: accept votes via a lightweight endpoint that writes to Kafka and returns immediately (fire-and-forget). Background consumers process Kafka events and write to the database in micro-batches. This decouples the vote ingestion rate from the database write rate. (2) Rate limiting: per-IP and per-user rate limits prevent bot floods. Token bucket in Redis: max 1 vote per second per IP per poll. (3) Countercaching: use Redis INCR for vote counts (sub-millisecond, handles 1M+ ops/second). Periodically flush Redis counts to the DB (every 30 seconds). Accept slight count lag for read traffic. (4) Read scaling: cache poll results in Redis with 1-5 second TTL. Most reads hit the cache, not the DB. (5) Database sharding: if a single poll exceeds DB write capacity, shard the votes table by poll_id range.”}}]}

See also: Twitter/X Interview Prep

See also: Snap Interview Prep

See also: LinkedIn Interview Prep

Scroll to Top