Overview
A notification service delivers time-sensitive messages to users via push notifications (mobile), SMS, email, and in-app alerts. WhatsApp sends 100B+ notifications/day; Facebook sends billions of push notifications daily. Requirements: sub-second delivery for high-priority notifications (payment alerts, 2FA codes), guaranteed delivery (at-least-once), user preference management (opt-out per channel), and rate limiting per user to prevent notification fatigue.
Notification Types and Priority
- Transactional (highest priority): OTP codes, payment confirmations, security alerts. Must be delivered immediately, regardless of user preferences. Delivered via multiple channels in parallel (SMS + push simultaneously).
- Operational (high priority): order shipped, flight gate change, account activity. Delivered within seconds, respecting preferred channel.
- Marketing (low priority): promotions, weekly digests, product announcements. Can be batched and delayed; must respect opt-out preferences. Rate limited to avoid spam.
Architecture Overview
- Notification API: services call the notification API to request a notification: POST /notify {user_id, type, template_id, params, priority}. The API validates, looks up user preferences, and publishes to Kafka.
- Kafka topics: separate topics by priority (notification-critical, notification-high, notification-low). This prevents a burst of marketing notifications from blocking OTP delivery.
- Notification dispatcher: consumers from Kafka topics look up user preferences (preferred channel, opted-out channels, DND schedule) and device tokens (for push). Selects delivery channel(s). Routes to the appropriate sender service.
- Channel sender services: push sender (APNs/FCM), SMS sender (Twilio/AWS SNS), email sender (SES/SendGrid). Each sender retries independently with exponential backoff.
- Delivery tracking: events (sent, delivered, clicked, failed) are written to a tracking database. Used for analytics and retry logic.
Mobile Push Notifications
Push notifications reach mobile devices via: APNs (Apple Push Notification service) for iOS, FCM (Firebase Cloud Messaging) for Android. The flow: notification service sends a HTTPS request to APNs/FCM with the device token and payload. APNs/FCM delivers to the device when it has a connection. If the device is offline, APNs/FCM queues the notification for up to 28 days.
Device token management: each app install registers a unique device token. Tokens change when the app is reinstalled or on token refresh. Tokens for unregistered devices return an error (unregistered/InvalidToken) — the notification service must delete these tokens from its registry. One user may have multiple devices (iPhone, iPad) — store multiple tokens per user_id.
SMS Delivery
SMS is delivered via SMS gateway providers (Twilio, AWS SNS, Vonage). The provider connects to carrier networks. Key considerations: phone number formatting (E.164: +12025551234), carrier blocking (spam filters), country-specific regulations (GDPR opt-in requirements in EU, TCPA in US — require explicit SMS opt-in). For 2FA OTPs: use Twilio Verify or a dedicated OTP service that handles rate limiting and delivery reports. Delivery receipts via callback webhooks — track delivered/failed status. Fallback: if primary carrier fails, route to a secondary provider.
Email Delivery
Email is sent via SMTP through services like SendGrid, SES, or Mailgun. Deliverability (reaching the inbox vs spam folder) depends on: SPF/DKIM/DMARC configuration, sender reputation (bounce rate < 2%, complaint rate < 0.1%), dedicated IP warming for new IPs, and content quality. Bounce handling: hard bounces (invalid address) permanently suppress the address. Soft bounces (temporary failure) retry 3× before suppression. Unsubscribe: must honor list-unsubscribe headers and links within 10 business days (CAN-SPAM). For transactional email (receipts, OTPs), use dedicated sending infrastructure separate from marketing email — poor marketing sender reputation shouldn't affect transactional deliverability.
User Preference Management
Users control which notification types they receive via which channels. Data model: (user_id, notification_type, channel, opted_in, dnd_start_hour, dnd_end_hour). Stored in a relational database, cached in Redis with TTL (refreshed on preference update). DND (Do Not Disturb): notifications received during DND are queued and delivered at DND end time (except transactional). The cache must be invalidated immediately on preference change — use write-through cache or pub/sub invalidation.
Rate Limiting and Throttling
Preventing notification spam: (1) Per-user rate limit — max N marketing notifications per day per user. Redis counter (INCR notification:marketing:{user_id}:{date}, EXPIRE = 86400). (2) Per-notification-type rate limit — a single trigger shouldn’t send the same notification twice. Deduplication key = (user_id, notification_type, idempotency_key) stored in Redis with TTL. (3) Global rate limit per channel — push providers have limits (APNs: 600 requests/second per team). Use a leaky bucket to spread sends evenly. (4) Batching marketing notifications — instead of sending to 10M users immediately, send to 100K/hour over 100 hours, reducing API provider load and allowing time to pause if engagement drops.
Monitoring and Reliability
Critical metrics: notification delivery latency (P99 5 seconds, page on-call immediately. Reliability: at-least-once delivery via Kafka — if the dispatcher crashes after reading from Kafka but before confirming delivery, the message is reprocessed on restart (offset not committed). Idempotency via deduplication key prevents duplicate sends on redelivery.