System Design: Audit Log — Immutable Event Trail, Compliance, and Tamper Detection

Why Audit Logs?

An audit log records every significant action in a system: who did what, to which resource, when, and from where. Audit logs serve multiple purposes: security forensics (investigate a breach — which accounts were accessed?), compliance (SOC 2, HIPAA, PCI-DSS all mandate audit trails), debugging (trace the sequence of events that caused a bug), and accountability (prove that a change was authorized). The defining property of an audit log: it must be append-only and tamper-evident. A legitimate user should never be able to delete or modify a past audit entry, even database administrators. Any modification must be detectable.

What to Log

Log every action that changes state or accesses sensitive data. Minimum fields per event: event_id (UUID), timestamp (with timezone, millisecond precision), actor_id (user or service that performed the action), actor_ip (source IP address), actor_user_agent, action_type (LOGIN, LOGOUT, CREATE, UPDATE, DELETE, EXPORT, VIEW_PII), resource_type (USER, ORDER, PAYMENT, REPORT), resource_id, outcome (SUCCESS, FAILURE, UNAUTHORIZED), request_id (correlates with application logs), before_state (JSON snapshot of the resource before the change), after_state (JSON snapshot after). For authentication events: log both successful logins and failed attempts. Failed login patterns detect brute force. Access to sensitive data (PII, financial records): log even read-only views (required by HIPAA for PHI access).

Immutability and Tamper Detection

An audit log is only trustworthy if it cannot be silently modified. Techniques: (1) Append-only database table: revoke UPDATE and DELETE privileges on the audit_log table from all users, including the application service account. The application can only INSERT. Even a compromised application cannot delete logs. (2) Hash chaining: each log entry includes a hash of the previous entry — similar to a blockchain. entry_hash = SHA256(prev_hash + event_data). If any entry is modified or deleted, all subsequent hashes become invalid. Verify integrity by recomputing the chain. (3) Write to an external system: replicate logs to a separate, isolated system (dedicated S3 bucket with object lock, a write-once storage service). The primary database cannot reach this system to modify it. (4) Digital signatures: sign each batch of log entries with the service’s private key. Verify with the public key. Modification invalidates the signature.

Schema and Storage

Audit logs are write-heavy and read-rarely (queried during investigations or compliance audits). Optimize for writes. PostgreSQL with append-only access and monthly partitioning (partition by timestamp month): old partitions can be archived to cold storage without affecting active write performance. Elasticsearch: index audit logs for fast full-text and filter queries during investigations (search all events by actor_id, resource_type, time range). Sync from the primary database via CDC (Debezium). S3 + Athena: archive partitioned Parquet files to S3. Query with Athena for compliance exports. Retention: regulatory requirements vary: HIPAA = 6 years, PCI-DSS = 1 year online + archive, SOC 2 = typically 1 year minimum. Store all events for the required period; archive older events to Glacier.

Implementation Patterns

Interceptor/middleware approach: add audit logging as a cross-cutting concern. In an API gateway or service middleware layer: extract actor, action, resource from each request/response. Insert to the audit log asynchronously (fire-and-forget to a Kafka topic; a dedicated audit log consumer writes to the database). Never block the main request path on audit log writes. Decorator pattern: wrap repository methods with an audit decorator that automatically captures before/after state for updates. The decorator reads the current state before the update, applies the update, then logs both states. For database-level audit: PostgreSQL triggers on sensitive tables (users, payments) automatically insert audit entries on INSERT/UPDATE/DELETE — audit logging cannot be bypassed by direct database access.

Interview Tips

  • Separation: audit logs must be in a separate system from application logs. Application logs are operational (debug, warn, error); audit logs are compliance-grade (who accessed what). Different retention, access controls, and integrity requirements.
  • PII in audit logs: audit logs often contain PII (names, emails, IPs). Apply data masking for non-privileged viewers. Only compliance officers and security engineers should see full PII in audit logs.
  • Alerting on audit events: security events (multiple failed logins, bulk data export, admin privilege grant) should trigger real-time alerts via SIEM (Security Information and Event Management) systems like Splunk or Datadog SIEM.

Asked at: Databricks Interview Guide

Asked at: Cloudflare Interview Guide

Asked at: Stripe Interview Guide

Asked at: Atlassian Interview Guide

Scroll to Top