Which Sorting Algorithm for Interviews?
You rarely implement a sorting algorithm from scratch in interviews — Python’s sorted() and Java’s Arrays.sort() are fine. But you must understand: (1) which algorithm is used internally and why (Timsort for Python, dual-pivot quicksort for Java primitives), (2) how to provide custom comparators for complex sort keys, (3) how to implement merge sort or quicksort when the interviewer asks (divide-and-conquer pattern), and (4) when to use non-comparison sorts (counting sort, radix sort) to beat the O(n log n) lower bound for comparison-based sorts. Knowing these puts you above candidates who just call sort() without understanding it.
Merge Sort: Stable, Divide-and-Conquer
Merge Sort: split array in half, sort each half recursively, merge. O(n log n) always (no bad cases). Stable: equal elements preserve relative order. O(n) extra space for the merge step. The merge step is the key: maintain two pointers, one for each half, advance the smaller one.
def merge_sort(arr):
if len(arr) <= 1:
return arr
mid = len(arr) // 2
left = merge_sort(arr[:mid])
right = merge_sort(arr[mid:])
return merge(left, right)
def merge(left, right):
result = []
i = j = 0
while i < len(left) and j < len(right):
if left[i] <= right[j]:
result.append(left[i]); i += 1
else:
result.append(right[j]); j += 1
result.extend(left[i:])
result.extend(right[j:])
return result
Use merge sort when: you need stable sort on linked lists (no random access), you need to count inversions (modified merge: count right-before-left merges), external sort (files too large for RAM: sort chunks, merge with k-way merge). The “count inversions” problem uses merge sort directly: during merge, when right[j] is picked before left[i], it contributes (len(left) – i) inversions.
Quick Sort: Fast In-Place, Unstable
Quick Sort: pick a pivot, partition around it, recurse on each side. O(n log n) average, O(n^2) worst case (already sorted + bad pivot). O(log n) stack space (in-place partition). Key: pivot selection. Random pivot: avoids worst case with very high probability. Three-way partition (Dutch National Flag): handles duplicate elements — partition into pivot. Avoids O(n^2) on arrays with many duplicates (e.g., [3,3,3,…,3]).
import random
def quicksort(arr, lo, hi):
if lo >= hi: return
pivot_idx = partition(arr, lo, hi)
quicksort(arr, lo, pivot_idx - 1)
quicksort(arr, pivot_idx + 1, hi)
def partition(arr, lo, hi):
# Random pivot to avoid worst case
rand = random.randint(lo, hi)
arr[rand], arr[hi] = arr[hi], arr[rand]
pivot = arr[hi]
i = lo - 1
for j in range(lo, hi):
if arr[j] <= pivot:
i += 1
arr[i], arr[j] = arr[j], arr[i]
arr[i+1], arr[hi] = arr[hi], arr[i+1]
return i + 1
Heap Sort and the Heap Property
Heap Sort: build a max-heap in O(n), then repeatedly extract the max (O(log n) each) to sort in O(n log n). In-place, O(1) extra space. Not stable. Rarely used in practice (poor cache performance vs merge sort), but the heap-building algorithm is important: heapify from the bottom up in O(n) — NOT by inserting one-by-one (which is O(n log n)). The O(n) build-heap proof: summing the work at each level of the heap gives O(n) via geometric series. Real use case: finding the k largest elements. Build a min-heap of size k. For each remaining element: if > heap[0]: replace heap[0] and heapify. O(n log k) — better than sorting all n elements.
Custom Comparators and Non-Comparison Sorts
Custom comparators: sort by multiple keys, or by a derived property. Python: sorted(items, key=lambda x: (x.priority, -x.timestamp)). For a tricky comparator (e.g., “largest number from digits”): define a comparator function and use functools.cmp_to_key. LC 179 (Largest Number): sort strings so that concatenation ab > ba if ab > ba lexicographically. Counting sort: when values are bounded integers in [0, K]. O(n + K) time, O(K) space. No comparisons needed. Use for: sorting characters in a string, sorting by small integer keys (grades, ages). Radix sort: sort integers digit by digit using a stable sort (counting sort) for each digit. O(d * (n + K)) where d = number of digits, K = base (10 or 256). Beats O(n log n) when d is small. Bucket sort: distribute elements into buckets (ranges), sort each bucket, concatenate. O(n) average for uniformly distributed data. Use for: sorting floating-point numbers in [0, 1).
{“@context”:”https://schema.org”,”@type”:”FAQPage”,”mainEntity”:[{“@type”:”Question”,”name”:”What sorting algorithm does Python's sorted() use and why?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Python uses Timsort, a hybrid of merge sort and insertion sort. Timsort exploits natural runs (already sorted sequences) in real-world data. It has O(n log n) worst case, O(n) best case (already sorted), is stable, and uses O(n) extra space. Java uses dual-pivot quicksort for primitive arrays (faster in practice, not stable) and Timsort for object arrays (stable, needed for correctness with Comparator).”}},{“@type”:”Question”,”name”:”How do you count inversions in an array using merge sort?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”During the merge step, when an element from the right half is placed before elements remaining in the left half, it creates inversions with all remaining left-half elements. Add len(left) – i to the inversion count when right[j] < left[i]. This modification adds O(1) work per merge step, keeping the total O(n log n). The inversion count equals the number of swaps needed to sort the array with bubble sort.”}},{“@type”:”Question”,”name”:”When should you use counting sort instead of comparison-based sorts?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Use counting sort when values are bounded integers in a known range [0, K] and K is not much larger than n. It runs in O(n + K) time without any comparisons. Good use cases: sorting characters in a string (K=26 or 256), sorting ages, grades, or other small-range integers. Avoid when K >> n (mostly empty count array wastes space) or when the values are not integers.”}},{“@type”:”Question”,”name”:”How does the "largest number" problem (LC 179) require a custom comparator?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”The goal is to arrange numbers so their concatenation is largest. For two numbers a and b, compare str(a)+str(b) vs str(b)+str(a). If str(a)+str(b) > str(b)+str(a): a should come first. Use functools.cmp_to_key to convert this comparator function into a key function for Python's sorted(). The comparator defines a total order that, when applied, produces the lexicographically largest concatenation.”}},{“@type”:”Question”,”name”:”Why is heap sort rarely used in practice despite being O(n log n) in-place?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Heap sort has poor cache performance. Its access pattern (jumping between parent and child nodes in a heap stored as an array) causes frequent cache misses, especially for large arrays. Merge sort and quicksort have much better cache locality — they access elements sequentially. In benchmarks, introsort (quicksort + heap sort fallback for worst case) or Timsort typically outperform pure heap sort on real data.”}}]}
See also: Meta Interview Prep
See also: Atlassian Interview Prep
See also: Databricks Interview Prep