What sorting algorithm does Python's sorted() use and why?

Python uses Timsort, a hybrid of merge sort and insertion sort. Timsort exploits natural runs (already sorted sequences) in real-world data. It has O(n log n) worst case, O(n) best case (already sorted), is stable, and uses O(n) extra space. Java uses dual-pivot quicksort for primitive arrays (faster in practice, not stable) and Timsort for object arrays (stable, needed for correctness with Comparator).

How do you count inversions in an array using merge sort?

During the merge step, when an element from the right half is placed before elements remaining in the left half, it creates inversions with all remaining left-half elements. Add len(left) - i to the inversion count when right[j] < left[i]. This modification adds O(1) work per merge step, keeping the total O(n log n). The inversion count equals the number of swaps needed to sort the array with bubble sort.

When should you use counting sort instead of comparison-based sorts?

Use counting sort when values are bounded integers in a known range [0, K] and K is not much larger than n. It runs in O(n + K) time without any comparisons. Good use cases: sorting characters in a string (K=26 or 256), sorting ages, grades, or other small-range integers. Avoid when K >> n (mostly empty count array wastes space) or when the values are not integers.

How does the "largest number" problem (LC 179) require a custom comparator?

The goal is to arrange numbers so their concatenation is largest. For two numbers a and b, compare str(a)+str(b) vs str(b)+str(a). If str(a)+str(b) > str(b)+str(a): a should come first. Use functools.cmp_to_key to convert this comparator function into a key function for Python's sorted(). The comparator defines a total order that, when applied, produces the lexicographically largest concatenation.

Why is heap sort rarely used in practice despite being O(n log n) in-place?

Heap sort has poor cache performance. Its access pattern (jumping between parent and child nodes in a heap stored as an array) causes frequent cache misses, especially for large arrays. Merge sort and quicksort have much better cache locality — they access elements sequentially. In benchmarks, introsort (quicksort + heap sort fallback for worst case) or Timsort typically outperform pure heap sort on real data.

Sorting Algorithms for Interviews: Merge Sort, Quick Sort, Heap Sort, and Custom Comparators (2025)

Q: When should you use counting sort instead of comparison-based sorts?

Use counting sort when values are bounded integers in a known range [0, K] and K is not much larger than n. It runs in O(n + K) time without any comparisons. Good use cases: sorting characters in a string (K=26 or 256), sorting ages, grades, or other small-range integers. Avoid when K >> n (mostly empty count array wastes space) or when the values are not integers.

⏱ 5 min read

Which Sorting Algorithm for Interviews?

You rarely implement a sorting algorithm from scratch in interviews — Python’s sorted() and Java’s Arrays.sort() are fine. But you must understand: (1) which algorithm is used internally and why (Timsort for Python, dual-pivot quicksort for Java primitives), (2) how to provide custom comparators for complex sort keys, (3) how to implement merge sort or quicksort when the interviewer asks (divide-and-conquer pattern), and (4) when to use non-comparison sorts (counting sort, radix sort) to beat the O(n log n) lower bound for comparison-based sorts. Knowing these puts you above candidates who just call sort() without understanding it.

Merge Sort: Stable, Divide-and-Conquer

Merge Sort: split array in half, sort each half recursively, merge. O(n log n) always (no bad cases). Stable: equal elements preserve relative order. O(n) extra space for the merge step. The merge step is the key: maintain two pointers, one for each half, advance the smaller one.

def merge_sort(arr):
    if len(arr) <= 1:
        return arr
    mid = len(arr) // 2
    left = merge_sort(arr[:mid])
    right = merge_sort(arr[mid:])
    return merge(left, right)

def merge(left, right):
    result = []
    i = j = 0
    while i < len(left) and j < len(right):
        if left[i] <= right[j]:
            result.append(left[i]); i += 1
        else:
            result.append(right[j]); j += 1
    result.extend(left[i:])
    result.extend(right[j:])
    return result

Use merge sort when: you need stable sort on linked lists (no random access), you need to count inversions (modified merge: count right-before-left merges), external sort (files too large for RAM: sort chunks, merge with k-way merge). The “count inversions” problem uses merge sort directly: during merge, when right[j] is picked before left[i], it contributes (len(left) – i) inversions.

Quick Sort: Fast In-Place, Unstable

Quick Sort: pick a pivot, partition around it, recurse on each side. O(n log n) average, O(n^2) worst case (already sorted + bad pivot). O(log n) stack space (in-place partition). Key: pivot selection. Random pivot: avoids worst case with very high probability. Three-way partition (Dutch National Flag): handles duplicate elements — partition into pivot. Avoids O(n^2) on arrays with many duplicates (e.g., [3,3,3,…,3]).

import random

def quicksort(arr, lo, hi):
    if lo >= hi: return
    pivot_idx = partition(arr, lo, hi)
    quicksort(arr, lo, pivot_idx - 1)
    quicksort(arr, pivot_idx + 1, hi)

def partition(arr, lo, hi):
    # Random pivot to avoid worst case
    rand = random.randint(lo, hi)
    arr[rand], arr[hi] = arr[hi], arr[rand]
    pivot = arr[hi]
    i = lo - 1
    for j in range(lo, hi):
        if arr[j] <= pivot:
            i += 1
            arr[i], arr[j] = arr[j], arr[i]
    arr[i+1], arr[hi] = arr[hi], arr[i+1]
    return i + 1

Heap Sort and the Heap Property

Heap Sort: build a max-heap in O(n), then repeatedly extract the max (O(log n) each) to sort in O(n log n). In-place, O(1) extra space. Not stable. Rarely used in practice (poor cache performance vs merge sort), but the heap-building algorithm is important: heapify from the bottom up in O(n) — NOT by inserting one-by-one (which is O(n log n)). The O(n) build-heap proof: summing the work at each level of the heap gives O(n) via geometric series. Real use case: finding the k largest elements. Build a min-heap of size k. For each remaining element: if > heap[0]: replace heap[0] and heapify. O(n log k) — better than sorting all n elements.

Custom Comparators and Non-Comparison Sorts

Custom comparators: sort by multiple keys, or by a derived property. Python: sorted(items, key=lambda x: (x.priority, -x.timestamp)). For a tricky comparator (e.g., “largest number from digits”): define a comparator function and use functools.cmp_to_key. LC 179 (Largest Number): sort strings so that concatenation ab > ba if ab > ba lexicographically. Counting sort: when values are bounded integers in [0, K]. O(n + K) time, O(K) space. No comparisons needed. Use for: sorting characters in a string, sorting by small integer keys (grades, ages). Radix sort: sort integers digit by digit using a stable sort (counting sort) for each digit. O(d * (n + K)) where d = number of digits, K = base (10 or 256). Beats O(n log n) when d is small. Bucket sort: distribute elements into buckets (ranges), sort each bucket, concatenate. O(n) average for uniformly distributed data. Use for: sorting floating-point numbers in [0, 1).