Graph BFS and DFS Interview Patterns: Islands, Shortest Path, Cycle Detection

Graph BFS and DFS Interview Patterns

Graph traversal is one of the most tested interview topics. BFS (Breadth-First Search) and DFS (Depth-First Search) are the two fundamental algorithms — all other graph algorithms build on them. The key skill: recognize which traversal is appropriate and handle visited state correctly.

  • Coinbase Interview Guide
  • Databricks Interview Guide
  • LinkedIn Interview Guide
  • Uber Interview Guide
  • Apple Interview Guide
  • Meta Interview Guide
  • BFS Template

    from collections import deque
    
    def bfs(graph, start):
        visited = {start}
        queue = deque([start])
        while queue:
            node = queue.popleft()
            process(node)
            for neighbor in graph[node]:
                if neighbor not in visited:
                    visited.add(neighbor)
                    queue.append(neighbor)
    

    BFS explores level by level — all nodes at distance 1, then distance 2, etc. This guarantees shortest path in unweighted graphs. Time: O(V+E). Space: O(V) for the queue.

    DFS Template (Iterative and Recursive)

    # Recursive DFS
    def dfs(graph, node, visited):
        visited.add(node)
        process(node)
        for neighbor in graph[node]:
            if neighbor not in visited:
                dfs(graph, neighbor, visited)
    
    # Iterative DFS (avoids recursion limit)
    def dfs_iterative(graph, start):
        visited = set()
        stack = [start]
        while stack:
            node = stack.pop()
            if node in visited: continue
            visited.add(node)
            process(node)
            for neighbor in graph[node]:
                if neighbor not in visited:
                    stack.append(neighbor)
    

    Pattern 1: Number of Islands (Grid BFS/DFS)

    def num_islands(grid):
        rows, cols = len(grid), len(grid[0])
        count = 0
        def dfs(r, c):
            if r = rows or c = cols or grid[r][c] != '1': return
            grid[r][c] = '0'  # mark visited by sinking
            for dr, dc in [(0,1),(0,-1),(1,0),(-1,0)]:
                dfs(r+dr, c+dc)
        for r in range(rows):
            for c in range(cols):
                if grid[r][c] == '1':
                    dfs(r, c)
                    count += 1
        return count
    

    Pattern 2: Shortest Path in Grid (BFS)

    def shortest_path(grid, start, end):
        rows, cols = len(grid), len(grid[0])
        queue = deque([(start[0], start[1], 0)])  # (row, col, distance)
        visited = {(start[0], start[1])}
        while queue:
            r, c, dist = queue.popleft()
            if (r, c) == (end[0], end[1]): return dist
            for dr, dc in [(0,1),(0,-1),(1,0),(-1,0)]:
                nr, nc = r+dr, c+dc
                if 0<=nr<rows and 0<=nc<cols and grid[nr][nc]==0 and (nr,nc) not in visited:
                    visited.add((nr,nc))
                    queue.append((nr,nc,dist+1))
        return -1  # no path
    

    BFS guarantees shortest path for unit-weight edges. Add distance to queue state — do not track distances separately.

    Pattern 3: Course Schedule (Cycle Detection with DFS)

    def can_finish(numCourses, prerequisites):
        graph = [[] for _ in range(numCourses)]
        for a, b in prerequisites: graph[b].append(a)
        # 0=unvisited, 1=in progress, 2=done
        state = [0] * numCourses
        def dfs(node):
            if state[node] == 1: return False  # cycle
            if state[node] == 2: return True   # already processed
            state[node] = 1
            for neighbor in graph[node]:
                if not dfs(neighbor): return False
            state[node] = 2
            return True
        return all(dfs(i) for i in range(numCourses))
    

    Three-color DFS: unvisited (0), in-progress/gray (1), done/black (2). A back edge to a gray node = cycle.

    Pattern 4: Multi-Source BFS

    # "01 Matrix": distance to nearest 0 for each cell
    def update_matrix(mat):
        rows, cols = len(mat), len(mat[0])
        dist = [[float('inf')]*cols for _ in range(rows)]
        queue = deque()
        for r in range(rows):
            for c in range(cols):
                if mat[r][c] == 0:
                    dist[r][c] = 0
                    queue.append((r, c))    # seed ALL zeros simultaneously
        while queue:
            r, c = queue.popleft()
            for dr, dc in [(0,1),(0,-1),(1,0),(-1,0)]:
                nr, nc = r+dr, c+dc
                if 0<=nr<rows and 0<=nc dist[r][c]+1:
                    dist[nr][nc] = dist[r][c]+1
                    queue.append((nr,nc))
        return dist
    

    Add ALL source nodes to the queue before starting BFS. The wave expands simultaneously from all sources. Equivalent to adding a virtual source node connected to all real sources.

    BFS vs DFS: When to Use Which

    • BFS: shortest path in unweighted graph, level-order traversal, minimum steps
    • DFS: cycle detection, topological sort, connected components, maze solving (any path)
    • Both work: number of islands, number of connected components

    {
    “@context”: “https://schema.org”,
    “@type”: “FAQPage”,
    “mainEntity”: [
    {
    “@type”: “Question”,
    “name”: “Why does BFS find the shortest path in an unweighted graph but DFS does not?”,
    “acceptedAnswer”: { “@type”: “Answer”, “text”: “BFS explores nodes in order of their distance from the source: all nodes at distance 1 are processed before any node at distance 2, all distance-2 nodes before distance-3, and so on. This level-by-level expansion guarantees that the first time BFS reaches a node, it has found the shortest path to that node. Proof: suppose BFS first reaches node v via path P of length d. If there were a shorter path P' of length d-1, BFS would have reached v when processing nodes at level d-1, before processing level d. Contradiction. DFS explores as deep as possible before backtracking. It finds *a* path to the target, but not necessarily the shortest — it might explore a long path first and reach the target before exploring shorter paths. To find the shortest path with DFS, you would need to explore all paths and take the minimum (exponential time). In weighted graphs, BFS no longer finds shortest paths (a longer path in hops might have smaller total weight). Use Dijkstra for weighted non-negative edges.” }
    },
    {
    “@type”: “Question”,
    “name”: “How does three-color DFS detect cycles and why does it use three states instead of two?”,
    “acceptedAnswer”: { “@type”: “Answer”, “text”: “Two-state visited (visited/unvisited) detects if a node has been seen, but cannot distinguish between a back edge (cycle) and a cross edge (to a completed subtree). Three-color DFS: WHITE (0) = unvisited; GRAY (1) = currently being processed (on the DFS stack); BLACK (2) = fully processed (all descendants visited). A back edge exists if during DFS we encounter a GRAY node — a node that is an ancestor in the current DFS stack, meaning we have a cycle. A cross/forward edge to a BLACK node is not a cycle (that subtree has no path back to the current node, since it was fully processed without finding a cycle). Algorithm: on entering a node, set GRAY. For each neighbor: if GRAY, cycle found. If WHITE, recurse. If BLACK, skip. On exiting a node, set BLACK. Two-state visited would mark all visited nodes as "done" and could not distinguish a back edge (to a node on the current recursion stack) from a cross edge (to a node in a completed, separate subtree). The GRAY state specifically marks nodes currently on the recursion stack.” }
    },
    {
    “@type”: “Question”,
    “name”: “What is multi-source BFS and what problems does it solve that single-source BFS cannot?”,
    “acceptedAnswer”: { “@type”: “Answer”, “text”: “Single-source BFS finds shortest distances from one source node to all other nodes. Multi-source BFS seeds the queue with multiple source nodes simultaneously before starting. This is equivalent to adding a virtual super-source S connected to all real sources with zero-cost edges, then running single-source BFS from S — but without the overhead of an extra node. What multi-source BFS solves: problems asking "for each node, what is the distance to the nearest source?" when there are multiple sources. Examples: (1) 01 Matrix (LeetCode 542): distance to nearest 0 for each cell — seed all zeros simultaneously. (2) Rotting Oranges: time for all oranges to rot — seed all initially rotten oranges. (3) Walls and Gates: fill each room with the distance to the nearest gate — seed all gates. (4) Nearest land from water: seed all land cells, compute distance for each water cell. The key implementation detail: add ALL source nodes to the queue BEFORE starting the BFS loop. This ensures they are all treated as distance-0 simultaneously, and the wave expands outward from all sources at the same speed.” }
    }
    ]
    }

    🏢 Asked at: Atlassian Interview Guide

    Scroll to Top