PATTERN 15: Union–Find / Disjoint Set (Connectivity Thinking)

D

Qubits of DPK

April 11, 2026

Core DSA

Core Idea (lock this)

“I need to
You are not traversing paths.
You are maintaining connectivity information dynamically.
If BFS/DFS answers reachability,
Union–Find answers connectivity.

How to Recognize This Pattern (CRITICAL)

You should instantly think Union–Find (DSU) when:
  • The problem talks about:
  • You repeatedly need to check:
  • Graph edges are added incrementally
  • Order of traversal does not matter
Trigger words:
  • “union”
  • “connected”
  • “components”
  • “merge”
  • “same group”

30 MUST-DO INTERVIEW PROBLEMS

(10 Easy → 10 Medium → 10 Hard)

EASY (Mechanics & Cycle Detection)

Goal: make find() and union() automatic.
  1. #
    Redundant Connection – LC 684
  2. #
    Satisfiability of Equality Equations – LC 990
  3. #
    Most Stones Removed with Same Row or Column – LC 947
  4. #
    Number of Operations to Make Network Connected – LC 1319
  5. #
    Smallest String With Swaps – LC 1202
  6. #
    Lexicographically Smallest Equivalent String – LC 1061
  7. #
    Find if Path Exists in Graph (DSU view) – LC 1971
  8. #
    Count Connected Components (DSU view) – LC 323
  9. #
    Graph Valid Tree (DSU view) – LC 261
  10. #
    Number of Provinces (DSU view) – LC 547
Now EASY covers:
  • Basic connectivity
  • Cycle detection
  • String grouping
  • Swap grouping
  • Component counting

️ MEDIUM (DSU as Core Strategy)

Now DSU becomes necessary.
  1. #
    Accounts Merge – LC 721
  2. #
    Similar String Groups – LC 839
  3. #
    Regions Cut by Slashes – LC 959
  4. #
    Sentence Similarity II – LC 737
  5. #
    Couples Holding Hands – LC 765
  6. #
    Earliest Moment When Everyone Becomes Friends – LC 1101
  7. #
    Remove Max Number of Edges to Keep Graph Fully Traversable – LC 1579
  8. #
    Optimize Water Distribution in a Village – LC 1168
  9. #
    Minimize Malware Spread – LC 924
  10. #
    Largest Component Size by Common Factor – LC 952
This section forces:
  • Smart modeling
  • Component weighting
  • Graph + DSU hybrid
  • Union ordering logic

HARD (Dynamic & Advanced Connectivity)

  1. #
    Number of Islands II – LC 305
  2. #
    Bricks Falling When Hit – LC 803
  3. #
    Number of Good Paths – LC 2421
  4. #
    Checking Existence of Edge Length Limited Paths – LC 1697
  5. #
    Path With Minimum Effort (Union-Find view) – LC 1631
  6. #
    Critical and Pseudo-Critical Edges in MST – LC 1489
  7. #
    Graph Connectivity With Threshold – LC 1627
  8. #
    Making A Large Island – LC 827
  9. #
    Kruskal’s Algorithm (MST implementation) – GFG
  10. #
    Count Sub Islands (Union-Find approach) – LC 1905
Now HARD truly tests:
  • Dynamic addition
  • Offline queries
  • DSU + sorting
  • Reverse operations
  • Component size tracking

The ONE Rule That Wins Interviews

Always be able to answer:
“What does one
If you can define that clearly,
your DSU model is correct.

Common Interview Mistakes

  • Using DFS when dynamic merging is needed
  • Forgetting path compression
  • Treating directed graphs like undirected in DSU
  • Unioning blindly without rank / size logic

INTERVIEW REVISION CARD — Pattern 15

1️⃣ Pattern in One Line

"Dynamically group elements into connected components — find() checks membership, union() merges groups."

2️⃣ Recognize It When...

  • Dynamic connectivity — edges added incrementally
  • Trigger words: "connected components", "same group", "merge", "union", "cycle in undirected graph"
  • Order of traversal doesn't matter
  • Need to check "are X and Y connected?" repeatedly

3️⃣ Don't Confuse With...

  • Graph DFS/BFS (P14) → DFS/BFS for static traversal; DSU for dynamic merging
  • Topological Sort (P14A) → ordering; DSU is about grouping
  • Hashing (P2) → hashing stores state; DSU tracks component membership

4️⃣ Approach in 3 Steps

  1. #
    Initialize parent[i] = i, rank[i] = 0 for all nodes
  2. #
    find(x) with path compression — flatten tree during lookup
  3. #
    union(x, y) with rank — attach smaller tree under larger

5️⃣ Invariant to Maintain

"At every step, find(x) always returns the true root of x's component — path compression keeps this efficient."

6️⃣ Complexity to Justify

  • find/union: O(α(N)) ≈ O(1) amortized with path compression + rank
  • Space: O(N)
  • Why: Path compression + union by rank gives nearly constant time

7️⃣ Edge Cases to Always Check

  • Self-loop — union(x, x) should do nothing
  • Already connected nodes — union should not create cycle
  • Disconnected graph — multiple components remain
  • Directed vs undirected — DSU only for undirected
  • Component size tracking needed?

8️⃣ One Line to Say in Interview

"I'll use Union-Find with path compression and union by rank — near O(1) per operation, ideal for dynamic connectivity queries."