Wednesday, January 12, 2011

UNION-FIND Algorithm

Union-Find Algorithm uses a disjoint-set data structure to solve a problem. We are allowed to merge any two items to consider them equal. This is a common way to find connectivity between nodes, or to find connected components.

Basically a Union-Find data structure implements two functions:
  • union( A, B ) - merge A's set with B's set
  • find( A ) - finds what set A belongs to
The most naive method involves using multiple lists. To merge two elements, we will find which lists they are in, and then append one of them to the other. To check for equality, we will again find which lists the two elements are in, and check that they are the same list.

The first optimization is union by rank. The first optimization is known as union by rank. We augment the node structure by adding a new parameter rank, which is initialized to 0 for all elements. When we want to merge two elements, we will first find the representative elements. If they are different, then check their ranks. If one is greater than the other, then set the parent of the node with smaller rank to be the node with greater rank. Otherwise, arbitrarily pick one node. Set its parent to be the other, and increment that other node's rank by 1. This heuristic attempts to add smaller trees to larger trees, meaning that when we need to run a find operation, the depth of the tree is as small as possible.

Reference:
http://www.algorithmist.com/index.php/Union_Find