Hierarchical Clustering in Data Mining
Hierarchical clustering is an unsupervised learning technique used to
group similar data points into clusters. It organizes the data in a
tree-like structure called a dendrogram. In this method, clusters are
formed step by step based on the similarity between data points. At the beginning, each data point is treated as a separate cluster.
Gradually, similar clusters are merged together or split apart depending
on the clustering method used.
Hierarchical clustering mainly has two types:
- Agglomerative Hierarchical Clustering
- Divisive Hierarchical Clustering
1. Agglomerative Hierarchical Clustering
Agglomerative clustering is the most commonly used hierarchical
clustering technique. It follows a bottom-up approach, where each data point starts as an individual
cluster and similar clusters are gradually merged together.
This method is also called AGNES (Agglomerative Nesting). At every step, the two clusters that are most similar are combined
until all data points form one single cluster.
Algorithm for Agglomerative Hierarchical Clustering
- Calculate the similarity or distance between all data points (create a proximity matrix).
- Consider each data point as an individual cluster.
- Find the two clusters that are most similar.
- Merge those clusters into a single cluster.
- Recalculate the distances between the new cluster and other clusters.
- Repeat steps 3–5 until only one cluster remains.
Example Explanation
Assume we have six data points:
P, Q, R, S, T, V
Step 1:
Each point is treated as a separate cluster:
(P), (Q), (R), (S), (T), (V)
Step 2:
The closest clusters are merged. Suppose Q and R are similar.
Clusters become:
(P), (QR), (S), (T), (V)
Step 3:
Next, S and T are merged.
Clusters become:
(P), (QR), (ST), (V)
Step 4:
Clusters ST and V are merged.
Clusters become:
(P), (QR), (STV)
Step 5:
Clusters QR and STV are merged.
Clusters become:(P), (QRSTV)
Step 6:
Finally, P merges with QRSTV to form one cluster:
(PQRSTV)
This merging process can be visualized using a dendrogram diagram.
2. Divisive Hierarchical Clustering
Divisive hierarchical clustering works in the opposite way compared to
agglomerative clustering.
It follows a top-down approach:
Initially, all data points are placed in one cluster.
- The algorithm then splits the cluster into smaller clusters based on dissimilarity.
- This splitting continues until each data point becomes its own cluster or the desired number of clusters is reached a dendrogram diagram.
Advantages of Hierarchical Clustering
- Easy to understand and implement.
- Produces a hierarchical structure (dendrogram) which provides more information about data relationships.
- The number of clusters does not need to be specified in advance.
Disadvantages of Hierarchical Clustering
- Not suitable for very large datasets.
- Difficult to handle clusters with different sizes and shapes.
- Sensitive to noise and outliers.
- Once clusters are merged or split, the process cannot be reversed.