Hierarchical Clustering in Data Mining
Hierarchical clustering is an unsupervised learning technique used to group
similar data points
into clusters. It organizes the data in a tree-like structure called a
dendrogram. In this method,
clusters are formed step by step based on the similarity between data
points.
At the beginning, each data point is treated as a separate cluster.
Gradually, similar clusters are
merged together or split apart depending on the clustering method
used.
Hierarchical clustering mainly has two types:
- Agglomerative Hierarchical Clustering
- Divisive Hierarchical Clustering
1. Agglomerative Hierarchical Clustering
Agglomerative clustering is the most commonly used hierarchical clustering
technique. It follows a bottom-up approach, where each data point starts as
an individual cluster and similar clusters are gradually merged
together.
This method is also called AGNES (Agglomerative Nesting).
At every step, the two clusters that are most similar are combined until
all data points form one single cluster.
Algorithm for Agglomerative Hierarchical Clustering
- Calculate the similarity or distance between all data points (create a proximity matrix).
- Consider each data point as an individual cluster.
- Find the two clusters that are most similar.
- Merge those clusters into a single cluster.
- Recalculate the distances between the new cluster and other clusters.
- Repeat steps 3–5 until only one cluster remains.
Example Explanation
Assume we have six data points:
P, Q, R, S, T, V
Step 1:
Each point is treated as a separate cluster:
(P), (Q), (R), (S), (T), (V)
Step 2:
The closest clusters are merged. Suppose Q and R are similar.
Clusters become:
(P), (QR), (S), (T), (V)
Step 3:
Next, S and T are merged.
Clusters become:
(P), (QR), (ST), (V)
Step 4:
Clusters ST and V are merged.
Clusters become:
(P), (QR), (STV)
Step 5:
Clusters QR and STV are merged.
Clusters become:(P), (QRSTV)
Step 6:
Finally, P merges with QRSTV to form one cluster:
(PQRSTV)
This merging process can be visualized using a dendrogram diagram.
2. Divisive Hierarchical Clustering
Divisive hierarchical clustering works in the opposite way compared to
agglomerative clustering.
It follows a top-down approach:
- Initially, all data points are placed in one cluster.
- The algorithm then splits the cluster into smaller clusters based on dissimilarity.
- This splitting continues until each data point becomes its own cluster or the desired number of clusters is reached
Advantages of Hierarchical Clustering
- Easy to understand and implement.
- produces a hierarchical structure (dendrogram) which provides more information about data relationships.
- The number of clusters does not need to be specified in advance.
Disadvantages of Hierarchical Clustering
- Not suitable for very large datasets.
- Difficult to handle clusters with different sizes and shapes.
- Sensitive to noise and outliers.
- Once clusters are merged or split, the process cannot be reversed.