Different Types of Clustering
gocourse.in Maintenance

We'll be back soon

Our CDN (cdn.gocourse.in) is currently unreachable. Some images, JavaScript, or CSS files may not load properly.

Estimated downtime: ~30 minutes

Different Types of Clustering

kumudha

Different Types of Clustering

Cluster Analysis is a technique used to divide data into groups called clusters. Objects within the same cluster are similar to each other, while objects in different clusters are different from one another. 

The main purpose of clustering is to identify patterns and meaningful groups in data. Sometimes, clustering is also used as an initial step for data summarization or further analysis.

Cluster analysis plays an important role in many fields such as:
  • Biology
  • Psychology
  • Statistics
  • Pattern Recognition
  • Machine Learning
  • Data Mining

What is Cluster Analysis?

Cluster Analysis is the process of grouping data objects based on the information present in the dataset.

The goal is:
  • Objects within the same group should be similar.
  • Objects from different groups should be dissimilar.
However, defining what exactly forms a cluster is not always simple. The same set of data points can sometimes be grouped in different ways depending on the method used. Therefore, the best clustering structure depends on the nature of the data and the objective of the analysis.

Clustering vs Classification 

Clustering is sometimes compared with classification, but they are different.

Classification

  • Uses predefined class labels
  • Requires training data
  • It is a supervised learning method

Clustering

  • Does not require predefined labels
  • Groups data based on similarity
  • It is an unsupervised learning method

Because of this, clustering is often called unsupervised classification.

Other Terms Related to Clustering

Some terms are often used as alternatives for clustering:

Segmentation

Dividing data into groups using simple rules.

Example:

  • Grouping people based on income
  • Segmenting images based on color

Partitioning

Dividing a dataset into smaller parts or subsets
This term is also used in graph partitioning and other areas.

Different Types of Clustering

Clustering methods can be categorized into several types:
  • Hierarchical vs Partitional Clustering
  • Exclusive vs Overlapping vs Fuzzy Clustering
  • Complete vs Partial Clustering

1. Hierarchical vs Partitional Clustering

Partitional Clustering

In partitional clustering, the dataset is divided into non-overlapping clusters. Each data object belongs to only one cluster

Clusters do not contain subclusters

Example:

If we divide 100 customers into 5 groups, each customer belongs to only one group.

Hierarchical Clustering

In hierarchical clustering, clusters are organized in a tree-like structure.
  • Clusters may contain subclusters
  • The structure is called a hierarchy
The top level contains one cluster with all data objects, and as we move down the tree, clusters are divided into smaller groups.

Important points:

  • The root contains all objects
  • The leaf nodes contain individual objects
  • A hierarchical cluster can be converted into partitional clusters by cutting the tree at a specific level

2. Exclusive vs Overlapping vs Fuzzy Clustering

Exclusive Clustering

In exclusive clustering, each object belongs to only one cluster.

Example:

A student belongs to one class section only.

Overlapping Clustering

In overlapping clustering, an object can belong to multiple clusters.

Example:

A person can be both an employee and a student trainee in a company.
This approach is useful when objects naturally belong to more than one group.

Fuzzy Clustering

In fuzzy clustering, objects belong to clusters with a membership value between 0 and 1.

This means:
  • The object belongs 70% to cluster A
  • 30% to cluster B

The sum of membership values for each object is equal to 1.

Fuzzy clustering is useful when boundaries between clusters are unclear.

3.Complete vs Partial Clustering

Complete Clustering

In complete clustering, every data object is assigned to a cluster.

Example:

Grouping all documents in a dataset into topics.

Partial Clustering

In partial clustering, some objects may not belong to any cluster.

These objects may be:
  • Noise
  • Outliers
  • Irrelevant data

Example:

When analyzing news articles, only articles related to important topics may be grouped into clusters, while others may be ignored

Different Types of Clusters

Different clustering methods define clusters in different ways.

Main cluster types include:
  • Well-separated clusters
  • Prototype-based clusters
  • Graph-based clusters
  • Density-based clusters
  • Shared-property (Conceptual) clusters

1. Well-Separated Clusters

In this type, each object in a cluster is closer to other objects in the same cluster than to objects in other clusters.

Characteristics:
  • Clear separation between clusters
  • Clusters can have any shape
  • This type works well when clusters are far apart from each other.

2. Prototype-Based Clusters

In this approach, each cluster is represented by a prototype.

The prototype can be:

Centroid

The average of all data points in the cluster.

Medoid

The most representative data point in the cluster.

Objects are assigned to the cluster whose prototype is closest to them.

These clusters are often called center-based clusters and usually have a spherical shape.

Example algorithms:
K-Means
K-Medoids

3. Graph-Based Clusters

In this method, data is represented as a graph:
  • Nodes represent data objects
  • Edges represent connections or similarity
A cluster is defined as a connected component in the graph.

One common example is contiguity-based clustering, where objects are connected if they are within a certain distance.

A limitation of this method is that noise points may connect different clusters, creating incorrect groupings.

4. Density-Based Clusters

In density-based clustering, clusters are defined as regions with high density of points, separated by regions with low density.

Characteristics:
  • Can detect irregular-shaped clusters
  • Handles noise and outliers

Example algorithm:

  • DBSCAN
Density-based clustering works well when clusters are complex or overlapping.

5. Shared-Property (Conceptual) Clusters

In this type, objects in a cluster share a common property or concept.

Example:

  • Documents discussing the same topic
  • Products belonging to the same category
These clusters are discovered using conceptual clustering, which focuses on understanding the meaning or concept behind the data rather than just distance.


Our website uses cookies to enhance your experience. Learn More
Accept !