Different Types of Clustering

kumudha

Different Types of Clustering

Cluster Analysis is a technique used to divide data into groups called clusters. Objects within the same cluster are similar to each other, while objects in different clusters are different from one another.

The main purpose of clustering is to identify patterns and meaningful groups in data. Sometimes, clustering is also used as an initial step for data summarization or further analysis.

Cluster analysis plays an important role in many fields such as:

Biology
Psychology
Statistics
Pattern Recognition
Machine Learning
Data Mining

What is Cluster Analysis?

Cluster Analysis is the process of grouping data objects based on the information present in the dataset.

The goal is:

Objects within the same group should be similar.
Objects from different groups should be dissimilar.

However, defining what exactly forms a cluster is not always simple. The same set of data points can sometimes be grouped in different ways depending on the method used. Therefore, the best clustering structure depends on the nature of the data and the objective of the analysis.

Clustering vs Classification

Clustering is sometimes compared with classification, but they are different.

Classification

Uses predefined class labels
Requires training data
It is a supervised learning method

Clustering

Does not require predefined labels
Groups data based on similarity
It is an unsupervised learning method

Because of this, clustering is often called unsupervised classification.

Other Terms Related to Clustering

Some terms are often used as alternatives for clustering:

Segmentation

Dividing data into groups using simple rules.

Example:

Grouping people based on income
Segmenting images based on color

Partitioning

Dividing a dataset into smaller parts or subsets

This term is also used in graph partitioning and other areas.

Different Types of Clustering

Clustering methods can be categorized into several types:

Hierarchical vs Partitional Clustering
Exclusive vs Overlapping vs Fuzzy Clustering
Complete vs Partial Clustering

1. Hierarchical vs Partitional Clustering

Partitional Clustering

In partitional clustering, the dataset is divided into non-overlapping clusters. Each data object belongs to only one cluster

Clusters do not contain subclusters

Example:

If we divide 100 customers into 5 groups, each customer belongs to only one group.

Hierarchical Clustering

In hierarchical clustering, clusters are organized in a tree-like structure.

Clusters may contain subclusters
The structure is called a hierarchy

The top level contains one cluster with all data objects, and as we move down the tree, clusters are divided into smaller groups.

Important points:

The root contains all objects
The leaf nodes contain individual objects
A hierarchical cluster can be converted into partitional clusters by cutting the tree at a specific level

2. Exclusive vs Overlapping vs Fuzzy Clustering

Exclusive Clustering

In exclusive clustering, each object belongs to only one cluster.

Example:

A student belongs to one class section only.

Overlapping Clustering

In overlapping clustering, an object can belong to multiple clusters.

Example:

A person can be both an employee and a student trainee in a company.

This approach is useful when objects naturally belong to more than one group.

Fuzzy Clustering

In fuzzy clustering, objects belong to clusters with a membership value between 0 and 1.

This means:

The object belongs 70% to cluster A
30% to cluster B

The sum of membership values for each object is equal to 1.

Fuzzy clustering is useful when boundaries between clusters are unclear.

3.Complete vs Partial Clustering

Complete Clustering

In complete clustering, every data object is assigned to a cluster.

Example:

Grouping all documents in a dataset into topics.

Partial Clustering

In partial clustering, some objects may not belong to any cluster.

These objects may be:

Noise
Outliers
Irrelevant data

Example:

When analyzing news articles, only articles related to important topics may be grouped into clusters, while others may be ignored

Different Types of Clusters

Different clustering methods define clusters in different ways.

Main cluster types include:

Well-separated clusters
Prototype-based clusters
Graph-based clusters
Density-based clusters
Shared-property (Conceptual) clusters

1. Well-Separated Clusters

In this type, each object in a cluster is closer to other objects in the same cluster than to objects in other clusters.

Characteristics:

Clear separation between clusters
Clusters can have any shape
This type works well when clusters are far apart from each other.

2. Prototype-Based Clusters

In this approach, each cluster is represented by a prototype.

The prototype can be:

Centroid

The average of all data points in the cluster.

Medoid

The most representative data point in the cluster.

Objects are assigned to the cluster whose prototype is closest to them.

These clusters are often called center-based clusters and usually have a spherical shape.

Example algorithms:

K-Means

K-Medoids

3. Graph-Based Clusters

In this method, data is represented as a graph:

Nodes represent data objects
Edges represent connections or similarity

A cluster is defined as a connected component in the graph.

One common example is contiguity-based clustering, where objects are connected if they are within a certain distance.

A limitation of this method is that noise points may connect different clusters, creating incorrect groupings.

4. Density-Based Clusters

In density-based clustering, clusters are defined as regions with high density of points, separated by regions with low density.

Characteristics:

Can detect irregular-shaped clusters
Handles noise and outliers

Example algorithm:

DBSCAN

Density-based clustering works well when clusters are complex or overlapping.

5. Shared-Property (Conceptual) Clusters

In this type, objects in a cluster share a common property or concept.

Example:

Documents discussing the same topic
Products belonging to the same category

These clusters are discovered using conceptual clustering, which focuses on understanding the meaning or concept behind the data rather than just distance.

« Previous Next »

Different Types of Clustering

Different Types of Clustering

What is Cluster Analysis?

Clustering vs Classification

Classification

Clustering

Other Terms Related to Clustering

Segmentation

Example:

Partitioning

Different Types of Clustering

1. Hierarchical vs Partitional Clustering

Partitional Clustering

Example:

Hierarchical Clustering

Important points:

2. Exclusive vs Overlapping vs Fuzzy Clustering

Exclusive Clustering

Example:

Overlapping Clustering

Example:

Fuzzy Clustering

3.Complete vs Partial Clustering

Complete Clustering

Example:

Partial Clustering

Example:

Different Types of Clusters

1. Well-Separated Clusters

2. Prototype-Based Clusters

3. Graph-Based Clusters

4. Density-Based Clusters

Example algorithm:

5. Shared-Property (Conceptual) Clusters

Example:

You may like these posts

Footer Copyright

Contact form