Grid-Based Method in Data Mining

Vinithra

Grid-Based Method in Data Mining

The grid-based clustering method is a technique used in data mining to group data points. Instead of looking at each data point individually, it divides the entire data space into a fixed number of small regions called cells (or grids).

All clustering operations are performed on these grid cells, not directly on the data points. This makes the method very fast, because the processing time depends on the number of grid cells, not the number of data points

Key Idea:

Divide data space into grids
Perform clustering on grids instead of individual points
Faster and efficient for large datasets

Examples of Grid-Based Methods

Some popular grid-based approaches include:

STING (Statistical Information Grid) – Uses statistical information stored in grid cells

WaveCluster – Uses wavelet transformation for clustering

CLIQUE – Works for high-dimensional data using grid and density concepts

Basics of Grid-Based Methods

Grid-based methods are especially useful when working with multi-dimensional data, such as:

Geographic data
Image data
Data with many attributes

Dividing data into grids provides several advantages:

1. Data Partitioning

Data is divided into multiple groups (clusters) based on similarity

Each partition represents a cluster

The number of clusters (K) is often predefined

Examples of partitioning algorithms:

K-Means

PAM (K-Medoids)

CLARA

2. Data Reduction

Reduces the size of large datasets

Removes unnecessary or duplicate data

Keeps only important information

This helps in faster processing and better performance.

3. Local Pattern Discovery

Helps find patterns within specific regions (cells)

Useful for identifying hidden trends in small areas

Good for detecting localized behaviors in data

4. Scalability

Works well with large datasets

Efficient for high-dimensional data

Grid structure simplifies complex data

5. Density Estimation

Groups are formed based on dense regions of data

Sparse regions are treated as noise

Key concepts:

ε (epsilon) → Radius of neighborhood

MinPts → Minimum number of points required

If a region has enough points → it forms a cluster

Otherwise → considered noise

6. Clustering and Classification

Uses grid cells instead of individual data points

Improves processing speed

Efficient for large datasets

7. Grid-Based Indexing

Organizes data using grid structure

Helps in faster data retrieval and searching

Improves query performance

Popular Grid-Based Methods

1. K-Means Clustering

K-Means is a simple and widely used clustering algorithm.

Key Points:

It is an unsupervised learning algorithm
The number of clusters (K) is predefined
Each cluster has a centroid (center point)
Goal: Minimize distance between points and their centroid

How K-Means Works

Step 1: Choose number of clusters (K)

Step 2: Select K random centroids

Step 3: Assign each data point to nearest centroid

Step 4: Recalculate centroids

Step 5: Repeat steps 3 & 4 until no change

2. DBSCAN (Density-Based Clustering)

DBSCAN groups data based on density.

Key Concepts:

Eps (ε): Radius of neighborhood

MinPts: Minimum number of points required

Types of Points:

Core Point: Has enough neighbors

Border Point: Close to core point

Noise: Outliers

Important Condition:

A point is a core point if:

Number of points in its ε-neighborhood ≥ MinPts

3. STING (Statistical Information Grid)

STING divides space into multiple grid levels.

Key Features:

Uses rectangular grid cells
Each cell stores statistical values like:
Mean
Min
Max
Works in a hierarchical structure (top → bottom levels)

How STING Works

Step 1: Select a starting layer

Step 2: Calculate probability for each cell

Step 3: Mark cells as relevant or not

Step 4: Move to lower level if needed

Step 5: Repeat for relevant cells

Step 6: Retrieve data from selected cells

Step 7: Return final result

« Previous Next »

Grid-Based Method in Data Mining

Grid-Based Method in Data Mining

Key Idea:

Examples of Grid-Based Methods

Basics of Grid-Based Methods

1. Data Partitioning

Examples of partitioning algorithms:

2. Data Reduction

3. Local Pattern Discovery

4. Scalability

5. Density Estimation

Key concepts:

6. Clustering and Classification

7. Grid-Based Indexing

Popular Grid-Based Methods

1. K-Means Clustering

Key Points:

How K-Means Works

2. DBSCAN (Density-Based Clustering)

Key Concepts:

Types of Points:

3. STING (Statistical Information Grid)

Key Features:

How STING Works

Translate

Related course

Social Plugin

Ads

Ads

Website by

Categories

Our Services

Footer Copyright

Contact form

Grid-Based Method in Data Mining

Grid-Based Method in Data Mining

Key Idea:

Examples of Grid-Based Methods

Basics of Grid-Based Methods

1. Data Partitioning

Examples of partitioning algorithms:

2. Data Reduction

3. Local Pattern Discovery

4. Scalability

5. Density Estimation

Key concepts:

6. Clustering and Classification

7. Grid-Based Indexing

Popular Grid-Based Methods

1. K-Means Clustering

Key Points:

How K-Means Works

2. DBSCAN (Density-Based Clustering)

Key Concepts:

Types of Points:

3. STING (Statistical Information Grid)

Key Features:

How STING Works

You may like these posts

Footer Copyright

Contact form