Support and Confidence in Data Mining
Data mining is the process of finding useful information from large amounts
of data. It is widelyused in industries like healthcare, marketing,
education, and more to help businesses makebetter decisions.
What is Association Rule Mining (ARM)?
Association Rule Mining (ARM) is a technique used to discover relationships
between items in adataset.
It is also known as:
- Market Basket Analysis (MBA)
- Affinity Analysis
In simple terms, ARM helps answer questions like:
“If a customer buys item A, are they likely to buy item B?”
A typical rule looks like this:
A ⇒ B (If A is bought, B is also likely to be bought)
Where:
A = Antecedent (input item)
B = Consequent (result item)
Examples of Association Rules
{Cheese, Butter} ⇒ {Bread}
{Smartphone} ⇒ {Phone Case}
{Table} ⇒ {Chair}
These rules help businesses understand customer buying patterns.
Support in Data Mining
What is Support?
Support tells how often an item or item set appears in the dataset.
Formula
For a single item:
Support(A) = Frequency of A / Total Transactions
For two items:
Support(A ∪ B) = Frequency of (A and B together) / Total Transactions
Example Dataset
Transaction Items
T1 Apple, Mango, Melon, Orange
T2 Apple, Pear, Orange
T3 Apple, Orange, Pear
T4 Apple, Mango
Total Transactions = 4
Support Calculation Examples
Support (Apple) = 4/4 = 100%
Support (Orange) = 3/4 = 75%
Support (Mango) = 2/4 = 50%
Support (Melon) = 1/4 = 25%
Support of combinations:
Support (Apple & Orange) = 3/4 = 75%
Support (Apple & Mango) = 2/4 = 50%
Key Idea
High Support → Item appears frequently
Low Support → Item appears rarely
Businesses use this to focus on popular items.
Confidence in Data Mining
What is Confidence?
Confidence measures how strong the relationship is between two items.
It answers:
“If a customer buys A, how often do they also buy B?”
Formula
Confidence (A ⇒ B) = Support (A ∪ B) / Support (A)
Example Dataset
Transaction Items
T1 Shoes,
Socks, Bag, Perfume
T2 Shoes,
Socks, Perfume, Watch
T3 Bag,
Perfume, Watch, Socks
T4 Socks,
Necktie, Watch
Example 1:
{Shoes} ⇒ {Socks}
Support (Shoes & Socks) = 2/4 = 0.5
Support (Shoes) = 2/4 = 0.5
Confidence = 0.5 / 0.5 = 1 (100%)
Meaning: Whenever Shoes are bought, Socks are always bought.
Strong relationship
Example 2:
{Socks} ⇒ {Shoes}
Support (Socks & Shoes) = 2/4 = 0.5
Support (Socks) = 4/4 = 1
Confidence = 0.5 / 1 = 0.5 (50%)
Meaning: Socks are not always bought with Shoes.
Moderate relationship
Key Idea
- High Confidence → Strong relationship
- Low Confidence → Weak or moderate relationship
Support vs Confidence
Support
- Measures frequency of items
- Focuses on “how often”
- Formula: Support(A) = Freq(A)/Total
- Helps find popular items
- High support = frequent item
Confidence
- Measures strength of relationship
- Focuses on “how likely”
- Confidence(A⇒B) = Support(A∪B)/Support(A)
- Helps find related items
- High confidence = strong rule
Support tells how frequently an item appears in the dataset.
Confidence tells how strongly two items are related.
Together, they help businesses:
- Understand customer behavior
- Improve product placement
- Increase sales through better recommendations