Support and Confidence in Data Mining
gocourse.in Maintenance

We'll be back soon

Our CDN (cdn.gocourse.in) is currently unreachable. Some images, JavaScript, or CSS files may not load properly.

Estimated downtime: ~30 minutes

Support and Confidence in Data Mining

shareef

 Support and Confidence in Data Mining

Data mining is the process of finding useful information from large amounts of data. It is widelyused in industries like healthcare, marketing, education, and more to help businesses makebetter decisions.

What is Association Rule Mining (ARM)?

Association Rule Mining (ARM) is a technique used to discover relationships between items in adataset.

It is also known as:
  • Market Basket Analysis (MBA)
  • Affinity Analysis
In simple terms, ARM helps answer questions like:
“If a customer buys item A, are they likely to buy item B?”

A typical rule looks like this:
A ⇒ B (If A is bought, B is also likely to be bought)

Where:
A = Antecedent (input item)
B = Consequent (result item)

Examples of Association Rules
{Cheese, Butter} ⇒ {Bread}
{Smartphone} ⇒ {Phone Case}
{Table} ⇒ {Chair}

These rules help businesses understand customer buying patterns.

Support in Data Mining

What is Support?

Support tells how often an item or item set appears in the dataset.

Formula

For a single item:
Support(A) = Frequency of A / Total Transactions

For two items:
Support(A ∪ B) = Frequency of (A and B together) / Total Transactions

Example Dataset
Transaction Items
T1     Apple, Mango, Melon, Orange
T2     Apple, Pear, Orange
T3     Apple, Orange, Pear
T4     Apple, Mango

Total Transactions = 4

Support Calculation Examples
Support (Apple) = 4/4 = 100%
Support (Orange) = 3/4 = 75%
Support (Mango) = 2/4 = 50%
Support (Melon) = 1/4 = 25%

Support of combinations:

Support (Apple & Orange) = 3/4 = 75%
Support (Apple & Mango) = 2/4 = 50%
Key Idea
High Support → Item appears frequently
Low Support → Item appears rarely

Businesses use this to focus on popular items.

Confidence in Data Mining

What is Confidence?

Confidence measures how strong the relationship is between two items.

It answers:
“If a customer buys A, how often do they also buy B?”

Formula
Confidence (A ⇒ B) = Support (A ∪ B) / Support (A)

Example Dataset

Transaction    Items
T1                  Shoes, Socks, Bag, Perfume
T2                  Shoes, Socks, Perfume, Watch
T3                  Bag, Perfume, Watch, Socks
T4                  Socks, Necktie, Watch

Example 1:

{Shoes} ⇒ {Socks}
Support (Shoes & Socks) = 2/4 = 0.5
Support (Shoes) = 2/4 = 0.5
Confidence = 0.5 / 0.5 = 1 (100%)

Meaning: Whenever Shoes are bought, Socks are always bought.
Strong relationship

Example 2:

{Socks} ⇒ {Shoes}
Support (Socks & Shoes) = 2/4 = 0.5
Support (Socks) = 4/4 = 1

Confidence = 0.5 / 1 = 0.5 (50%)

Meaning: Socks are not always bought with Shoes.
Moderate relationship
Key Idea
  • High Confidence → Strong relationship
  • Low Confidence → Weak or moderate relationship

Support vs Confidence

Support

  • Measures frequency of items
  • Focuses on “how often”
  • Formula: Support(A) = Freq(A)/Total
  • Helps find popular items
  • High support = frequent item

Confidence

  • Measures strength of relationship
  • Focuses on “how likely”
  • Confidence(A⇒B) = Support(A∪B)/Support(A)
  • Helps find related items
  • High confidence = strong rule
Support tells how frequently an item appears in the dataset.
Confidence tells how strongly two items are related.

Together, they help businesses:
  • Understand customer behavior
  • Improve product placement
  • Increase sales through better recommendations
Our website uses cookies to enhance your experience. Learn More
Accept !