What is a snowflake schema?
A snowflake schema is a data modeling technique where a central fact table is connected to dimension tables, and those dimension tables are further divided into multiple related tables.
- The fact table contains measurable data (like sales and revenue)
- The dimension tables contain descriptive data (like product, customer, date)
- These dimension tables are normalized into multiple levels
In Simple Terms
Snowflake Schema = One fact table + multiple normalized dimension tables (split into sub-tables)
Example
- Fact Table → Sales
- Dimension Tables → Product, Customer, Date
Further breakdown:
- Product → Category → Subcategory
- Location → City → State → Country
Dimension tables are not directly flat; they are split into multiple related tables.
Why it is Used
- Reduces data redundancy
- Improves data consistency
- Handles complex and hierarchical data
- Saves storage space
Features of Snowflake Schema
The snowflake schema is used when data needs to be more structured and normalized.
1. Central Fact Table
At the center, there is a fact table that stores numerical measures like sales, revenue, and quantity.
2. Normalized Dimension Tables
Dimension tables are split into multiple related tables, creating a normalized structure.
3. Reduced Redundancy
Data is stored only once, reducing duplication and improving consistency.
4. Supports Hierarchies
It efficiently handles hierarchical relationships like the following:
- Product → Category → Subcategory
- Geography → Country → State → City
5. More Joins Required
Since data is divided into multiple tables, queries require more joins compared to Star Schema.
6. Structured Design
The schema is more organized and follows proper database normalization rules.
Merits of Snowflake Schema
1. Reduced Data Redundancy
Because of normalization:
- Data duplication is minimized
- Storage is optimized
2. Better Data Integrity
Since data is stored in separate tables:
- Updates happen in one place
- Data remains consistent
3. Efficient for Complex Data
Works well when:
- Data has multiple levels
- Relationships are complex
4. Optimized Storage
Less repeated data means the following:
- Lower storage usage
- Better memory efficiency
5. Logical Data Organization
Data is well-structured and follows a clear hierarchy.
Demerits of Snowflake Schema
1. Complex Structure
Compared to Star Schema:
- More tables
- More relationships
2. Slower Query Performance
Due to multiple joins:
- Queries take more time
- Report performance may decrease
3. Not Ideal for Power BI
Power BI prefers simpler models:
- Complex joins slow down performance
- DAX becomes harder to manage
4. Difficult for Beginners
Non-technical users may find it difficult to
- Understand relationships
- Navigate multiple tables
5. Higher Development Effort
Building and maintaining:
- Requires more time
- Needs careful planning
Difference Between Star Schema and Snowflake Schema
| Feature | Star Schema | Snowflake Schema |
|---|---|---|
| Structure | Central fact table with directly connected dimension tables | Fact table with dimension tables further split into sub-tables |
| Design | Denormalized | Normalized |
| Complexity | Simple and easy to understand | More complex with multiple tables |
| Joins | Fewer joins required | More joins required |
| Performance | Faster query performance | Slower compared to star schema |
| Data Redundancy | Higher (data may be repeated) | Lower (minimal redundancy) |
| Storage | Requires more storage | Saves storage space |
| Hierarchy Handling | Limited | Better for hierarchical data |
| Ease of Use | Easy for beginners and business users | Difficult for non-technical users |
| Power BI Usage | Highly recommended | Used in complex scenarios |
Conclusion
The Snowflake Schema is ideal for handling complex, structured, and hierarchical data, but it comes with increased complexity and slower performance.
A snowflake schema is useful when data integrity and normalization are critical.


