What is a Data Warehouse?
A Data Warehouse (DW) is a large, centralized system used to store data collected from different sources within an organization. This data is organized and prepared mainly for analysis, reporting, and decision-making, not for daily operations.
It is a type of database designed for querying and analysis, rather than handling routine transactions. It stores both current and historical data, helping organizations understand past trends and make better decisions.
In simple terms, a data warehouse:
- Combines data from multiple sources
- Stores large amounts of historical data
- Helps managers and analysts make informed decisions
- Is used for analysis, not daily work
Characteristics of a Data Warehouse
1. Integration
Data from different sources (like databases, spreadsheets, and external
systems) is combined using a process called ETL (Extract, Transform,
Load).
This ensures the data is consistent and clean for analysis.
2. Time-Variant
A data warehouse stores historical data over a long period.
Users can analyze data from:
- 3 months ago
- 6 months ago
- 1 year or more
This helps in identifying trends and changes over time.
3. Subject-Oriented
Data is organized based on key business subjects such as:
- Customers
- Products
- Sales
This makes it easier to analyze specific areas of the business instead of
overall operations.
4. Non-Volatile
Once data is stored in the warehouse:
- It is not frequently updated or deleted
- It remains stable for analysis
This ensures consistency and reliability of data.
Types of Data Warehouse
1. Enterprise Data Warehouse (EDW)
Centralized system for the entire organization
Combines data from all departments
Supports strategic decision-making
2. Data Mart
A smaller part of a data warehouse
Designed for a specific department (like sales or finance)
Provides focused and faster analysis
3. Operational Data Store (ODS)
Stores real-time or near real-time data
Used for quick reporting and operational decisions
4. Offline Data Warehouse
Data is updated periodically (not in real-time)
Suitable when immediate updates are not required
What is OLAP Technology?
OLAP (Online Analytical Processing) is a technology used to analyze data
in multiple dimensions quickly and interactively.
It allows users like analysts and managers to:
- View data from different angles
- Perform complex analysis easily
- Make faster and better decisions
In simple words, OLAP helps users explore data deeply and
interactively.
Types of OLAP Technology
1. ROLAP (Relational OLAP)
Uses relational databases
Data is stored in tables
Flexible but slightly slower
2. MOLAP (Multidimensional OLAP)
Stores data in multidimensional cubes
Very fast for analysis
Requires more storage
3. HOLAP (Hybrid OLAP)
Combines ROLAP and MOLAP
Balances performance and flexibility
4. DOLAP (Desktop OLAP)
Installed on a user’s computer
Allows local data analysis
Does not require a server
5. TOLAP (Temporal OLAP)
Focuses on time-based analysis
Helps study trends over time
Difference Between Data Warehouse and OLAP
1. Purpose
Data Warehouse: Stores and manages large amounts of data
OLAP: Analyzes and explores data
2. Data Structure
Data Warehouse: Uses structured tables (relational model)
OLAP: Uses multidimensional cubes
3. Processing
Data Warehouse: Uses ETL to collect and prepare data
OLAP: Uses operations like:
- Slice
- Dice
- Drill-down
- Pivot