Data Warehouse

A data warehouse is a specialized database system used for storing, managing, and analyzing large volumes of structured data from multiple sources.

Components: 1. Source systems. 2. Staging area. 3. Presentation layer (Reporting). 4. Metadata. 5. ETL tools. Types: Enterprise Data Warehouse (EDW), Operational Data Store (ODS), and Data Mart.

        graph LR
  Center["Data Warehouse"]:::main
  classDef main fill:#7c3aed,stroke:#8b5cf6,stroke-width:2px,color:white,font-weight:bold,rx:5,ry:5;
  classDef pre fill:#0f172a,stroke:#3b82f6,color:#94a3b8,rx:5,ry:5;
  classDef child fill:#0f172a,stroke:#10b981,color:#94a3b8,rx:5,ry:5;
  classDef related fill:#0f172a,stroke:#8b5cf6,stroke-dasharray: 5 5,color:#94a3b8,rx:5,ry:5;
  linkStyle default stroke:#4b5563,stroke-width:2px;

      

🧒 Explain Like I'm 5

Imagine a supermarket. The cash register (a regular [database](/en/terms/database)) only cares about the person currently buying bread. But in the back office, there is a giant filing cabinet (the [Data Warehouse](/en/terms/data-warehouse)) that keeps track of every loaf of bread sold every day for years. The manager uses that cabinet to decide how much bread to order next week. It’s the 'big picture' storage for the whole company.

🤓 Expert Deep Dive

Technically, data warehouses use 'Columnar Storage' rather than 'Row-based Storage' to speed up analytical queries. This allows the system to read only the specific columns needed for a report (e.g., 'Price') without touching millions of 'Customer Names'. Architectural patterns typically involve 'Dimensional Modeling', where data is organized into 'Fact Tables' (events like sales) and 'Dimension Tables' (attributes like time, product, or location). This is often viewed as a 'Star Schema'. Modern cloud warehouses like Snowflake have decoupled 'Storage' from 'Compute', allowing teams to scale their processing power up or down in seconds depending on the complexity of the report.

📚 Sources