Kimball vs. Inmon: The Two Titans of Data Warehouse Architecture
·
17 min read

Table of Contents
When organizations set out to build an enterprise data warehouse (EDW), two foundational schools of thought dominate the landscape:
- Kimball’s Bottom-Up Dimensional Approach
- Inmon’s Top-Down Corporate Information Factory
Both methodologies aim to deliver trusted, consistent, and analytics-ready data, but they differ in philosophy, design flow, and implementation strategy. Let’s dive deep into their differences with examples and practical guidance.
Kimball Methodology – Bottom-Up Approach
The Kimball approach focuses on delivering fast business value by creating data marts organized around business processes.
Architectural Flow
Plain Text
[Source Systems]
↓
[Staging Area]
↓
[Data Marts (Star/Snowflake Schemas)]
↓
[BI / Reporting Tools]
Key Characteristics
- Data Marts First: Each business process (Sales, Inventory, Finance) gets its own data mart.
- Star Schema Design: Fact tables + Dimension tables, highly denormalized.
- Conformed Dimensions: Shared dimensions (like Time, Customer) ensure consistency across marts.
- Rapid Delivery: Business teams see value quickly with focused marts.
👉 Example: Star Schema for Sales
Plain Text
-- Fact table
CREATE TABLE Fact_Sales (
Sale_ID BIGINT PRIMARY KEY,
Date_Key INT,
Product_Key INT,
Customer_Key INT,
Sales_Amount DECIMAL(10,2),
Quantity_Sold INT
);
-- Dimension table: Product
CREATE TABLE Dim_Product (
Product_Key INT PRIMARY KEY,
Product_Name VARCHAR(100),
Category VARCHAR(50),
Price DECIMAL(10,2)
);
👉 BI Query Example (Kimball-style):
Plain Text
SELECT
p.Category,
SUM(f.Sales_Amount) AS Total_Sales
FROM Fact_Sales f
JOIN Dim_Product p ON f.Product_Key = p.Product_Key
GROUP BY p.Category;
- Fast results for business analytics.
- Optimized for querying but not ideal for long-term historical detail.
Inmon Methodology – Top-Down Approach
The Inmon approach focuses on building a centralized EDW in 3NF (normalized form) before creating downstream data marts.
Architectural Flow
Plain Text
[Source Systems]
↓
[ETL / Integration Layer]
↓
[Enterprise Data Warehouse (3NF; normalized)]
↓
[Departmental Data Marts (denormalized)]
↓
[BI / Reporting Tools]
Key Characteristics
- Enterprise Data Warehouse First: A single, corporate-wide EDW in normalized form.
- Subject-Oriented: Data organized around entities like Customer, Product, Order.
- Data Marts Later: Built off EDW as denormalized subsets for reporting.
- Longer Setup: Slower initial delivery but consistent single source of truth.
👉 Example: Inmon EDW (3NF style)
Plain Text
-- Core entities in normalized structure
CREATE TABLE Customer (
Customer_ID INT PRIMARY KEY,
First_Name VARCHAR(50),
Last_Name VARCHAR(50),
Email VARCHAR(100)
);
CREATE TABLE Product (
Product_ID INT PRIMARY KEY,
Name VARCHAR(100),
Category VARCHAR(50)
);
CREATE TABLE Orders (
Order_ID INT PRIMARY KEY,
Customer_ID INT,
Order_Date DATE,
FOREIGN KEY (Customer_ID) REFERENCES Customer(Customer_ID)
);
👉 BI Query Example (Inmon-style, via Data Mart):
Plain Text
SELECT
c.Last_Name,
SUM(o.Sales_Amount) AS Total_Spent
FROM Orders o
JOIN Customer c ON o.Customer_ID = c.Customer_ID
GROUP BY c.Last_Name;
- Strong data consistency & governance.
- EDW as the “single version of truth”.
- Slower to deliver analytics-ready outputs compared to Kimball.
Side-by-Side Comparison
Aspect | Kimball (Bottom-Up) | Inmon (Top-Down) |
Design Flow | Data marts first → EDW by integration | EDW first → data marts by extraction |
Data Model | Denormalized (star/snowflake schemas) | Normalized (3NF) in EDW, marts denormalized |
Speed to Value | Fast, iterative | Slower initial, but enterprise-wide |
Data Consistency | Conformed dimensions | Centralized EDW as golden source |
Flexibility | Easy to extend marts per business area | Harder upfront, but stable enterprise view |
Best For | Agile BI, smaller orgs, fast delivery | Large enterprises, regulatory needs |
Reporting | Queries hit data marts directly | Queries hit marts, not EDW directly |
Visual Comparison Diagram
Kimball (Bottom-Up)

Plain Text
Sales Mart Finance Mart HR Mart
│ │ │
└──► Conformed Dimensions ◄─┘
│
BI Tools
Inmon (Top-Down)

Plain Text
Enterprise Data Warehouse (3NF)
│
┌───────┴────────┐
Sales Mart Finance Mart
│ │
BI Tools BI Tools
Hybrid Approaches
In practice, many organizations blend Kimball and Inmon:
- Use Inmon-style EDW for governance & compliance.
- Build Kimball-style marts for agile analytics.
- Add Data Vault Modeling as a flexible middle layer for scalability.
👉 Modern architectures often look like:
Plain Text
[Raw Data Lake] → [Data Vault / EDW] → [Star Schemas / BI Marts] → [Reports / AI Models]
Business scenario & raw sources
We’re tracking retail sales for customers buying products.
Raw sources (Bronze / landing)
Plain Text
SRC_CUSTOMER(CustomerID, FirstName, LastName, Email, Status, UpdatedAt, SourceSystem)
SRC_PRODUCT(ProductID, Name, Category, ListPrice, UpdatedAt, SourceSystem)
SRC_ORDER_HEADER(OrderID, CustomerID, OrderDate, Status, SourceSystem, ExtractTS)
SRC_ORDER_LINE(OrderID, LineNumber, ProductID, Quantity, UnitPrice, DiscountPct, SourceSystem, ExtractTS)
Common downstream questions:
- Revenue by category / month
- Top customers by spend
- Order status pipeline (open vs shipped)
- Slowly changing attributes (customer status changes)
1) Kimball (bottom-up, star/snowflake)
mental model
deliver subject-area data marts quickly; enforce consistency via conformed dimensions; tables are denormalized for BI.
Plain Text
[Staging] → [Star/Snowflake Marts] → [BI]
schema (star)
Plain Text
DimCustomer DimProduct DimDate
│ │ │
└──────┐ ┌────┘ │
▼ ▼ │
FactSales ◄─────────────────┘
tables (SQL)
Dimensions
Plain Text
CREATE TABLE DimCustomer (
CustomerKey INT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
CustomerID VARCHAR(50) NOT NULL, -- natural key
FullName VARCHAR(150),
Email VARCHAR(200),
Status VARCHAR(30),
StartDate DATE NOT NULL,
EndDate DATE,
IsCurrent BOOLEAN NOT NULL,
SourceSystem VARCHAR(50)
);
CREATE TABLE DimProduct (
ProductKey INT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
ProductID VARCHAR(50) NOT NULL,
Name VARCHAR(150),
Category VARCHAR(100),
ListPrice DECIMAL(18,2),
StartDate DATE NOT NULL,
EndDate DATE,
IsCurrent BOOLEAN NOT NULL,
SourceSystem VARCHAR(50)
);
CREATE TABLE DimDate (
DateKey INT PRIMARY KEY, -- e.g., 20250131
Date DATE NOT NULL,
Month INT,
MonthName VARCHAR(20),
Year INT,
Quarter INT
);
Fact
Plain Text
CREATE TABLE FactSales (
SalesKey BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
DateKey INT NOT NULL REFERENCES DimDate(DateKey),
CustomerKey INT NOT NULL REFERENCES DimCustomer(CustomerKey),
ProductKey INT NOT NULL REFERENCES DimProduct(ProductKey),
OrderID VARCHAR(50),
LineNumber INT,
Quantity INT,
UnitPrice DECIMAL(18,2),
DiscountPct DECIMAL(5,2),
SalesAmount DECIMAL(18,2), -- Quantity * UnitPrice * (1-Discount)
Status VARCHAR(30),
SourceSystem VARCHAR(50),
LoadTS TIMESTAMP NOT NULL
);
load flow (ELT outline)
- Stage: copy raw → STG_* tables.
- SCD2 upsert to DimCustomer / DimProduct (track status, price changes).
- Derive measures & keys; insert FactSales with surrogate keys.
SCD2 example (DimCustomer)
Plain Text
-- Close out changed rows
UPDATE DimCustomer d
SET EndDate = CURRENT_DATE - 1, IsCurrent = FALSE
FROM Stg_Customer s
WHERE d.CustomerID = s.CustomerID
AND d.IsCurrent = TRUE
AND (d.Email <> s.Email OR d.Status <> s.Status);
-- Insert new current rows
INSERT INTO DimCustomer (CustomerID, FullName, Email, Status, StartDate, EndDate, IsCurrent, SourceSystem)
SELECT CustomerID,
CONCAT(FirstName, ' ', LastName),
Email,
Status,
CURRENT_DATE,
NULL,
TRUE,
SourceSystem
FROM Stg_Customer s
LEFT JOIN DimCustomer d ON d.CustomerID = s.CustomerID AND d.IsCurrent = TRUE
WHERE d.CustomerID IS NULL
OR d.Email <> s.Email
OR d.Status <> s.Status;
Fact load (surrogate lookups)
Plain Text
INSERT INTO FactSales
(DateKey, CustomerKey, ProductKey, OrderID, LineNumber, Quantity, UnitPrice, DiscountPct, SalesAmount, Status, SourceSystem, LoadTS)
SELECT
to_char(h.OrderDate, 'YYYYMMDD')::INT,
dc.CustomerKey,
dp.ProductKey,
l.OrderID,
l.LineNumber,
l.Quantity,
l.UnitPrice,
l.DiscountPct,
l.Quantity * l.UnitPrice * (1 - l.DiscountPct),
h.Status,
l.SourceSystem,
CURRENT_TIMESTAMP
FROM Stg_Order_Line l
JOIN Stg_Order_Header h ON h.OrderID = l.OrderID
JOIN DimCustomer dc ON dc.CustomerID = h.CustomerID AND dc.IsCurrent = TRUE
JOIN DimProduct dp ON dp.ProductID = l.ProductID AND dp.IsCurrent = TRUE;
BI query: monthly revenue by category
Plain Text
SELECT d.Year, d.Month, p.Category, SUM(f.SalesAmount) AS Revenue
FROM FactSales f
JOIN DimDate d ON f.DateKey = d.DateKey
JOIN DimProduct p ON f.ProductKey = p.ProductKey
GROUP BY d.Year, d.Month, p.Category
ORDER BY d.Year, d.Month, p.Category;
dbt-style model (fact)
Plain Text
-- models/marts/f_sales.sql
{{ config(materialized='incremental', unique_key='SalesKey') }}
SELECT ...
FROM {{ ref('stg_order_line') }} l
JOIN {{ ref('stg_order_header') }} h ON h.orderid = l.orderid
JOIN {{ ref('dim_customer') }} dc ON ...
JOIN {{ ref('dim_product') }} dp ON ...
{% if is_incremental() %}
WHERE l.extractts > (SELECT max(loadts) FROM {{ this }})
{% endif %}
pros: fast BI, easy analytics
watchouts: upstream changes can ripple into facts/dims; history depth limited by SCD design
2) Inmon (top-down, EDW-first 3NF)
mental model
build a normalized EDW as enterprise truth; publish data marts off it.
Plain Text
[Staging] → [EDW 3NF] → [Dimensional Data Marts] → [BI]
EDW schema (3NF)
Plain Text
Customer(CustomerID PK, ...status...)
Product(ProductID PK, ...price...)
OrderHeader(OrderID PK, CustomerID FK, OrderDate, Status)
OrderLine(OrderID FK, LineNumber, ProductID FK, Quantity, UnitPrice, DiscountPct)
EDW DDL (trimmed)
Plain Text
CREATE TABLE EDW_Customer (
CustomerID VARCHAR(50) PRIMARY KEY,
FirstName VARCHAR(80),
LastName VARCHAR(80),
Email VARCHAR(200),
Status VARCHAR(30),
RowEffStart TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
RowEffEnd TIMESTAMP,
RecordSource VARCHAR(50)
);
CREATE TABLE EDW_Product (
ProductID VARCHAR(50) PRIMARY KEY,
Name VARCHAR(150),
Category VARCHAR(100),
ListPrice DECIMAL(18,2),
RowEffStart TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
RowEffEnd TIMESTAMP,
RecordSource VARCHAR(50)
);
CREATE TABLE EDW_OrderHeader (
OrderID VARCHAR(50) PRIMARY KEY,
CustomerID VARCHAR(50) NOT NULL REFERENCES EDW_Customer(CustomerID),
OrderDate DATE NOT NULL,
Status VARCHAR(30),
RecordSource VARCHAR(50),
RowLoadTS TIMESTAMP NOT NULL
);
CREATE TABLE EDW_OrderLine (
OrderID VARCHAR(50) REFERENCES EDW_OrderHeader(OrderID),
LineNumber INT,
ProductID VARCHAR(50) REFERENCES EDW_Product(ProductID),
Quantity INT,
UnitPrice DECIMAL(18,2),
DiscountPct DECIMAL(5,2),
RecordSource VARCHAR(50),
RowLoadTS TIMESTAMP NOT NULL,
PRIMARY KEY (OrderID, LineNumber)
);
publish dimensional mart
from the EDW
Plain Text
-- DimCustomer from EDW (SCD2 if needed)
INSERT INTO DimCustomer (...)
SELECT ...
FROM EDW_Customer
-- apply effective dating → SCD2 rows
-- FactSales from EDW tables
INSERT INTO FactSales (...)
SELECT ...
FROM EDW_OrderLine l
JOIN EDW_OrderHeader h ON h.OrderID = l.OrderID
JOIN DimCustomer dc ON dc.CustomerID = h.CustomerID AND dc.IsCurrent = TRUE
JOIN DimProduct dp ON dp.ProductID = l.ProductID AND dp.IsCurrent = TRUE;
pros: enterprise consistency, de-duplication, strong governance
watchouts: longer time-to-BI; more modeling upfront
3) Data Vault 2.0 (agile, auditable backbone)
mental model
store all data, all the time, insert-only with metadata. separate business keys (Hubs), relationships (Links), and descriptive history (Satellites). then publish marts.
Plain Text
[Staging] → [Hubs/Links/Sats] → [PIT/Bridge] → [Star Marts] → [BI]
schema (mini vault)
Plain Text
HubCustomer HubOrder HubProduct
│ │ │
└──────┐ ┌────┘ │
▼ ▼ │
Link_OrderLine ──────────────┘
│ │
│ └─ Sat_OrderLine (qty, price, discount, status…)
└─ Sat_OrderHeader (status, dates…)
Sat_Customer Sat_Product
hashing helpers (Python)
Plain Text
import hashlib
def h(s): return hashlib.md5(s.encode('utf-8')).hexdigest()
def hk(*vals): return h('||'.join(v if v is not None else '' for v in vals))
Hubs
Plain Text
CREATE TABLE Hub_Customer (
Customer_HK CHAR(32) PRIMARY KEY,
Customer_BK VARCHAR(50) NOT NULL, -- business key
LoadDTS TIMESTAMP NOT NULL,
RecordSource VARCHAR(50) NOT NULL
);
CREATE TABLE Hub_Product (
Product_HK CHAR(32) PRIMARY KEY,
Product_BK VARCHAR(50) NOT NULL,
LoadDTS TIMESTAMP NOT NULL,
RecordSource VARCHAR(50) NOT NULL
);
CREATE TABLE Hub_Order (
Order_HK CHAR(32) PRIMARY KEY,
Order_BK VARCHAR(50) NOT NULL,
LoadDTS TIMESTAMP NOT NULL,
RecordSource VARCHAR(50) NOT NULL
);
Link (n-ary relationship: order line)
Plain Text
CREATE TABLE Link_OrderLine (
OrderLine_HK CHAR(32) PRIMARY KEY, -- hash of (Order_BK, LineNumber)
Order_HK CHAR(32) NOT NULL,
Product_HK CHAR(32) NOT NULL,
Customer_HK CHAR(32) NOT NULL,
LoadDTS TIMESTAMP NOT NULL,
RecordSource VARCHAR(50) NOT NULL
);
Satellites (context & history)
Plain Text
CREATE TABLE Sat_Customer (
Customer_HK CHAR(32) NOT NULL,
HashDiff CHAR(32) NOT NULL, -- hash of descriptive attrs
FullName VARCHAR(150),
Email VARCHAR(200),
Status VARCHAR(30),
EffectiveDTS TIMESTAMP NOT NULL,
LoadDTS TIMESTAMP NOT NULL,
RecordSource VARCHAR(50) NOT NULL,
PRIMARY KEY (Customer_HK, EffectiveDTS)
);
CREATE TABLE Sat_Product (
Product_HK CHAR(32) NOT NULL,
HashDiff CHAR(32) NOT NULL,
Name VARCHAR(150),
Category VARCHAR(100),
ListPrice DECIMAL(18,2),
EffectiveDTS TIMESTAMP NOT NULL,
LoadDTS TIMESTAMP NOT NULL,
RecordSource VARCHAR(50) NOT NULL,
PRIMARY KEY (Product_HK, EffectiveDTS)
);
CREATE TABLE Sat_OrderLine (
OrderLine_HK CHAR(32) NOT NULL,
HashDiff CHAR(32) NOT NULL,
LineNumber INT,
Quantity INT,
UnitPrice DECIMAL(18,2),
DiscountPct DECIMAL(5,2),
Status VARCHAR(30),
EffectiveDTS TIMESTAMP NOT NULL,
LoadDTS TIMESTAMP NOT NULL,
RecordSource VARCHAR(50) NOT NULL,
PRIMARY KEY (OrderLine_HK, EffectiveDTS)
);
load pattern (ELT)
- Hubs: insert new BKs (CustomerID, ProductID, OrderID) with *_HK = md5(BK).
- Links: insert relationships using hub HKs + link HK (md5(OrderID||LineNumber)).
- Satellites: compute HashDiff = md5(concat(attrs)); insert only when hash changes.
sat insert (change detection)
Plain Text
INSERT INTO Sat_Customer (Customer_HK, HashDiff, FullName, Email, Status, EffectiveDTS, LoadDTS, RecordSource)
SELECT
md5(c.CustomerID),
md5(coalesce(c.FirstName,'') || '|' || coalesce(c.LastName,'') || '|' || coalesce(c.Email,'') || '|' || coalesce(c.Status,'')),
c.FirstName || ' ' || c.LastName,
c.Email,
c.Status,
c.UpdatedAt,
CURRENT_TIMESTAMP,
c.SourceSystem
FROM Stg_Customer c
LEFT JOIN (
SELECT Customer_HK, HashDiff
FROM Sat_Customer sc
QUALIFY ROW_NUMBER() OVER (PARTITION BY Customer_HK ORDER BY EffectiveDTS DESC) = 1
) last_sc
ON last_sc.Customer_HK = md5(c.CustomerID)
WHERE last_sc.HashDiff IS DISTINCT FROM
md5(coalesce(c.FirstName,'') || '|' || coalesce(c.LastName,'') || '|' || coalesce(c.Email,'') || '|' || coalesce(c.Status,''));
PIT/Bridge (for point-in-time joins & performance)
Plain Text
CREATE TABLE PIT_OrderLine AS
SELECT
l.OrderLine_HK,
l.Order_HK,
l.Product_HK,
l.Customer_HK,
-- latest sat rows as of a calendar grain
s_ol.EffectiveDTS as OL_EffDTS,
s_c.EffectiveDTS as C_EffDTS,
s_p.EffectiveDTS as P_EffDTS
FROM Link_OrderLine l
LEFT JOIN LATEST_SAT_OL s_ol ON s_ol.OrderLine_HK = l.OrderLine_HK
LEFT JOIN LATEST_SAT_CUSTOMER s_c ON s_c.Customer_HK = l.Customer_HK
LEFT JOIN LATEST_SAT_PRODUCT s_p ON s_p.Product_HK = l.Product_HK;
publish a Kimball-style star from the vault
Plain Text
CREATE VIEW DimCustomer AS
SELECT
hc.Customer_HK AS CustomerKey,
sc.FullName,
sc.Email,
sc.Status
FROM Hub_Customer hc
JOIN (
SELECT * FROM Sat_Customer QUALIFY ROW_NUMBER()
OVER (PARTITION BY Customer_HK ORDER BY EffectiveDTS DESC)=1
) sc ON sc.Customer_HK = hc.Customer_HK;
CREATE VIEW DimProduct AS
SELECT
hp.Product_HK AS ProductKey,
sp.Name,
sp.Category,
sp.ListPrice
FROM Hub_Product hp
JOIN (
SELECT * FROM Sat_Product QUALIFY ROW_NUMBER()
OVER (PARTITION BY Product_HK ORDER BY EffectiveDTS DESC)=1
) sp ON sp.Product_HK = hp.Product_HK;
CREATE VIEW FactSales AS
SELECT
to_char(s_ol.EffectiveDTS, 'YYYYMMDD')::INT AS DateKey,
l.Customer_HK AS CustomerKey,
l.Product_HK AS ProductKey,
l.OrderLine_HK,
s_ol.Quantity,
s_ol.UnitPrice,
s_ol.DiscountPct,
s_ol.Quantity * s_ol.UnitPrice * (1 - s_ol.DiscountPct) AS SalesAmount,
s_ol.Status,
s_ol.EffectiveDTS AS EffectiveDTS
FROM Link_OrderLine l
JOIN (
SELECT * FROM Sat_OrderLine QUALIFY ROW_NUMBER()
OVER (PARTITION BY OrderLine_HK ORDER BY EffectiveDTS DESC)=1
) s_ol ON s_ol.OrderLine_HK = l.OrderLine_HK;
pros: insert-only audit trail, agile to new sources, scales massively
watchouts: not query-optimized without PIT/bridges/marts; more objects to manage
🔁 Same BI questions, three implementations
Q1) Revenue by category & month
- Kimball: simple group by on FactSales + DimDate + DimProduct (shown earlier).
- Inmon: either build a view from EDW that aggregates lines or (recommended) use the published mart (same query as Kimball).
- Data Vault: use published star or a vault reporting view over PIT/bridge (query mirrors Kimball once published).
Plain Text
-- works for Kimball, Inmon mart, or DV-published star
SELECT d.Year, d.Month, p.Category, SUM(f.SalesAmount) AS Revenue
FROM FactSales f
JOIN DimDate d ON f.DateKey = d.DateKey
JOIN DimProduct p ON f.ProductKey = p.ProductKey
GROUP BY d.Year, d.Month, p.Category
ORDER BY d.Year, d.Month, p.Category;
Q2) Top 10 customers by spend (last 90 days)
Plain Text
SELECT c.FullName, SUM(f.SalesAmount) AS Spend
FROM FactSales f
JOIN DimCustomer c ON f.CustomerKey = c.CustomerKey
JOIN DimDate d ON f.DateKey = d.DateKey
WHERE d.Date >= CURRENT_DATE - INTERVAL '90 day'
GROUP BY c.FullName
ORDER BY Spend DESC
FETCH FIRST 10 ROWS ONLY;
Q3) Order pipeline (open vs shipped)
- In DV, get latest status from Sat_OrderLine or Sat_OrderHeader (via PIT).
- In Kimball/Inmon marts, status is in Fact or a degenerate dimension.
Plain Text
SELECT Status, COUNT(*) AS Lines
FROM FactSales
GROUP BY Status;
🧱 dbt project structure (side-by-side)
Plain Text
models/
staging/
stg_customer.sql
stg_product.sql
stg_order_header.sql
stg_order_line.sql
-- Kimball mart
kimball/
dim_customer.sql (SCD2)
dim_product.sql (SCD2)
fact_sales.sql
-- Inmon
edw/
edw_customer.sql
edw_product.sql
edw_order_header.sql
edw_order_line.sql
marts_from_edw/
dim_customer.sql
dim_product.sql
fact_sales.sql
-- Data Vault
vault/
hub_customer.sql
hub_product.sql
hub_order.sql
link_orderline.sql
sat_customer.sql
sat_product.sql
sat_orderline.sql
vault_publish/
pit_orderline.sql
dim_customer.sql
dim_product.sql
fact_sales.sql
Example dbt model (vault hub):
Plain Text
-- models/vault/hub_customer.sql
{{ config(materialized='incremental', unique_key='Customer_HK') }}
SELECT
md5(CustomerID) AS Customer_HK,
CustomerID AS Customer_BK,
CURRENT_TIMESTAMP AS LoadDTS,
SourceSystem AS RecordSource
FROM {{ ref('stg_customer') }}
{% if is_incremental() %}
WHERE UpdatedAt > (SELECT COALESCE(MAX(LoadDTS),'1900-01-01') FROM {{ this }})
{% endif %}
which should you choose?
- need fast BI with a few subject areas and clear KPIs? → Kimball.
- need enterprise-wide consistency & governance across many systems? → Inmon (EDW first).
- need agility, auditability, frequent source changes, and scale? → Data Vault backbone + published stars.
Modern stacks often combine them:
Plain Text
Raw/Lake → Data Vault (Silver) → Dimensional Marts (Gold) → BI/ML
if you want, i can zip this into a starter dbt repo (kimball + inmon + data-vault folders, with macros for hashing & SCD2) and include a small seed dataset so you can run models locally. just say the word and i’ll drop the files.
Recommendation
- Choose Kimball if your org prioritizes speed, business focus, and agility.
- Choose Inmon if you need a centralized, consistent, and governed EDW.
- Choose Hybrid if you want both agility and governance, especially in cloud-native lakehouse setups.
Both methodologies remain cornerstones of data warehousing. The right choice depends on scale, governance needs, and time-to-value expectations.