Need some real/sample examples of Information Mart using Raw and Business vault

Irfan · 7 July 2023 06:11

Hi,

I need real examples of converting a raw data vault into Information Mart (star schema). All the examples I found on the internet are too simple. I need to know the following.

Using PIT and Satellites to create a dimension
Using Bridge and (or Links) tables to create fact tables.
How multiple hubs’ and their satellites (with or without PIT) can be used to create a dimension (e.g., Category and Product)
How staging and CDC can be used in an effective way to achieve all of the above.
How effectivity and Status based satellites can be used to create these Marts.

Suggest me some good books with chapters on the above things. I don’t mind having Microsoft samples (Northwind, WorldwideImporters, Adventureworks) either. Currently, I don’t find any good books or internet materials on the above things. Maybe I had been searching the wrong way. So please help me find them.

Thanks and Regards,
Irfan

Nat · 7 July 2023 06:36

I do agree that there isn’t a huge amount of content out there on the information mart layer possibilities specifically related to data vault (as opposed of course to dimensional modelling in general)

patrickcuba · 7 July 2023 07:34

If you use auto-increment columns in each of the sats, you can build a Bridge table to simulate the fact table. Upcoming blog I show how to do this, no date when it will come out yet!

Irfan · 7 July 2023 07:36

I am waiting for it desperately.

Nat · 7 July 2023 16:09

will this be an evolution of this one?

patrickcuba · 7 July 2023 22:15

Yes, you might figure it out before I publish!

Temporal surrogate sequence keys, if you pull those into a Bridge table that includes the metrics needed for your fact table you have facts and dimensions. Reuse the history-views over the other sats as pseudo-dimensions (they have the auto-increment column) and you have a star schema you can slice and dice using a date dimension

Irfan · 9 July 2023 16:43

Thank you, Patrick. I got the idea of using Temporal surrogate keys just to make join faster because of simple integers.
If you can, suggest something on the CDC side. Do you think PIT can/should be CDC based also to cater to multiple satellites for the next run? Similarly, the same can be true for the bridge table for loading facts. What do you think?

patrickcuba · 9 July 2023 22:29

CDC based? Incremental PITs?

Irfan · 10 July 2023 07:58

By CDC, I meant “change data capture.”

Actually, I assume there are two pipelines. The first one is to get everything from the staging/landing zone and take it to the Raw vault and Business vault (PIT and Bridge).

The second one is to get everything from the Raw vault (and Business vault) and take it to the Information Mart (star schema).

It is the second part I am most interested in right now. You suggested Temporal surrogate sequence keys that will help load information mart faster. I understand that part.

My question is whether (and how) cdc (i.e., change data capture) can increase the performance of loading the information mart, considering there will be a continuous loading process in this pipeline.

patrickcuba · 10 July 2023 08:22

I know CDC mate. I was wondering in what context wrt to Data Vault.
CDC, Snapshot, Accumulating Snapshot — it don’t matter.

Raw Vault = Hub, Link and Satellites (auditable and non-disposable)
Business Vault = Link and Satellites (auditable and non-disposable)
Query Assistance = PITs and Bridges (disposable)
Information Marts = Flat Wide, Views, Star Schema … whatever ((disposable))

Info-Marts being disposable are by default deployed as views until such time the performance is lagging or a better latency is desired. Snowflake’s new Dynamic Tables is a great fit for this area

azimed · 1 July 2024 07:30

Thanks for the clarifications. Do you recommend that historized data (the different values in satelittes with a start end and end date) should be maintained in the information mart? If so, how do you expose the complexity of point in time logic to the end users?

Topic		Replies	Views
Which is the best model to create Information mart on Raw data vault in Big query Data Vault 2.0 automatedv	7	985	1 December 2025
DeNormalizing/Flattening Raw vault entities to create Dimensions for the Information Mart Data Vault 2.0 hub	4	639	1 December 2025
Creating information mart table from a Link-Sat with history Data Vault 2.0	14	999	1 July 2024
Refactoring of business vault Data Vault 2.0	4	297	1 December 2025
PIT for SAT of multiple Hubs Data Vault 2.0	3	448	1 December 2025

Need some real/sample examples of Information Mart using Raw and Business vault

Related topics