How to design data vault in case of multiple source systems and many satellites for each source

Chandra · 20 January 2023 09:29

Hi,
I have Implemented data vault for single source system using dbtvault on Big query. Now I have a use case where data is coming from multiple source systems(3 to 4) for one data product and most of the attributes are same in all source systems and per source system I need to create multiple satellites based on rate of change or type of data(around 15 types).
How to design hubs, Links, and Sats in case of multiple source systems and per system has multiple satellites based on rate of change and type of data.

I found below options.

Creating a satellite for each source system(hub and link will hold all keys) . In this case I may end up creating many sat tables ,so there would be cost and performance effective when joining all tables in next layer.If I create single satellite for all source systems per each type , I have to do union for all source systems records in staging layer before pushing it raw data vault (as dbt vault satellite macro not supporting to load satellite from multiple staging table as of now). But in this case I miss data vault principle like keeping raw vault as close to source.

Can you please suggest how to design in these cases?

Thanks!

FrenkLoonen · 24 January 2023 16:14

Hi Chandra,

combining multiple sources into one satellite is never an option. Your first option sounds like a proper one. Yes, you might end up with more satellites than you would have wanted, but introducing a Point-In-Time table for each hub should take away most of the performance issues you might run into.

Chandra · 24 January 2023 22:39

Thanks FrenkLoonen for the suggestion.

patrickcuba · 5 February 2023 21:47

As @FrenkLoonen stated, it’s not an option

Keeping them seperate is the standard in Raw Vault.

Hub tables are your integration points — or “shared kernel” (in DDD parlance). Keeping raw sats seperate per source maintains scalability of your model

Topic		Replies	Views
Multi hub from one source. Split the satellite ? Data Vault 2.0 hub , satellite	4	98	31 January 2025
Two different source table with same columns, can I load to same satellite or different satellite Data Vault 2.0	7	829	25 October 2022
Splitting the satellite in Business Vault Data Vault 2.0 satellite	2	108	7 November 2024
For JSON Source if there is a LINK why need HUB and SAT ? Data Vault 2.0 business-key , link , hub	3	355	6 February 2023
Raw Vault layer data modeling questions Data Vault 2.0	7	1153	24 May 2022

How to design data vault in case of multiple source systems and many satellites for each source

Related topics