How to design data vault in case of multiple source systems and many satellites for each source

I have Implemented data vault for single source system using dbtvault on Big query. Now I have a use case where data is coming from multiple source systems(3 to 4) for one data product and most of the attributes are same in all source systems and per source system I need to create multiple satellites based on rate of change or type of data(around 15 types).
How to design hubs, Links, and Sats in case of multiple source systems and per system has multiple satellites based on rate of change and type of data.

I found below options.

Creating a satellite for each source system(hub and link will hold all keys) . In this case I may end up creating many sat tables ,so there would be cost and performance effective when joining all tables in next layer.If I create single satellite for all source systems per each type , I have to do union for all source systems records in staging layer before pushing it raw data vault (as dbt vault satellite macro not supporting to load satellite from multiple staging table as of now). But in this case I miss data vault principle like keeping raw vault as close to source.

Can you please suggest how to design in these cases?


1 Like

Hi Chandra,

combining multiple sources into one satellite is never an option. Your first option sounds like a proper one. Yes, you might end up with more satellites than you would have wanted, but introducing a Point-In-Time table for each hub should take away most of the performance issues you might run into.


Thanks FrenkLoonen for the suggestion.

As @FrenkLoonen stated, it’s not an option

Keeping them seperate is the standard in Raw Vault.

Hub tables are your integration points — or “shared kernel” (in DDD parlance). Keeping raw sats seperate per source maintains scalability of your model