Implementing Data Vault Architecture & Model in both on-prem and on-Cloud (hybrid)


I am new to this group, but I would to ask guidance or advice or info (for those who have done it) on how to implement data vault architecture and model where some “pieces” or “parts” of the raw data vault are on-prem and some “pieces”/“parts” are on Cloud platform.

For example, some Customer attributes can be on Cloud data storage but other Customer attributes must be on-prem storage.

If this is the case, what is the best practice or kind provide guidance on how to implement this?


My suggestion is that you try to do a join between data on two different locations. Measure the performance and then perform the same join when the data is on one location.


Sounds like two different vault!

A DV has many joins and if you’re joining data between on-prem and cloud what is the cost of running those joins between different locations?

There isn’t any guide because I don’t think anyone does this within a data vault? DV includes historical data and connecting a client to join data across the ether just sounds impractical!