Hello , In a Use case We have data coming from 20+ sources which will be loaded into the RDV as Separate source driven satellites, so 20 Source feeds will load 20 different Satellites .
Next , we are divided on the best approach for the Business data Vault - should the Source based satellites (some of which need more enrichment and transformation than others) be merged into one harmonized Satellite in Business Data Vault or does it make more sense to maintain the source based split in the Business Data Vault Satellite as well ?What factors should be considered to make this selection ?
Other details : Data will be loaded daily with some 100 million records / day from one source on a Apache Spark Based cloud platform.
If you merge them in to one in the BDV how will you align the timelines?
My gut would say to leave them separate in the BDV. In the InfoMart layer you can then combine information from the various sources together. You may need to use query assistance tables such as PIT table (Point-In-Time) depending on your requirements in order to align your timelines.
Just remember, BV is not a replica of RV… sat splitting should be done in RV already, now if you want to bring data together in BV what is the purpose? Are you deploying some business rule? Adding transformations?
As @Carl mentioned, if you’re looking to bring content together use PITs, they’re intended to efficiently use a platform’s OLAP hash-joining, see here: Simple PIT table Constructs