Hi , let say I have two source tables agents_tx and agents_ny from same source systems . They have same exact columns, just two different tables because of the way the source system loads data. Would you recommend loading it two different satellite or same satellite ?
How does the source system store the data? Based on the information you gave, I would be inclined to load them to the same satellite.
I think I would build two satellites and one business satellite in addition (with the combined data) as view
Always separate, if one evolves and other doesn’t, what then?
As tempting as it is, I would keep them as separate satellites – they are two independent sources, after all. The Business Data Vault would be the place to bring them together (when appropriate).
In a previous discussion, we talked about data from different implementations of an application that had the same identifier value for different “things” (for example, customer number 1 in appA is a different customer than customer number 1 in appB). By keeping these in separate satellites, you would not have any issues with the load and could apply the necessary logic to bring them together appropriately when needed (e.g. add a tag that identifies the source system)
You should not copy raw vault data into business vault, instead just create a view to consolidate the two if you desire
I have one scenario where I am tempted to load to the same satellite. It’s the same source, with the same schema, it’s just that my delivery systems are different. My company has two integration layers, with one being decommissioned for this source on 2019-12-31 and the other one being used from 2020-01-01 onwards. The integration layer is where I’m getting my data from (not directly from the sources) so because for these sources they are mutually exclusive, I think I can get away with loading to one satellite.
However, there’s another source that got decommissioned at around 2020-03-31 while the new delivery system started on 2020-01-01 already. I must make sure that I stop loading from the old delivery system after 2019-12-31 if I want to load to one satellite. In this case, I’m more tempted to load everything that I can from that system, and make the decision what the golden source is towards my information mart. In that case, I will need to use separate satellites.
As a minimum, maintain the separation of the source data as provided. Aggregating later is far easier than disaggregating. Further separate based on PII, rate of change etc. You’ll have lots of sats, but that’s a strength.