DV Implementation

We currently operate an established system utilizing dimensional model surrogate keys and business keys in a production environment. Concurrently, we are in the process of developing Data Vault 2.0. Our strategy entails running both systems simultaneously for a year to facilitate user acceptance testing (UAT) by the business stakeholders. We are deliberating on potential challenges that may arise concerning key selection between the two systems. It is foreseeable that discrepancies in data storage methods could occur, as the legacy system may employ primary keys (PK), surrogate keys (SK), or business keys, while Data Vault 2.0 will utilize hash keys and business keys. This variance in key structures may lead to data inconsistencies or mismatches, warranting careful consideration and mitigation strategies during the transition period. How can we ensure the data matches apple to apple across systems?

… by doing your due diligence…

Appreciate it if you could elaborate more, so we don’t end up in a situation where end users or customer scratch their heads using the new system and see new keys compared to what they used to see earlier.

Well what you’re after is typically what consultants charge for deciphering / consulting. Sure your customers might be scratching your heads but having quality input is not free.
This forum is mostly about modelling problems, this thread I think is something where a consultancy would spend whiteboard sessions deciphering, helping and getting paid to do.

DV 2.0 hubs are defined by business keys, not surrogate keys. You might bring in the SK from a source system and have it in a satellite, but only the true business keys should be in the hub. The only potential conflict between business keys will be if a business key value is provided by 2 independent systems (think of item number from 2 different sources), in which case a business key collision code is recommended.

I’m going to assume that “variance in key structure” means that different source systems have different “grain” of business key (e.g. one defines item by an item number but the other defines item as item number + size) but analysis should help identify those. (This can be a challenging conversation when each part of the business is convinced that their way is the only correct way.)

I hope that’s helpful.