A single HUB can be used to store the records from different sources with different natural keys

A single HUB can be used to store the records for different natrual keys(concatenated by different natural keys from different sources) ?
For example
Source 1 Hassh key
H_Costomer
cust_hash_key =(concatenation of fname,last name,source1)

Source 2 Hassh key
H_Costomer
cust_hash_key =(concatenation of full_name,source1)

Looking for the right approch in data vault with respective to automateDV

No… this is an anti-pattern

Thanks @patrickcuba what if my hub have super set of natural keys from both the source system
H_Customer
Cust_Hash_Key: fname,last name,full_name,source
in case of records from source1 full_name will be null and in case of records from source2 fname and last name field will be null

These are not business keys

Hi @shuja,

In a HUB we store the business-key, the full business-key and nothing but the business-key. So, the question is: what is the stable (over time), unique identifier one can identify an instance of the business-object with.
So in your case: how does the business identify a unique customer. Is that the fname + lastname, then the hub contains those 2 fields and nothing else.
Now, the same name might come in from 2 different sources. Assuming you’ve picked the correct business-key, that by definition means it’s data about the very same customer, so no new hub-record needs to be created when the second source delivers it’s data.

The other-than-the-business-key-attributes, will be stored in a per source satellite. A business-vault satellite will be used to add derived values and/or pick-and-choose attribute valiues for ‘the golden record’ from the separate source-based-satellites.

Hope this helps!

Some additional notes:

  • Customers are a painfull businesobject. It’s hard to determine a correct businesskey in lot’s of cases. What i’ve seen most often is that an e-mail adress is used for this, combined with a same as link to identify the use of multiple email-adresses by the same customer overtime, if you would at all be able to detect that.
  • Sometimes one might be forced to use ‘customernumber’ as the businesskey, by the lack of anything better. In that kind of cases, you might run into a businesskey containing both the customernumber and the source-system, as both systems might draw the same customernumber for different customers.
1 Like