A single HUB can be used to store the records for different natrual keys(concatenated by different natural keys from different sources) ?
For example
Source 1 Hassh key
H_Costomer
cust_hash_key =(concatenation of fname,last name,source1)
Source 2 Hassh key
H_Costomer
cust_hash_key =(concatenation of full_name,source1)
Looking for the right approch in data vault with respective to automateDV
Thanks @patrickcuba what if my hub have super set of natural keys from both the source system
H_Customer
Cust_Hash_Key: fname,last name,full_name,source
in case of records from source1 full_name will be null and in case of records from source2 fname and last name field will be null
In a HUB we store the business-key, the full business-key and nothing but the business-key. So, the question is: what is the stable (over time), unique identifier one can identify an instance of the business-object with.
So in your case: how does the business identify a unique customer. Is that the fname + lastname, then the hub contains those 2 fields and nothing else.
Now, the same name might come in from 2 different sources. Assuming you’ve picked the correct business-key, that by definition means it’s data about the very same customer, so no new hub-record needs to be created when the second source delivers it’s data.
The other-than-the-business-key-attributes, will be stored in a per source satellite. A business-vault satellite will be used to add derived values and/or pick-and-choose attribute valiues for ‘the golden record’ from the separate source-based-satellites.
Hope this helps!
Some additional notes:
Customers are a painfull businesobject. It’s hard to determine a correct businesskey in lot’s of cases. What i’ve seen most often is that an e-mail adress is used for this, combined with a same as link to identify the use of multiple email-adresses by the same customer overtime, if you would at all be able to detect that.
Sometimes one might be forced to use ‘customernumber’ as the businesskey, by the lack of anything better. In that kind of cases, you might run into a businesskey containing both the customernumber and the source-system, as both systems might draw the same customernumber for different customers.