I have two tables:
Customer_Core with PK CustomerID that I load into Hub_Customer and Sat_Customer
Customer_Additional with PK CustomerID+Method+Number
Method is always a 2-character string (of about 20 distinct values) for which I haven’t figured out what value means what. In current implementations using this same table, they always filter on value ‘01’ but I feel like this is done solely to fix the grain difference in the easiest way. ‘01’ always exists.
Number is just an extra unique identifier, so it seems, that only one rare occasions makes a record unique. Looks like a table-scoped sequence (instead of a record-scoped sequence).
Basically, the table contains funky data, that I just have to deal with.
How would you model this table?
MAS_Customer_Additional with both Method+Number as child dependent keys, OR
(weak) Hub_Customer_Additional with a (concatenated or composite?) three-part BK and Link_Customer_Customer_Additional
I’m in favor of the first. I’ll end up with fewer tables and going towards the BDV it will be easier to apply a business rule (where I most probably will be doing some filtering similar to the current implementation) to end up with a satellite with the same grain as Customer.
My client had all its data analysts and engineers do the CDVDM certification so I’m trying to figure out which patterns would be preferrable. Personally I’ve always preferred satellites with DCK’s.
By now I don’t think there’s any article of yours I myself haven’t yet read… with this particular post, I was already waiting for the right moment to drop this bomb on them
I’m going to order your book tonight (I really am!), hoping to find the answer in there as well. In the meantime, do you happen to have a blog post somewhere that describes the difference between MSat and a Satellite with a dependent-child key (and isn’t there a nice abbreviation for that as well?)? I think it will be similar to the explanation in this blog post, right?
Is the dbtvault implementation of an MSat what you call a DCK Sat? (also, kind of funny, the use the term Child-Dependent Key, not Dependent Child Key)
Ah! The book I spent a largish section on the difference and their use case. Where choosing one or the other may succeed or fail depending on how you structure them.
I believe dbtvault has implemented the msat style from the book! @alex.higgs or @neil.strange can confirm!
Book also includes the load code as a template for the SQL literate to grasp how these sats work.
The sat with a dep-key code doesn’t differ from a regular Sat load code too much.
Btw Christian follows Ensemble Data Vault, not Data Vault 2.0. Ensemble does not allow for dep-keys in sats, instead each of these dep-keys end up being a hub!
That means you end up with far too many tables to join on whereas the DV2.0 approach keeps this number down!
Sweet, looking forward to the book then! It makes me wonder, what the dbtvault team has used for inspiration (or in a more literal sense, where they found the SQL or to what degree they came up with the SQL themselves) for their implementation of all patterns.
Really appreciate your swift and elaborate responses