Multi-Active Satellite or Weak Hub + Link?

FrenkLoonen · 16 June 2022 08:19

Which flavor do you guys like best?

I have two tables:
Customer_Core with PK CustomerID that I load into Hub_Customer and Sat_Customer
Customer_Additional with PK CustomerID+Method+Number

Method is always a 2-character string (of about 20 distinct values) for which I haven’t figured out what value means what. In current implementations using this same table, they always filter on value ‘01’ but I feel like this is done solely to fix the grain difference in the easiest way. ‘01’ always exists.
Number is just an extra unique identifier, so it seems, that only one rare occasions makes a record unique. Looks like a table-scoped sequence (instead of a record-scoped sequence).
Basically, the table contains funky data, that I just have to deal with.

How would you model this table?

MAS_Customer_Additional with both Method+Number as child dependent keys, OR
(weak) Hub_Customer_Additional with a (concatenated or composite?) three-part BK and Link_Customer_Customer_Additional

I’m in favor of the first. I’ll end up with fewer tables and going towards the BDV it will be easier to apply a business rule (where I most probably will be doing some filtering similar to the current implementation) to end up with a satellite with the same grain as Customer.

All thoughts are appreciated!

patrickcuba · 16 June 2022 15:07

No such think as “weak” entities in DV2.0; that is an Ensemble DV concept.

Satellite with a dependent-child key (your method column)
Or if you wish to treat your descriptive content as a SET then use an MSat

FrenkLoonen · 16 June 2022 17:07

My client had all its data analysts and engineers do the CDVDM certification so I’m trying to figure out which patterns would be preferrable. Personally I’ve always preferred satellites with DCK’s.

patrickcuba · 16 June 2022 17:16

That’s unfortunate

Get them to read this

FrenkLoonen · 16 June 2022 17:32

By now I don’t think there’s any article of yours I myself haven’t yet read… with this particular post, I was already waiting for the right moment to drop this bomb on them

FrenkLoonen · 21 June 2022 12:58

@patrickcuba

I’m going to order your book tonight (I really am!), hoping to find the answer in there as well. In the meantime, do you happen to have a blog post somewhere that describes the difference between MSat and a Satellite with a dependent-child key (and isn’t there a nice abbreviation for that as well?)? I think it will be similar to the explanation in this blog post, right?

Is the dbtvault implementation of an MSat what you call a DCK Sat? (also, kind of funny, the use the term Child-Dependent Key, not Dependent Child Key)

patrickcuba · 21 June 2022 14:12

Hello!

Ah! The book I spent a largish section on the difference and their use case. Where choosing one or the other may succeed or fail depending on how you structure them.

I believe dbtvault has implemented the msat style from the book! @alex.higgs or @neil.strange can confirm!

Book also includes the load code as a template for the SQL literate to grasp how these sats work.
The sat with a dep-key code doesn’t differ from a regular Sat load code too much.

I don’t have a blog on that I’m afraid!

patrickcuba · 21 June 2022 14:15

Btw Christian follows Ensemble Data Vault, not Data Vault 2.0. Ensemble does not allow for dep-keys in sats, instead each of these dep-keys end up being a hub!

That means you end up with far too many tables to join on whereas the DV2.0 approach keeps this number down!

patrickcuba · 21 June 2022 14:16

Of course the choice is yours, follow Hultgren or Linstedt
There’s a lot more different between the two than just using hash keys!

FrenkLoonen · 22 June 2022 07:45

Sweet, looking forward to the book then! It makes me wonder, what the dbtvault team has used for inspiration (or in a more literal sense, where they found the SQL or to what degree they came up with the SQL themselves) for their implementation of all patterns.

Really appreciate your swift and elaborate responses

alex.higgs · 22 June 2022 17:54

We implemented as in Patrick’s book, with a few minor adjustments to support dbt’s materialisations

Topic		Replies	Views
Multi Active Satelite or Normaization needed? Data Vault 2.0 multi-active-sat , satellite	5	42	13 April 2025
Modelling Question on DV2.0 Data Vault 2.0	27	1863	14 December 2023
DQ issues: Error Vault? Data Vault 2.0	4	439	7 August 2022
Business Entity with multiple Business Keys Data Vault 2.0 business-key	1	63	19 March 2025
Including additional keys as primary keys in satellite Data Vault 2.0 satellite	3	672	10 March 2024

Multi-Active Satellite or Weak Hub + Link?

Related topics