Two or more dependent children in a LINK?

saqib · 28 February 2022 01:22

Hi all,

We have two dependent children (subclass and class) that do not have any meaning by themselves. We are thinking of adding them to the LINK as follows:

Question: Is this the correct way to hang two satellites from multiple dependent children in a LINK?

Thanks,
Saqib

Nicruzer · 28 February 2022 13:41

@saqib - I am really curious about this one, and I’m not sure I fully understand the situation here.

Questions I have regarding this:

Is it possible that CLASS and SUBCLASS would qualify as reference tables?
Are you able to provide a sample with data?

I’m very interested in people’s responses here…

patrickcuba · 1 March 2022 00:23

Me too! Generally speaking a Reference table does not have hash keys, using dependent child keys is synonymous with degenerate dimensions under Kimball

saqib · 1 March 2022 15:57

Hi @Nicruzer and @patrickcuba ,

The data looks as follows:

ITEM_ID	CLASS	SUBCLASS	DEPT
aaa	14	1	100
bbb	14	1	100
ccc	14	1	100
ddd	16	1	100
eee	16	1	100
fff	16	1	200

DEPT	CLASS	CLASS_DESC	Attr1	Attr2	Attr3
100	14	SEDANS	nnn	nnn	nnn
100	16	WOMEN SHOES	nnn	nnn	nnn
200	16	NOTEBOOKS	nnn	nnn	nnn

DEPT	CLASS	SUBCLASS	SUBCLASS_DESC
100	14	1	VEHICLES
100	16	1	SHOES
200	16	1	COMPUTERS

Note that the CLASS and SUBCLASS only hold meaning when combined with DEPT. This is the reason I was thinking of hanging the two SATs (SAT_ITEM_DEPT_CLASS and SAT_ITEM_DEPT_SUBCLASS) from the LINK (LINK_ITEM_DEPT) table.

Thanks,
Saqib

patrickcuba · 1 March 2022 22:58

They look like REFs.

If you refer to the book I do mention how to treat ref data and codes,

Ref data is provided as autonomous tables & how to apply them to your data
Ref data is provided already resolved from source (no need to apply them)

Can a subclass be joined to other classes?

The way you have modelled it suggests that it does, I am betting it doesn’t and therefore modelling it this way breaks the “Unit of Work”. A subclass belong to a class and no other class (right?).

Again, ref data does not need a hashkey, I think they should be ref-code look ups, whether it is in the link or is in the sat is up to you and the business as you understand the data then I do.

For a relationship can the class+subclass change? How would you then model that?
You could consider a Link-Sat holding those ref-code lookups instead. That way the ref-codes could change for a relationship and the model naturally handles that change… because thats what a sat is built for.

saqib · 2 March 2022 05:20

Hi @patrickcuba . Is that section 6.4 in the book?

Thanks,
Saqib

saqib · 3 March 2022 22:56

Hi @patrickcuba , if we define the SUBCLASS and CLASS as reference tables, how will we keep a audit history of the changes to attributes on those tables?

patrickcuba · 3 March 2022 23:26

Ref tables can be historised too, doesn’t mean it must be a satellite table

patrickcuba · 3 March 2022 23:31

ah and I’m not sure in which chapter/ section i’ve got info on ref tables!

spachunuri · 3 March 2022 23:35

Yes. It is from Patrick’s book section 6.4. We have a bunch of reference tables. Per Patrick’s book, we defined a reference table as whatever falls under the lookup category. In our case, we track the data from the source system more of change data capture. So I have all the history if I need to but for consumption on Infomart, we just have the latest snapshot reference table. Per Patrick’s book, you can do it as a Hub reference and Hub reference satellite and join it for reporting but for us, reference tables did not categorize as a Hub per business modelling so ended up being reference tables mostly as I said above as lookups.

saqib · 4 March 2022 01:59

Thanks @spachunuri and @patrickcuba . CDC is one way, but if we don’t want to depend on the CDC, what is the best approach?

What do the Hub Reference and Hub Reference Satellites look like?

Thanks,
Saqib

patrickcuba · 4 March 2022 02:12

yea… you can go with the hub_ref sat_ref approach or simply build sat_refs with natural keys (no hashes) and in that way you limit the number of join tables

hub_refs would be in a sense centralise ref-codes from multiple sources and perhaps build a BV to consolidate it… it’s an approach but you’ll need to balance it with your requirements. For the simplest approach (KISS), refs with change tracking should be fine!

spachunuri · 4 March 2022 13:46

Can i ask how you are getting data from the source system now? and which data warehouse you are using?

saqib · 4 March 2022 14:56

Hi @spachunuri , we are using Fivetran with Snowflake.

Thanks,
Saqib

cmeyersohn · 12 March 2022 21:06

Saqib, there seems to be a bit of confusion regarding this book or that book. To stay DV2 compliant and to reduce confusion, my recommendation is that you stay with the DV2 standards as published by Dan Linstedt. I’m not familiar with the referenced materials, so I cannot attest to whether they are DV2 compliant or not - unless your are referring to your CDVP2 course materials, “Building a Scalable Data Warehouse with Data Vault 2.0”, or any of Dan Linstedt’s blogs and posts.
As it relates to reference data, keep in mind that the team has to determine where to draw the line between what is reference data and what is operational/transactional data. At first glance, and based upon your description, you could treat the relationship as a link with two dependent child keys and hang the descriptive data (Attr1, Attr2, Attr3, etc.) in the link satellite to capture history. The question is, does the class/subclass description change over time? When looking at this data from a reference data perspective, are the class and subclass identifiers buried across your enterprise operational systems as descriptive codes? If they are, then that should influence how the team chooses to design this data. Hopefully this is helpful for clarification.

saqib · 13 March 2022 22:56

Hi @cmeyersohn ,

Thanks for detailed response. This clears up the confusion I had earlier. Both subclass and class have attributes that can change over time, so we will go ahead and hang it as a SAT from the LINK.

Thanks again,
Saqib

Topic		Replies	Views
Modeling a dependent child Data Vault 2.0 dependent-child , link	11	4055	13 March 2022
Same as Link for Dependent Child Data Vault 2.0 business-key , dependent-child , link , same-as-link , hashkey	3	602	21 May 2022
Handling relationship between parties in Data Vault Data Vault 2.0	7	545	3 January 2024
Modelling Question on DV2.0 Data Vault 2.0	27	1875	14 December 2023
Query optimization for the Link SAT with Dependent Child Data Vault 2.0 dependent-child , link	12	853	7 June 2022

Two or more dependent children in a LINK?

Related topics