We have a business process where the the Manufacturer is a Dependent Child i.e. it only holds meaning when tied to a Dept. We are planning to add that Manufacturer ID to the LINK_ table as a dependent key as following:
Is this a good approach to model this process?
- links are stateless, as that id there breaks link.
- you need hub for that entity and add new hk from new hub into link
- depending if system in production, data quantity, etc… when a model change (e g. one new hub key on the link) that changes the grain of the link, and as always in data vault we dont drop stuff, delete stuff or update stuff… with these 3 ideas in mind its not too far to think that you need a new link table with same columns + new hub column, then a business view that union all both link tables assigning some zero-key (avoid nulls) on all records from old video.
I wonder what more experts in here think on this ?
Hi @emanueol . Thanks for the response. Since Manufacture is a not a Business concept for us and only holds meaning when tied to a Dept. As such, we can not create a HUB for it.
In that case its part of the link context, so added it to link satellite, if currently theres none, just do the following:
And finally because link can change or die, you can also add effectivity columns:
- add column EFF_BEGIN date, datetime or timestamp
- add column EFF_END date, datetime or timestamp
In case you need to change the MFG_NO_INT you just terminate in the satellite SAT_DEPT_MFG row:
- update EFF_END to current/some date, datetime or timestamp (from null or magic date).
- insert new row on SAT_DEPT_MFG with new value of MFG_NO and of course the EFF_BEGIN with current/some date, datetime or timestamp.
Hope this helps, makes sense and that more experienced DV like Cuba, Kent, Veronika etc agree with my assessment about your business case… A link its just a link, columns of HUBs, all context is on LINK satellite.
A) a dep-key in the link denotes that the dep-key is a part of the relationship forever, like a code or degenerate dimension; this relationship (dept + vendor + brand) will never have the same dep-key between them. Likely the offshoot to describe that dep-key is a ref table.
B) a dep-key in the link does not participate in the generation of the link hash key
C) if the attributes associated with the dep-key change then would this be better to include that dep-key in a link-satellite as a dep-key to that parent link-hash-key? Advantage of this approach is that you reduce the number of joins needed to get the data out and the resolution to which applicable dep-key attributes are appropriate to the recorded relationship at that point in time is resolved. Less complex SQL to get that data out. This of course means the data from source is already embedded with that ref-content! If it is seperate then it should be loaded separately and only resolved upon query time.
Am reading the diagram right? You have generated an Effectivity Satellite (SATE)?
Where did BEGIN and END dates under SATE come from?
- if it is from source then the satellite is a normal Link-Sat, you may choose to split the satellite between the effectivity columns and the other attributes if you like but that just means you’ll have to resolve them at query time should the marts/reports need them (adds complexity that might not be necessary)
- if you have derived those effectivity columns then yes this should be an effectivity satellite where we only care about tracking that movement, all other attributes describing that relationship goes to its own link-satellite
Try to represent the Business Process / Value Stream Stage in your model, but also keep in mind not to make the interpretation of the model too complex that you have to do mental gymnastics to try and get it out!
Effectivity Satellites constructed by deriving the relationship against a driving key is very complex; use it if you really need it, I’d rate that as one of the most complex builds and queries in a Data Vault model. Definite examples in the book!
HI @patrickcuba . Thanks for the detailed response. This is very helpful.
A) Yes, in our case the MFG is part of the DEPT_MFG relationship for all data records. We are thinking of putting the descriptive attributes for the MFG in the SATL. Is this is a reasonable approach?
B) Got it. Thanks. This is one area I was not sure of.
C) Sounds good. We will include the MFG_NO (Dependent Key) in the SATL
D) We will have to derive those effectivity columns. This is definitely complex. What we are trying to capture is when a relation between a DEPT and a MFG is removed from the system. Is there a better approach to capture this?
D) 1st prize, get the source to send that to you as a flag,
2nd prize, EFS = identify Driver Key, code in book
Alternate Status Tracking Satellite off a Link
Hi @patrickcuba ,
My colleague just corrected me, we do get the deleted flag. So we don’t need a SATE or SATS.
Thanks again for all the guidance!
Hi @patrickcuba . Just to confirm. Does the dep-key participate in the generation of the link HK?
Saqib, you are correct. This appears to be a dependent child and is used to maintain the unit of work of the transaction represented in the link. You are correct in modeling it as such. Just make sure that it is included in your link hash string.
The caveat here is that manufacturer is not a business concept, that it meets the litmus test of a dependent child key (manufacturer id) (i.e., cannot stand alone and have meaning without the context of the relationship). If this is truly the situation, then it is a dependent child and should never be modeled as a hub, and as a DV2 standard should be included in the link object, not the satellite. We talked about this in class. Moving the dependent child to the satellite changes the grain of the relationship represented by the link. The standard for DV2 is to keep the unit of work together (the grain).
Also, with the Effectivity Satellite, don’t overload what the satellite represents by mixing meaning. The effectivity sat should be tracking the status of the relationship of the key set represented in the link. The team needs to be very clear about what a deleted flag actually means in the context of the Effectivity Satellite if you make the design decision to add it to the structure.
Sorry for the late response … I’ve been heads down in CDVP2 training for the last two weeks.
As your instructor, check your email for a little gift to help with all of this.
Hi @cmeyersohn ,
Thank you for the clarity. This is very helpful. I really appreciate you taking time to respond.
Yes the Manufacturer can not stand alone without the dept.
So just the confirm, the LINK_HK would be constructed as following:
DEPT_BK + VENDOR_BK + BRAND+BK + MFG_NO
Hey, Saqib. It looks like you’ve got the structure of the Link hash string correct based on the business keys and dependent child key that you’ve outlined. Don’t forget the delimiters and all the other pieces of the hash string construct formatting pattern (I know you’ve got this!). I’ll forward the email that I sent to you at your Cohesity email. I’ve got a “gift” coming for you …