Hey all,
Here’s my (Healthcare) scenario: I have a hub entity called H_Payor. Each payor has a number of Benefit Plans. Benefit plans only exist in reference to a specific Payer.
Modeling question : Do we create a hub for Benefit Plans, and link them to the Payor hub, losing the information about the dependency (i.e. H_Payor and H_Benefit_Plan, linked with a Link table. Which doesn’t have any indication of the dependency unless it’s kept in the Link tables Satellite)? or is there some other approach which is preferred?
It depends on whether you consider a Benefit Plan to be a business concept?
If yes, then HUB for Payor, HUB for Benefit Plan, a LINK to show the relationship between the two, and if you want to split your attribute information because of the dependency you can have a HUB_SAT for Benefit plan which holds the information about the Plan that is specific to the plan itself regardless of the relationship, and then a LINK_SAT for the attribute information that is solely based on the relationship between Payor and Benefit Plan.
Alternatively, if Benefit Plan is not a business concept, then the Benefit Plan ID (or whatever your unique identifier is) can be considered a dependent child. From previous posts on this Forum there are a number of considerations regarding this:
Should the dependent child exist in the LINK (Payor to something else), or, should the dependent child exist in the LINK_SAT
If chosen to exist in the LINK, then should the dependent child be included in the LINK_HASHKEY generation?
In my opinion, it depends on your situation and what you can get from Source. What is your grain, does Source provide deletions or do you have to infer this, do you need peripheral tables such as Status Tracking Satellite or Status Effectivity Satellite.
If you search for ‘Dependent Child’ on the Forum you’ll find numerous related previous posts. Below are a couple of discussions which relate to the questions above:
@Carl , I have an extension question. If we have two or more hierarchical dependent children, for example, document → document_line → document_sub_lines. How do we model these.
Will document_line and document_sub_lines be a link tables join each other ?
Note: document_line and sub lines are not business concepts.
I can only give my opinion as I’m not a qualified Data Vault Practitioner and I’m still learning DV myself.
Please see above for considerations around whether to include Dependent Children in the LINK HASH KEY as well as:
Assuming you include the Dependent Children in the LINK_HASH_KEY, would the following work?
HUB_Document
LINK_Document - dependent children of both Document_Line and Document_Sub_Lines included in LINK and the LINK_HASH_KEY
HUB_SAT_Document
LINK_SAT_Document_Line - LINK_HASH_KEY and Load Date make up your Primary Key and you include attributes relating to the Document Line
LINK_SAT_Document_Sub_Lines - LINK_HASH_KEY and Load Date make up your Primary Key and you include attributes relating to the Document Sub Line
Alternatively, if you do not include the Dependent Children in the LINK_HASH_KEY:
HUB_Document
LINK_Document - dependent children of both Document_Line and Document_Sub_Lines included in LINK
HUB_SAT_Document
LINK_SAT_Document_Line - LINK_HASH_KEY, Document Line ID and Load Date make up your Primary Key (partition)
LINK_SAT_Document_Sub_Lines - LINK_HASH_KEY, Document_Line ID, Document_Sub_Line ID, Load Date make up your Primary Key (partition). Document_Line ID is required in here so that you can relate, via the LINK, which Document_Sub_Line IDs belong to which Document_Line ID.
You would need to JOIN from your LINK_SATs to your LINK using the LINK_HASH_KEY and the dependent children.