Loading a Link Satellite with Driving Key

Hello, forgive me if this topic has already been exhausted (I looked but haven’t found an answer anywhere), but I’m just not understanding something about how to load a link satellite using a driving key.

I’m dealing with a source system in which many key relationships are subject to update. My understanding is that, because link structures are insert only, a link satellite should be implemented that maintains the effectivity dates of the satellite data and key relationships.

The satellite and link implicitly have a subset of keys from the link that “drive” the updates of any remaining keys within the link and this driving key should be used to populate the satellite, maintain the effectivity dates, and track the correct hash key for joining to the link. All of this makes sense to me.

Where my confusion lies is in section 4.5.5 of “Building a Scalable Data Warehouse with Data Vault 2.0” (and other resources) the driving key appears to be excluded from the satellite table and only the link hash key is present. It’s even stated in that section that the driving key is not visible within the model. If the driving key is not stored in the satellite, how am I to identify the correct row for which to update the link hash key when relationships change? I have to have something to match on within my merge statement in order to update the link hash key. So I’m at a loss as to how this is done from a technical perspective.

If anyone has some guidance or even an example of a merge statement that would accomplish this, I’d greatly appreciate it!

your ultimate resource for effectivity satellite and driver keys

Hi Patrick and BMendes,

Thanks for starting this conversation. I have read both Patrick’s book and the article on Medium and I understand the concept behind effectivity satellites and also the staging process.

In order to select the current effectivity, it is necessary to join the effectivity satellite to the link table. This allows to select the hub hash keys which make up the driver keys and to detect changes of the non-driver keys. This is explained in detail on page 655 of Patrick’s book.

This approach works perfectly fine if we assume that the link load has successfully loaded all links into the link table. We might skip changes if the link load has failed and not all links are present in the link table. Has anybody ever come up with an approach that doesn’t require an inner join to the link table? It feels like an unlikely scenario, but everything else in Data Vault is very consistent, whereas here I can’t find a way around this small inconsistency without persisting the current effectivity during the effectivity satellite load in just another table…

Thanks a lot, Henning :slight_smile: