Storing the start and end date for a link in effectivity satellite or "plain" satellite

I am currently modelling data vault 2.0 tables for one of our sources and I’m a bit stuck when it comes to handle validity dates.
What the source delivers us, is the relationship between persons and certificates. Each person can have multiple certificates (Person A has a Data Vault Practioner and the Data Vault Architect certificate). Each of those certificates does come with a ValidFrom (date the certificate was obtained) and ValidTo date (expiry date) and some additional descriptive columns (content, issuer etc)

The obvious choice is to model it as a link between the person and certificate hub. However, I’m unsure about how to store the validity dates.
Reading the Data Vault Guru by Patrick Cuba it says the following:

Don’t mix with the definition of a source-supplied effectivity (sat_mdm_party_address) with data vault’s own effectivity satellite, their purpose is the same but an effectivity satellite is the result of the execution of business rules staged through the identification of a driving key; rather than the source already supplying the business rule output in raw vault. Effectivity supplied by the source is merely hashed into a record hash (hashdiff) column and loaded like any other satellite load.

Building a scalable data warehouse from Linstedt and Olschimke states the following:
The begin and end dates are not system generated. Instead, they have to be provided from a data source, for example the audit trail of Microsoft Master Data Services, effectivity dates within the master data, change data capture (CDC) or any other audit trail from operational systems. In order to pass an audit of the data warehouse, it must be possible to trace back the dates to the source system. Effectivity satellites are only reasonable if the source system provides the effectivity dates.

I have the feeling both sources are contradicting each other. While one is saying that source provided business effectivity dates are loaded into the link satellite, the other one says if it needs to go into a seperate effectivity satellite. Which approach is more reasonable?

1 Like

My suggestion is store everything that the source provides in “plain” satellites. Use RTS to be jedi compliant and keep track of exactly what was delivered and when.

Is certificate really a business object in your scenario?

Start and end dates from a source are just source supplied attributes, it can get complex if you consider a source with start and end dates updates the end-date of the previous now inactive row. To a change tracking mechanism this may appear as a change in itself. As long as these are not merely metadata-columns added by your ELT/ETL tool then you might consider,

  • If loading to a regular satellite you may integrity issues, two “active” records in a regular satellite is not allowed
  • Declare the start-date as a dependent-child key to load a sat with a dependent-child key
  • If you need aLL records to appear active then you might consider an MSAT

It really depends on what you want to get out of it! But don’t default to an MSAT is that is not what you’re expecting from source!

Dear Patrick, dear AHenning,

thanks for your responses.

We figured out that our certificate is not a business object in our DV.

Therefore we decided to implement a person_certificate_sat which has the person_hashkey, ValidFrom, ValidTo, and certificate (name, description, issuer) columns.

Let’s say it is not multi-active, but we have a dependent-child on the ValidFrom and ValidTo.

We could then have something like that:

Person_HashKey | ValidFrom | ValidTo | CertificateData | LDTS
PersonA 01.01.23 31.05.23 AWS 01.01.24
PersonA 01.06.23 31.12.23 AWS Advanced 01.01.24

Two questions I have here:

  • From my understanding this is not a multi-active satellite as we have the dependent-childs. Is that correct?
  • How would you proceed if the source is deleting row 1 (01.01.23-31.05.23) and you wouldn’t want to show this entry anymore (mark it as deleted)? Do we need any kind of supporting satellite?
1 Like

I find it odd that you ask how to model the relation between person and certificate and at the same time say that certificate is not a business object.