Modelling Question on DV2.0

These are for streaming usecases because they are not idempotent, load to a link-satellite instead — I thought that was clear in the link (pun unintended) I shared.

Hi Patrick,

Yes, thank you. In my replies to yourself and Andreas yesterday I indicated I would be using the dependent child key in the Link Sat. :smiley:

1 Like

Hi Andreas,

Yes, we’ve already started to hit problems where the data does not conform to business process which is then breaking our ELT logic. For relationships which do not provide effectivity from source we have implemented the use of status tracking satellites against the LINK. We’ve ‘simplified’ (I think) these to only indicate whether the relationship is active or not (we don’t need effectivity dates for these particular requirements at the moment). Anything which has a latest status of “I” is active and anything with a latest status of “D” is inactive. These status tracking satellites are mostly enabling us to identify, and exclude, where data errors have occurred.

Does a Record tracking satellite (RTS) work in a similar way but records every time the record is seen and therefore has a larger volume of rows per record than the Status tracking satellite? Do you have any reasons why you would use one over the other?

Hi @patrickcuba,

I was discussing the choice to include the dependent-child key in the LINK SAT rather than the LINK itself with a colleague this morning and the discussion came to how we handle / track the deletion of a dependent-child key occurrence. The LINK HASH KEY in the LINK, and subsequently in the LINK SAT, could have a number of dependent-child key occurrences within it. So a Status Tracking Satellite (STS) on the LINK would not be able to not be able to provide us with sufficient information at the right grain to determine which of the dependent-child key occurrences had been deleted.

I have searched for other posts regarding this issue but these seemed to indicate that a STS on the LINK would not be sufficient and I found another post where you stated that you shouldn’t use a STS against another satellite (the LINK SAT).

So, how do you track the deletion of a dependent child in a LINK SAT?

Thanks in advance for any additional guidance you can provide.

Carl

1 Like

I use rts, but i delta load them. It will basically be the same as status tracking. Most important is to be able to exactly recreate the data that was delivered at any point in time.

Good morning,

STS tracks the insert, update and deletion of the parent hash-key but it is intended for the relationship or business object itself. STS is also an inference satellite table, i.e. you’re calculating whether the parent key has been dropped or not, while it is true that if you were loading the dep-key into the link-hash key you might be inferring the deleting of the relationship + dep-key, the optimal choice would be to get a deletion indicator from the source itself — therefore you would not need to infer anything, it would be fact.

Do not build a DV so you can fit an STS, STS is not a commonly used artefact that in fact also introduces the need for second-level staging — in the years I have done DV I have not seen anyone have the need for an STS.

Good Morning Patrick,

Yes, in some cases we do have a deletion indicator in the form of a status id. However, for other tables we do not have a deletion indicator on the source table itself. Some of these tables are audited though which tracks the changes against these tables and we would be able to determine a row deletion based on the Audit table. But, should we then include this as a derived deletion indicator on the initial source table, and, then bring this into the satellite so we can see that the latest row for the dependent-child key has a deletion indicator? This doesn’t sound right to me but I thought i would ask the question.

If the source doesn’t provide the deletion indicator then are we forced to move the dependent-child key into the link-hash key in order to be able to track a deletion?

With regard to STS I’m afraid we have already implemented an instance of this. Not for the reason of checking off a list of DataVault artefacts, but to help us identify deletions of relationships where the business have incorrectly created a relationship due to user error. Unfortunately, the source system doesn’t provide us with the deletion indication. The issue we had was that the HUB instances, and the related SAT information, of the relationship were still valid and correct but it was the relationship itself that was erroneous and as LINKs are immutable we needed a way to determine whether a LINK was actually still valid / accurate or whether it was an error.

Thanks again taking the time to discuss these points with me.

Carl

1 Like

Hi Andreas,

Ok, great. If they are achieving the same purpose then that works. :slight_smile:
Thanks,

Carl