When you have a sat with a dependent child and no CDC, how do you implement a Record Tracking Satellite (RTS)? Normally the RTS sits on the hub and tracks BKs. However, the BK grain is not sufficient to track a dependent child. If only one of the sat rows are removed the BK is not deleted, but it is still necessary to know which row has been deleted.
E.G. Here’s Monday
On Tuesday one of the rows is deleted from source, so we need to be able to report that
Thing1 | 2 | Monday
has been deleted and is no longer current.
Thoughts on dealing with this? A standard RTS would not be at the needed grain to report on this. Adding a dependent child on the RTS feels odd since it would not be needed for every satellite attached to the hub. A separate RTS satellite for just the satellites with a single dependent key seems plausible if a bit clunky.
Thankfully this is theoretical for me since I haven’t been stuck with this situation yet. I’m looking forward to people’s thoughts on the situation.
RTS is built to track Bkeys if it hangs off a hub and the relationship if it hangs off a Link.
RTS would then be tracking that something happened against that parent hub or link hash key, if you wanted to track the dep-key why not refer back to that satellite?
Is the thought putting a RTS on the satellite itself? That makes more sense from a model perspective than trying to track the satellite through the hub, given the different grain. I’ve never tried to implement sat-sat. Very interesting.
Sat relates to a hub or a link, you’ll never a see sat-sat relationship!
That was my impression. So if that’s not what you were implying, what did you mean? I’m not following when you say
hmmm… maybe have a read what is RTS meant for. You might be trying to do too much in RTS.
As I said before RTS is about tracking the parent key, hub or link — it is not about tracking dep-keys.
You seem to think I want an RTS focused solution. I’m interested in thoughts on handling the described scenario. I misunderstood your response and thought you had a solution in mind. I appreciate you sharing your thoughts on what wouldn’t work, though.
It’s you not me, check your initial post
Yeah I can see that being unclear. I am interested in any approach, not only an RTS one.
Hi @Christopher . Thanks for posting this. I have been thinking about this as well, but did get around posting it. Perhaps community experts can shed some light on this.
We use Fivetran so we get the
_FIVETRAN_DELETED flag. We can use that to handle the deletes in Dependent Child. But probably there is a better way to handle this at the DV level.
This is a very good question. I have a somewhat unorthodox solution to your problem. I was facing the same issue and came to the conclusion that using a Multiactive satellite won’t track the deletes as you mentioned.
Create a peg leged link and insert the DepChild value into the link. This enables you to handle each child in a satellite, “row by row” (no need for a multi active satellite anymore).
After that you create an Effectivity satellite connected to your peg leged link.
All descriptive data for each child + BK can be versioned in a regular delta satellite.
Wheater a bk + dep child is effective or not can be derived from the Effectivity satellite that is connected to the peg leged link.
There will be a lot easier to handle different versions in the satellite, if you dont use a multi active satellite. Data can also be split to either be versioned by the hub or by the hub + dep child (peg leged link).
You can derive the effectivity of the BK from the effectivity satellite connected to the Peg Leged Link.
Please come back to me if this is unclear, It is a little bit messy!
Forgive me, @Christopher , I am a little confused here. I thought dependent child keys were only part of Links. That dependent child key then becomes part of the link’s hash key, making the whole relationship unique (Thing1_hk + Thing2_hk + DepChild = Thing1Thing2_link_hk).
Can you clarify for me when there might be a situation where a Hub would contain a dependent child key? I mean, if there is a dependent child, doesn’t that imply a relationship and thereby a link is required?
I’m not trying to be critical. I sincerely am looking for clarity here.
They can be used in Satellites too mate.
Full list of RV-sats:
- regular satellite table (SAT) - single row state of the parent entity (hub or link, mutually exclusive)
- satellite with the dependent child key (SAT+DEP) - single row state of the parent entity + dep-key (hub or link, mutually exclusive)
- multi-active satellite (MSAT) - single/multi-row SET state of the parent entity (hub or link, mutually exclusive)
- record tracking satellite - single row occurance of the parent entity (hub or link, mutually exclusive)
- status tracking satellite - single row status of the parent entity (hub or link, mutually exclusive) of snapshot source
- effectivity satellite - tracking the driver key relationship over a link
- non-historised satellite - by definition the source is always a change then there is no need to check if the incoming data is new, micro-batch/streaming use case. High velocity data loads.
SAT, SAT+DEP, MSAT can be expressed as BV artefacts too
The dep-key can be modelled in links too.
Peg-legged links is not best practice and not recommended for Data Vault models intended to scale. I don’t even mention it in my book because you’re essentially building something that really just creates tech debt, managing a table for the sake of filling a checkbox of data vault tables you’ve constructed.