I’ve got a modelling challenge.
I have to model account limits. My source file contains (in its simple form):
AccountNumber, LimitDate, LimitAmount
The PK is AccountNumber+LimitDate. Basically, it’s telling me per AccountNumber, what all the LimitAmounts are per all future LimitDates.
I’m getting full loads. For each AccountNumber, I can get up to 500 different LimitDates.
I’ve already got an Account Hub. If I were to model this as an MSAT on Account, every time a new LimitDate gets added, or when one single LimitAmount changes, the whole set for that AccountNumber gets inserted. That feels like a lot of extra rows that I keep inserting every time. The upside is that it does track whenever a AccountNumber+LimitDate gets deleted from the source (which can happen).
I could create a Hub with a composite BK of AccountNumber+LimitDate, but then I would include a date field which is sort of an anti-pattern. But, whenever one AccountNumber+LimitDate gets added or changed, it will result in only one inserted row in the satellite. And the deleted records I can easily catch in a status tracking satellite.
If I were to create a DepKey Satellite, I would restrict the amount of inserts as well. However, tracking the deletes is much harder. And the pattern itself hasn’t been implemented in dbtvault, as far as I’m aware.
Would love to hear everyone’s thoughts. “Just go ahead and implement the DepKey Sat pattern yourself” could be a valid one
My suggestion is that you create a link on hub account and limit date, L_Accountlimit.
Put an rts connected to L_AccountLimit.
Create a delta satellite for limitamount and connect it also to the link.
Why: jedi safe for full loads with deletions in the sourcesystem. Maybe the solution that stores the least amount of data. Not messing up the account hub.
Thanks for the suggestion! That is an alternative I have pondering over… but then I would end up with a peg-legged link which I rather don’t. It’s not a pattern that our technical implementation currently supports and I prefer to keep it that way.
AHenning like DV anti-patterns, don’t worry MSATs should be fine and should compress well anyway.
Dates in a hub is nonsense and not a hub, id hate to see those modelling DVs that way. These are destined to be chucked into the pile of failed DV implementations.
Deletions could be detected by comparing SETs and if detected could also be persisted into a BV status tracking satellite
Wich compressing algoritm works so well that you can load 499 rows instead of 1 and then not worring about it?
Is this Trivia…………you’ve got me, which one?
Is it trivia? Dont you care about data redundancy?
Which algorithm? I’m stumped.
Data redundancy and data compression are two very different things.
Compression and redundancy are not the same. But please understand that it is more efficient to store as less data as possible.
LOL… still no algorithm hey? When you recommend to people to use dates as a part of or as business keys in hub tables you lose all credibility. Fake Vault.
“Please understand” … I do… I don’t think you do. Which through the forums you have shown time and time again that you don’t understand DV, or what it’s trying to accomplish.
Please read my suggestion again. I did not recommend Frenk to put the date field in the hub. I dont understand why you misunderstand me?
I am open to discuss pros and cons of data vault. That is what a forum is used for. I have for the last 10 years built dv solutions for clients.
The question about the compressing algorithm is directed to you.
I’m facing similar challenge - detecting deletions from MA satellite.
We’ve implemented plenty of MA satellites using various dependent child keys in sats (dates, statuses, variants). When dependent child key is used in sat we save either entries in satellite (we don’t compare groups of data but respective entries) and we don’t use peg legged links.
Though what gets more complex is the detection of deletes and I have not seen a good example of code pattern to handle deletions for MA sats.
Have you blogged about it already? Can you share the pattern of populating it and then using in BV (e.g. building fact / dimension view)?
Maybe, for fun, try a peg link with an rts. Maybe it makes you happy.