Need help on Building and Maintaining PiTs

DataDude · 27 October 2023 16:53

It’s the PiTs! It’s not often you get to trot that one out

So I’m building dimensions atop my Vault, and I want to build PiTs for each hub that’s going into the Information Layer, but reading the DV 2.0 book has left me with a clear concept but little clarity on actual implementation.

He keeps referencing using the Hkeys and it makes no sense to me (p155). Why wouldn’t you want the Hdiffs from each satellite in the pit instead of a bunch of Hkeys?
How do you actually build and maintain these things in Snowflake? I have a mandate of 2 years of info layer, what is a basic creation template?
What is the recommended maintenance method for rolling off old records?
I had planned to have my info layer be a view of the PIT, but there are some derived attributes that need to be created somewhere. Is it wise to store these in the PiTs? It would make the view simple obviously, but need some advice on that.

Thank you all, this community is a breath of fresh air!

patrickcuba · 27 October 2023 23:25

The entire series and your questions answered here, https://www.snowflake.com/blog/authors/patrick-cuba/

neil.strange · 28 October 2023 19:14

I can’t do better than refer you to Patrick’s articles. A PIT is a helper table that is only produced when you need one - where you have multiple sats off a hub or link and you need to align retrieval of data to a reporting calendar. The PIT is a pointer table - it tells you which records in a set of satellites are the correct ones for a given date. The PIT contains the primary keys of the satellites - the pk is the hash key + the load datetime. You use the PIT to join to the satellite to retrieve the payload columns you want to read. Patrick discusses the SQL to load the PIT and roll it forward. You might consider derived attributes to be Business vault components, derived by business rules and held in the Data Vault for use in reporting.

Topic		Replies	Views
Clarification on what satellites are involved in a PIT table Data Vault 2.0	2	623	19 July 2022
What are PIT & Bridge Tables? Data Vault 2.0	13	10508	15 August 2022
PIT for SAT of multiple Hubs Data Vault 2.0	2	413	21 December 2022
PIT table for Realtime Data Data Vault 2.0	1	202	4 November 2022
Technical Advice on SAL and PIT Related to Same Hub Data Vault 2.0	23	773	19 June 2023

Need help on Building and Maintaining PiTs

Related topics