Re-Assembling Split Satellite Tables

Our organization has adopted both the type of and rate of change data splits of our satellites. In doing so, I’m wrestling with how you reconstruct the entire record for the end user to use instead of having to know which split satellite contains the data that they’re looking for?

After profiling my table data coming into staging, I have 4 split tables as follows:

  1. Table name ends with _pii;
  2. Table name ends with _slow;
  3. Table name ends with _intr; and
  4. Table name ends with _fast.

The consistent columns on all 4 tables are:
1) The hub hash key; and
2) The load datetime value.

Within the descriptive data, there are 2 columns that can act as indexed to organize the data coming in. They are the applied date column and a version number. Unfortunately, they both are included in the _fast table. Using the windowing function, I can locate the record for a date in question from the _fast table. However, I don’t have any columns in the other 3 tables where I can identify the row as belonging to the record in the _fast table. I’ve thought of including those 2 additional columns in all remaining tables and ONLY loading them when a row in the table is created with the current values on the record. Is this the correct solution and what would the query look like to accomplish it?

Thanks for your time

Clay

Hi BigGuy,

All satellites should have standard DV-Tags columns:

  • DV_HASH_KEY_{HUB|LINK-TABLENAME}
  • DV_LOAD_DATE
  • DV_APPLIED_DATE (optional but recommended)
  • DV_RECSOURCE

At the centre of these satellites is the hub (of course), bringing the satellites together should be aligned by parent key (hub-hash-key here), applied-date (if applicable) and load date.
That said, there are two paths as far as I know to bring these together,

  • using a query to combine these without a PIT, Roelant Vos has a nice PDF from his website on how to do this, or
  • using logarithmic PIT managed windows

The beauty of the latter is that it is repeatable, and takes away the complexity of combining these structures away from the business user, i.e it is DRY.

For guidance on the latter refer to:

Hope this helps!

Hello Clay,

If you are using dbtvault, it has a macro for creating PIT that makes the whole process super easy. cc: @alex.higgs

Thanks,
Saqib

Thanks for the heads. I’ll check it out.

Clay

It most certainly did. Thank you.

Clay