Hi
I would like to validate a code template of virtualized dimension, that considers status tracking satellites.
Concept_PIT - regular PIT for a Concept, indexes both regular but also status tracking satellite
Concept_SAT_S_source1 - regular satellite from source1, golden source of business keys of a Concept
Concept_SATST_source1 - status tracking of the source1
Concept_SAT_S_source2 - regular satellite from source2, it’s a descriptive information for the concept
Concept_SATST_source2 - status tracking of the source2, tracks whether descriptive information was deleted or not
I wonder whether it’s a good practice to index all satellites both regular and status tracking and use as below
create view Concept_DIM
as select
pit.pit.concept_dim_id DimConceptId -- surrogate key dimension (autoincrement added when populating PIT)
,pit.BusinessKey -- business key of the entity
,ISNULL( CASE WHEN sat1ST.IsActive=1 then sat1.AttrA1 END , -- check whether latest source1 entry is active
CASE WHEN sat2ST.IsActive=1 then sat2.AttrB1 END -- check whether latest source2 entry is active
) Attribute1 -- simple attributes are calculated using coalesce
...
...
, sat1ST.IsActive -- flag which says that bKey was deleted in to source system (calculates only based upon golden source of business keys)
, pit.SnapshotDate -- snapshot date is exposed, so that it can be used in downstream use case by use case
FROM
Concept_PIT pit
JOIN Concept_SAT_S_source1 sat1 -- golden source of the concept keys
ON pit.hk_concept_source1 = sat1.hk_concept and pit.dv_loaddate_source1 = sat1.dv_loaddate
JOIN Concept_SATST_source1 sat1ST -- status tracking satellite for source1
ON pit.hk_concept_source1_ST = sat1ST.hk_concept and pit.dv_loaddate_source1_ST = sat1ST.dv_loaddate
JOIN Concept_SAT_S_source2 sat2 -- source of the descriptive information
ON pit.hk_concept_source2 = sat2.hk_concept and pit.dv_loaddate_source2 = sat2.dv_loaddate
JOIN Concept_SATST_source2 sat2ST -- status tracking satellite for source1
ON pit.hk_concept_source2_ST = sat2ST.hk_concept and pit.dv_loaddate_source2_ST = sat2ST.dv_loaddate
e
or simplify and while loading the pit already enter 0x000 keys in PIT hash keys to regular satellites when given business key was absent at the point in time?
create view Concept_DIM
as select
pit.concept_dim_id DimConceptId -- surrogate key dimension
,pit.BusinessKey -- business key of the entity
,ISNULL(sat1.AttrA1,sat2.AttrB1) Attribute1 -- simple attributes are calculated using coalesce
...
...
, IIF(pit.hk_concept_source1,0x00000,0,1) -- flag which says that bKey was deleted in to source system - check whether there is an active entry (<>0x0)
, pit.SnapshotDate -- snapshot date is exposed, so that it can be used in downstream use case by use case
FROM
Concept_PIT pit
JOIN Concept_SAT_S_source1 sat1 -- golden source of the concept keys
ON pit.hk_concept_source1 = sat1.hk_concept and pit.dv_loaddate_source1 = sat1.dv_loaddate
JOIN Concept_SAT_S_source2 sat2 -- source of the descriptive information
ON pit.hk_concept_source2 = sat2.hk_concept and pit.dv_loaddate_source2 = sat2.dv_loaddate
For the PIT-less implementation would you build code as below or differently?
create view Concept_DIM
as select
hub.Concept_dim_id --??? surrogate key dimension in a hub?
,hub.BusinessKey -- business key of the entity
,ISNULL(
CASE WHEN sat1ST.IsActive=1 then sat1.AttrA1 END, -- check whether latest source1 entry is active
CASE WHEN sat2ST.IsActive=1 then sat2.AttrB1 END -- check whether latest source2 entry is active
) Attribute1 -- simple attributes are calculated using coalesce
...
...
, sat1ST.IsActive -- flag which says that bKey was deleted in to source system
, hub.SnapshotDate -- snapshot date is exposed, so that it can be used in downstream use case by use case
FROM
Concept_hub hub
JOIN Concept_SAT_S_source1_current sat1 -- golden source of the concept keys
ON hub.hk_concept_source1 = sat1.hk_concept
JOIN Concept_SATST_source1_current sat1ST -- status tracking satellite for source1
ON hub.hk_concept_source1_ST = sat1ST.hk_concept
JOIN Concept_SAT_S_source2_current sat2 -- source of the descriptive information
ON hub.hk_concept_source2 = sat2.hk_concept
JOIN Concept_SATST_source2_current sat2ST -- status tracking satellite for source1
ON hub.hk_concept_source2_ST = sat2ST.hk_concept
Thanks for review and comments,
Marcin