Creating information mart table from a Link-Sat with history

Hi All,
Hope you are well!

I have a question on how to create a model in the information mart corresponding to a specific Link-Sat combination. Creating a fact table might be an obvious answer, but the satellite hanging off the link stores history, hence want to make sure that this the correct approach.
Explaining through an example

If you look at the first snippet
Link_Sat_1.JPG

There is a Hub and Sat Product which consists of the base attributes of the product
There is a Hub and Sat Banner with the details of the retail banners under which these products are sold
Then there are banner specific attributes of a product like the website specific description, website URL etc, that I have put in a satellite hanging off a link between product and banner.
Now these attributes are non-additive and can change over time as shown in the example for max purchase quantity.

Now if you look at the second screenshot for the information mart layer

Link_Sat_2.JPG

It shows an illustration of Hub & Sat Product being converted into dim_product
Hub & Sat Banner being converted into dim_banner.
The link between product and banner and the corresponding satellite converted into a factless fact table with product_key and banner_key

My concerns are the following:-
This fact table will need to store records as a SCD type 2 to reflect the changes captured in the satellite.
This fact table will contain all attributes from the satellite making it quite wide

Do you think this is the right approach?
I don’t want to manufacture a dimension which is a combination of product and banner.

Please let me know what you think.

Link-Sat for your facts ← oh yes!
Aggregate the metrics and bring in the identity columns to create a star schema, even better

Thanks @patrickcuba . Appreciate your prompt response.

Just to clarify, in the given example, are we saying it’s ok to have a wide SCD type 2 fact table?

nope — no where in the article does it state the fact is a an scd type 2 table; please review

That was my whole question. Can you please look at the example that I had shared originally and kindly suggest what should be best way to do it.
It’s not a regular scenario.

i did … the satellite in a DV can be used to support an SCD AND fact table, remeber sats don’t have end-dates

Apologies, maybe I am unable to explain my question.

The satellite hanging off my link needs to be converted into a SCD type 2.
The grain of this satellite is a combination of keys coming from 3 different hubs which forms the link.
These attributes can’t be aggregated, like I have shown in the example.

How do I convert this Link + Sat into a fact and dimension?
There is no natural key for this dimension as it is a combination of 3 concepts coming together.

Note: These 3 concepts will have their own separate dimension tables containing their base attributes.
These attributes are over and above that.

Maybe the problem is how I have done the DV layer itself?

Hi Blitz,

From what i can understand, I think you are trying to create a Factless Fact table? Or, is your Fact table tracking the Sales of Products?

I’ll propose an option based on my understanding but perfectly happy for those more in the know to correct me if I’m barking up the wrong tree.

If you are trying to create a Factless Fact table then I think you need to determine a grain at which you want to be able to track the ‘Facts’. For example, do you want to know what the Product, Banner and Product_Banner attribute combinations are for every day, or, at the end of a month?

I think you need to determine your grain and then utilise the approach of a creating a PIT table to assist in the creation of your Fact table. Keep the attribute information in the Satellites / Dimensions (all 3 of them). Your Fact table should then include a Date column (to denote your grain) and then Dimension Key columns for each date (row), which link back to your Dimensions to retrieve the correct attribute information for that particular date.

Your Dimensions are created from your Satellites. You will need to create a DimensionKey to denote row uniqueness. You cannot use the HUB Key as this will not denote row uniqueness.
You cannot use your DiffHashKey as this could also not be unique if attributes revert back to a historical combination of attributes. You may have a sequence number column you can use. Or, in my case I created another HashKey value by combining the Hub_Key and the LoadDate and then utilised this as my DimKey.

Your Fact table is now no longer an SCD2 as the attributes exist in the Dimensions (which are created from your Satellites). But, you need to determine a grain that meets your need for the Factless Fact table.

Alternatively, if your Fact table is tracking the sale of Products then your Grain is the date of sale but you still need to include DimensionKeys for each of the 3 Dimensions to return the correct attribute information for the date of sale.

I hope I have helped in someway, even if to generate a clearer discussion :slight_smile:

1 Like

Thanks Carl, yes I am trying to create a factless fact table, however the I can’t move the attributes into individual dimensions, since these attributes exist against a combination of Product and Banner.

Like “maximum quantity” you can purchase is not just a banner or product attribute.
Ex.
You can purchase 6 bottles of Cat amongst pigeons from Liquorland
You can purchase 8 bottles of Cat amongst pigeons from Vintage Cellars
You can purchase 3 bottles of Jacob’s Creek from Vintage Cellars

As you can see this value differs for the combination of Product and Banner
Also this maximum quantity can change over time and hence the need of storing SCD2 data,

So to answer your question I want the product-banner attributes to be tracked whenever they get changed, similar to tracking changes to only product attributes or only banner attributes.

Product-Banner in itself maybe qualified to become a ‘manufactured’ dimension, if not a factless fact, but I am not sure if that’s the right thing to do - hence the original question.

Hope this clarifies my predicament.

Hi Blitz,

It looks like your sat_product_banner is a Link Sat off of link_product_banner. So, can you not treat sat_product_banner as any other Sat and track your attribute changes over time? Your ‘Product Banner HK’, from the link, acts as a sort of HUB_Key but then you create a Hash Diff Key to track changes in your SAT. If needed as a Dimension later you can just use your sat_product_banner SAT as a source for your dimension and create a DimKey by either creating a hashkey of ‘Product Banner HK’ and ‘LoadDate’ or by using a sequence number if you have one.

Does that sound about right?

Carl

Hey Carl,
“It looks like your sat_product_banner is a Link Sat off of link_product_banner”

Correct, if you look at my original post, that’s exactly what I have shown in the diagram :slight_smile:

My problem isn’t tracking the changes in DV layer, that’s very straightforward.
My question always was regarding converting this structure into either a factless fact or a dimension table, and both of them don’t seem to be an ideal approach

  • With a dimension approach - I am creating a new business concept of Banner-Product which doesn’t really exist in real life, even though there are attributes against it.
  • With the factless fact approach - I need to create a really wide fact table with SCD2 type data,

My question is which is the better approach of the two or is there a better way of handling such a scenario.
Unfortunately this type of scenario is not covered in Dan’s book or the links that Patrick shared.

Hi Blitz,

Ok, I get what you’re asking. My gut is leaning me towards thinking that the Dimension approach would be the better option, however, like you, i’d be interested to hear what other people think.

As a side, could i ask which tool you use to show the ERD like image showing the relationships and DV Objects?

Thanks,

Carl

Thanks Carl, yes I am leaning towards creating a dimension as well.

The tool is called Lucid charts. It’s not that great tbh, a lot of manual effort required :sweat_smile:

1 Like

Dimension seems to be the best choice. When I generate marts from DV I create dimensions from satellites and facts from non historized links. That makes is easy and straight forward.