Skip hubs in joins

Hi All,
I’m reasonably fresh to DV but have now built a PoC DV based on dbtvault (thanks guys) and I’m now learning the joys of extracting data from the vault. The joins are getting quite extensive. I have one query (loosely based on previous non-DV query) that is up to 25 joins.
My question is, do I need to actually join to a hub if I have the key from one link to drive into the next link in the chain. I can’t see why this would be an issue but I seem to remember reading always go hub → link → hub (repeat) but no specifics were provided for why. For my scenario I have many links that contain the hub key needed for the next link and I don’t need anything from the intermediate hub (or any of it’s satellites). Is there any material reason I shouldn’t go hub->link->link->hub?

Regards,

Ashley

It depends.

You can do a Link-Link navigation if all you are interested in is equi-joins and don’t need access to business keys (unless you put the business keys in the link like some practitioners do). You will probably start the navigation at a hub to drive the grain of the query. It is possible that hub records do not have corresponding link records so if you need a complete treatment of hubs you must include hubs in the navigation.

2 Likes

Thanks Neil. I couldn’t see why it would make a difference. I thought there might be some aspect that I wasn’t considering. This will certainly reduce the joins somewhat in the cases where no fields from the hub are projected or filtered.

candidate for a bridge table The Lost Art of Building Bridges. As a data vault grows with your… | by Patrick Cuba | Snowflake | Medium