I need to model a couple of complex ragged hierarchies. One will have about 45k nodes and the other about 200k nodes.
I am concerned the standard data vault way of using a parent child approach and then using a recursive cte will cause performance issues.
The model outlined by Kimball of using a bridge table (chapter 7 in The Data Warehouse Toolkit; Ragged Variable Depth Hierarchies), scales very well and is elegant. My thinking is that the bridge table would become a link table, with two keys in the same way as outlined in DV2 for dealing with hierarchies. These keys would link to hubs and satalittes for data about the nodes. It would need a three additional columns to store depth from parents, highest parent flag and lowest child flag. These could go into a separate hub satellite combination so not break the model.
I would like some feed back for any flaws in my thinking.
Thank you in advance.