The more you add as Dep-Keys the slower it will be!
Judging by your code you’re looking for a single record for hk + all the dep-keys… should just the hk suffice? I mean you can make it two step, get the latest record by hk, then get the dep-records you want? no?
If this is a Link Sat, then you are storing the dependent keys in the link itself, correct? From there, you are concatenating and hashing all the keys (including dependents) as the link_hk. Thus, you are storing the link_hk + ldts in the satellite, correct? This would eliminate the need for using the row_number function, since you’ll have the composite hash and load date together in the Link Sat.
Am I way off base in my understanding of the situation?
Hello Saqib!
Without knowing on what kind of environment you are running on my suggestion might not be faster. But try to extract a subset in a inner query and then join back to the table.
ex of inner query;
select max(loaddate), hashkey, prime_dep_child_no, sub_dep_child_no, base_dep_child_no
group by hashkey, prime_dep_child_no, sub_dep_child_no, base_dep_child_no
full query:
select x,y,z
from satellite s
inner join (
select max(loaddate) as ld, hashkey, prime_dep_child_no, sub_dep_child_no, base_dep_child_no
group by hashkey, prime_dep_child_no, sub_dep_child_no, base_dep_child_no
) lastrow on lastrow.ld = s.load_date and lastrow.hashkey = s.hashkey and lastrow.prime_dep_child_no = s.prime_dep_child_no and lastrow.sub_dep_child_no = s.sub_dep_child_no
Without the use of a window function you are able to reduce the risk table spooling (MS SQL Server).
Another note is that if the childfields would reside in a Link then the complexity of extracting data would be less complex (and faster).
Another note2: It is strange (my opinion) that you choose to have the same grain for order and orderlines or is it something that I missed here?
Thanks @Nicruzer and @AHenning . We have the dependent child(ren) as part of the LINKHASHKEY. So I should probably be using the LINKHASHKEY to partition by, rather than one of the Business HK participating in the LINK. Let me try that out.
dep-keys are degenerate dimensions that better qualify a relationship.
If you include them in the link-hash-key generation they no longer represent the participating business entities and the link-sat itself will be defined at a deeper grain that the relationship depicted in the link.
You will effectively create a link record for every combination of participating hub-hash-keys + dep-keys and the link ceases to be a unique list of relationships
it’s part of the reason why you would choose to either have the Dep-Key in the Link vs in the Sat…
in the link — these attributes are applicable to the relationship forever and never change
in the sat – these attributes change and we track those in the sat+dep-key design
That way, in choosing one or the other with the desired purpose, your model will scale