Loading of Hub's and Links

We have a reasonably disciplined shop where consistency exists within various applications when referencing an employee. For example, we have four applications that all use the same key value that identifies the employee. So, the hub load compares the hashkeys, and the record is not loaded when they’re the same. Perfect. Now, I understand that the BKCC code differentiates the same hashkey values for different rows. In the pipeline to load the hub, what is the orchestration flow if a duplicate occurs when the business key value is the same but represents a different employee? In other words, what is the logic used to identify this situation?

Why do you need to “identify” it in the loader?

Bkcc is used to distinguish the same bkey value loading to the same hub but are different business objects, you will see multiple rows in the hub

You say your apps have the same bk value used across apps that are the same business objects then they don’t need bkcc’s or at least share the same bkcc. This is excellent and serves Passive Integration well. You will see a single row for the same bk value

As you know bkcc is used in conjunction with the bk to create the hk.

You will not get duplicate keys in the hub.

The image below is from Dan’s book

The spreadsheet below contains the example of the loads for the employee hub.

The processing for John Doe resulted in just 1record being added to the hub. However, the activity “Retrieve Distinct List of Business Keys from Stage” will produce the same HashKey based on the calculation even though its a different employee. Question – what is the logic used to identify Frank Smith as a different employee even though his employee number and Hashkey is identical to John Doe. The BKCC should contain a value to reflect that it’s not the same resulting in a different HashKey thus resulting in him being loaded into the hub.


Of course!

Bkcc makes up the hashkey with the bk

If they are different business objects with the same bk then you must use a different bkcc

What am I missing?

I guess it’s what you’re referring to as a “business object”. In my example, my interpretation is I have 5 different applications (4 that are the same employee and 1 that’s different) business objects. Where am I wrong?


The different bk should not be ignored
Give that the non-default bkcc

This allows for equi-joins between a hub and sats and links and you will only return the relevant results for that business object

Hey Patrick

The good news is that we’re both on the same page here. We both agree that the correct approach is to replace the default for the bkcc with something that would result in a different hashkey for Frank Smith. Where we’re having a disconnect is the logic in the automated processing to distinguish the different employees. From my previous example, you can determine by referencing the employee’s name that it’s not the same as the current 12345 hashkey in the hub. But that only fixes this one scenario. If at all possible, how would you code this to have the bkcc determined at runtime? Perhaps a soft rule for those duplicate hashkey values to determine if in fact it is a duplicate before ignoring the record. If not a duplicate, then add a value to the bkcc and recalculate the hashkey eventually resulting in the Frank Smith record being added to the hub.

Bkcc is not determined at runtime
As I have said, it is included in the generation of the hashkey and therefore is impossible to join to the wrong record in the link or sat

Thank you, Patrick. :+1: