How to Integrate Same-As-Links into PITs

We are currently integrating 2 source systems for employee information where the second source system contains many of the employees from the original source system. The business key in the 2nd system prepends the “employeeId” with a 3-character source identifier. For example, my ID in the first system is “000123456” and “GSS-000123456” in the second system (numbers have been obfuscated to protect the innocent…though one might argue my innocence…but that’s a different matter entirely). The second system also contains employees from external entities (“non-integrated entities” or NIE). Thus, you might see another employee ID as “ABC-7890” in source 2.

We have chosen to create a same-as-link for relating sources 1 and 2 with matching employees, such as my example from above. We are also separating out the 2 source systems into their own satellites, according to DV2 best practices, hanging off the common hub (hub.Identity - the common name for “employee” in our system).

We’d like to create a PIT for the Identity hub and its two satellites but are a little perplexed as to how we might be able to include the same-as-link to join the records together to avoid having 2 records each in the PIT for the same employee.

One thought we had was to first build a Bridge with the hub and same-as-link. Then we would build the PIT on top of that Bridge. Although, that idea seems like it would deviate from DV2 a bit.

Ideas, please?

I reviewed my certification materials and realized that same-as-links can be included in the creation of the PITs. Has anyone ever done this? Have any sample pseudo-code you can/want to share?

Noel, did you look at the code in the DV-101 Baseline Course? There are code samples for a properly built PIT and Bridge. Remember, how a PIT or Bridge starts its life - on the full key sets. You can also build a PIT/Bridge hybrid. In the Same-As link, are you mastering the system 1 employees to the system 2 employees - or vice versa? Which is acting as the master key? How are you handling employees in system 1 that don’t exist in system 2; or that exist in system 2 but don’t exist in system 1? Using the Same-As link as Master Data would allow you to build a PIT focused on the Same-As link and join to the associated satellites, flattening out the key set - if that’s what you’re trying to do. You can build a PIT/Bridge hybrid to capture all of the keys sets - link, hubs, sats - in one flattened structure. Hope this helps.

1 Like

@cmeyersohn - Yes, this helps significantly.

At this point, we are using system 1 as the master; although, both systems actually receive their integrated employee data from another source system, which is the true master. Thus, only a full outer join would properly capture the data either system has because of the possibility that some data may exist in one and not the other and vice versa.

I will review the 101 course. Forgot to look there.

Thank you for the confirmation!

No problem, Noel. Happy Labor Day!

Hi Team,

Where can I get more information for these scenarios? I have a challenge now that entails exactly this type of scenario where I have multiple systems providing data for employees and with different data structures.

Was thinking on what would be the approach to design the Hub and Satellites and how you would build the SAL. I could not find a lot of information about these.

Thanks :slight_smile: