Hi everybody!
I need some advice regarding best practices in hub and defining the business key - precisely on how to represent a recycled business key.
My situation:
I have a source table providing the business key (Client Number) and descriptive data for the entity “Client”. I have a second table providing the business key (Account Number), the link to the Client entity (Client Number) and descriptive data for the entity “Account”.
For both tables, at record level, the source provides the change type (A-add, C-change, D-delete). The source is allowing a purge process for the 2 entities and eventually the business key could be reused to represent a different client/account.
In this case the creation date (provided both for the client and account data) could uniquely identify the instance of the entity.
Using a composite business key (Client Number + Creation Date) seem like an option, but I have 2 problems her:
- Using dates as business keys (or in a composite business key) is not recommended;
- Creating the link_account_client is not possible in the RAW DV as the account source table is not providing the client creation date to have the client composite business key;
What is your best solution for this situation?
Will you use the date in the business keys?
Many thanks for any advice!
That’s not a business key — it’s a system surrogate key. If a key can be reloaded and represents something else how will you resolve what descriptive data belongs to what key? You need to raise this as technical debt and find the business key to host in the hub table.
Never use dates as a part of or as a business key.
The source system is an old mainframe system coming with its limitation. The client/account numbers are allocated in the system in ranges par tenant. The recycling process is business controlled, checking that the account has no activity and it’s closed, for the client all the allocated accounts are no active and all are closed. Following the due diligence, the purge process is ordered. After a wile (months, years,…) the same number is reused and could represent a different client/account.
The number is even a smart key as some of the positions in the number represents a business meaning (ex. Account type).
As I was mentioning the source is sending the info regarding creation (A-add) and purge (D-delete). With the record A (Add) on the recycled number we are getting the new descriptive data from the source system.
The recycled number are a small percentage of the data volume.
If you are certain these are business keys you’ll need a way to track its deletion/removal and reinsertion. This will likely be loaded to a satellite with this flag either supplied by source or derived by you. It becomes tech debt you have to manage because every historical query must be cognisant of the change to key context through time, i.e. from this date the historical satellite records are not applicable, the new history builds from this date. The DV2.0 artefact to track this in the absence of the source sending you a flag is the status tracking satellites, status being:
- ‘I’ - insert
- ‘U’ - update and
- ‘D’ - delete denoting what has happened to that key
Yes, actually the source is providing this information and I have already the satellite tracking all the I/U/D.
Thanks for the advice. Greatly appreciated.
1 Like
It would be good to ask how the business differentiates between the two uses of the same key. I have seen similar problems with business keys such as badge number being reused in the police.