Re-using business keys across hubs? Or use child dependent key?

evan.phillips · 26 February 2024 09:23

Hi all,

I’m working on a project that models political campaign finance data and the source system data we are using is messy to say the least. The business objects we’ve landed on for this specific case are Committee, Donor, Company, and Receipt.

Committee: represents a political action committee (PAC).
Donor: represents a person or other entity that donates to a PAC which is captured via a receipt.
Company: represents the employer of an individual donor, or a business that is donating.

In the source data we have the following columns:

Recipient committee id
Donor name
Donor address
Donor employer
Donor occupation
Donor type (could be Individual, PAC, or Company)
Donor committee id
Receipt date
Receipt amount

The committee hub feels straightforward as we can simply use the committee id columns to create a committee hub. When a committee is the donor, the Donor committee id column is filled in, otherwise it is null.

The business key for the company hub can be found across multiple columns depending on the donor type (Donor employer OR Donor name). When the Donor type is Company, the Donor employer column is null and the company name is in the Donor name column instead. The company names themselves are free text fields, so we master these later in the process.

Additionally, to uniquely identify a donor we must use the Donor name and Donor address columns.

In this case, is it appropriate to create a hard business rule to tease this out in the raw vault based on the Donor type? Or is Donor type somehow a child dependent key?

patrickcuba · 27 February 2024 06:16

That is a problem already (of course).

To manage the uniqueness of donors you might be looking at changing something at the source, i.e. have the data content processed in an application that your business users can become familiar with and introduce a donor id. The donor id should become permanent to identify the donor in question. This will likely require more than just an address and free form text but the business rules to determine the donor id should be consistent (lightweight mastering of business keys). This happens before entering the vault.

Alternatively, the donor is not a business object, but rather an attribute in a satellite describing something else.

Topic		Replies	Views
Business subject but no business key Data Vault 2.0 business-key , hub	4	92	27 February 2025
Recycled business key Data Vault 2.0 business-key , hub	5	374	28 October 2023
Business Entity with multiple Business Keys Data Vault 2.0 business-key	1	67	19 March 2025
Sharing Hubs with different BKs but the same business object Data Vault 2.0	4	260	3 November 2023
Hubs using multiple columns as natural identifier - general names or multiple hubs ? Data Vault 2.0 business-key , hub	7	843	16 April 2023

Re-using business keys across hubs? Or use child dependent key?

Related topics