Re-using business keys across hubs? Or use child dependent key?

Hi all,

I’m working on a project that models political campaign finance data and the source system data we are using is messy to say the least. The business objects we’ve landed on for this specific case are Committee, Donor, Company, and Receipt.

  • Committee: represents a political action committee (PAC).
  • Donor: represents a person or other entity that donates to a PAC which is captured via a receipt.
  • Company: represents the employer of an individual donor, or a business that is donating.

In the source data we have the following columns:

  • Recipient committee id
  • Donor name
  • Donor address
  • Donor employer
  • Donor occupation
  • Donor type (could be Individual, PAC, or Company)
  • Donor committee id
  • Receipt date
  • Receipt amount

The committee hub feels straightforward as we can simply use the committee id columns to create a committee hub. When a committee is the donor, the Donor committee id column is filled in, otherwise it is null.

The business key for the company hub can be found across multiple columns depending on the donor type (Donor employer OR Donor name). When the Donor type is Company, the Donor employer column is null and the company name is in the Donor name column instead. The company names themselves are free text fields, so we master these later in the process.

Additionally, to uniquely identify a donor we must use the Donor name and Donor address columns.

In this case, is it appropriate to create a hard business rule to tease this out in the raw vault based on the Donor type? Or is Donor type somehow a child dependent key?

That is a problem already (of course).

To manage the uniqueness of donors you might be looking at changing something at the source, i.e. have the data content processed in an application that your business users can become familiar with and introduce a donor id. The donor id should become permanent to identify the donor in question. This will likely require more than just an address and free form text but the business rules to determine the donor id should be consistent (lightweight mastering of business keys). This happens before entering the vault.

Alternatively, the donor is not a business object, but rather an attribute in a satellite describing something else.