Split or consolidate hubs?

I have four source systems describing target individuals who work for target organisations. The context is selling or promoting healthcare products and services to private and public healthcare services.

Source 1 entities:
-Person
-Organisation
-Address
-Job Role

Source 2 entities:
-Person
-Employment
-Organisation

Source 3 entities:
-Person
-Employment
-Address

Source 4 entities:
-Target Customer (holds all employee, job role, and organisation attributes)

Many of these are missing the healthcare business key for organisations and healthcare professionals, and in some cases like retail pharmacies, there just isn’t one. I’m going with the surrogate key or BKCC should there be any collision. I also don’t want to start constructing a business key across multiple attributes that we want to track in our satellites as these can change over time.

Anyway the question is about how many hubs (and links) should I go with. Do I have a target individual, target organisation (customer), job role, and address hubs or is it best to consolidate the latter two and simply have the target individual and organisation with a works for link? There are multiple addresses and job roles in the mix.

Thanks.

target hub tables that represent the business architecture’s business capabilities — that is the data vault way

Okay. So, one for the target individual and another for the target organisation? If the organisation has multiple addresses, for example a retail pharmacy, then would multi effective sats be appropriate? Similarly the same question with regards to health care professionals who carry out multiple job roles. Thanks.

what is that? I’m not familiar with this term

Sorry multi active satellites.

Ask your business what the business entities are

Address is not a business entity, might be candidate for msat… maybe

The business would like to separate the target contact from the target customer. However, this is to do with thinking that they might not be able to roll-up to the customer level for reporting.

To me the process/domain might suggest that you cannot call/interact with a customer organisation without a target contact and sales rep involved, so it feels like there should be one hub for the target contact, and the target customer organisation they work for, are attributes of the target contact, although I could be wrong.

hub-contact maybe… telco’s need this as an entity because it represents their business model — simply creating a hub-address might actually give you incorrect results in querying especially if you consider multiple address types a customer might have – postal, pysical, business etc.

I agree the address should be an attribute of the hub. Thing is our customer data is so messy across different data sources, so at the minute it is a question of whether to have a separate hubs for the contact (employee) and customer (organisation), or to amalgamate into one hub for the contact, with the organisation attributes that the person works for part of that same hub.

Some data sources will have separate contact and customer entities, whereas the other end of the spectrum might be target list data with a lot of redundancy, and no reliable business or surrogate keys, and column names that aren’t clear what concept/entity they belong to.

Your DV should be a Top Down model of the business ontology… I’ll quote two quotes from my blog that may be relevant to you situation,

" 1. Data Vault is not the repository of Technical Debt."

and

" The Data Vault Model should not be dictated by Technical Limitations"

Blog post: Rules for an (almost) unbreakable Data Vault | by Patrick Cuba | The Modern Scientist | Medium