Hi,
I’m relatively new to DV2.0 and I’m hoping to understand a few things that I’m having trouble finding resources for online. I’ll keep them as separate threads so as not to confuse.
We have two Kafka event streams and three Salesforce tables.
Kafka_customer_event
Kafka_team_member_event
SF_account
SF_user
SF_contact
The SF_contact table contains additional attributes for both accounts and users. The SF_account table holds the key for the Kafka_customer, and likewise the SF_user table holds the key for the Kafka team_member. Between the customer and account, there should be a 1:1, but there are records in Kafka that aren’t in Salesforce, and likewise records in Salesforce that don’t exist in Kafka. Same on the user/team member side.
Any transaction records in Kafka refer to the Kafka id, and any records in SF use the SF id.
We have been forbidden from holding PII within our warehouse - no exceptions - so we cannot use the customer name as the business key.
So how should I go about modelling this in the raw layer?
Is this a customer and team member hub, with satellites each source table and same-as links for the twe hubs to link the Kafka and SF records? This would suggest that the contact satellite would be used in part for the customer hub and in part for the team member hub.
Is it a customer, team member and contact hub with link between customer-contact and team_member-contact?
Is it a hub for all 5 tables?
I’m leaning towards the second option, but keep second guessing myself.