New DV project - First steps

Hi together,

we are currently planning to map three ERP systems from different locations in one DV model and to integrate further ERP systems later.
The three ERP systems are based on MS Dynamics NAV 2016 and are structurally identical, as they come from one codebase and are maintained by one development team. Some data is the same across all three locations and ERP systems (e.g. customers), others are independent of each other (e.g. articles). Unfortunately, there is no CDC process in place, so we get the data in full copy mode.

We would like to start with a simple use case: All orders for customers, articles, G/L accounts, storage location.

We plan to use a Business Key Collision Code (BKCC) so that we can achieve passive integration for customers and differentiation for e.g. articles. In addition, we plan to use a Tenant ID, as there may be different tenants in each ERP system. The hash key thus results from BK + BKCC. Source system and Tenant ID would be additional descriptive attributes in HUB, SAT, LNK.

  1. This means that we do not cover all the relationships that exist in these individual source tables for the first use case. How do you deal with such issues in the early stages of development? For example, in the customer table we have references to the tables “Base Calendar”, “Customer Bank Account”, “Customer Posting Group”, etc, which are not currently relevant. Are such attributes transferred to the satellites as descriptive attributes? What if the “Customer Posting Group” becomes relevant later on? Is to be modelled as an independent hub then? Will these fields then be added to the existing link or will a new link be created with these additional fields and the previous link (without these attributes) gets deprecated?

  2. MS Dynamics NAV does not know NULL values. In the case of strings it is an empty string (‘’), in the case of date values it is a dummy value (01.01.1753) and in the case of numeric values it is set to 0. In such cases, would one use a hard rule on the way into the RAW Vault that replaces these values with NULL or is it kept as is?


@jbo My first question is - has your team been through the CDVP2 training? This would be a first priority to ensure everyone has the same understanding what needs to be built and how to build it properly.

Second, from the business’ perspective, are the customers across all three ERP systems supposed to be stored uniquely if it is the exact same customer identified by the exact same business key? What does the business say about this?

Third, if you are using a Mutlitenant ID, it should be included in your hash key computation to ensure that identical customers from different tenants are uniquely identified and segmented by tenant. This is not descriptive data, but becomes part of the composite primary key and part of the business key hash key computation. It’s okay to carry both the BKCC and Multitenant identifiers in the satellite, but that doesn’t work well for the Link.

Fourth, what is the cost of bringing reference data into the customer satellite? Aren’t these descriptive attributes of the business key associated with Customer or perhaps descriptive of the relationship . Customer Posting Group sounds more like reference data than transactional data, but that is a business decision and an IT modeling decision that needs to be made - i.e., what is reference data and what is transactional data? The answers to the other questions in paragraph 1 should help define how you will model this. I question the legitimacy of Base Calendar, Customer Bank Account, or Customer Posting Group as being part of a link. I think you need to understand if they are supposed to be business keys or descriptive data.

Fifth, regarding #2, the answer depends on whether the value represents a business key or whether it represents descriptive data. If it’s descriptive data, nothing more needs to be done. If it is a business key, then you need a mechanism to implement the NULL Key Concept that we talk about in class. I’d be a bit cautious about replacing a numeric value with a zero. How do you know that the value is a NULL key replacement?

it sounds like there is more work to do with data profiling side-by-side with the business before you dive into any kind of design.

CDVP2 Instructor

Hi Cindi,

thanks for your reply.

Currently I am the only one having CDVP2 Certification.

From the business perspective the customers in the “Customer” tables are the same across all three ERP systems (Customer with ID 1 represents Customer “Jack” in all ERP systems).

You are right with the tenant id. As we want to have an overview across all ERP systems for customer I would assume we use the tenant “Default” or “Enterprise” and for the other tables it would be something like “Company 1”, “Company 2“ etc. As we have only ERP systems as our sources for the moment the BKCC would be “Default”.

All the mentioned tables (Base Calendar, Customer Bank Account,…) can be seen as reference data. But we have a relationship from “Customer” to “Contact” as well, which is a Business Key but not relevant for this use case. Can we to store the Contact ID as descriptive data in the customer satellite or DO we have to model this a HUB_Contact?
For me it’s the question of how many Entities we have to model in the first iteration.

Replacing numeric values with 0 is not really a good idea, right. So only checking the BKs für potential null values (empty string ’’) and use the NULL Key Concept.



For 1: You need source SMEs involved to profile business keys and relationships, your first go at the DV model might need refactoring although I consider this to be a dirty word. Operational reality dictates what happens here! Have you tried mob modelling? You need executive buy in, you need business involved otherwise you’re building a legacy platform!

For 2: don’t change what you’re getting from source! If they give you 1753 as the date then so be it. When it comes to interpreting that data that falls into the Business Vault and/or Information mart domains!

Hope that helps!