To subtype or not

Hello,

I have the following case in our company. Our business users talk about orders, invoices, credit notes, returns, etc. and so I am tempted to model these as individual entities (HUBs).
However, all these entities come from two sources (Sales Header/Line), distinguishable via a Document Type field. When the documents are posted, new entries are created in additional tables. For example, posted invoices are created in Sales Invoice Header/Line, posted credit notes in Sales Cr.Memo Header/Line and posted returns in Return Receipt Header/Line.
In the RAW Vault we are not supposed to create super and subtypes. Would this be a case where new hubs are created in the Business Vault, depending on the type?

For example:
HUB_ORDERS - including Sales Header values with Type ā€œOrderā€.
HUB_INVOICES - including Sales Invoice Header values and additionally as Business Rule values from Sales Header with Type ā€œInvoiceā€.

Kind regards

Are you creating new business keys in the DV? If not they are not BV hubs.
Split the content by type into your multiple hubs before loading to RV. Ensure the mapping is also built to handle where new types are found and you have not seen before. Load to Hubs + links.

I do not create new business keys, just taking the values out of the original source, split by Document Type.
Does the document type then become a descriptive attribute in the satellite?

Do you mean by handling some kind of notification? Because that is one of the reasons why I had thought to load the original table (Sales Header/Line) as is (a HUB) in order not to lose any new types and then copy them into the separate HUBs as a business rule. But this is apparently not the correct approach.

And there is a special aspect to be mentioned about the Sales Header and Sales Line tables. Each of these also has an archive table. If necessary, users can create a new version of a document in the frontend and this will be created in the archive table. Likewise, if a document is posted, the entries from the Sales Header and Sales Line are moved to the corresponding archive tables.
For example:
An invoice in the Sales Header/Line table is posted and a new document is created in the Sales Invoice Header/Line. The document from the Sales Header/Line is moved to the corresponding archive tables (deleted in Sales Header/Line and created as a new version in the archive tables).

Sales Header

  • Document Tpe
  • No

Sales Header Archive

  • Document Type
  • No
  • Doc No of Occurrence
  • Version No

How would one deal with such archive tables? One idea would be to split them up and attach them to the same hubs and make them into multi-active satellites in which the ā€œDoc No of Occurrenceā€ and ā€œVersion Noā€ are child dependent keys.
But with such a construct I couldnā€™t link the the Sales Header Archive to Sales Line Archive, because those archived Sales Lines have an version Number as well.

Sales Line

  • Document Tpe
  • No
  • Line No

Sales Line Archive

  • Document Type
  • No
  • Doc No of Occurrence
  • Version No
  • Line No

In order to be able to see the totality of all orders placed, the entries from the Sales Header/Line would have to be combined with the entries from the Sales Header Archive/Line Archive. This would probably have to be created via a Business Satellite.

Which ERP are you using? is it Sage? For some reason that seems Sage-y.

I donā€™t understand the statement ā€˜In order to be able to see the totality of all orders placed, the entries from the Sales Header/Line would have to be combined with the entries from the Sales Header Archive/Line Archiveā€™ - usually, an archive table in an ERP is simply a history of changes, while the current version lives in the header/line table.

So if you want to see all the orders placed, you can query the current version in the ERP, but if you need to see all the changes to the orders placed, you can query the archive table in the ERP - or the you can query the RV to give you the current version of each sales order header/line or to get all the changes to the sales order header/line as long as the extract period is short enough.

I guess I donā€™t really understand why you need the archive data in the data vault, but then I also donā€™t understand your ERP :smiley: It doesnā€™t seem like itā€™s necessary. Of course your ERP could be doing some very weird things, this does happen often.

Itā€™s Microsoft dynamics nav. Itā€™s sometimes a hassle to work with. No defined relationships (foreign keys) in the DB layer, only in the application layer. Logical relationships from one column to another table depending on value in another column. Think column ā€œidā€ contain item references, if colum ā€œtypeā€ has value 1, but if the type column has value 2 the ā€œidā€ column references resources.

And regarding the sales order: an order gets physically deleted from the sales table if the invoice is posted (and some other conditions) and a new version is created in the archive tables ( so you could think of moved to the archive tables). To get an overview about all orders placed we need to query the sales header/line tables for the open orders and the archive tables for the closed orders.
In the RV we create the history in satellite tables, but we would miss out the orders created in the past, if we donā€™t add the archive tables as well.

Yes and split before load.

  1. First prize: data is provided split already, get the source to do it
  2. Second prize: you have to split the data in just before staging
  3. Third prize: (wooden spoon), your suggestion, split them out in BV

PS: there are no BV hubs unless you create the business key in DV, which you are not, you simply pull them out. Load those keys to the Hub table. Hub table is the passive integration point between source systems and RV & BV. It is the Shared Kernel in DDD parlance. 1. Data Vault and Domain Driven Design | by Patrick Cuba | Snowflake | Medium

Iā€™m going to stir the pot a little bit here and suggest that the structure of the hubs is not determined by any one source system. It is the product of your business analysis. Then you need to map the source onto it.
Iā€™m working with MS Dynamics currently. I understand the pain. But the level of abstraction is determined by the businessā€™ architecture and language. What matters to the business, how do they identify each object (business key) and then look for that in the source tables

1 Like

itā€™s not stirring any pot mateā€¦ it is correct what you say. Hubs represent the business ontology; no source system dictates that, source systems map to that

1 Like