To subtype or not

jbo · 9 November 2022 15:55

Hello,

I have the following case in our company. Our business users talk about orders, invoices, credit notes, returns, etc. and so I am tempted to model these as individual entities (HUBs).
However, all these entities come from two sources (Sales Header/Line), distinguishable via a Document Type field. When the documents are posted, new entries are created in additional tables. For example, posted invoices are created in Sales Invoice Header/Line, posted credit notes in Sales Cr.Memo Header/Line and posted returns in Return Receipt Header/Line.
In the RAW Vault we are not supposed to create super and subtypes. Would this be a case where new hubs are created in the Business Vault, depending on the type?

For example:
HUB_ORDERS - including Sales Header values with Type “Order”.
HUB_INVOICES - including Sales Invoice Header values and additionally as Business Rule values from Sales Header with Type “Invoice”.

Kind regards

patrickcuba · 10 November 2022 02:27

Are you creating new business keys in the DV? If not they are not BV hubs.
Split the content by type into your multiple hubs before loading to RV. Ensure the mapping is also built to handle where new types are found and you have not seen before. Load to Hubs + links.

jbo · 10 November 2022 10:47

I do not create new business keys, just taking the values out of the original source, split by Document Type.
Does the document type then become a descriptive attribute in the satellite?

Do you mean by handling some kind of notification? Because that is one of the reasons why I had thought to load the original table (Sales Header/Line) as is (a HUB) in order not to lose any new types and then copy them into the separate HUBs as a business rule. But this is apparently not the correct approach.

jbo · 10 November 2022 12:49

And there is a special aspect to be mentioned about the Sales Header and Sales Line tables. Each of these also has an archive table. If necessary, users can create a new version of a document in the frontend and this will be created in the archive table. Likewise, if a document is posted, the entries from the Sales Header and Sales Line are moved to the corresponding archive tables.
For example:
An invoice in the Sales Header/Line table is posted and a new document is created in the Sales Invoice Header/Line. The document from the Sales Header/Line is moved to the corresponding archive tables (deleted in Sales Header/Line and created as a new version in the archive tables).

Sales Header

Document Tpe
No

Sales Header Archive

Document Type
No
Doc No of Occurrence
Version No

How would one deal with such archive tables? One idea would be to split them up and attach them to the same hubs and make them into multi-active satellites in which the “Doc No of Occurrence” and “Version No” are child dependent keys.
But with such a construct I couldn’t link the the Sales Header Archive to Sales Line Archive, because those archived Sales Lines have an version Number as well.

Sales Line

Document Tpe
No
Line No

Sales Line Archive

Document Type
No
Doc No of Occurrence
Version No
Line No

In order to be able to see the totality of all orders placed, the entries from the Sales Header/Line would have to be combined with the entries from the Sales Header Archive/Line Archive. This would probably have to be created via a Business Satellite.

Nat · 10 November 2022 20:50

Which ERP are you using? is it Sage? For some reason that seems Sage-y.

I don’t understand the statement ‘In order to be able to see the totality of all orders placed, the entries from the Sales Header/Line would have to be combined with the entries from the Sales Header Archive/Line Archive’ - usually, an archive table in an ERP is simply a history of changes, while the current version lives in the header/line table.

So if you want to see all the orders placed, you can query the current version in the ERP, but if you need to see all the changes to the orders placed, you can query the archive table in the ERP - or the you can query the RV to give you the current version of each sales order header/line or to get all the changes to the sales order header/line as long as the extract period is short enough.

I guess I don’t really understand why you need the archive data in the data vault, but then I also don’t understand your ERP It doesn’t seem like it’s necessary. Of course your ERP could be doing some very weird things, this does happen often.

jbo · 10 November 2022 22:48

It’s Microsoft dynamics nav. It’s sometimes a hassle to work with. No defined relationships (foreign keys) in the DB layer, only in the application layer. Logical relationships from one column to another table depending on value in another column. Think column “id” contain item references, if colum “type” has value 1, but if the type column has value 2 the “id” column references resources.

And regarding the sales order: an order gets physically deleted from the sales table if the invoice is posted (and some other conditions) and a new version is created in the archive tables ( so you could think of moved to the archive tables). To get an overview about all orders placed we need to query the sales header/line tables for the open orders and the archive tables for the closed orders.
In the RV we create the history in satellite tables, but we would miss out the orders created in the past, if we don’t add the archive tables as well.

patrickcuba · 11 November 2022 00:03

Yes and split before load.

First prize: data is provided split already, get the source to do it
Second prize: you have to split the data in just before staging
Third prize: (wooden spoon), your suggestion, split them out in BV

PS: there are no BV hubs unless you create the business key in DV, which you are not, you simply pull them out. Load those keys to the Hub table. Hub table is the passive integration point between source systems and RV & BV. It is the Shared Kernel in DDD parlance. 1. Data Vault and Domain Driven Design | by Patrick Cuba | Snowflake | Medium

Rusty · 15 November 2022 11:48

I’m going to stir the pot a little bit here and suggest that the structure of the hubs is not determined by any one source system. It is the product of your business analysis. Then you need to map the source onto it.
I’m working with MS Dynamics currently. I understand the pain. But the level of abstraction is determined by the business’ architecture and language. What matters to the business, how do they identify each object (business key) and then look for that in the source tables

patrickcuba · 16 November 2022 21:39

it’s not stirring any pot mate… it is correct what you say. Hubs represent the business ontology; no source system dictates that, source systems map to that

Topic		Replies	Views
Super Type and sub type Data Vault 2.0 business-key , hub , super-type , subtype , sub-type	9	2060	3 June 2024
HUB or LNK/DC when unit of work is different for source files for the same BK Data Vault 2.0 dependent-child	3	451	21 June 2023
HUB Creation from Multiple Source system Data Vault 2.0	15	2299	11 June 2022
Data Vault Modeling: Storing Role Type – Satellite or Hub? Data Vault 2.0 business-key , hub , satellite	12	267	13 October 2024
How should I model Customer/Invoice relationship? Data Vault 2.0 dbtvault , dependent-child , link	3	123	4 February 2025

To subtype or not

Related topics