Hi All,
In our organisation we are in the process of including a raw data vault layer in our architecture. However there are still some questions from different teams around the necessity and the added complexity it brings to the architecture. One of the primary concern is that our data engineering crowd is very fluid and is finding it hard to understand and appreciate the vault method of modelling.
So we are thinking to have an architecture where we give options to team to take either of the below paths:
- SOURCE → STAGING → RAW DATA VAULT → DATA MART (Star Schema)
- SOURCE → STAGING → RAW DATA VAULT & DATAMART in parallel.
Has anyone used such a setup where the data mart is loaded from staging directly and you still have a raw vault for preserving historic data for use by AI/ML use cases later. Any problems that are very obvious with such an approach?