Hello everyone,
I’m working on a Data Vault implementation that currently includes Hubs, Links, and Satellites, including multi-active satellites and satellite links. The model is still in the development phase, so we’re focusing on relatively simple use cases.
I collaborate with a Java team that needs to extract data from the Raw Vault to present it on a dashboard. The data they require comes from:
- A few Hubs
- Two Satellites
I’m relatively new to Data Vault modeling and would like some guidance on the best practices for this scenario. Specifically:
- Should I create PIT and Bridge tables?
- While I understand these structures improve query performance and simplify data consumption, I’m wondering if they are necessary at this early stage, given that the data requirements are straightforward.
- Would it be acceptable to create a view with joins on the Raw Vault tables?
- My idea is to define a view to join the required Hubs, Links, and Satellites for the requested data. This would keep the logic in the database and provide a single source of truth for the Java team.
- Should I implement a Stored Procedure for parameterized data retrieval?
- I’m considering creating a Stored Procedure that accepts parameters (e.g., specific keys or date ranges) and performs the necessary filtering. The Java team could call this procedure to fetch the data dynamically.
Since we’re still building the Data Vault and don’t yet have complex scenarios, I’m looking for recommendations on how to approach this:
- Is it better to keep things simple with views and procedures for now, or should I start implementing PIT and Bridge tables early?
- Are there any other best practices I should consider?
Thank you in advance for your suggestions!