Now I've built my datavault - how do I prove that it is right?

I’ve built a realtime DV with the 1st data source being a new D365 app. What does testing a DV look like?

  • Are there established frameworks and tools?
  • Do you all roll-you-own?

I have built a pilot, secured a budget and its time to build a team and testing is a blind spot for me

Thanks
Russell

A little bit like this: Data Vault Test Automation. Modern day data analytics platforms… | by Patrick Cuba | Snowflake | Medium

Reconstituting the original load tables from your data vault objects is a very reliable way to test that your model is working as designed. So if you’ve ingested a table that has become a hub and a satellite then join the hub and satellite and do set comparisons against the load table.

1 Like

Thanks @patrickcuba. I’ve read your chapter on automated testing - thanks for a great book
We’re managing the data integrity of the DV with DB controls for now. I do not see performance or scalability challenges in our circumstances in the forseeable future. So I’ve decided to rely on prevention rather than detection.

What I’m looking for something like a test driven development approach. I want to create repeatable tests for specific complex scenarios that can be repeated and automated in the dev and test environments.

Are there known tools for this? (I’m only now discovering dbtvault and wondering if I can incorporate this into our toolkit

Thanks

Ah there are many, I guess it depends on what tools you’re using; like Great Expectations seems very popular

dbt has very good test features you can use to develop a test harness. And with jinja support you can template repeatable tests. There is even a great expectations port available as a package.

1 Like