AutomateDV vs dataVault4dbt

kasia · 7 July 2024 15:40

Dear Data-Vaulters,
I’m about to choose which data vault package for dbt will be used in my company. Do you have experiences with automateDV and/or datavault4dbt? Which one is better?
To make things more complex on my side, I have hundreds of small data sources, part of them are manually created. So data quality is something that is a great problem on our side. This probably makes configurability and flexibility my top priority in terms of features.

Any thoughts and experiences will be greatly appreciated!

patrickcuba · 7 July 2024 23:10

datavault4dbt is a fork of AutomateDV.

mrcool4 · 11 July 2024 11:35

We are using automatedv

kasia · 11 July 2024 12:13

Are you satisfied? Any obstacles? Do you need to use any workarounds? Or do you consult community often?

patrickcuba · 11 July 2024 23:04

anything using dbt needs workarounds… but that’s the beauty of using dbt packages, building your own macros

JFP · 15 July 2024 13:01

@kasia: I don’t have answer for you about the tool. I do have questions about the complexity you described.

Are the hundreds of small data sources existing to support the same process, solve the same business problem?
Are the ones systematically created share a common schema?
What is the percentage of data sources manually created in your source inputs?
Are manual data sources the root cause of data quality issues?

You mentioned data quality is something that is a great problem. If that is the case, your choice of ETL tool is not as important as addressing governance and stewardship of your data sources.

I would love to hear you take on my questions.
Best regards,
JF

kasia · 15 July 2024 13:29

Hi,

All the sources support the same business process, they don’t share the same schema (each source has a different schema and there is nothing we can do about it), about 5-10% of sources are manually created, but each of source system is prone to human error (imagine invoicing systems where you enter data manually - the output has consistent schema and data types but the content can still be incorrect).

Yes I know that data governance is important and we’re working on it, but there are things we cannot really change. My research shows that we’re trying to solve an issue that basically doesn’t exist anywhere else. Currently nobody deals with manually created data to this extent.

Btw. I’m not working with invoices. It was just an example.

Thanks for your questions, I’m wondering what’s your thoughts now after I answered them

Nat · 23 July 2024 18:18

I’ve tried both. I found at least that while datavault4dbt has forked AutomateDV, it has now gone quite a bit further than AutomateDV did. But I may not be up to date on the latest AutomateDV.

You can easily write some python code to create the yamls (which are basically python dicts) for both and just pass in different parameters(they are quite similar but have slightly different naming) and automate the hell out of it. If you have hundreds of small sources you’ll want lots of automation. I did try using the best of both, which is ok until you get to PITs - Datavault4DBT pits were better last time I looked, but they didn’t work on AutomateDV hubs for me (different HK data format I think)

Nat · 23 July 2024 18:18

Also, not sure we are allowed to pitch but I do some service provision around this so if you’re interested send me a DM. (putting this separately in case mods want to delete this post

Topic		Replies	Views
The dbtvault Data Vault automation tool AutomateDV automatedv	5	807	25 November 2025
Implementing DV without an automation tool Data Vault 2.0	5	494	25 November 2025
AutomateDV advise AutomateDV automatedv , business-key	4	124	28 November 2025
Now I've built my datavault - how do I prove that it is right? Data Vault 2.0 testing	6	405	1 December 2025
Trial users wanted for a new end to end data vault to metrics AI enabled automation Data Vault 2.0	0	38	22 January 2025

AutomateDV vs dataVault4dbt

Related topics