AutomateDV vs dataVault4dbt

Dear Data-Vaulters,
I’m about to choose which data vault package for dbt will be used in my company. Do you have experiences with automateDV and/or datavault4dbt? Which one is better?
To make things more complex on my side, I have hundreds of small data sources, part of them are manually created. So data quality is something that is a great problem on our side. This probably makes configurability and flexibility my top priority in terms of features.

Any thoughts and experiences will be greatly appreciated!

1 Like

datavault4dbt is a fork of AutomateDV.

We are using automatedv

Are you satisfied? Any obstacles? Do you need to use any workarounds? Or do you consult community often? :slight_smile:

anything using dbt needs workarounds… but that’s the beauty of using dbt packages, building your own macros

@kasia: I don’t have answer for you about the tool. I do have questions about the complexity you described.

  • Are the hundreds of small data sources existing to support the same process, solve the same business problem?

  • Are the ones systematically created share a common schema?

  • What is the percentage of data sources manually created in your source inputs?

  • Are manual data sources the root cause of data quality issues?

You mentioned data quality is something that is a great problem. If that is the case, your choice of ETL tool is not as important as addressing governance and stewardship of your data sources.

I would love to hear you take on my questions.
Best regards,
JF

1 Like

Hi,

All the sources support the same business process, they don’t share the same schema (each source has a different schema and there is nothing we can do about it), about 5-10% of sources are manually created, but each of source system is prone to human error (imagine invoicing systems where you enter data manually - the output has consistent schema and data types but the content can still be incorrect).

Yes I know that data governance is important and we’re working on it, but there are things we cannot really change. My research shows that we’re trying to solve an issue that basically doesn’t exist anywhere else. Currently nobody deals with manually created data to this extent.

Btw. I’m not working with invoices. It was just an example.

Thanks for your questions, I’m wondering what’s your thoughts now after I answered them :slight_smile: