POLL: Which of the following Data Vault books have you read?

Which of the following three Data Vault books have you read?

  • Building a Scalable Data Warehouse with Data Vault 2.0 (Linstedt and Olschimke)
  • The Data Vault Guru (Cuba)
  • The Elephant in the Fridge (Giles)
0 voters
Continue the discussion...

If you have read more than one of the books listed in the poll, which would you recommend first to someone wanting to learn more about Data Vault 2.0?

If you have only read one, would you recommend it to others? Why or why not?

1 Like

I will admit. Thus far, I have just read Building a Scalable Data Warehouse with Data Vault 2.0, and I reference it often. I am in the process of reading the other two books right now and am enjoying them both. (It’s pretty sweet that one of the authors, @patrickcuba, participates in this forum actively.)

As for a recommendation, it’s difficult to go wrong with promoting the book co-written by the creator of Data Vault himself, Dan Linstedt. If you aren’t yet a certified practitioner and really want to prepare for the exam, this is the book you want to start studying now.

I could go on and on, but I have more reading to do…

I’m interested to know: If you have read more than one of the books listed, what differences are worth noting?

I got them all in Kindle. I read most of Linstedt but I tend to take advantage of Kindle’s search features to dip into the other two when I need to look up some question.

1 Like

I am guilty of the same…

1 Like

These are the 4 books I was aware:

  1. 2015 - dv2 | Dan Linstedt - Building a Scalable Data Warehouse with Data Vault 2.0

  2. 2020 - dv2 | Patrick Cuba - The Data Vault Guru: a pragmatic guide on building a data vault

  3. 2015 dv2 | Kent - An Introduction to Agile Data Engineering Using Data Vault 2.0: Better Data Modeling: 4 Graziano, Kent

  4. 2011 old intro seems ? | dan and kent -Super Charge Your Data Warehouse: Invaluable Data Modeling Rules to Implement Your Data Vault: Volume 1 Linstedt, Dan, Graziano, Kent

1 is the foundation even if a little outdated.
2 has a lot of practical advice.
Mandatory 1, 3, 4 (old but has good detailed examples of the many of the concepts like effectivity Sat, same as links etc (just need to ignore using seq PKs and doing updates.

–the 5th element
I will read all 4 above.
And thanks for sharing a 5th I wasn’t aware:
5. The Elephant in the Fridge (Giles)

Kind regards ~

1 Like

@emanueol - Thank you for sharing this one. I too was unaware this one exists. I have read and listened to Kent quite a bit, so I’m looking forward to adding this book to my library.

I’ve read Dan’s Book Building a Scalable Data Warehouse with Data Vault 2.0. I get it in my Safari subscription through my ACM membership a high value 99USD investment. While it might have a few gaps with some current thinking, it’s invaluable as a getting started guide

I have BETTER DATA MODELING: AN INTRODUCTION TO AGILE DATA ENGINEERING USING DATA VAULT 2.0 Graziano, Kent. on my Kindle. Its’ been a while since I read it, so I’ll be back into it soon

Purchased Patrick’s book this week, hard copy, it’s in transit. I have a new data apprentice starting out with me and want to share with her or it might have been on my kindle. Looking forward to getting my teeth into it
Going to have to consider getting The Elephant in the Fridge. It seems as it has some invaluable advice

1 Like

I have all of the DV books I can find, including these. I use these primarily as reference materials. This is such a large topic, it is a real challenge to cover the entire standard in all of the scenarios you will find in practice.
e.g. I recently chose to exploit @patrickcuba’s XTS pattern for late arriving stream data. in the end it was quite easy to deploy in our streaming code that there was only one situation (case#4) where a special update to the SAT was required

1 Like

the best part about the XTS pattern is the load to the adjacent SAT and processing of scenario 4 corrections can be done in a single query :slight_smile:

1 Like

It was solved in a single query. our load process for each sat is:

  1. For each incoming record (we’re getting a stream of sequential CUD events arriving (mostly) in order
    1. Start TXN
      1. insert into XTS (one table per sat - eliminates concurrency concerns)
      2. insert new info into SAT (std DV pattern where HDIFF<>latest.HDIFF)
      3. insert into SAT previous record as next SAT record per scenario #4
      4. update the audit log with the successful result of this txn
    2. COMMIT
  2. Next record

works a treat :slight_smile:

1 Like

Just remember, you don’t need to load XTS before the SAT, you don’t need to set a sequence to make this work.
This is because XTS is about the record before and after the current record you’re trying to load — so in fact you can load XTS and its adjacent SAT at the same time, it doesn’t matter who goes first.

1 Like

I am doing the same.