In learning about the data vault methodology, the topics of hard rules and soft rules are mentioned to identify the “WHAT” specific transformations to the data NEEDS to be completed before loading the data into the raw vault. I understand the reasoning behind their existence but I haven’t come across any material that indicates “HOW” to implement them. What are my options and is there material somewhere that explains the best practices to accomplish the objectives of the rules?
Thanks for your time.
Soft rules are the T in ETL
Thanks, Patrick – I knew that LOL. My question had to do with the DV methodology of implementing the rules. For example, are there unwritten guidelines to develop these rules to simplify their implementation? Do you use generically written functions that follow a specific pattern? Are there already libraries written? I realize that I’m spit-balling here for a good reason. I love the methodology of how the data is organized to address scaling and growth. If what I’m asking exists, I’d like to know what others have done to accomplish the same benefits in coding for the DV that also address scaling and growth.
I don’t think DV prescribes a way of defining Soft rules because in reality they could be anything. BV captures the outcome of those soft rules as Links and/or Satellites.
Whether that outcome is a part of
- DQ efforts and measures
- Plugging gaps in the source, a tech debt that will never be solved
- Deriving more value where there is no desire to do it in the source
- ML/AI algorithm outcomes
RV merely captures the outcome of automated Business Rules by those source systems, I think you know where I am going with this. The practice of how these rules are captured and managed can be covered by existing standards,
RV - BR outcomes from source
BV - BR outcomes from within DV - a BV can be based on RV, RV+BV or just other BV - you design it
Then there are functional rules, rules needed to make the data presentable to the Business User, i.e. they are used to build Information Marts.
Some guidance here: Data Vault Mysteries… Business Vault | by Patrick Cuba | Medium
Your explanation was a fantastic segway to your Data Vault Mysteries… Business Vault article. It cleared up a misunderstanding I had about the relationships between the RV and the BV. My take from the material you provided to my question is the famous answer – IT DEPENDS. The various business needs for the information determine how the rules are applied. Our job is to utilize the best approaches to provide that information using the contents of the RV and optionally 1-M BV’s. If the process can be shared, so be it but don’t knock yourself out looking for a one size fits all solution.