More Salesforce woes when it comes to DV…
There are many instances where a record in Salesforce gets “touched” but not necessarily a data update and, because of the nature of Salesforce, the LastModifiedDate gets updated to the latest touch. Since we are following an ELT methodology to populate the DV for our customer, MuleSoft does not handle a “hashdiff”-like approach to the records brought over…our SOURCE.
gets populated with a new record/loaddate/lastmodifieddate combination even though the rest of the data is the same.When DBTVault picks up the record and moves it into the Hub and Satellite, we do not include LASTMODIFIEDDATE in the hashdiff, it creates a new record with the same HASHDIFF.
Thinking through all of that, I guess I really have a few questions:
- How should this be handled? Should the record be allowed to persist, or should it not make it into the DV?
- For anyone more familiar with DBTVault than I am, I thought it was able to look at the HASHDIFF and cleanly handle not inserting a new record if the HASHDIFF was the same?
- Should I set up something to automate deleting duplicate HASHDIFFS?
Now, all of that said, there are some instances where I would expect duplicate HASHDIFFs, but I’m not sure how to handle them. Situation: An Opportunity in the sales pipeline moves from Contracting to Contract Review. The contract gets rejected for some reason, so the Opportunity moves back to Contracting in order to be worked. This is an example of a workflow for the customer, but they have many simlar workflows. When that happens, the move back to Contracting is a true data change, with a different LASTMODIFIEDDATE, but it creates a duplicate HASHDIFF (though I feel this is an acceptable duplicate).
The question is: is it okay to have an “acceptable duplicate” and how should it be handled?
In an overarching question that covers all of this topic: It is possible every object we sync over could fall into the “acceptable duplicate” situation since select boxes and formula fields can change and then revert back. In broad strokes, should I just allow duplicate hashdiffs and safely assume that if there are duplicates with different LASTMODIFIEDDATES, that it was an instance where a value was moved forward and then reverted back (which is 100% the case for all of the duplicates after a lot of manual data verification and audit log tracking)?