Change in Primary Key at source

One of our hubs is central to rest of our Data Vault. Meaning, 80% of our satellites are hanging off this hub.

Consultants who came in first to build this 1st one of our hubs had little idea as to what data field is ideal to use for Hub hash, so they ended up choosing something that could change once in say 2 years for unknown reasons. We plan to remodel the hub in near future and use a more stable column for Hub hash.

For historical data, Given that our datasets are small ~ 50K rows per satellite and wondering if it is advisable to simply update the satellite data to point to the new key? (May be not, but thought I’d ask)

If not, is there a simpler way to point historical data to new key without undertaking a herculean effort.

I think refactoring will help you in the long run — imagine having to write queries to handle data differently at a point in time just based on a legacy DV

Hi @patrickcuba

Thank you very much for responding. I wanted to make sure I’m on the right track. Our plan is to :

Create a new hub with a stable primary key and a refactor existing data to use the new hub and key.

–Brief plan

  1. Hub called h_property with PropertyNumber as the current business key, and we want to change it to use propertyid as the new business key.

  2. ##Analyze Impact##:

    • Identify all Links and Satellites that reference h_property.
  3. ##Create a New Hub##:
    – SQL
    CREATE TABLE h_property_New (
    propertyhash char(32),
    propertyid VARCHAR(36) PRIMARY KEY,
    LoadDateTime DATETIME,
    RecordSource VARCHAR(50)
    );

  4. ##Data Migration##:
    – SQL
    INSERT INTO h_property_New (propertyhash,propertyid, LoadDateTime, RecordSource)
    SELECT new_hash_from_new_key, propertyid, LoadDateTime, RecordSource
    FROM h_property;

  5. ##Update Links##:
    – SQL
    ALTER TABLE l_property_customer ADD propertyid VARCHAR(36);

    UPDATE l_property_customer OL
    SET propertyid = PHN.propertyid
    FROM h_property PH
    JOIN h_property_New PHN ON PH.PropertyNumber = PHN.PropertyNumber
    WHERE OL.PropertyNumber = PH.PropertyNumber;

  6. ##Update Satellites##:

    ALTER TABLE property_satellite ADD propertyid VARCHAR(36);

    UPDATE property_satellite CS
    SET propertyid = PHN.propertyid
    FROM h_property PH
    JOIN h_property_New PHN ON PH.PropertyNumber = PHN.propertyid
    WHERE CS.PropertyNumber = PH.PropertyNumber;

  7. ##Update ETL Processes##:

    • Modify ETL to load propertyid in the new Hub.
  8. ##Deprecate the Old Hub##:

    • Ensure old Hub is no longer in use.