Modeling Dictionaries/Hash tables (Key/Value Pairs)

Nicruzer · 2 November 2022 16:43

In one of our source systems we have several “attribute” tables that essentially extend the primary tables to which they are linked. This system tracks access control across the corporation.

For example, our Application table, which keeps common metadata about applications together, links to an ApplicationAttribute table, which extends or customizes each application in the parent table. The attribute table is essentially a key/value pair table (dictionary, hash table) with a unique surrogate key and the foreign key.

AppAttributeId - surrogate key
AppId - foreign key
DataKey
DataValue

There are other attribute tables modeled in much the same way for other concepts within the system…RoleAttribute, AccountAttribute, ApplicationAccountXrefAttribute, etc. The structure is essentially constant: a surrogate key, the foreign key, the DataKey and the DataValue.

To avoid influencing the discussion with my ideas, I’d like to hear some initial thoughts on how to model this in the data vault without modeling the source system (although, I’m not aware of any other system that models data this way).

FrenkLoonen · 3 November 2022 10:35

I’m assuming the Business Key is AppId + DataKey? In that case I would create an MSAT with DataKey as Dependent Child Key and AppAttributeId as a plain attribute.

If not, then I think you’d have to use AppAttributeId as DCK.

Not sure what happens when a KVP gets deleted? Is it a hard delete? Or did you leave out the DeletedFlag for simplicity’s sake? In any case, if it is hard delete, you capture it because of the set-based pattern of the MSAT.

Not sure how many different DataKey values you can expect and how much you actually need towards your information mart… I currently use a business rule to pivot some of the frequently used DataKey values into a BDV satellite that’s on the same grain as (in your case) AppId.

Nicruzer · 3 November 2022 14:38

@FrenkLoonen - Excellent suggestions! Thank you.

And no…I didn’t leave out a deleted flag. None of the KVP’s are ever deleted, only updated…at least in the case of the application attributes. We do have a (poorly implemented) table in the source system that tracks role questions, and it allows users to create their own KVPs. Those can be deleted. Thankfully, we aren’t pulling that information into the DW yet.

One problem at a time…

patrickcuba · 4 November 2022 04:03

Hello,

Are you loading semi-structured content into a variant column? I do have an upcoming blog covering this.

Alternatively Ensemble Data Vault has a Satellite table pattern called Name-Pair Satellites: a “Key” column and a “Value” column.

Nicruzer · 4 November 2022 12:07

Thank you for the response, @patrickcuba.

I suppose the data, in some of the cases, can be considered semi-structured. I would be interested to read your blog post. I assume on Medium, correct?

patrickcuba · 5 November 2022 00:01

It will be posted to here: Patrick Cuba - Snowflake

And eventually on Medium

Topic		Replies	Views
Need help with a data vault model Data Vault 2.0	1	118	18 September 2024
What would be the data vault of this data model? Data Vault 2.0 link , hub	2	553	8 February 2023
How to model list attribute Data Vault 2.0	2	58	12 August 2024
Link DCK or MA satellite for modelling role/type information Data Vault 2.0	1	37	7 November 2024
Modelling Question on Business Key Updating in source system Data Vault 2.0	9	427	19 September 2022

Modeling Dictionaries/Hash tables (Key/Value Pairs)

Related topics