Dealing with NVARCHAR(MAX) Columns in Hash Calculations Azure Synapse DB

Bigguy365 · 11 April 2023 18:41

In a satellite table, I have a column that contains descriptive information in the form of string paragraphs. The largest length of this column is 39,467. I’m stuck in how to specify the value into a hash calculation when the largest NVARCHAR column is 4000. When I try using the 4000, it passes validation but fails with a string that exceeds the length warning of truncation and stops processing. When I specify MAX, it also passes verification but fails indicating that the column type is not allowed in a columnstore index. What do you recommend?

Thank you

Clay

Nicruzer · 11 April 2023 20:23

Clay, when you say “hash calculation,” do you mean hash distribution, or are you using the column as part of a HashDiff in a satellite?

We use a dedicated pool in Azure Synapse as well and have also run into this limitation. Unfortunately, it is exactly that: a limitation.

To maintain the data integrity, keep the NVARCHAR(MAX) on the column and forego the CLUSTERED COLUMNSTORE INDEX on the table. Opt for either a rowstore CLUSTERED INDEX or a HEAP, if your table will be fewer than 60 million rows (which is less likely if it’s a satellite).

Keep in mind that using MAX does not prevent you from using a hash distribution on the table, if you so choose.

Bigguy365 · 11 April 2023 20:36

Thanks for the feedback. It is for the HashDiff calculation. We are also using a dedicated pool in Azure Synapse. I’ll give the Heap a go and let you know.

Thanks again.

Bigguy365 · 11 April 2023 21:43

Heap worked!!! Again — thank you.

AHenning · 11 April 2023 22:08

You could also consider having your nvarchar(max) columns in a separate satellite.

Bigguy365 · 12 April 2023 14:28

That’s a good idea!! Thanks

Topic		Replies	Views
Business Key Size in Hubs Data Vault 2.0 hub	4	352	8 June 2022
Using json to hash diff your satllite Data Vault 2.0 performance	3	110	23 September 2024
New column additions to SAT, and the HASHDIFF in dbtvault Data Vault 2.0 dbtvault , hashdiff	27	3340	20 September 2022
Azure SQL / SQL Server Table Partitioning Strategies Data Vault 2.0	1	54	30 September 2024
New Domain, Hash Key Question for the HUBs Data Vault 2.0	9	1115	21 September 2022

Dealing with NVARCHAR(MAX) Columns in Hash Calculations Azure Synapse DB

Related topics