Dealing with NVARCHAR(MAX) Columns in Hash Calculations Azure Synapse DB

In a satellite table, I have a column that contains descriptive information in the form of string paragraphs. The largest length of this column is 39,467. I’m stuck in how to specify the value into a hash calculation when the largest NVARCHAR column is 4000. When I try using the 4000, it passes validation but fails with a string that exceeds the length warning of truncation and stops processing. When I specify MAX, it also passes verification but fails indicating that the column type is not allowed in a columnstore index. What do you recommend?

Thank you


Clay, when you say “hash calculation,” do you mean hash distribution, or are you using the column as part of a HashDiff in a satellite?

We use a dedicated pool in Azure Synapse as well and have also run into this limitation. Unfortunately, it is exactly that: a limitation.

To maintain the data integrity, keep the NVARCHAR(MAX) on the column and forego the CLUSTERED COLUMNSTORE INDEX on the table. Opt for either a rowstore CLUSTERED INDEX or a HEAP, if your table will be fewer than 60 million rows (which is less likely if it’s a satellite).

Keep in mind that using MAX does not prevent you from using a hash distribution on the table, if you so choose.

Thanks for the feedback. It is for the HashDiff calculation. We are also using a dedicated pool in Azure Synapse. I’ll give the Heap a go and let you know.

Thanks again.

Heap worked!!! Again — thank you.

1 Like

You could also consider having your nvarchar(max) columns in a separate satellite.

1 Like

That’s a good idea!! Thanks

1 Like