Business Key Size in Hubs

With passive integration, the business key that we’re encountering can be either numeric or character. The size of the business key needs to accommodate the largest known size during the initial development to avoid revisiting it if another business key comes in from a different source that exceeds the current size. What I’m proposing is to go with a size (nvarchar(128)) upfront. One of the pushbacks that I’ve received is if querying the business key, the result will be extremely wide in the majority of the cases. My position is it’s much easier to use the Trim function than it is to extend the size of the column. From a performance and pragmatic perspective, which is the best answer?

if the varchar gets that wide… i mean to extend a varchar column isn’t a painful exercise. To change its data type is painful — migration, refactoring.

trim would have already been applied when the bkeys are loaded, so applying that on the way out wouldn’t see any benefit… right? Or are you thinking of the information mart that ingests the keys? Which in that case, and if it is physical no trimming would be required. If the IM were a view then every query on that view would be executing a trim…

Your assumption is correct. My focus is on pulling the data out of the DV. The size of 128 is what is indicated as being the length of the column name when executing a query that uses Information_Schema. Information_Schema also has a column that indicated the maximum column size (with MAX being a -1). My initial plan was to use that value until another source was added to the hub that had a size much larger (not 128 but still larger). Since you indicated that the column expansion is not a painful exercise, I’ll abandon my approach and go with expanding the column instead. Thank you for your insight.

Clay

yea no worries, Snowflake compresses every column — it sounds like you’re using Snowflake.
On another plaform this might hurt you but on Snowflake it shouldn’t

1 Like

Thanks Patrick for your insight.