Friday, 17 January 2014

Compare data size definition and actual data for string field

Most of the time we import data from file with default field length of 255. But once it is imported you might want to set it to correct size because of many reasons such as

  • For Constraint and keys. 
  • Indexing strategy implementation.
  • Improve string comparison.
  • Maintaining data base field sizing standards 
and more many other reason.

To do this we have to get the max length of current data and the compare it with database definition. We might have to do this for any existing table.

I have written one script which give the comparison of field size defined in table and actual data size. You can use this comparison  and write script to make the necessary correction.

You can also use this script with some modification to generate ALTER script if you have defined rule for field size.



DECLARE @query VARCHAR(MAX)
DECLARE @tableName VARCHAR(255) = '[dbo].[SomeTable]'
SELECT @query = 'SELECT ' + SUBSTRING(A, 4, LEN(A)) + ' FROM ' + @tableName
FROM
       (
              SELECT
                     (
                     SELECT ' , MAX(LEN(['+ name +'])) AS ['+ name +']'
                     FROM sys.columns
                     WHERE [object_id] = OBJECT_ID(@tableName)
                           AND TYPE_NAME(system_type_id) IN ('CHAR', 'NCHAR', 'VARCHAR', 'NVARCHAR')
                     FOR XML PATH ('')
                     ) AS A
       ) AS B

SELECT @query = ' SELECT A.ColumnName, UPPER(TYPE_NAME(B.system_type_id)) AS DataType, CASE WHEN TYPE_NAME(B.system_type_id) IN (''NCHAR'', ''NVARCHAR'') THEN B.max_length/2 ELSE B.max_length END AS MaxSizeByDefinition, A.MaxSizeByActualData
--Add here your script for ALTER query
       FROM ( ' + @query + ' ) AS SourceTable UNPIVOT (MaxSizeByActualData FOR ColumnName IN ( ' + SUBSTRING(A, 4, LEN(A)) + ' )) AS A
       INNER JOIN sys.columns AS B ON B.name = A.ColumnName AND B.[object_id] = OBJECT_ID(''' + @tableName + ''') ORDER BY B.column_id'
FROM
       (
              SELECT
                     (
                     SELECT ' , ['+ name + ']' FROM sys.columns C WHERE [object_id] = OBJECT_ID(@tableName) AND TYPE_NAME(system_type_id) IN ('CHAR', 'NCHAR', 'VARCHAR', 'NVARCHAR') FOR XML PATH ('')
                     ) AS A
       ) AS B

EXECUTE(@query)



1 comment: