SQL_Variant – Real world research

I came across a table for custom data fields that went something like this…

Columns:

PrimaryKey int
EmployeeKey int
Datatype varchar(20)
intData int
varcharData varchar(250)
boolData bit
numericData float (I laughed as well)
dataData datetime

(Ugh.)

I’m looking at archiving this data for long-term storage in a data mart I’ve been asked to build. So, I figured I’d take a look at the SQL_Variant datatype to see if I generated any space savings.

I created two temp tables – one with the data from the table, with no alternations and the same structure as the existing table (I wanted to discount the size of the indexes).  The second was the data in this structure:

PrimaryKey int
EmployeeKey int
Data SQL_Variant

With a small data set (22k records), the SQL_Variant table outperformed the current format by 1,088kb to 1,360kb.

With a larger data set (1,807k records), the SQL_Variant table outperformed the current format by 86,864 KB to 107,760 KB.

At this point, I’m thinking to myself that this is looking like a pretty good research, and I turn to Google to look for best practices and the like. I find a lot of interesting thoughts on the topic – with the general opinion being “don’t bother.”

I tried another path, this time with a table that looks like this:

PrimaryKey int
EmployeeKey int
Data varchar(250) (250 is the largest field in the source data)

I repeated my test at the larger data set (1,807k records) and varchar(250) beat out SQL_Variant and Native formats by coming in at 75,088 KB.

I’m fairly certain that there’s a point where SQL_Variant will beat out a varchar approach – perhaps varchar(1000)? Someday I’ll take a look at it – but for the time being this meets my needs.

This entry was posted in SQL Server Development and tagged , , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *