Skip navigation

Row-Concatenation Solutions Aren't All Equal

Downloads
96784.zip

 Executive Summary:

There are many ways to concatenate row values in Microsoft SQL Server databases. With the hope of finding the most efficient row-concatenation solution, a SQL Server Magazine reader tested two common approaches: using a self-reference variable within a SELECT clause and using FOR XML. He found that although these solutions produce the same result, the time it takes to get those results varies significantly.

Let's say you have a table with a column named color_name and you want to store all the row values as a list of comma-delimited values. For example, if the column contains the row values of Yellow, Blue, and Red, the final result would look like: @cv = \{Yellow,Blue,Red\}

There are many ways to concatenate row values to get results like this. Although the various solutions produce the same result, the time it takes to get those results can vary significantly. With the hope of finding the most efficient row-concatenation solution, I tested two common approaches: using a self-reference variable within a SELECT clause and using FOR XML. See also, "Using T-SQL to Concatenate Data Into a Variable."

I began by running the SampleData.sql script in Listing 1 to create a table named T1 and populate it with 50,000 rows. (If you run this code so that you can test the solutions on your system, note that SampleData.sql takes a few minutes to run.)

Next, I turned on the Discard results after execution option in SQL Server Management Studio (SSMS) so that the time it took to generate the output wouldn't be taken into consideration. After enabling that option, I cleared the data cache and ran Solution1.sql, which Listing 2 shows, to concatenate rows by using a self-reference variable within a SELECT clause. It took 10 seconds for this code to run on my system.

It's worth noting that some people prefer to use the following two system-defined functions to concatenate rows:

DECLARE @cv varchar(MAX)
SELECT @cv =
COALESCE(@cv + ',', ') +
ISNULL(col1,'<NULL>') FROM T1
SELECT @cv

This code took even longer to run on my system (16 seconds) because of the functions' overhead.

I then tested Solution2.sql, which Listing 3 shows, to concatenate rows by using FOR XML. It took less than 1 second for this code to run on my system, which is 10 times faster than the first solution.

The first solution is not only slower but also leads to undefined results. This is due to the kind of assignment used within the SELECT clause. As Microsoft's Eugene Zabokritski notes, "The results of an assignment in the SELECT list... depend on what plan the optimizer chooses to execute the query, so technically speaking, the results are undefined." (For more information about undefined results, go to http://connect.microsoft.com/SQLServer/feedback/ViewFeed back.aspx?FeedbackID=126129.)

When you have a small number of row values, it probably doesn't matter which row-concatenation solution you choose. However, when you're working with tables that contain a large amount of data, consider using FOR XML to concatenate row values. See also, "Custom Aggregations: The Pivoting Technique."

TAGS: SQL
Hide comments

Comments

  • Allowed HTML tags: <em> <strong> <blockquote> <br> <p>

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.
Publish