COALESCE vs. ISNULL

The COALESCE and ISNULL T-SQL functions are used to return the first nonnull expression among the input arguments. SQL Server practitioners often wonder what the difference is between the two functions. Several differences exist; some are straightforward and are common knowledge, whereas others are less straightforward and are less well-known.

Some of the examples that I employ in this article use a sample database called TSQL2012. Download the code to create and populate the sample database.

I'd like to thank Brad Schulz, Erland Sommarskog, Paul White, and Umachandar Jayachandran (UC), who were the source of and inspiration for some of the information covered in this article.

COALESCE and ISNULL

According to SQL Server Books Online, COALESCE "returns the first nonnull expression among its arguments," and ISNULL "replaces NULL with the specified replacement value." As a simple example, the following code demonstrates using the two functions:

SET NOCOUNT ON;
USE TSQL2012; -- this database is used in later examples

DECLARE
  @x AS INT = NULL,
  @y AS INT = 1759,
  @z AS INT = 42;

SELECT COALESCE(@x, @y, @z);
SELECT ISNULL(@x, @y);

When you run this code, both functions return the integer 1759.

One apparent advantage that COALESCE has over ISNULL is that it supports more than two inputs, whereas ISNULL supports only two. Another advantage of COALESCE is that it's a standard function (namely, defined by the ISO/ANSI SQL standards), whereas ISNULL is T-SQL–specific. These differences between the two functions are fairly straightforward.

Next, I discuss some additional differences, some of which are more subtle. In some respects, ISNULL has advantages over COALESCE.

Data Type of Expression

COALESCE and ISNULL differ in how they determine the data type of the resulting expression.

The data type of a COALESCE expression is the data type of the input argument with the highest data type precedence. If all inputs are the untyped NULL literal, you get an error.

The data type of an ISNULL expression is the data type of the first input. If the first input is an untyped NULL literal, the data type of the result is the type of the second input. If both inputs are the untyped literal, the type of the output is INT.

As an example, consider the following code and its output, which Table 1 shows.

DECLARE
  @x AS VARCHAR(3) = NULL,
  @y AS VARCHAR(10) = '1234567890';

SELECT
  COALESCE(@x, @y) AS COALESCExy, COALESCE(@y, @x)
    AS COALESCEyx,
  ISNULL(@x, @y) AS ISNULLxy, ISNULL(@y, @x)
    AS ISNULLyx;

COALESCExy	COALESCEyx	ISNULLxy	ISNULLyx
----------	----------	--------	----------
1234567890	1234567890	123	    1234567890

Notice that with COALESCE, regardless of which input is specified first, the type of the output is VARCHAR(10)—the one with the higher precedence. However, with ISNULL, the type of the output is determined by the first input. So when the first input is of a VARCHAR(3) data type (the expression aliased as ISNULLxy), the output is VARCHAR(3). As a result, the returned value that originated in the input @y is truncated after three characters.

In the last example, all inputs are character strings, just of different lengths. What if the inputs are of different data type families? The same rules still apply. For example, the following COALESCE expression accepts a character string as the first input and an integer as a second input:

SELECT COALESCE('abc', 1);

An integer has a higher data type precedence than a character string; therefore, SQL Server tries to convert the value 'abc' to an integer and naturally fails. This code generates the following conversion error:

Msg 245, Level 16, State 1, Line 1
Conversion failed when converting the varchar value
'abc' to data type int.

Try a similar expression, using ISNULL instead:

SELECT ISNULL('abc', 1);

Now the returned type is based on the first input, and therefore SQL Server doesn't try to convert the character string. So you get the output 'abc'.

As for untyped NULL literal inputs, as I mentioned, if all inputs to COALESCE are the untyped NULL literal, you get an error. To demonstrate this, run the following code:

SELECT COALESCE(NULL, NULL);

You get the following error:

Msg 4127, Level 16, State 1, Line 1
     At least one of the arguments to COALESCE must be an
     expression that is not the NULL constant.

As long as there's a typed NULL in the input, the code succeeds:

SELECT COALESCE(CAST(NULL AS INT), NULL);

You get a NULL typed as an integer as the output.

The ISNULL function does allow both inputs to be untyped NULLs, in which case it returns a NULL typed as an integer as the output. To demonstrate this, run the following code:

SELECT ISNULL(NULL, NULL);

You get a NULL typed asan integer as the output. To prove this, run the following code:

SELECT ISNULL(NULL, NULL) AS col1
INTO dbo.T1;

SELECT TYPE_NAME(user_type_id)
FROM sys.columns
WHERE object_id = OBJECT_ID('dbo.T1')
  AND name = N'col1';

You get 'int' as output.

When you're done, run the following code for cleanup:

DROP TABLE dbo.T1;

Nullability of Expression

If you need to store the result of a COALESCE or ISNULL expression in a table, you probably care whether the resulting column is nullable or not. The way nullability of the expression is determined is different for the two functions. To demonstrate the difference, I'll use a table called T0, which the following code creates and populates:

SELECT CAST(NULL AS INT) AS col1 INTO dbo.T0;

With COALESCE, the resulting column is defined as NOT NULL only if all expressions are nonnullable and NULL otherwise. With ISNULL, the resulting column is defined as NOT NULL if any expression is nonnullable and NULL if both are nullable. As an example, run the following code:

SELECT
  COALESCE(1, 2) AS COALESCE_1_2,
  COALESCE(col1, 0) AS COALESCE_col1_0,
  ISNULL(col1, 0) AS ISNULL_col1_0
INTO dbo.T1
FROM dbo.T0;

Then, check the nullability of the resulting columns by running the following query:

SELECT
  COLUMNPROPERTY(OBJECT_ID('dbo.T1'), 'COALESCE_1_2',
    'AllowsNull') AS COALESCE_1_2,
  COLUMNPROPERTY(OBJECT_ID('dbo.T1'), 'COALESCE_col1_0',
    'AllowsNull') AS COALESCE_col1_0,
  COLUMNPROPERTY(OBJECT_ID('dbo.T1'), 'ISNULL_col1_0',
    'AllowsNull') AS ISNULL_col1_0;

You get the output shown in Table 2.

COALESCE_1_2	COALESCE_col1_0	ISNULL_col1_0
------------	---------------	-------------
0	    1	    0

The expression COALESCE(1, 2) resulted in a nonnullable column because both inputs were nonnullable. The expression COALESCE(col1, 0) resulted in a nullable column because one of the expressions was nullable (col1). The expression ISNULL(col1, 0) resulted in a nonnullable column because one of the expressions was nonnullable (0).

When you're done, run the following code for cleanup:

DROP TABLE dbo.T0, dbo.T1;

Used with Subqueries

The ISNULL function has an important advantage over COALESCE in that internally it doesn't evaluate an input expression more than once. In accordance with standard SQL, COALESCE(v1, v2) is simply internally translated to CASE WHEN v1 IS NOT NULL THEN v1 ELSE v2 END. As a result, SQL Server might evaluate the expression v1 more than once, which can lead to all kinds of surprising results. This section and the next section describe a couple of interesting examples that are affected by this behavior.

Suppose that you need to write an expression that returns the result of a subquery when it's not NULL and returns zero otherwise. Before trying different solutions, run the following code in order to request I/O statistics for queries in your session:

SET STATISTICS IO ON;

Run the following query (call it Query 1):

SELECT SUM(qty) FROM Sales.OrderDetails;

Observe the plan for Query 1 in Figure 1 (the first of the three plans in the figure).

COALESCE vs. ISNULL with Subqueries

Figure 1: COALESCE vs. ISNULL with Subqueries

The clustered index of the OrderDetails table is scanned once, resulting in a scan count of 1 and a logical reads count of 11. Next, try using COALESCE to implement the task at hand by running the following query (call it Query 2):

SELECT COALESCE( (SELECT SUM(qty) FROM Sales.OrderDetails), 0 );

Examine the plan for Query 2 in Figure 1 and observe that the clustered index of the OrderDetails table is scanned twice. This results in a scan count of 2 and a logical reads count of 22. This is a direct result of the fact that COALESCE was internally translated to a CASE expression that refers to the subquery twice—once to check if its result is NULL and again to return the result because it isn't NULL.

Implement a solution using ISNULL by running the following query (call it Query 3):

SELECT ISNULL( (SELECT SUM(qty) FROM Sales.OrderDetails), 0 );

Examine the plan for Query 3 in Figure 1 and observe that the clustered index of the OrderDetails table is scanned only once. This results in a scan count of 1 and a logical reads count of 11.

In this particular case, you could first store the result of the subquery in a variable and use the variable in the COALESCE expression, but that's not always possible. For example, consider a case where the subquery is a correlated one appearing in some outer query against another table. So from this perspective, ISNULL is preferred to COALESCE.

When you're done, run the following code to turn off reporting I/O statistics in the session:

SET STATISTICS IO OFF;

Atomicity vs. Isolation

This section covers another example for the implications of SQL Server's conversion of a COALESCE expression to a CASE expression. Before I present the example, try to think of a scenario in which the expression COALESCE( , 0 ) returns a NULL.

Now let's look at the example. Open two connections to SQL Server (call them connection 1 and connection 2). Run the following code in connection 1:

CREATE TABLE dbo.T1(col1 INT);
INSERT INTO dbo.T1 VALUES(42);

BEGIN TRAN
  INSERT INTO dbo.T1 VALUES(1759);

Then, run the following code in connection 2:

SELECT COALESCE( (SELECT SUM(col1) FROM dbo.T1), 0 );

Back in connection 1, run the following code:

  DELETE FROM dbo.T1;
COMMIT TRAN

DROP TABLE dbo.T1;

Connection 2 returns a NULL. Can you explain how this could happen?

In order to explain what happened, you need to examine the execution plan for the query in connection 2 shown in Figure 2.

COALESCE and Atomicity

Figure 2: COALESCE and Atomicity

The COALESCE expression in the query in connection 2 was internally translated to the following CASE expression:

CASE
  WHEN (SELECT SUM(col1) FROM dbo.T1) IS NOT NULL
    THEN (SELECT SUM(col1) FROM dbo.T1)
  ELSE 0
END

The upper branch of the query plan calculates the result of the query in the WHEN clause of the CASE expression. It reads the row with the value 42 and then blocks, because the row with the value 1759 is exclusively locked. Then, connection 1 deletes all rows from the table and commits, allowing the query to complete. The resulting sum 42 is assigned to the variable Expr1004. Next, the bottom branch of the query plan calculates the result of the query in the THEN clause of the CASE expression. The query returns a NULL because there are no rows in the table at this point, and the NULL result is assigned to the variable Expr1010. Now the CASE expression looks like this:

CASE WHEN Expr1004 IS NOT NULL THEN Expr1010 ELSE 0 END

Or, with constants:

CASE WHEN Expr1004 IS NOT NULL THEN Expr1010 ELSE 0 END

You might think that the atomicity of the SELECT transaction is broken here, but it isn't—you still get an all-or-nothing behavior. Under the default isolation level read committed, there's no guarantee that the code will interact with a frozen state of the data. To achieve such behavior, you need to request to work with the serializable isolation level. For example, repeat the same test that you just ran, but this time request to use the serializable isolation level in the query in connection 2, like so:

SELECT COALESCE( (SELECT SUM(col1) FROM dbo.T1 WITH
  (SERIALIZABLE)), 0 );

This time you get the output 0. Examine the plan for the query in connection 2 shown in Figure 3.

COALESCE and Serializable Isolation

Figure 3: COALESCE and Serializable Isolation

This time the query in the WHEN clause blocks before any rows are scanned waiting for the delete transaction to complete. Consequently, both branches of the plan find zero rows in the table, and result in a NULL. Therefore, the CASE expression looks like this:

CASE WHEN NULL IS NOT NULL THEN NULL ELSE 0 END

Try again, but this time using ISNULL instead of COALESCE:

SELECT ISNULL( (SELECT SUM(col1) FROM dbo.T1), 0 );

This time you get the output 42. By examining the plan in Figure 4, you can understand why.

ISNULL and Atomicity

Figure 4: ISNULL and Atomicity

The query was executed only once. It blocked after reading the row with the value 42. After the delete transaction committed, there were no more rows to read; hence the sum was computed as 42. Because that sum wasn't NULL, the value 42 was returned.

Significant Differences

On the surface, it might seem that there are only a couple of minor differences between the COALESCE and ISNULL functions. In practice, there are quite a few differences—some of which are significant. This article describes not only the minor and straightforward differences but also the more complex and significant ones. Some of the differences make the COALESCE function preferable, whereas others make ISNULL preferable. Armed with this knowledge, you can make an educated decision about which function to use based on your priorities.

Comments

Plain text