Divide and Conquer Halloween

The classic algorithm design paradigm known as Divide and Conquer (D&C) takes a problem and reduces it by dividing it into two or more sub-problems in a recursive/iterative manner. A variation called Decrease and Conquer reduces the original problem into one sub-problem—again, in a recursive/iterative manner.

There's a common recursive/iterative solution pattern used in SQL Server to handle graph-related tasks that could be considered as being based on a Decrease and Conquer algorithm. Examples of such tasks include returning a subgraph (find subordinates of a manager, perform a parts explosion in a Bill of Materials, or BOM), returning a path (find managers of an employee), and others. The common solution pattern involves iterating through the graph levels, and with the help of a temporary table or a table variable, handling one level at a time.

Recently I observed that query plans of solutions based on this pattern contain extra work to deal with Halloween protection. (As background, I recommend reading Paul White's excellent series on the topic of Halloween protection.) When the source and target of a modification are the same and there's potential for the same rows to be handled more than once (e.g., updated multiple times, read and inserted multiple times), the optimizer uses Halloween protection to avoid the problem. Such Halloween protection can sometimes involve extra work added to the plan that otherwise wouldn't be necessary.

I wondered whether I could come up with an alternative generic pattern that removes the need for Halloween protection, and I managed to find one. I split the handling of each level into two parts, alternating between two temporary tables/table variables—thus the source and target are always different. Before I wrote this article that explains the technique, I described it to my wife and asked for her advice regarding how to name the pattern. Interestingly, she intuitively said "Divide and Conquer," without any prior knowledge of the algorithm design pattern of the same name. I liked her suggestion; therefore, I'll refer to the technique as Divide and Conquer Halloween.

The Problem

I'll use a generic representation of a graph for my sample data. Use the code in Listing 1 to create a helper function called GetNums that returns a sequence of integers in a requested range. This function is used to help create the sample data.

SET NOCOUNT ON;
USE tempdb;
GO

IF OBJECT_ID(N'dbo.GetNums', N'IF') IS NOT NULL DROP FUNCTION dbo.GetNums;
GO

CREATE FUNCTION dbo.GetNums(@low AS BIGINT, @high AS BIGINT) RETURNS TABLE
AS
RETURN
  WITH
    L0   AS (SELECT c FROM (SELECT 1 UNION ALL SELECT 1) AS D(c)),
    L1   AS (SELECT 1 AS c FROM L0 AS A CROSS JOIN L0 AS B),
    L2   AS (SELECT 1 AS c FROM L1 AS A CROSS JOIN L1 AS B),
    L3   AS (SELECT 1 AS c FROM L2 AS A CROSS JOIN L2 AS B),
    L4   AS (SELECT 1 AS c FROM L3 AS A CROSS JOIN L3 AS B),
    L5   AS (SELECT 1 AS c FROM L4 AS A CROSS JOIN L4 AS B),
    Nums AS (SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS rownum
         FROM L5)
  SELECT TOP(@high - @low + 1) @low + rownum - 1 AS n
  FROM Nums
  ORDER BY rownum;
GO

Run the code in Listing 2 to create a table called Graph and populate it with 10,000,000 nodes. The degree of this graph (maximum depth) is 7.

IF OBJECT_ID(N'dbo.Graph', N'U') IS NOT NULL DROP TABLE dbo.Graph;
GO

CREATE TABLE dbo.Graph
(
  nodeid   INT NOT NULL,
  parentid INT NULL,
  val      NUMERIC(12, 2)
);

INSERT INTO dbo.Graph WITH(TABLOCK) (nodeid, parentid, val)
SELECT
  n AS nodeid,
  NULLIF((n-1) / 10, 0) AS parentid,
  CAST(1 + ABS(CHECKSUM(NEWID())) % 100 AS NUMERIC(12, 2)) AS val
FROM dbo.GetNums(1, 10000000) AS Nums;

CREATE UNIQUE CLUSTERED INDEX idx_pid_nid ON dbo.Graph(parentid, nodeid);
ALTER TABLE dbo.Graph ADD CONSTRAINT PK_Graph PRIMARY KEY NONCLUSTERED(nodeid);

Suppose you need to implement a solution for a subgraph request. Given an input node @root, return the nodes in the subgraph of @root. For our purposes, in order not to generate very large outputs, instead of returning all individual nodes in the subgraph, compute the sum of the values in their val column and their count. Note that the sample data I populated in the table Graph represents a graph that's a tree, meaning that only one path can lead to each node; however, the techniques I describe in the article are just as applicable to the more generic case known as a directed acyclic graph (DAG), meaning that multiple paths can lead to a node. A typical way to handle such a subgraph request is to use a recursive query such as the one shown in Listing 3.

DECLARE @root AS INT = 5;

WITH C AS
(
  SELECT nodeid, parentid, val
  FROM dbo.Graph
  WHERE nodeid = @root

  UNION ALL

  SELECT CUR.nodeid, CUR.parentid, CUR.val
  FROM C AS PRV
    INNER JOIN dbo.Graph AS CUR
  ON CUR.parentid = PRV.nodeid
)
SELECT COUNT(*) AS cnt, SUM(val) AS total_val
FROM C;

Figure 1 shows the execution plan for this query. You'll notice in the plan that SQL Server creates an Index Spool (a B-tree-based work table) where it stores the intermediate result sets. Using a Nested Loops operator, the plan iteratively processes one level of nodes at a time. In each iteration, it reads from the spool to collect the nodes from the previous level, joining to the Graph table to collect the nodes from the current level, and inserts those nodes into the spool.

Figure 1: Plan for Solution in Listing 3

The problem with this solution is that you can control only the optimization aspects of the user table Graph in terms of indexing, statistics, and so on. You have no say in terms of the work table's indexing and statistics, or any other aspect of its handling. The end result is a slow-running solution. The solution took 19 seconds to complete on my system for the input @root = 5, performing a total of 13,337,269 logical reads (measured with an Extended Events session).

Despite the elegance and conciseness of a recursive query, you'll tend to get better performance implementing the same algorithm with a loop and a temporary table or a table variable, as the code in Listing 4 demonstrates. SQL Server maintains distribution statistics (histograms) on temporary tables but not on table variables; therefore, estimates tend to be better with temporary tables.

DECLARE @root AS INT = 5;

CREATE TABLE #T
(
  lvl      INT        NOT NULL,
  nodeid   INT        NOT NULL,
  parentid INT        NULL,
  val      NUMERIC(12, 2) NOT NULL,
  UNIQUE CLUSTERED(lvl, nodeid)
);

DECLARE @lvl AS INT = 0;

-- insert root node
INSERT INTO #T(lvl, nodeid, parentid, val)
  SELECT @lvl, nodeid, parentid, val
  FROM dbo.Graph
  WHERE nodeid = @root;

WHILE @@ROWCOUNT > 0
BEGIN
  SET @lvl += 1;

  -- insert children of nodes in prev level
  INSERT INTO #T(lvl, nodeid, parentid, val)
    SELECT @lvl, CUR.nodeid, CUR.parentid, CUR.val
    FROM #T AS PRV
  INNER JOIN dbo.Graph AS CUR
    ON PRV.lvl = @lvl - 1
    AND CUR.parentid = PRV.nodeid;
END

SELECT COUNT(*) AS cnt, SUM(val) AS total_val
FROM #T;

DROP TABLE #T;

Figure 2 shows the plans for the queries in the solution.

Figure 2: Plan for Solution in Listing 4

The first plan represents the insertion of the first node into the temp table. The next seven plans represent the iterative insertions of the "next level." Finally, the last plan represents the query computing the sum and the count. Observe the highlighted Table Spool (Eager Spool) operators in the plans for the first six iterations of the INSERT statement. The Eager Spool is a work table that eagerly (completely) consumes all input rows before passing them on to the operator to the left of it. It's a blocking (stop-and-go) operator.

SQL Server uses a blocking operator for Halloween protection purposes—to ensure that data that was read from and written to the same table won't be read more than once. The Eager Spool naturally adds work, mainly I/O activity in tempdb, latches, and so on—resources that often tend to be in contention in SQL Server. The plan for the seventh iteration of the INSERT statement doesn't include an Eager Spool operator for Halloween protection purposes. That’s because the Hash Match operator is a blocking operator for the build input, and hence is already providing the needed protection.

This solution ran for 8 seconds on my system, performing 4,232,753 logical reads. The work done by the Eager Spool operators for Halloween protection purposes accounts for a few hundreds of thousands of those reads. I was intrigued by the challenge to try and come up with a generic solution pattern for such tasks, but one that somehow avoids the need for Halloween protection and its associated overhead.

The Solution: Divide and Conquer Halloween

As I mentioned, the potential for Halloween protection is only there when the source and target are one and the same. To avoid the need for Halloween protection, I use two temporary tables rather than one, alternating between them in each iteration of the loop. Based on the parity of the level counter (@lvl), I know which of the two to use as the source and which as the target. Then the final query that computes the sum and count simply applies the calculations to the unified sets from both tables. Listing 5 contains the complete solution.

DECLARE @root AS INT = 5;

CREATE TABLE #T1
(
  lvl      INT        NOT NULL,
  nodeid   INT        NOT NULL,
  parentid INT        NULL,
  val      NUMERIC(12, 2) NOT NULL,
  UNIQUE CLUSTERED(lvl, nodeid)
);

CREATE TABLE #T2
(
  lvl      INT        NOT NULL,
  nodeid   INT        NOT NULL,
  parentid INT        NULL,
  val      NUMERIC(12, 2) NOT NULL,
  UNIQUE CLUSTERED(lvl, nodeid)
);

DECLARE @lvl AS INT = 0;

-- insert root node
INSERT INTO #T1(lvl, nodeid, parentid, val)
  SELECT @lvl, nodeid, parentid, val
  FROM dbo.Graph
  WHERE nodeid = @root;

WHILE @@ROWCOUNT > 0
BEGIN
  SET @lvl += 1;

  -- insert children of nodes in prev level
  IF @lvl % 2 = 1
    INSERT INTO #T2(lvl, nodeid, parentid, val)
  SELECT @lvl, CUR.nodeid, CUR.parentid, CUR.val
  FROM #T1 AS PRV
    INNER JOIN dbo.Graph AS CUR
      ON PRV.lvl = @lvl - 1
      AND CUR.parentid = PRV.nodeid;
  ELSE
    INSERT INTO #T1(lvl, nodeid, parentid, val)
    SELECT @lvl, CUR.nodeid, CUR.parentid, CUR.val
    FROM #T2 AS PRV
  INNER JOIN dbo.Graph AS CUR
    ON PRV.lvl = @lvl - 1
    AND CUR.parentid = PRV.nodeid;
END

SELECT COUNT(*) AS cnt, SUM(val) AS total_val
FROM (SELECT * FROM #T1 UNION ALL SELECT * FROM #T2) AS U;

DROP TABLE #T1, #T2;

Figure 3 shows the execution plans for the queries in this solution. Observe that Halloween protection with an Eager Spool was completely removed from all plans. This solution ran for 6 seconds on my system, reducing the number of logical reads by over half a million to 3,651,072.

Figure 3: Plan for Solution in Listing 5

Avoid Unnecessary Costs

In this article, I described a common pattern used in iterative T-SQL solutions to handle graph-related tasks and other tasks. The problem with this pattern is that you iteratively read from and write to the same temporary table, requiring SQL Server to employ Halloween protection techniques, which come at a cost. I provided a modified pattern that I referred to as Divide and Conquer Halloween, in which I alternate between two temporary tables, using a different source and target in each iteration. Because the solution never uses the same table as both the source and the target, SQL Server doesn't need to add any Halloween protection controls, thereby avoiding unnecessary costs.

Comments

Plain text