When using the HIERARCHYID datatype to represent graphs, in certain cases the values can become long. With very deep graphs this is natural, since the HIERARCHYID value represents a path of all nodes leading to the current starting with the root node. However, in certain cases even when the graph is not very deep, the path can become long. First I’ll explain the circumstances in which this can happen, and then I’ll provide a solution to normalizing the values, making them shorter.
HIERARCHYID values can become long when you keep adding new nodes in between two existing nodes that have consecutive values in the last numeric element in their canonical path. As an example, say you have two nodes with the following canonical paths: /1/ and /2/, and you add a node between them. You get a new value whose canonical path is /1.1/. Now add a value between /1.1/ and /2/ and you get /1.2/. Now add a value between /1.1/ and /1.2/ and you get /1.1.1/. As you can realize, if you keep adding nodes in between two existing nodes in this manner, you can get lengthy paths that represent lengthy HIERARCHYID values even when the graph is not deep.
If order among siblings is not important, you can add child nodes always after the last child, or always before the first, and this way, the paths will be more economic. But when order among siblings matters, you can’t control this. If you frequently add new nodes between existing ones you may end up with very long HIERARCHYID values. In such a case, you can periodically run a procedure that I will provide here that normalizes the HIERARCHYID values for the whole graph, making them shorter.
Uses the following code to create a table called Employees and populate it with sample data:
USE tempdb;
CREATE TABLE dbo.Employees
(
empid INT NOT NULL,
hid HIERARCHYID NOT NULL,
lvl AS hid.GetLevel() PERSISTED,
empname VARCHAR(25) NOT NULL
);
CREATE UNIQUE CLUSTERED INDEX idx_depth_first ON dbo.Employees(hid);
CREATE UNIQUE INDEX idx_breadth_first ON dbo.Employees(lvl, hid);
CREATE UNIQUE INDEX idx_empid ON dbo.Employees(empid);
GO
CREATE PROC dbo.AddEmp
@empid AS INT,
@mgrid AS INT,
@leftempid AS INT,
@rightempid AS INT,
@empname AS VARCHAR(25)
AS
DECLARE @hid AS HIERARCHYID;
IF @mgrid IS NULL
SET @hid = HIERARCHYID::GetRoot();
ELSE
SET @hid = (SELECT hid FROM dbo.Employees WHERE empid = @mgrid).GetDescendant
( (SELECT hid FROM dbo.Employees WHERE empid = @leftempid),
(SELECT hid FROM dbo.Employees WHERE empid = @rightempid) );
INSERT INTO dbo.Employees(empid, hid, empname)