First Look at System-Versioned Temporal Tables-Part 2: Querying Data and Optimization Considerations

This article is the second part in a two-part series about system-versioned temporal tables—a new feature introduced in Microsoft SQL Server 2016. Part 1 covered what system-versioned temporal tables are, how to create them and how to modify data in them. This article focuses on querying data and optimization considerations.

I’d like to thank several people who participated in discussions on the topic for their insights. From Microsoft: Borko Novakovic (the feature’s PM), Carl Rabeler and Conor Cunningham. Fellow SQL Server MVPs: Hugo Kornelis, Simon Sabin, Matija Lah, Phil Brammer, Louis Davidson, Jeffrey Moden and Erland Sommarskog.

Note: At the date of this writing the latest available public build of SQL Server 2016 is CTP2. Make sure you check the official product documentation for information about any changes and additions in later builds.

Recreating sample data from Part 1

If you want to run the queries from the article and get the same results as in my examples, you need your tables to have the same sample data as in mine. Use the code in Listing 1 to create the current and history tables (Employees and EmployeesHistory), and populate them with the sample data.

Listing 1: Create sample data

-- Create TemporalDB database and drop tables if exist
SET NOCOUNT ON;
IF DB_ID(N'TemporalDB') IS NULL CREATE DATABASE TemporalDB;
GO
USE TemporalDB;
GO
IF OBJECT_ID(N'dbo.Employees', N'U') IS NOT NULL
BEGIN
  IF OBJECTPROPERTY(OBJECT_ID(N'dbo.Employees', N'U'), N'TableTemporalType') = 2
    ALTER TABLE dbo.Employees SET ( SYSTEM_VERSIONING = OFF );
  IF OBJECT_ID(N'dbo.EmployeesHistory', N'U') IS NOT NULL
    DROP TABLE dbo.EmployeesHistory;
  DROP TABLE dbo.Employees;
END;
GO

-- Create and populate Employees table
CREATE TABLE dbo.Employees
(
  empid INT NOT NULL
    CONSTRAINT PK_Employees PRIMARY KEY NONCLUSTERED,
  mgrid INT NULL
    CONSTRAINT FK_Employees_mgr_emp REFERENCES dbo.Employees,
  empname VARCHAR(25) NOT NULL,
  sysstart DATETIME2(0) NOT NULL,
  sysend DATETIME2(0) NOT NULL
);

CREATE UNIQUE CLUSTERED INDEX ix_Employees
  ON dbo.Employees(empid, sysstart, sysend);

INSERT INTO dbo.Employees(empid, mgrid, empname, sysstart, sysend) VALUES
  (1 , NULL, 'David'  , '2015-06-01 19:54:04', '9999-12-31 23:59:59'),
  (2 , 1   , 'Eitan'  , '2015-06-01 19:54:04', '9999-12-31 23:59:59'),
  (3 , 1   , 'Ina'    , '2015-06-01 20:01:41', '9999-12-31 23:59:59'),
  (4 , 2   , 'Seraph' , '2015-06-01 19:54:20', '9999-12-31 23:59:59'),
  (5 , 2   , 'Jiru'   , '2015-06-01 19:54:20', '9999-12-31 23:59:59'),
  (6 , 3   , 'Steve'  , '2015-06-01 21:32:20', '9999-12-31 23:59:59'),
  (7 , 4   , 'Aaron'  , '2015-06-01 21:32:20', '9999-12-31 23:59:59'),
  (8 , 5   , 'Lilach' , '2015-06-01 20:01:41', '9999-12-31 23:59:59'),
  (9 , 4   , 'Rita'   , '2015-06-01 21:32:20', '9999-12-31 23:59:59'),
  (10, 5   , 'Sean'   , '2015-06-01 20:01:41', '9999-12-31 23:59:59'),
  (11, 6   , 'Gabriel', '2015-06-01 21:32:20', '9999-12-31 23:59:59');

-- Create and populate EmployeesHistory table
CREATE TABLE dbo.EmployeesHistory
(
  empid INT NOT NULL,
  mgrid INT NULL,
  empname VARCHAR(25) NOT NULL,
  sysstart DATETIME2(0) NOT NULL,
  sysend DATETIME2(0) NOT NULL
);

CREATE CLUSTERED INDEX ix_EmployeesHistory
  ON dbo.EmployeesHistory(empid, sysstart, sysend)
  WITH (DATA_COMPRESSION = PAGE);

INSERT INTO dbo.EmployeesHistory(empid, mgrid, empname, sysstart, sysend) VALUES
(6 , 2, 'Steve'  , '2015-06-01 19:54:20', '2015-06-01 21:32:20'),
(7 , 3, 'Aaron'  , '2015-06-01 20:01:41', '2015-06-01 21:32:20'),
(9 , 7, 'Rita'   , '2015-06-01 20:01:41', '2015-06-01 20:11:01'),
(9 , 3, 'Rita'   , '2015-06-01 20:11:01', '2015-06-01 21:32:20'),
(11, 7, 'Gabriel', '2015-06-01 20:01:41', '2015-06-01 20:11:01'),
(11, 3, 'Gabriel', '2015-06-01 20:11:01', '2015-06-01 21:32:20'),
(12, 9, 'Emilia' , '2015-06-01 20:01:41', '2015-06-01 21:32:20'),
(13, 9, 'Michael', '2015-06-01 20:01:41', '2015-06-01 20:11:01'),
(14, 9, 'Didi'   , '2015-06-01 20:01:41', '2015-06-01 20:11:01');

-- Enable system versioning
ALTER TABLE dbo.Employees ADD
  PERIOD FOR SYSTEM_TIME (sysstart, sysend);

ALTER TABLE dbo.Employees
  SET ( SYSTEM_VERSIONING = ON ( HISTORY_TABLE = dbo.EmployeesHistory ) );

Run the following code to show the contents of the current table:

SELECT *
FROM dbo.Employees;

Table 1 shows the output of this query.

Table 1: Contents of Employees table

empid      mgrid      empname  sysstart           sysend
---------- ---------- -------- ---------------------- ----------------------
1      NULL       David    2015-06-01 19:54:04    9999-12-31 23:59:59
2      1      Eitan    2015-06-01 19:54:04    9999-12-31 23:59:59
3      1      Ina      2015-06-01 20:01:41    9999-12-31 23:59:59
4      2      Seraph   2015-06-01 19:54:20    9999-12-31 23:59:59
5      2      Jiru     2015-06-01 19:54:20    9999-12-31 23:59:59
6      3      Steve    2015-06-01 21:32:20    9999-12-31 23:59:59
7      4      Aaron    2015-06-01 21:32:20    9999-12-31 23:59:59
8      5      Lilach   2015-06-01 20:01:41    9999-12-31 23:59:59
9      4      Rita     2015-06-01 21:32:20    9999-12-31 23:59:59
10     5      Sean     2015-06-01 20:01:41    9999-12-31 23:59:59
11     6      Gabriel  2015-06-01 21:32:20    9999-12-31 23:59:59

Run the following code to show the contents of the history table:

SELECT *
FROM dbo.EmployeesHistory;

Table 2 shows the output of this query.

Table 2: Contents of EmployeesHistory table

empid      mgrid      empname  sysstart           sysend
---------- ---------- -------- ---------------------- ----------------------
6      2      Steve    2015-06-01 19:54:20    2015-06-01 21:32:20
7      3      Aaron    2015-06-01 20:01:41    2015-06-01 21:32:20
9      7      Rita     2015-06-01 20:01:41    2015-06-01 20:11:01
9      3      Rita     2015-06-01 20:11:01    2015-06-01 21:32:20
11     7      Gabriel  2015-06-01 20:01:41    2015-06-01 20:11:01
11     3      Gabriel  2015-06-01 20:11:01    2015-06-01 21:32:20
12     9      Emilia   2015-06-01 20:01:41    2015-06-01 21:32:20
13     9      Michael  2015-06-01 20:01:41    2015-06-01 20:11:01
14     9      Didi     2015-06-01 20:01:41    2015-06-01 20:11:01

Now that your tables have the same sample data as mine, we can get started with querying them. Just keep in mind that since I executed the modification examples in Part 1 of the series over a short period of time (a couple of hours), the sample data accordingly spans such a short period. In reality, such changes would normally take place over multiple years.

Querying data and optimization considerations

The data of a temporal table is stored in two tables (current and history), but this fact can be transparent to the application. The application should interact with only one table. If you need to query only the current data, you just query the current table, without using any special clauses. For example, suppose you want to query the current state of all employees. You issue the following query:

SELECT *
FROM dbo.Employees;

The plan for this query is shown in Figure 1.

Figure 1: Plan for query without the FOR SYSTEM_TIME clause

The plan scans the clustered index of the current table, and, of course, doesn’t need to involve the history table.

Suppose, though, that you need to query the state of the employees as it was at a past point in time provided in a parameter called @datetime, or even a period defined by the inputs @start and @end. You could technically issue two queries: one against the current table and another against the history table, with the appropriate filter predicates to capture the right versions, and unify the results with a UNION ALL operator. But, instead, you get an elegant clause called FOR SYSTEM_TIME that you specify in a query against a temporal table, or a view based on temporal tables, and in it indicate the validity time or period that you’re interested in. SQL Server then translates your short and elegant query to more complex queries with the right predicates against the current and history tables behind the scenes.

Before I demonstrate using the FOR SYSTEM_TIME clause, I want to make sure you have an accurate understanding of the interval that the system period start and end columns represent. The interval they represent is what’s called in mathematics a closed-open interval and is represented as [sysstart, sysend). A square bracket means a closed edge (inclusive), and an angle bracket represents an open edge (exclusive). So in our case, sysstart is included and sysend isn’t included.

You specify the FOR SYSTEM_TIME clause right after the table or view name, before any table alias, as in:

SELECT ... FROM  FOR SYSTEM_TIME  AS ;

The subclause you will use most often is the AS OF subclause. You pass a date and time literal, variable or parameter as input, such as FOR SYSTEM_TIME AS OF @datetime. You get back the rows that were considered valid at the input point in time, keeping in mind the fact that the period columns represent a closed-open interval. With our Employees temporal table, qualifying rows are ones satisfying the predicates: sysstart <= @datetime AND sysend > @datetime. For example, the following query returns the state of the employee rows that were valid as of 2015-06-01 20:11:01:

DECLARE @datetime AS DATETIME2(0) = '2015-06-01 20:11:01';

SELECT *
FROM dbo.Employees FOR SYSTEM_TIME AS OF @datetime;

Feel free to look at Table 1 and Table 2, which were provided earlier, and try to figure out which rows are supposed to be returned before you run the query.

Like I said, you can achieve the same with two queries and a UNION ALL operator, like so:



SELECT *
FROM dbo.Employees
WHERE sysstart <= @datetime
  AND sysend > @datetime
  
UNION ALL

SELECT *
FROM dbo.EmployeesHistory
WHERE sysstart <= @datetime
  AND sysend > @datetime;

You get the same meaning, and in fact the same query plan, as shown in Figure 2.

Figure 2: Plan for query with FOR SYSTEM_TIME AS OF @datetime

The code (either solution) generates the following output (see if you figured out correctly which rows should be returned):

empid      mgrid      empname    sysstart           sysend
---------- ---------- ---------- ---------------------- ----------------------
1      NULL       David      2015-06-01 19:54:04    9999-12-31 23:59:59
10     5      Sean       2015-06-01 20:01:41    9999-12-31 23:59:59
2      1      Eitan      2015-06-01 19:54:04    9999-12-31 23:59:59
3      1      Ina    2015-06-01 20:01:41    9999-12-31 23:59:59
4      2      Seraph     2015-06-01 19:54:20    9999-12-31 23:59:59
5      2      Jiru       2015-06-01 19:54:20    9999-12-31 23:59:59
8      5      Lilach     2015-06-01 20:01:41    9999-12-31 23:59:59
11     3      Gabriel    2015-06-01 20:11:01    2015-06-01 21:32:20
12     9      Emilia     2015-06-01 20:01:41    2015-06-01 21:32:20
6      2      Steve      2015-06-01 19:54:20    2015-06-01 21:32:20
7      3      Aaron      2015-06-01 20:01:41    2015-06-01 21:32:20
9      3      Rita       2015-06-01 20:11:01    2015-06-01 21:32:20

Clearly, using the FOR SYSTEM_TIME clause allows a much simpler and more natural solution compared to not using it. This can go a long way toward improving the readability and maintainability of your code.

Note: The following discussion is relevant only if you’re relying on rowstore indexing.

If you’re wondering why the plan performs scans of the clustered indexes instead of seeks, that’s because if you recall from Part 1, the current clustered indexes on the tables have the empid column as the leading key, followed by systart and sysend, and our query doesn’t filter by the empid column. Here’s an example where you filter a specific employee:

DECLARE @datetime AS DATETIME2(0) = '2015-06-01 20:11:01', @empid AS INT = 9;

SELECT *
FROM dbo.Employees FOR SYSTEM_TIME AS OF @datetime
WHERE empid = @empid;

The plan for this query is shown in Figure 3.

Figure 3: Plan with seeks

This time the plan does show seeks in the indexes. If you expect to frequently issue queries without filtering a specific primary key value, you may want to consider also creating an index on the system period columns without the leading primary key column.

Another important thing to note regarding indexing is that when using multiple range predicates, only one of them can be used as a seek predicate (see Seek Predicates property in the plan); the rest are used as residual predicates (see Predicate property in the plan). This means that for optimal performance (to scan fewer pages in the index leaf), it’s advisable that you form the index key list like so: (, , ). With the current table it’s fine to place the sysstart column before the sysend column in the key list since sysstart will appear in the more selective range predicate (all rows will satisfy the predicate sysend > @datetime, whereas a subset of the rows will satisfy the predicate sysstart <= @datetime). But with the history table, if you tend to query recent data, the range predicate based on the sysend column will tend to be more selective than the one based on the sysstart column. Therefore, especially if you’re expecting many versions for each row, you want to make sure that when you do performance testing, you try an index with sysend before sysstart in the key list.

In case you need more information about such indexing considerations, I explain those in detail in the articles: Optimization Tips for Multiple Range Predicates, Part 1 and Optimization Tips for Multiple Range Predicates, Part 2.

FOR SYSTEM_TIME subclauses

The AS OF subclause is only one of four subclauses that the FOR SYSTEM_TIME clause supports. Here’s the full list of supported subclauses:

AS OF @datetime
FROM @start TO @end
BETWEEN @start AND @end
CONTAINED IN(@start, @end)

The inputs can be literals, variables or parameters. The first three subclauses are standard, whereas the fourth is a Microsoft extension to the standard.

Table 3 provides the predicates that represent the qualifying rows for the different subclauses.

Table 3: Qualifying rows for FOR SYSTEM_TIME subclauses

Figure 4 provides illustrations that depict the qualifying rows for the different subclauses graphically.

Figure 4: Illustrations of FOR SYSTEM_TIME subclauses

The green blocks represent the input date and time values in the FOR SYSTEM_TIME subclauses based on the type’s precision. Our system period columns and input variables use the type DATETIME2(0) and hence have a precision of one second. In the AS OF case, the green block represents the input @datetime value. In the rest of the subclauses, the leftmost green block represents the input @start value and the rightmost green block represents the input @end value.

Naturally, there may be additional supported date and time values between @start and @end, and those are represented by the green block in the middle. Using a block to represent a supported value in the type makes it easy for me to convey when a value is or isn’t included. The red and blue arrows represent which ranges of values in the systart and sysend columns, respectively, would make a row a qualifying one. Also keep in mind that there’s an implied condition, which is enforced by SQL Server, that says that sysend >= systart.

Let’s go over examples for all four cases. Starting with the AS OF @datetime subclause, the following query returns the row for the employee with the employee ID stored in @empid that was valid as of the date and time value stored in @datetime (qualifying row: sysstart <= @datetime AND sysend > @datetime):

DECLARE @datetime AS DATETIME2(0) = '2015-06-01 20:11:01', @empid AS INT = 9;
-- also try with @datetime = '2015-06-01 20:11:00'

SELECT *
FROM dbo.Employees FOR SYSTEM_TIME AS OF @datetime
WHERE empid = @empid;

DECLARE @datetime AS DATETIME2(0) = '2015-06-01 20:11:01', @empid AS INT = 9;

-- also try with @datetime = '2015-06-01 20:11:00'

SELECT *

FROM dbo.Employees FOR SYSTEM_TIME AS OF @datetime

WHERE empid = @empid;



SELECT *
FROM dbo.Employees
WHERE empid = @empid
  AND sysstart <= @datetime
  AND sysend > @datetime

UNION ALL

SELECT *
FROM dbo.EmployeesHistory
WHERE empid = @empid
  AND sysstart <= @datetime
  AND sysend > @datetime;

Here’s the output that you got for @datetime = '2015-06-01 20:11:01':

empid      mgrid      empname  sysstart           sysend
---------- ---------- -------- ---------------------- ----------------------
9      3      Rita     2015-06-01 20:11:01    2015-06-01 21:32:20

And here’s the output that you get for @datetime = '2015-06-01 20:11:00':

empid      mgrid      empname  sysstart           sysend
---------- ---------- -------- ---------------------- ----------------------
9      7      Rita     2015-06-01 20:01:41    2015-06-01 20:11:01

The FOR SYSTEM_TIME FROM @start TO @end subclause gives you all row versions where sysstart is before the input @end, and sysend is after the input @start (qualifying rows: sysstart < @end AND sysend > @start). Consider the following example:

DECLARE
  @start AS DATETIME2(0) = '2015-06-01 19:00:00',
  @end AS DATETIME2(0) = '2015-06-01 21:32:20',
  @empid AS INT = 9;

SELECT *
FROM dbo.Employees FOR SYSTEM_TIME FROM @start TO @end
WHERE empid = @empid;

This query is equivalent to the following:



SELECT *
FROM dbo.Employees
WHERE empid = @empid
  AND sysstart < @end
  AND sysend > @start

UNION ALL

SELECT *
FROM dbo.EmployeesHistory
WHERE empid = @empid
  AND sysstart < @end
  AND sysend > @start;

You get the following output showing two row versions for the same employee during the period of interest:

empid      mgrid      empname  sysstart           sysend
---------- ---------- -------- ---------------------- ----------------------
9      7      Rita     2015-06-01 20:01:41    2015-06-01 20:11:01
9      3      Rita     2015-06-01 20:11:01    2015-06-01 21:32:20

The FOR SYSTEM_TIME BETWEEN @start AND @end subclause is very similar to the previous, only the predicate against systart is less than or equal to the input @end (as opposed to being just less than). The qualifying rows are the ones satisfying the conjunction of predicates: sysstart <= @end AND sysend > @start. Here’s an example using this subclause:

DECLARE
  @start AS DATETIME2(0) = '2015-06-01 19:00:00',
  @end AS DATETIME2(0) = '2015-06-01 21:32:20',
  @empid AS INT = 9;

SELECT *
FROM dbo.Employees FOR SYSTEM_TIME BETWEEN @start AND @end
WHERE empid = @empid;

This query is equivalent to the following code:



SELECT *
FROM dbo.Employees
WHERE empid = @empid
  AND sysstart <= @end
  AND sysend > @start

UNION ALL

SELECT *
FROM dbo.EmployeesHistory
WHERE empid = @empid
  AND sysstart <= @end
  AND sysend > @start;

You get the following output, which this time includes three row versions for the employee, compared to two row versions that you got with the previous subclause using the same inputs:

empid      mgrid      empname  sysstart           sysend
---------- ---------- -------- ---------------------- ----------------------
9      4      Rita     2015-06-01 21:32:20    9999-12-31 23:59:59
9      7      Rita     2015-06-01 20:01:41    2015-06-01 20:11:01
9      3      Rita     2015-06-01 20:11:01    2015-06-01 21:32:20

Finally, the fourth subclause is FOR SYSTEM_TIME CONTAINED IN(@start, @end). Unlike the others, this one is a Microsoft extension to the standard. It considers as qualifying rows ones where sysstart is on or after the input @start and sysend is on or before the input @end (qualifying rows: sysstart >= @start AND sysend <= @end). In other words, the system period has to be contained in the input period. Here’s an example using this subclause:

DECLARE
  @start AS DATETIME2(0) = '2015-06-01 20:01:41',
  @end AS DATETIME2(0) = '2015-06-01 21:32:20',
  @empid AS INT = 9;

SELECT *
FROM dbo.Employees FOR SYSTEM_TIME CONTAINED IN (@start, @end)
WHERE empid = @empid;

This query is equivalent to the following code:



SELECT *
FROM dbo.Employees
WHERE empid = @empid
  AND sysstart >= @start
  AND sysend <= @end

UNION ALL

SELECT *
FROM dbo.EmployeesHistory
WHERE empid = @empid
  AND sysstart >= @start
  AND sysend <= @end;

The query generates the following output, showing two qualifying row versions for the input employee:

empid      mgrid      empname  sysstart           sysend
---------- ---------- -------- ---------------------- ----------------------
9      7      Rita     2015-06-01 20:01:41    2015-06-01 20:11:01
9      3      Rita     2015-06-01 20:11:01    2015-06-01 21:32:20

Querying table expressions

As mentioned, you’re allowed to use the FOR SYSTEM_TIME clause against a system-versioned temporal table and against a view. (SQL Server will propagate the clause to inner system-versioned temporal tables.) As of SQL Server 2016 CTP2 (may change later), SQL Server doesn’t support the clause against other types of table expressions (derived tables, CTEs and inline table-valued functions). With those, you will need to specify the clause in the inner references to the temporal tables.

For example, suppose that you create an inline table-valued function (TVF) called ShowHierarchy, which uses a recursive CTE to return the hierarchy of employees with ancestor paths and level indication. You want to be able to query the function requesting to capture the hierarchy at a given point in time, like so:

DECLARE @datetime AS DATETIME2(0) = '2015-06-01 20:11:01';

SELECT *
FROM dbo.ShowHierarchy() FOR SYSTEM_TIME AS OF @datetime AS F;

But currently SQL Server doesn’t support specifying the FOR SYSTEM_TIME clause against the inline TVF and propagating it to the underlying tables like it does with views. For now, as a workaround, you can pass the date and time value as an input to the function, and specify it as input to the FOR SYSTEM_TIME AS OF clause directly in the inner queries against the Employees table. Here’s an example for such a solution defining a function called ShowHierarchyAt:

IF OBJECT_ID(N'dbo.ShowHierarchyAt', 'IF') IS NOT NULL
  DROP FUNCTION dbo.ShowHierarchyAt;
GO
CREATE FUNCTION dbo.ShowHierarchyAt(@datetime AS DATETIME2(0)) RETURNS TABLE
AS
RETURN
  WITH EmpsCTE AS
  (
    SELECT empid, mgrid, empname, 0 AS lvl,
  CAST('.' + CAST(empid AS VARCHAR(10)) + '.' AS VARCHAR(900)) AS path
    FROM dbo.Employees FOR SYSTEM_TIME AS OF @datetime
    WHERE mgrid IS NULL

UNION ALL

SELECT S.empid, S.mgrid, S.empname, M.lvl + 1 AS lvl,
  CAST(M.path + CAST(S.empid AS VARCHAR(10)) + '.' AS VARCHAR(900)) AS path
    FROM EmpsCTE AS M
  INNER JOIN dbo.Employees FOR SYSTEM_TIME AS OF @datetime AS S
    ON S.mgrid = M.empid
  )
  SELECT empid, mgrid, empname, lvl, path
  FROM EmpsCTE;
GO

Use the following query against the function, passing the date and time value 2015-06-01 20:01:41 as input:

SELECT
  REPLICATE(' | ', lvl) + '(' + CAST(empid AS VARCHAR(10)) + ') ' + empname
    AS emp
FROM dbo.ShowHierarchyAt('2015-06-01 20:01:41') AS F
ORDER BY path;

You will get the following state of the hierarchy as it was correct at the requested point in time:

(1) David
 | (2) Eitan
 |  | (4) Seraph
 |  | (5) Jiru
 |  |  | (10) Sean
 |  |  | (8) Lilach
 |  | (6) Steve
 | (3) Ina
 |  | (7) Aaron
 |  |  | (11) Gabriel
 |  |  | (9) Rita
 |  |  |  | (12) Emilia
 |  |  |  | (13) Michael
 |  |  |  | (14) Didi

Show the hierarchy at 2015-06-01 20:11:01:

SELECT
  REPLICATE(' | ', lvl) + '(' + CAST(empid AS VARCHAR(10)) + ') ' + empname
    AS emp
FROM dbo.ShowHierarchyAt('2015-06-01 20:11:01') AS F
ORDER BY path;

You will get the following output:

(1) David
 | (2) Eitan
 |  | (4) Seraph
 |  | (5) Jiru
 |  |  | (10) Sean
 |  |  | (8) Lilach
 |  | (6) Steve
 | (3) Ina
 |  | (11) Gabriel
 |  | (7) Aaron
 |  | (9) Rita
 |  |  | (12) Emilia

Show the hierarchy at 2015-06-01 20:11:01:

SELECT
  REPLICATE(' | ', lvl) + '(' + CAST(empid AS VARCHAR(10)) + ') ' + empname
    AS emp
FROM dbo.ShowHierarchyAt('2015-06-01 21:32:20') AS F
ORDER BY path;

You will get the following output:

(1) David
 | (2) Eitan
 |  | (4) Seraph
 |  |  | (7) Aaron
 |  |  | (9) Rita
 |  | (5) Jiru
 |  |  | (10) Sean
 |  |  | (8) Lilach
 | (3) Ina
 |  | (6) Steve
 |  |  | (11) Gabriel

Another capability that is not yet available is to specify a direct correlation in the FOR SYSTEM_TIME clause as an input. This could be handy if you want to return multiple states of the data using the APPLY operator (or a correlated subquery) based on input date and time values stored in a table or a table valued parameter (TVP). Here’s an example demonstrating this need attempting to return the managers of an employee at different points in time. (Don’t actually run this since this code is not supported, yet.):

DECLARE @PointsInTime AS TABLE
(
  p VARCHAR(10) NOT NULL PRIMARY KEY,
  dt DATETIME2(0) NOT NULL UNIQUE
);

INSERT INTO @PointsInTime(p, dt)
  VALUES('T1', '2015-06-01 19:54:04'),
    ('T2', '2015-06-01 19:54:20'),
    ('T3', '2015-06-01 20:01:41'),
    ('T4', '2015-06-01 20:11:01'),
    ('T5', '2015-06-01 21:32:20');

SELECT P.p, P.dt, E.mgrid
FROM @PointsInTime AS P
  OUTER APPLY ( SELECT *
        FROM dbo.Employees
          FOR SYSTEM_TIME AS OF P.dt AS E
        WHERE E.empid = 9 ) AS E;

For now there is a fairly simple workaround. Create an inline TVF that accepts the date and time value as an input parameter, queries the temporal table, and specifies the input parameter in the FOR SYSTEM_TIME clause, like so:

IF OBJECT_ID(N'dbo.EmployeeAt', 'IF') IS NOT NULL
  DROP FUNCTION dbo.EmployeeAt;
GO
CREATE FUNCTION dbo.EmployeeAt(@empid AS INT, @datetime AS DATETIME2(0)) RETURNS TABLE
AS
RETURN
  SELECT empid, mgrid, empname, sysstart, sysend
  FROM dbo.Employees FOR SYSTEM_TIME AS OF @datetime
  WHERE empid = @empid;
GO

Then instead of using a query with a correlation as the applied table expression, use the inline TVF and pass the date and time column from the TVP as input to the function, like so:

DECLARE @PointsInTime AS TABLE
(
  p VARCHAR(10) NOT NULL PRIMARY KEY,
  dt DATETIME2(0) NOT NULL UNIQUE
);

INSERT INTO @PointsInTime(p, dt)
  VALUES('T1', '2015-06-01 19:54:04'),
    ('T2', '2015-06-01 19:54:20'),
    ('T3', '2015-06-01 20:01:41'),
    ('T4', '2015-06-01 20:11:01'),
    ('T5', '2015-06-01 21:32:20');

SELECT P.p, P.dt, E.mgrid
FROM @PointsInTime AS P
  OUTER APPLY dbo.EmployeeAt( 9, P.dt ) AS E;

This execution of this code generates the following output:

p      dt              mgrid
---------- --------------------------- ----------
T1     2015-06-01 19:54:04     NULL
T2     2015-06-01 19:54:20     NULL
T3     2015-06-01 20:01:41     7
T4     2015-06-01 20:11:01     3
T5     2015-06-01 21:32:20     4

For fun, here’s code showing the data pivoted into one row:



WITH C AS
(
  SELECT P.p, E.mgrid
  FROM @PointsInTime AS P
    OUTER APPLY dbo.EmployeeAt( 9, P.dt ) AS E
)
SELECT *
FROM C PIVOT( MAX(mgrid) FOR p IN (T1, T2, T3, T4, T5) ) AS PVT;

This code generates the following output:

T1     T2     T3     T4     T5
---------- ---------- ---------- ---------- ----------
NULL       NULL       7      3      4

Conclusion

SQL Server 2016 introduces support for system-versioned temporal tables, which allow you to capture current and history states of the data based on the time modifications are executed. Hopefully in the future we’ll also see support for application-time period tables to allow the application (not the change time) to define the effective periods of the data, including recording future effective periods.

Since the feature is so new, we haven’t yet gathered a lot of performance-related information. Microsoft does provide some indexing guidelines, but there are currently no specialized indexes specifically for temporal data. (For details on what such indexes might look like, see: Interval Queries in SQL Server.) Depending on how many versions you will keep per row and what your queries will look like, perhaps the generic columnstore and rowstore indexes will do well. For example, columnstore segment elimination may end up doing a decent job at eliminating most of the noninteresting data. With time we’ll gain more experience and see.

Another small feature that perhaps we’ll see in the future is the ability to define the system period columns as implicitly hidden. DB2, for example, currently supports this feature. The idea is that when you issue a SELECT * against the temporal table, those columns won’t be returned, rather only when explicitly referred to. I also mentioned some other small things that are missing related to table expressions and correlations. I don’t want to focus on the half-empty glass and seem unappreciative of the efforts made by Microsoft. It is truly great to see the beginning of an investment in SQL Server in such an important area as temporal data.

Comments

Plain text