This article is the second part in a two-part series about system-versioned temporal tables—a new feature introduced in Microsoft SQL Server 2016. Part 1 covered what system-versioned temporal tables are, how to create them and how to modify data in them. This article focuses on querying data and optimization considerations.
I’d like to thank several people who participated in discussions on the topic for their insights. From Microsoft: Borko Novakovic (the feature’s PM), Carl Rabeler and Conor Cunningham. Fellow SQL Server MVPs: Hugo Kornelis, Simon Sabin, Matija Lah, Phil Brammer, Louis Davidson, Jeffrey Moden and Erland Sommarskog.
Note: At the date of this writing the latest available public build of SQL Server 2016 is CTP2. Make sure you check the official product documentation for information about any changes and additions in later builds.
Recreating sample data from Part 1
If you want to run the queries from the article and get the same results as in my examples, you need your tables to have the same sample data as in mine. Use the code in Listing 1 to create the current and history tables (Employees and EmployeesHistory), and populate them with the sample data.
Listing 1: Create sample data
-- Create TemporalDB database and drop tables if exist
SET NOCOUNT ON;
IF DB_ID(N'TemporalDB') IS NULL CREATE DATABASE TemporalDB;
GO
USE TemporalDB;
GO
IF OBJECT_ID(N'dbo.Employees', N'U') IS NOT NULL
BEGIN
IF OBJECTPROPERTY(OBJECT_ID(N'dbo.Employees', N'U'), N'TableTemporalType') = 2
ALTER TABLE dbo.Employees SET ( SYSTEM_VERSIONING = OFF );
IF OBJECT_ID(N'dbo.EmployeesHistory', N'U') IS NOT NULL
DROP TABLE dbo.EmployeesHistory;
DROP TABLE dbo.Employees;
END;
GO
-- Create and populate Employees table
CREATE TABLE dbo.Employees
(
empid INT NOT NULL
CONSTRAINT PK_Employees PRIMARY KEY NONCLUSTERED,
mgrid INT NULL
CONSTRAINT FK_Employees_mgr_emp REFERENCES dbo.Employees,
empname VARCHAR(25) NOT NULL,
sysstart DATETIME2(0) NOT NULL,
sysend DATETIME2(0) NOT NULL
);
CREATE UNIQUE CLUSTERED INDEX ix_Employees
ON dbo.Employees(empid, sysstart, sysend);
INSERT INTO dbo.Employees(empid, mgrid, empname, sysstart, sysend) VALUES
(1 , NULL, 'David' , '2015-06-01 19:54:04', '9999-12-31 23:59:59'),
(2 , 1 , 'Eitan' , '2015-06-01 19:54:04', '9999-12-31 23:59:59'),
(3 , 1 , 'Ina' , '2015-06-01 20:01:41', '9999-12-31 23:59:59'),
(4 , 2 , 'Seraph' , '2015-06-01 19:54:20', '9999-12-31 23:59:59'),
(5 , 2 , 'Jiru' , '2015-06-01 19:54:20', '9999-12-31 23:59:59'),
(6 , 3 , 'Steve' , '2015-06-01 21:32:20', '9999-12-31 23:59:59'),
(7 , 4 , 'Aaron' , '2015-06-01 21:32:20', '9999-12-31 23:59:59'),
(8 , 5 , 'Lilach' , '2015-06-01 20:01:41', '9999-12-31 23:59:59'),
(9 , 4 , 'Rita' , '2015-06-01 21:32:20', '9999-12-31 23:59:59'),
(10, 5 , 'Sean' , '2015-06-01 20:01:41', '9999-12-31 23:59:59'),
(11, 6 , 'Gabriel', '2015-06-01 21:32:20', '9999-12-31 23:59:59');
-- Create and populate EmployeesHistory table
CREATE TABLE dbo.EmployeesHistory
(
empid INT NOT NULL,
mgrid INT NULL,
empname VARCHAR(25) NOT NULL,
sysstart DATETIME2(0) NOT NULL,
sysend DATETIME2(0) NOT NULL
);
CREATE CLUSTERED INDEX ix_EmployeesHistory
ON dbo.EmployeesHistory(empid, sysstart, sysend)
WITH (DATA_COMPRESSION = PAGE);
INSERT INTO dbo.EmployeesHistory(empid, mgrid, empname, sysstart, sysend) VALUES
(6 , 2, 'Steve' , '2015-06-01 19:54:20', '2015-06-01 21:32:20'),
(7 , 3, 'Aaron' , '2015-06-01 20:01:41', '2015-06-01 21:32:20'),
(9 , 7, 'Rita' , '2015-06-01 20:01:41', '2015-06-01 20:11:01'),
(9 , 3, 'Rita' , '2015-06-01 20:11:01', '2015-06-01 21:32:20'),
(11, 7, 'Gabriel', '2015-06-01 20:01:41', '2015-06-01 20:11:01'),
(11, 3, 'Gabriel', '2015-06-01 20:11:01', '2015-06-01 21:32:20'),
(12, 9, 'Emilia' , '2015-06-01 20:01:41', '2015-06-01 21:32:20'),
(13, 9, 'Michael', '2015-06-01 20:01:41', '2015-06-01 20:11:01'),
(14, 9, 'Didi' , '2015-06-01 20:01:41', '2015-06-01 20:11:01');
-- Enable system versioning
ALTER TABLE dbo.Employees ADD
PERIOD FOR SYSTEM_TIME (sysstart, sysend);
ALTER TABLE dbo.Employees
SET ( SYSTEM_VERSIONING = ON ( HISTORY_TABLE = dbo.EmployeesHistory ) );
Run the following code to show the contents of the current table:
SELECT *
FROM dbo.Employees;
Table 1 shows the output of this query.
Table 1: Contents of Employees table
empid mgrid empname sysstart sysend
---------- ---------- -------- ---------------------- ----------------------
1 NULL David 2015-06-01 19:54:04 9999-12-31 23:59:59
2 1 Eitan 2015-06-01 19:54:04 9999-12-31 23:59:59
3 1 Ina 2015-06-01 20:01:41 9999-12-31 23:59:59
4 2 Seraph 2015-06-01 19:54:20 9999-12-31 23:59:59
5 2 Jiru 2015-06-01 19:54:20 9999-12-31 23:59:59
6 3 Steve 2015-06-01 21:32:20 9999-12-31 23:59:59
7 4 Aaron 2015-06-01 21:32:20 9999-12-31 23:59:59
8 5 Lilach 2015-06-01 20:01:41 9999-12-31 23:59:59
9 4 Rita 2015-06-01 21:32:20 9999-12-31 23:59:59
10 5 Sean 2015-06-01 20:01:41 9999-12-31 23:59:59
11 6 Gabriel 2015-06-01 21:32:20 9999-12-31 23:59:59
Run the following code to show the contents of the history table:
SELECT *
FROM dbo.EmployeesHistory;
Table 2 shows the output of this query.
Table 2: Contents of EmployeesHistory table
empid mgrid empname sysstart sysend
---------- ---------- -------- ---------------------- ----------------------
6 2 Steve 2015-06-01 19:54:20 2015-06-01 21:32:20
7 3 Aaron 2015-06-01 20:01:41 2015-06-01 21:32:20
9 7 Rita 2015-06-01 20:01:41 2015-06-01 20:11:01
9 3 Rita 2015-06-01 20:11:01 2015-06-01 21:32:20
11 7 Gabriel 2015-06-01 20:01:41 2015-06-01 20:11:01
11 3 Gabriel 2015-06-01 20:11:01 2015-06-01 21:32:20
12 9 Emilia 2015-06-01 20:01:41 2015-06-01 21:32:20
13 9 Michael 2015-06-01 20:01:41 2015-06-01 20:11:01
14 9 Didi 2015-06-01 20:01:41 2015-06-01 20:11:01
Now that your tables have the same sample data as mine, we can get started with querying them. Just keep in mind that since I executed the modification examples in Part 1 of the series over a short period of time (a couple of hours), the sample data accordingly spans such a short period. In reality, such changes would normally take place over multiple years.
Querying data and optimization considerations
The data of a temporal table is stored in two tables (current and history), but this fact can be transparent to the application. The application should interact with only one table. If you need to query only the current data, you just query the current table, without using any special clauses. For example, suppose you want to query the current state of all employees. You issue the following query:
SELECT *
FROM dbo.Employees;
The plan for this query is shown in Figure 1.
Figure 1: Plan for query without the FOR SYSTEM_TIME clause
The plan scans the clustered index of the current table, and, of course, doesn’t need to involve the history table.
Suppose, though, that you need to query the state of the employees as it was at a past point in time provided in a parameter called @datetime, or even a period defined by the inputs @start and @end. You could technically issue two queries: one against the current table and another against the history table, with the appropriate filter predicates to capture the right versions, and unify the results with a UNION ALL operator. But, instead, you get an elegant clause called FOR SYSTEM_TIME that you specify in a query against a temporal table, or a view based on temporal tables, and in it indicate the validity time or period that you’re interested in. SQL Server then translates your short and elegant query to more complex queries with the right predicates against the current and history tables behind the scenes.
Before I demonstrate using the FOR SYSTEM_TIME clause, I want to make sure you have an accurate understanding of the interval that the system period start and end columns represent. The interval they represent is what’s called in mathematics a closed-open interval and is represented as [sysstart, sysend). A square bracket means a closed edge (inclusive), and an angle bracket represents an open edge (exclusive). So in our case, sysstart is included and sysend isn’t included.
You specify the FOR SYSTEM_TIME clause right after the table or view name, before any table alias, as in:
SELECT ... FROM FOR SYSTEM_TIME AS ;
The subclause you will use most often is the AS OF subclause. You pass a date and time literal, variable or parameter as input, such as FOR SYSTEM_TIME AS OF @datetime. You get back the rows that were considered valid at the input point in time, keeping in mind the fact that the period columns represent a closed-open interval. With our Employees temporal table, qualifying rows are ones satisfying the predicates: sysstart <= @datetime AND sysend > @datetime. For example, the following query returns the state of the employee rows that were valid as of 2015-06-01 20:11:01:
DECLARE @datetime AS DATETIME2(0) = '2015-06-01 20:11:01';
SELECT *
FROM dbo.Employees FOR SYSTEM_TIME AS OF @datetime;
Feel free to look at Table 1 and Table 2, which were provided earlier, and try to figure out which rows are supposed to be returned before you run the query.
Like I said, you can achieve the same with two queries and a UNION ALL operator, like so:
SELECT *
FROM dbo.Employees
WHERE sysstart <= @datetime
AND sysend > @datetime
UNION ALL
SELECT *
FROM dbo.EmployeesHistory
WHERE sysstart <= @datetime
AND sysend > @datetime;
You get the same meaning, and in fact the same query plan, as shown in Figure 2.
Figure 2: Plan for query with FOR SYSTEM_TIME AS OF @datetime
The code (either solution) generates the following output (see if you figured out correctly which rows should be returned):
empid mgrid empname sysstart sysend
---------- ---------- ---------- ---------------------- ----------------------
1 NULL David 2015-06-01 19:54:04 9999-12-31 23:59:59
10 5 Sean 2015-06-01 20:01:41 9999-12-31 23:59:59
2 1 Eitan 2015-06-01 19:54:04 9999-12-31 23:59:59
3 1 Ina 2015-06-01 20:01:41 9999-12-31 23:59:59
4 2 Seraph 2015-06-01 19:54:20 9999-12-31 23:59:59
5 2 Jiru 2015-06-01 19:54:20 9999-12-31 23:59:59
8 5 Lilach 2015-06-01 20:01:41 9999-12-31 23:59:59
11 3 Gabriel 2015-06-01 20:11:01 2015-06-01 21:32:20
12 9 Emilia 2015-06-01 20:01:41 2015-06-01 21:32:20
6 2 Steve 2015-06-01 19:54:20 2015-06-01 21:32:20
7 3 Aaron 2015-06-01 20:01:41 2015-06-01 21:32:20
9 3 Rita 2015-06-01 20:11:01 2015-06-01 21:32:20
Clearly, using the FOR SYSTEM_TIME clause allows a much simpler and more natural solution compared to not using it. This can go a long way toward improving the readability and maintainability of your code.
Note: The following discussion is relevant only if you’re relying on rowstore indexing.
If you’re wondering why the plan performs scans of the clustered indexes instead of seeks, that’s because if you recall from Part 1, the current clustered indexes on the tables have the empid column as the leading key, followed by systart and sysend, and our query doesn’t filter by the empid column. Here’s an example where you filter a specific employee:
DECLARE @datetime AS DATETIME2(0) = '2015-06-01 20:11:01', @empid AS INT = 9;
SELECT *
FROM dbo.Employees FOR SYSTEM_TIME AS OF @datetime
WHERE empid = @empid;
The plan for this query is shown in Figure 3.
Figure 3: Plan with seeks
This time the plan does show seeks in the indexes. If you expect to frequently issue queries without filtering a specific primary key value, you may want to consider also creating an index on the system period columns without the leading primary key column.
Another important thing to note regarding indexing is that when using multiple range predicates, only one of them can be used as a seek predicate (see Seek Predicates property in the plan); the rest are used as residual predicates (see Predicate property in the plan). This means that for optimal performance (to scan fewer pages in the index leaf), it’s advisable that you form the index key list like so: (
In case you need more information about such indexing considerations, I explain those in detail in the articles: Optimization Tips for Multiple Range Predicates, Part 1 and Optimization Tips for Multiple Range Predicates, Part 2.
FOR SYSTEM_TIME subclauses
The AS OF subclause is only one of four subclauses that the FOR SYSTEM_TIME clause supports. Here’s the full list of supported subclauses:
- AS OF @datetime
- FROM @start TO @end
- BETWEEN @start AND @end
- CONTAINED IN(@start, @end)
The inputs can be literals, variables or parameters. The first three subclauses are standard, whereas the fourth is a Microsoft extension to the standard.
Table 3 provides the predicates that represent the qualifying rows for the different subclauses.
Table 3: Qualifying rows for FOR SYSTEM_TIME subclauses
Figure 4 provides illustrations that depict the qualifying rows for the different subclauses graphically.
Figure 4: Illustrations of FOR SYSTEM_TIME subclauses
The green blocks represent the input date and time values in the FOR SYSTEM_TIME subclauses based on the type’s precision. Our system period columns and input variables use the type DATETIME2(0) and hence have a precision of one second. In the AS OF case, the green block represents the input @datetime value. In the rest of the subclauses, the leftmost green block represents the input @start value and the rightmost green block represents the input @end value.
Naturally, there may be additional supported date and time values between @start and @end, and those are represented by the green block in the middle. Using a block to represent a supported value in the type makes it easy for me to convey when a value is or isn’t included. The red and blue arrows represent which ranges of values in the systart and sysend columns, respectively, would make a row a qualifying one. Also keep in mind that there’s an implied condition, which is enforced by SQL Server, that says that sysend >= systart.
Let’s go over examples for all four cases. Starting with the AS OF @datetime subclause, the following query returns the row for the employee with the employee ID stored in @empid that was valid as of the date and time value stored in @datetime (qualifying row: sysstart <= @datetime AND sysend > @datetime):
DECLARE @datetime AS DATETIME2(0) = '2015-06-01 20:11:01', @empid AS INT = 9;
-- also try with @datetime = '2015-06-01 20:11:00'
SELECT *
FROM dbo.Employees FOR SYSTEM_TIME AS OF @datetime
WHERE empid = @empid;
DECLARE @datetime AS DATETIME2(0) = '2015-06-01 20:11:01', @empid AS INT = 9;
-- also try with @datetime = '2015-06-01 20:11:00'
SELECT *
FROM dbo.Employees FOR SYSTEM_TIME AS OF @datetime
WHERE empid = @empid;
SELECT *
FROM dbo.Employees
WHERE empid = @empid
AND sysstart <= @datetime
AND sysend > @datetime
UNION ALL
SELECT *
FROM dbo.EmployeesHistory
WHERE empid = @empid
AND sysstart <= @datetime
AND sysend > @datetime;
Here’s the output that you got for @datetime = '2015-06-01 20:11:01':
empid mgrid empname sysstart sysend
---------- ---------- -------- ---------------------- ----------------------
9 3 Rita 2015-06-01 20:11:01 2015-06-01 21:32:20
And here’s the output that you get for @datetime = '2015-06-01 20:11:00':
empid mgrid empname sysstart sysend
---------- ---------- -------- ---------------------- ----------------------
9 7 Rita 2015-06-01 20:01:41 2015-06-01 20:11:01
The FOR SYSTEM_TIME FROM @start TO @end subclause gives you all row versions where sysstart is before the input @end, and sysend is after the input @start (qualifying rows: sysstart < @end AND sysend > @start). Consider the following example:
DECLARE
@start AS DATETIME2(0) = '2015-06-01 19:00:00',
@end AS DATETIME2(0) = '2015-06-01 21:32:20',
@empid AS INT = 9;
SELECT *
FROM dbo.Employees FOR SYSTEM_TIME FROM @start TO @end
WHERE empid = @empid;
This query is equivalent to the following:
SELECT *
FROM dbo.Employees
WHERE empid = @empid
AND sysstart < @end
AND sysend > @start
UNION ALL
SELECT *
FROM dbo.EmployeesHistory
WHERE empid = @empid
AND sysstart < @end
AND sysend > @start;
You get the following output showing two row versions for the same employee during the period of interest:
empid mgrid empname sysstart sysend
---------- ---------- -------- ---------------------- ----------------------
9 7 Rita 2015-06-01 20:01:41 2015-06-01 20:11:01
9 3 Rita 2015-06-01 20:11:01 2015-06-01 21:32:20
The FOR SYSTEM_TIME BETWEEN @start AND @end subclause is very similar to the previous, only the predicate against systart is less than or equal to the input @end (as opposed to being just less than). The qualifying rows are the ones satisfying the conjunction of predicates: sysstart <= @end AND sysend > @start. Here’s an example using this subclause:
DECLARE
@start AS DATETIME2(0) = '2015-06-01 19:00:00',
@end AS DATETIME2(0) = '2015-06-01 21:32:20',
@empid AS INT = 9;
SELECT *
FROM dbo.Employees FOR SYSTEM_TIME BETWEEN @start AND @end
WHERE empid = @empid;
This query is equivalent to the following code:
SELECT *
FROM dbo.Employees
WHERE empid = @empid
AND sysstart <= @end
AND sysend > @start
UNION ALL
SELECT *
FROM dbo.EmployeesHistory
WHERE empid = @empid
AND sysstart <= @end
AND sysend > @start;
You get the following output, which this time includes three row versions for the employee, compared to two row versions that you got with the previous subclause using the same inputs:
empid mgrid empname sysstart sysend
---------- ---------- -------- ---------------------- ----------------------
9 4 Rita 2015-06-01 21:32:20 9999-12-31 23:59:59
9 7 Rita 2015-06-01 20:01:41 2015-06-01 20:11:01
9 3 Rita 2015-06-01 20:11:01 2015-06-01 21:32:20
Finally, the fourth subclause is FOR SYSTEM_TIME CONTAINED IN(@start, @end). Unlike the others, this one is a Microsoft extension to the standard. It considers as qualifying rows ones where sysstart is on or after the input @start and sysend is on or before the input @end (qualifying rows: sysstart >= @start AND sysend <= @end). In other words, the system period has to be contained in the input period. Here’s an example using this subclause:
DECLARE
@start AS DATETIME2(0) = '2015-06-01 20:01:41',
@end AS DATETIME2(0) = '2015-06-01 21:32:20',
@empid AS INT = 9;
SELECT *
FROM dbo.Employees FOR SYSTEM_TIME CONTAINED IN (@start, @end)
WHERE empid = @empid;
This query is equivalent to the following code:
SELECT *
FROM dbo.Employees
WHERE empid = @empid
AND sysstart >= @start
AND sysend <= @end
UNION ALL
SELECT *
FROM dbo.EmployeesHistory
WHERE empid = @empid
AND sysstart >= @start
AND sysend <= @end;
The query generates the following output, showing two qualifying row versions for the input employee:
empid mgrid empname sysstart sysend
---------- ---------- -------- ---------------------- ----------------------
9 7 Rita 2015-06-01 20:01:41 2015-06-01 20:11:01
9 3 Rita 2015-06-01 20:11:01 2015-06-01 21:32:20
Querying table expressions
As mentioned, you’re allowed to use the FOR SYSTEM_TIME clause against a system-versioned temporal table and against a view. (SQL Server will propagate the clause to inner system-versioned temporal tables.) As of SQL Server 2016 CTP2 (may change later), SQL Server doesn’t support the clause against other types of table expressions (derived tables, CTEs and inline table-valued functions). With those, you will need to specify the clause in the inner references to the temporal tables.
For example, suppose that you create an inline table-valued function (TVF) called ShowHierarchy, which uses a recursive CTE to return the hierarchy of employees with ancestor paths and level indication. You want to be able to query the function requesting to capture the hierarchy at a given point in time, like so:
DECLARE @datetime AS DATETIME2(0) = '2015-06-01 20:11:01';
SELECT *
FROM dbo.ShowHierarchy() FOR SYSTEM_TIME AS OF @datetime AS F;
But currently SQL Server doesn’t support specifying the FOR SYSTEM_TIME clause against the inline TVF and propagating it to the underlying tables like it does with views. For now, as a workaround, you can pass the date and time value as an input to the function, and specify it as input to the FOR SYSTEM_TIME AS OF clause directly in the inner queries against the Employees table. Here’s an example for such a solution defining a function called ShowHierarchyAt:
IF OBJECT_ID(N'dbo.ShowHierarchyAt', 'IF') IS NOT NULL
DROP FUNCTION dbo.ShowHierarchyAt;
GO
CREATE FUNCTION dbo.ShowHierarchyAt(@datetime AS DATETIME2(0)) RETURNS TABLE
AS
RETURN
WITH EmpsCTE AS
(
SELECT empid, mgrid, empname, 0 AS lvl,
CAST('.' + CAST(empid AS VARCHAR(10)) + '.' AS VARCHAR(900)) AS path
FROM dbo.Employees FOR SYSTEM_TIME AS OF @datetime
WHERE mgrid IS NULL
UNION ALL
SELECT S.empid, S.mgrid, S.empname, M.lvl + 1 AS lvl,
CAST(M.path + CAST(S.empid AS VARCHAR(10)) + '.' AS VARCHAR(900)) AS path
FROM EmpsCTE AS M
INNER JOIN dbo.Employees FOR SYSTEM_TIME AS OF @datetime AS S
ON S.mgrid = M.empid
)
SELECT empid, mgrid, empname, lvl, path
FROM EmpsCTE;
GO
Use the following query against the function, passing the date and time value 2015-06-01 20:01:41 as input:
SELECT
REPLICATE(' | ', lvl) + '(' + CAST(empid AS VARCHAR(10)) + ') ' + empname
AS emp
FROM dbo.ShowHierarchyAt('2015-06-01 20:01:41') AS F
ORDER BY path;
You will get the following state of the hierarchy as it was correct at the requested point in time:
(1) David
| (2) Eitan
| | (4) Seraph
| | (5) Jiru
| | | (10) Sean
| | | (8) Lilach
| | (6) Steve
| (3) Ina
| | (7) Aaron
| | | (11) Gabriel
| | | (9) Rita
| | | | (12) Emilia
| | | | (13) Michael
| | | | (14) Didi
Show the hierarchy at 2015-06-01 20:11:01:
SELECT
REPLICATE(' | ', lvl) + '(' + CAST(empid AS VARCHAR(10)) + ') ' + empname
AS emp
FROM dbo.ShowHierarchyAt('2015-06-01 20:11:01') AS F
ORDER BY path;
You will get the following output:
(1) David
| (2) Eitan
| | (4) Seraph
| | (5) Jiru
| | | (10) Sean
| | | (8) Lilach
| | (6) Steve
| (3) Ina
| | (11) Gabriel
| | (7) Aaron
| | (9) Rita
| | | (12) Emilia
Show the hierarchy at 2015-06-01 20:11:01:
SELECT
REPLICATE(' | ', lvl) + '(' + CAST(empid AS VARCHAR(10)) + ') ' + empname
AS emp
FROM dbo.ShowHierarchyAt('2015-06-01 21:32:20') AS F
ORDER BY path;
You will get the following output:
(1) David
| (2) Eitan
| | (4) Seraph
| | | (7) Aaron
| | | (9) Rita
| | (5) Jiru
| | | (10) Sean
| | | (8) Lilach
| (3) Ina
| | (6) Steve
| | | (11) Gabriel
Another capability that is not yet available is to specify a direct correlation in the FOR SYSTEM_TIME clause as an input. This could be handy if you want to return multiple states of the data using the APPLY operator (or a correlated subquery) based on input date and time values stored in a table or a table valued parameter (TVP). Here’s an example demonstrating this need attempting to return the managers of an employee at different points in time. (Don’t actually run this since this code is not supported, yet.):
DECLARE @PointsInTime AS TABLE
(
p VARCHAR(10) NOT NULL PRIMARY KEY,
dt DATETIME2(0) NOT NULL UNIQUE
);
INSERT INTO @PointsInTime(p, dt)
VALUES('T1', '2015-06-01 19:54:04'),
('T2', '2015-06-01 19:54:20'),
('T3', '2015-06-01 20:01:41'),
('T4', '2015-06-01 20:11:01'),
('T5', '2015-06-01 21:32:20');
SELECT P.p, P.dt, E.mgrid
FROM @PointsInTime AS P
OUTER APPLY ( SELECT *
FROM dbo.Employees
FOR SYSTEM_TIME AS OF P.dt AS E
WHERE E.empid = 9 ) AS E;
For now there is a fairly simple workaround. Create an inline TVF that accepts the date and time value as an input parameter, queries the temporal table, and specifies the input parameter in the FOR SYSTEM_TIME clause, like so:
IF OBJECT_ID(N'dbo.EmployeeAt', 'IF') IS NOT NULL
DROP FUNCTION dbo.EmployeeAt;
GO
CREATE FUNCTION dbo.EmployeeAt(@empid AS INT, @datetime AS DATETIME2(0)) RETURNS TABLE
AS
RETURN
SELECT empid, mgrid, empname, sysstart, sysend
FROM dbo.Employees FOR SYSTEM_TIME AS OF @datetime
WHERE empid = @empid;
GO
Then instead of using a query with a correlation as the applied table expression, use the inline TVF and pass the date and time column from the TVP as input to the function, like so:
DECLARE @PointsInTime AS TABLE
(
p VARCHAR(10) NOT NULL PRIMARY KEY,
dt DATETIME2(0) NOT NULL UNIQUE
);
INSERT INTO @PointsInTime(p, dt)
VALUES('T1', '2015-06-01 19:54:04'),
('T2', '2015-06-01 19:54:20'),
('T3', '2015-06-01 20:01:41'),
('T4', '2015-06-01 20:11:01'),
('T5', '2015-06-01 21:32:20');
SELECT P.p, P.dt, E.mgrid
FROM @PointsInTime AS P
OUTER APPLY dbo.EmployeeAt( 9, P.dt ) AS E;
This execution of this code generates the following output:
p dt mgrid
---------- --------------------------- ----------
T1 2015-06-01 19:54:04 NULL
T2 2015-06-01 19:54:20 NULL
T3 2015-06-01 20:01:41 7
T4 2015-06-01 20:11:01 3
T5 2015-06-01 21:32:20 4
For fun, here’s code showing the data pivoted into one row:
WITH C AS
(
SELECT P.p, E.mgrid
FROM @PointsInTime AS P
OUTER APPLY dbo.EmployeeAt( 9, P.dt ) AS E
)
SELECT *
FROM C PIVOT( MAX(mgrid) FOR p IN (T1, T2, T3, T4, T5) ) AS PVT;
This code generates the following output:
T1 T2 T3 T4 T5
---------- ---------- ---------- ---------- ----------
NULL NULL 7 3 4
Conclusion
SQL Server 2016 introduces support for system-versioned temporal tables, which allow you to capture current and history states of the data based on the time modifications are executed. Hopefully in the future we’ll also see support for application-time period tables to allow the application (not the change time) to define the effective periods of the data, including recording future effective periods.
Since the feature is so new, we haven’t yet gathered a lot of performance-related information. Microsoft does provide some indexing guidelines, but there are currently no specialized indexes specifically for temporal data. (For details on what such indexes might look like, see: Interval Queries in SQL Server.) Depending on how many versions you will keep per row and what your queries will look like, perhaps the generic columnstore and rowstore indexes will do well. For example, columnstore segment elimination may end up doing a decent job at eliminating most of the noninteresting data. With time we’ll gain more experience and see.
Another small feature that perhaps we’ll see in the future is the ability to define the system period columns as implicitly hidden. DB2, for example, currently supports this feature. The idea is that when you issue a SELECT * against the temporal table, those columns won’t be returned, rather only when explicitly referred to. I also mentioned some other small things that are missing related to table expressions and correlations. I don’t want to focus on the half-empty glass and seem unappreciative of the efforts made by Microsoft. It is truly great to see the beginning of an investment in SQL Server in such an important area as temporal data.