OFFSET/FETCH, Part 1

Author’s Note: This article covers information about a prereleased version of SQL Server 11, code-named Denali (CTP1). In the final release, Microsoft might decide to change the implementation of this feature or not to include it altogether.

SQL Server code-named “Denali” introduces a new filter called OFFSET/FETCH that allows filtering a requested range of rows based on given ordering, a number of rows to skip, and a number of rows to return. This filter can be used for several purposes, including paging. The OFFSET/FETCH filter is conceptually similar to TOP; still, in some ways it’s more flexible than TOP, and in other ways it’s less flexible. If you’re wondering why Microsoft bothered to add another filter that resembles TOP, you should be aware that TOP isn’t a standard feature, whereas OFFSET/FETCH is. For this reason, it’s more likely that Microsoft will focus future efforts around OFFSET/FETCH—and I therefore recommend that you use it going forward.

This article is the first in a two-part series covering the OFFSET/FETCH filter. In this article I introduce the feature and its fundamentals, and I demonstrate its use with examples. I then discuss how SQL Server optimizes the feature and how you can ensure that you get optimal plans. Next month, I’ll cover additional aspects of the OFFSET/FETCH feature.

For sample data, run the code in Listing 1. This code creates two tables called Customers and Orders in the tempdb database (for test purposes); it populates Customers with 50,000 rows and Orders with 1,000,000 rows. Don’t worry if you’re not familiar with parts of the code in Listing 1 (below); it uses a couple of new SQL Server 11 T-SQL features, such as a sequence and the OFFSET/FETCH filter itself. Remember that the purpose of this code is just to create the sample data, so it’s not really important for the purposes of this article to understand how the code works.

SET NOCOUNT ON;
USE tempdb;

IF OBJECT_ID('dbo.Orders'     , 'U' ) IS NOT NULL DROP TABLE dbo.Orders;
IF OBJECT_ID('dbo.Customers'  , 'U' ) IS NOT NULL DROP TABLE dbo.Customers;
IF OBJECT_ID('dbo.SeqOrderIDs', 'SO') IS NOT NULL DROP SEQUENCE dbo.SeqOrderIDs;
IF OBJECT_ID('dbo.GetNums'    , 'IF') IS NOT NULL DROP FUNCTION dbo.GetNums;
GO

CREATE FUNCTION dbo.GetNums(@n AS BIGINT) RETURNS TABLE
AS
RETURN
  WITH
  L0   AS(SELECT 1 AS c UNION ALL SELECT 1),
  L1   AS(SELECT 1 AS c FROM L0 AS A CROSS JOIN L0 AS B),
  L2   AS(SELECT 1 AS c FROM L1 AS A CROSS JOIN L1 AS B),
  L3   AS(SELECT 1 AS c FROM L2 AS A CROSS JOIN L2 AS B),
  L4   AS(SELECT 1 AS c FROM L3 AS A CROSS JOIN L3 AS B),
  L5   AS(SELECT 1 AS c FROM L4 AS A CROSS JOIN L4 AS B),
  Nums AS(SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS n FROM L5)
  SELECT n FROM Nums ORDER BY n OFFSET 0 ROWS FETCH FIRST @n ROWS ONLY;
GO

CREATE TABLE dbo.Customers
(
  custid    INT     NOT NULL,
  CONSTRAINT PK_Customers PRIMARY KEY(custid)
);

INSERT INTO dbo.Customers WITH (TABLOCK) (custid)
  SELECT n AS custid FROM dbo.GetNums(50000) AS Nums;

CREATE SEQUENCE dbo.SeqOrderIDs AS INT START WITH 1 INCREMENT BY 1;

CREATE TABLE dbo.Orders
(
  orderid   INT     NOT NULL DEFAULT (NEXT VALUE FOR dbo.SeqOrderIDs),
  custid    INT     NOT NULL,
  orderdate DATE    NOT NULL,
  filler    BINARY(200) NOT NULL DEFAULT (0x43112609),
  CONSTRAINT PK_Orders PRIMARY KEY(orderid),
  CONSTRAINT FK_Orders_Customers
    FOREIGN KEY(custid) REFERENCES dbo.Customers(custid)
);

CREATE UNIQUE INDEX idx_unc_od_oid ON dbo.Orders(orderdate, orderid);

INSERT INTO dbo.Orders WITH (TABLOCK) (custid, orderdate)
  SELECT
    CAST(RAND(n*101) * 50000 AS INT) + 1 AS custid,
    DATEADD(day, RAND(n*211)*365*3, '20090101') AS orderdate
  FROM dbo.GetNums(1000000) AS Nums;

Getting Started with OFFSET/FETCH

The OFFSET/FETCH filter appears right after the ORDER BY clause of a query, and, in fact, in the SQL Server documentation it’s considered part of the ORDER BY clause specification. The general syntax of this filter is

[ ORDER BY 
  [ OFFSET  { ROW | ROWS } 
    [ FETCH { FIRST | NEXT }  { ROW | ROWS } ONLY ] ] ]

Based on the ordering defined by the ORDER BY clause, OFFSET dictates how many rows to skip and FETCH dictates how many rows to return after skipping. For example, the following query defines ordering based on orderdate DESC and orderid DESC, skips 50 rows, and fetches the next 10 rows; in other words, based on the given ordering, it returns rows 51 through 60:

SELECT orderid, orderdate, custid, filler
FROM dbo.Orders
ORDER BY orderdate DESC, orderid DESC
OFFSET 50 ROWS FETCH NEXT 10 ROWS ONLY;

Note that contrary to standard SQL, in SQL Server the FETCH clause can’t be specified without the OFFSET clause. So even if you don’t want to skip any rows, you still need to indicate OFFSET 0. For example, the following query returns the first 10 rows based on the given ordering:

SELECT orderid, orderdate, custid, filler
FROM dbo.Orders
ORDER BY orderdate DESC, orderid DESC
OFFSET 0 ROWS FETCH FIRST 10 ROWS ONLY;

The reasoning behind this requirement has to do with a little bit of trivia that I learned from Microsoft’s Tobias Ternstrom, who, among other things, owned this feature. The way the SQL Server parser was designed is that when processing a given token, it can see only one token ahead. So suppose that an ORDER BY clause is followed by the keywords FETCH NEXT. The parser can’t tell whether it’s the FETCH command that fetches a record from a cursor or the new FETCH filtering clause. With the OFFSET keyword appearing after the ORDER BY clause, there’s no ambiguity with other commands, and it’s clear to the parser that it’s the OFFSET/FETCH clause.

As you might have guessed, the keywords FIRST and NEXT are interchangeable; similarly, the keywords ROW and ROWS are interchangeable as well. The flexibility in the use of keywords allows intuitive English-like coding. So in the first query where I needed to skip 50 rows and return the next 10, it was intuitive for me to use the keyword NEXT. In the second query where I didn’t need to skip any rows, it was intuitive for me to use the keyword FIRST. Similarly, if you need to skip or fetch only one row, you’d probably prefer to use the singular form ROW instead of the plural form ROWS, even though both are technically supported regardless of the value to be skipped or fetched.

I mentioned earlier that in SQL Server, FETCH can’t be specified without OFFSET. However, OFFSET can be specified without FETCH—meaning skip the specified number of rows, and return all other qualifying rows without an upper cap. For example, the following query skips 50 rows and returns all other qualifying rows:

SELECT orderid, orderdate, custid, filler
FROM dbo.Orders
ORDER BY orderdate DESC, orderid DESC
OFFSET 50 ROWS;

Both offset_value and fetch_value (the inputs to OFFSET and FETCH) can be either constants or expressions. You’re allowed to refer to variables or parameters, and you’re even allowed to use self-contained subqueries. When you use a subquery, it has to appear in parentheses; otherwise, the use of parentheses is optional (e.g., when referring to a constant or to a parameter).

I mentioned and demonstrated that offset_value can be 0, but fetch_value can’t—at least not when specified as a constant or an expression based on constants. So an attempt to run the following code will fail:

SELECT orderid, orderdate, custid, filler
FROM dbo.Orders
ORDER BY orderdate DESC, orderid DESC
OFFSET 0 ROWS FETCH FIRST 0 ROWS ONLY;

You’ll get the error message that Figure 1 shows.

Msg 10744, Level 15, State 1, Line 4
The number of rows provided for a FETCH clause must be greater then zero.

This restriction complies with standard SQL, but I’m not sure why the restriction is in the standard, because an empty set is still a set. At any rate, there are a couple of workarounds if you find yourself with the need to return an empty set using the OFFSET/FETCH feature. One is to use a variable assigned with 0, like so:

DECLARE @f AS BIGINT = 0;

SELECT orderid, orderdate, custid, filler
FROM dbo.Orders
ORDER BY orderdate DESC, orderid DESC
OFFSET 0 ROWS FETCH FIRST @f ROWS ONLY;

The other is to use a subquery, like so:

SELECT orderid, orderdate, custid, filler
FROM dbo.Orders
ORDER BY orderdate DESC, orderid DESC
OFFSET 0 ROWS FETCH FIRST (SELECT 0) ROWS ONLY;

As you can see so far, conceptually, the OFFSET/FETCH filter is similar to the TOP filter. The OFFSET/FETCH filter is more flexible than TOP in the sense that it allows skipping rows. The TOP filter is more flexible than OFFSET/FETCH in the sense that TOP is allowed in modification statements, whereas OFFSET/FETCH isn’t allowed. In next month’s article, I’ll provide a workaround to this restriction. And in case it wasn’t obvious, TOP and OFFSET/FETCH can’t be combined in the same query.

As I mentioned in the beginning of the article, one of the practical uses of the OFFSET/FETCH filter is for paging purposes. But as you can imagine, if the underlying data changes between page requests, the results can be unstable. There are a few options for you to choose from in case you need to guarantee stable paging results. One option is to create a static snapshot of the results that you need to page through (e.g., in a temporary table), then issue the page requests against that static snapshot. Another option is to submit all page requests in the same transaction and use either the SERIALIZABLE or the SNAPSHOT isolation levels.

Optimization

There are several aspects to consider in regards to optimization of queries with OFFSET/FETCH. I’ll start with indexing considerations and then also compare OFFSET/FETCH with an alternative approach based on row numbers.

Indexing recommendations. The optimization of a query with the OFFSET/FETCH filter depends on whether a supporting index exists—and if an index exists, whether it’s a covering one. Let’s start with the worse-case scenario: No good index exists to support the query (meaning no index defined on the query sort columns as the index keys). To arrange such an environment for our test, run the following code, which drops the existing index on orderdate, orderid:

DROP INDEX dbo.Orders.idx_unc_od_oid;

Then run the following query and examine its execution plan, which Figure 2 shows.

SELECT orderid, orderdate, custid, filler
FROM dbo.Orders
ORDER BY orderdate DESC, orderid DESC
OFFSET 50 ROWS FETCH NEXT 10 ROWS ONLY;

As you can see, the plan performs a full clustered index, which in our case involves scanning 1,000,000 rows. The STATISTICS IO option reports an I/O cost of 27,882 logical reads for this scan. The plan then performs a Top N Sort, where N in this case (as reported by the Top Rows property of this operator) is 60 (50 + 10). The sort operation is by far the most significant part of the cost in this plan, amounting to 94 percent of the entire plan cost. Finally, the plan applies the Top operator to do the final filtering. You can observe in the properties of the Top operator that the Offset Expression property is 50 and the Top Expression property is 10. These two properties correspond to the OFFSET and FETCH input values, respectively. In short, this plan is very inefficient.

The key to an efficient plan is an index on the query sort columns as the index keys. Often, covering indexes are ideal to support good query optimization (never mind for now whether that’s also true in our case), so let’s test this approach. Create a covering index by running the following code:

CREATE UNIQUE INDEX idx_unc_od_oid_i_cid_filler
  ON dbo.Orders(orderdate, orderid)
  INCLUDE(custid, filler);

Note that the key list isn’t very wide (8 bytes for the orderdate column and 4 bytes for orderid), but the covered columns are fairly wide (4 bytes for custid and 200 bytes for filler to mimic a typical order row). So in total, each index row is over 200 bytes long.

Now, run the previous query again (with OFFSET 50 ROWS FETCH NEXT 10 ROWS ONLY) and examine the execution plan that Figure 3 shows.

Observe in the plan that the covering index is scanned in an ordered fashion, and the Top operator makes it stop after 60 rows are scanned. Out of the 60 rows scanned, the Top operator returns only the last 10. The number of logical reads reported for this plan is 7. This is a very efficient plan, provided that the offset value is very small—which in reality is often the case because users tend to ask for only the first few pages from the result. However, this plan isn’t necessarily the best possible plan when the offset value is large. For example, consider the following query:

SELECT orderid, orderdate, custid, filler
FROM dbo.Orders
ORDER BY orderdate DESC, orderid DESC
OFFSET 500000 ROWS FETCH NEXT 10 ROWS ONLY;

Recall that in our covering index the rows are quite wide—over 200 bytes. And even though in this query you care about only the last 10 rows, you still end up scanning the first half million wide rows. With about only 35 rows that fit in a page, the I/O cost of this plan is 13,546 logical reads. This plan might be much better than one without a good index in place, but is it the best you can get? Remember that typically there are multiple page requests involved in each paging session.

Covering indexes are intuitively perceived as better than noncovering indexes with the same key list, but here’s a case in which there’s at least the theoretical potential for the reverse to be true. Consider a nonclustered noncovering index defined on the keys (orderdate, orderid) and with no included columns. The index row size in such an index is less than one tenth of that in the covering index, and therefore SQL Server can fit over ten times the number of rows in each page. Then, one would hope that the optimizer will scan the first fetch_value + offset_value rows (500,000 + 10 in the last query’s case) in the index, and perform lookups only for the last offset_value rows (10 in our case). The I/O cost of such a strategy should be significantly lower than with a covering index—again, when offset_value is large and fetch_value is small. Let’s see if the optimizer does in fact apply this logic. First, run the following code to drop the covering index and create the noncovering one:

DROP INDEX dbo.Orders.idx_unc_od_oid_i_cid_filler;
CREATE UNIQUE INDEX idx_unc_od_oid ON dbo.Orders(orderdate, orderid);

Run the last query again (with OFFSET 500000 ROWS FETCH 10 ROWS ONLY), and examine the execution plan that Figure 4 shows.

Observe in the plan that the Top operator is applied after the Nested Loops operator that applied the Key Lookups. This means that the Key Lookup operator is applied 500,010 times before the filtering of the last 10, and this strategy results in 1,532,100 logical reads. A more efficient plan would be one in which the Top operator is applied after the index scan and before the lookups. This way, the lookups would be applied only fetch_value times (10 in our case) and not offset_value + fetch_value times (500,010), as is currently the case.

As I understand, enhancing the optimizer to support the suggested efficient plan isn’t a trivial thing, so the likelihood for it to happen isn’t high. Moreover, as mentioned, this is only an issue when offset_value is large; the more common case in paging is that it’s small.

In the meanwhile, if you do need to support a large offset_value, there’s a workaround. You can define a table expression based on a query that returns only the keys and applies the OFFSET/FETCH filtering. Then, in the outer query, you can use the CROSS APPLY operator to go after the rest of the attributes. The benefit of this solution is that the optimizer will apply the Key Lookups (clustered index seek operations) after the OFFSET/FETCH filtering. Here’s the complete query:

WITH CLKeys AS
(
  SELECT orderid, orderdate
  FROM dbo.Orders
  ORDER BY orderdate DESC, orderid DESC
  OFFSET 500000 ROWS FETCH FIRST 10 ROWS ONLY
)
SELECT K.*, O.*
FROM CLKeys AS K
  CROSS APPLY (SELECT custid, filler
       FROM dbo.Orders AS A
       WHERE A.orderid = K.orderid) AS O
ORDER BY orderdate DESC, orderid DESC;

Figure 5 shows the execution plan for this query. Observe that the Top operator is applied before the Nested Loops operator and that the Clustered Index Seek operator is invoked only 10 times. The I/O cost of this plan is reduced to 841 logical reads. That’s quite large savings compared with more than 1,500,000 reads. Let’s hope that the optimizer will be enhanced to produce this plan for the original query form without you needing to use the workaround.

Alternative to OFFSET/FETCH. The alternative to filtering rows using the OFFSET/FETCH filter is to assign rows with row numbers and filter the requested range of rows. For example, consider the following query:

SELECT orderid, orderdate, custid, filler
FROM dbo.Orders
ORDER BY orderdate DESC, orderid DESC
OFFSET 50 ROWS FETCH NEXT 10 ROWS ONLY;

Here’s the alternative to this query using row numbers:

WITH C AS
(
  SELECT orderid, orderdate, custid, filler,
    ROW_NUMBER() OVER(ORDER BY orderdate DESC, orderid DESC) AS rownum
  FROM dbo.Orders
)
SELECT *
FROM C
WHERE rownum BETWEEN 51 AND 60
ORDER BY rownum;

At the moment, there’s a nonclustered noncovering index on the sort columns. You already saw in Figure 4 how SQL Server optimized the query that uses OFFSET/FETCH. Figure 6 shows the plan for the query using the row numbers. As you can see, besides the addition of the Segment and Sequence Project operators that calculate the row numbers and involve very minor cost, the rest of the work is almost the same. It’s just that with OFFSET/FETCH, the Top operator handles both the offset (50) and the fetch (10) parts. With the row number filtering, the Top operator handles the filtering of the 50 + 10 = 60 rows, and an additional Filter operator handles the filtering of the requested range of row numbers 51 through 60. All in all, the work is very similar, so you shouldn’t expect a performance gain from the OFFSET/FETCH option. The apparent benefit of OFFSET/FETCH is that it doesn’t require the extra layer of the table expression that the solution that uses row numbers does; therefore, the query using OFFSET/FETCH is more concise.

Figure 6: Plan for query with row numbers

However, when the request is to apply the filtering in a partitioned manner (e.g., return orders 6 through 10—based on orderdate desc, orderid desc ordering) for each customer, there’s a difference between the two approaches, and each performs better in different circumstances. The approach using row numbers works best when the partitioning column has low density (lots of small partitions), and the approach with OFFSET/FETCH excels with dense partitions (few large partitions); examples will follow shortly.

First, the recommendation here is to create a supporting covering index. Run the following code to create one:

CREATE UNIQUE INDEX idx_unc_cid_odD_oidD
  ON dbo.Orders(custid, orderdate DESC, orderid DESC)
  INCLUDE(filler);

If there’s a large number of partitions with a small number of rows in each (i.e., the density of the partitioning column is low), the solution that relies on the ROW_NUMBER function is the optimal one because it scans the base data only once. Such is our case because the code in Listing 1 populated the Orders table with 50,000 customers, each with an average of 20 rows. Here’s the solution code (Figure 7 shows the plan):

WITH C AS
(
  SELECT orderid, orderdate, custid, filler,
    ROW_NUMBER() OVER(PARTITION BY custid
          ORDER BY orderdate DESC, orderid DESC) AS rownum
  FROM dbo.Orders
)
SELECT *
FROM C
WHERE rownum BETWEEN 6 AND 10
ORDER BY custid, rownum;

Figure 7: Plan for partitioned query with row numbers

Most of the cost of this plan is associated with the scan of the leaf level of the covering index. The total I/O cost here is 27,097 reads, which is the number of pages in the leaf of the index. As I mentioned, this strategy works best for nondense partitions.

With dense partitions, a better strategy would be to query the Customers table and apply (using the CROSS APPLY operator) a query with an OFFSET/FETCH filter to each customer, like so (Figure 8 shows the plan):

SELECT C.custid, A.*
FROM dbo.Customers AS C
  CROSS APPLY (SELECT orderid, orderdate, filler
       FROM dbo.Orders AS O
       WHERE O.custid = C.custid
       ORDER BY orderdate DESC, orderid DESC
       OFFSET 5 ROWS FETCH NEXT 5 ROWS ONLY) AS A
ORDER BY custid, orderdate DESC, orderid DESC;

Figure 8: Plan for partitioned query with APPLY

As you can see, the plan scans the Customers table, and for each customer it performs a seek in the covering index on Orders to the beginning of the customer partition in the index, scans the first 10 rows, and returns the last 5 of those. Assuming there were only a small number of customers (not our case at the moment), this plan would generate much less I/O than the one for the solution based on row numbers, which scans the entire leaf of the covering index on Orders.

What’s Next?

As you probably noticed, the OFFSET/FETCH feature is pretty straightforward to use, which is probably its greatest benefit. I described the fundamentals of this feature and covered its optimization. I provided recommendations regarding indexing, and I explained the different plans you’d get depending on whether the index is a covering one or not. I also covered an alternative method to filter the desired range of rows using row numbers and showed that it’s more verbose than code using OFFSET/FETCH. But I also mentioned that the approach using row numbers performs better when you need to apply the filter in a partitioned manner (such as per customer) and the partitioning element has low density.

There are several other interesting aspects of working with OFFSET/FETCH that I haven’t covered in this article; I’ll cover those next month.

Comments

Plain text