SQL Server 2016 SP1 Standard Edition Enhancements

I seldom get excited about service packs but the changes released with SQL Server 2016 SP1 are the most significant I’ve seen in a SQL Server service pack in 20+ years. Microsoft announced this week at the Microsoft Connect(); developer’s conference that SQL Server 2016 SP1, which is available for download immediately, allows features previously available only in Enterprise/Developer Editions to be used in lessor Standard, Web, Express, and LocalDB Editions too. Features like table partitioning, In-Memory OLTP, and columnstore are now options for developers and DBAs using SQL Server Standard Edition and even the free Express Edition in production. See SQL Server 2016 Service Pack 1 (SP1) released !!! for the complete matrix of programmability features by edition along with other cool SP1 information.

The implications are huge now that SQL Server has the same programmability surface area among editions. The choice of the production edition can be made independently based on operational needs rather than programmability features. Developers can use a free edition (i.e. LocalDB, Express or Developer) without fear a feature won’t be available in production as long as prod is running SQL Server 2016 SP1 or greater. DBAs can now choose the appropriate edition for production based on other considerations like advanced high availability, TDE, Auditing as well as performance features like higher supported memory, more number of cores, and advanced scanning. This separation of concerns avoids the need to lock in the production edition early in the application lifecycle, making development easier and production implementation more flexible.

Real World Use Case Scenario
I work with an ISV with hundreds of customers running a mix of Standard and Enterprise Edition. Their needs vary widely and SQL Server Enterprise Edition is not an option for some due to budget constraints. Some tables are often quite large so partitioning is required for manageability and, for their reporting workload, partitioning also improves performance of large scans due to partition elimination. The ugliness though, is that table partitioning (and/or columnstore) is the right tool for the job but was not an option for customers on Standard Edition.

The ISV initially compromised and used view partitioning instead of table partitioning so that the same code would run regardless of edition. Although that provided the expected manageability benefits, there were some downsides. Compilation time increased significantly as the number of partitioned view member tables increased as did the query plan complexity. This sometimes resulted in poor query plans against very large tables and especially impacted larger and most valued customers, most of which were running Enterprise Edition.

To address the problem before SQL Server 2016 SP1, the ISV added conditional code to the application so that either view or table partitioning could be used depending on the SQL Server edition. This wasn’t ideal as it added code complexity and doubled the number of QA test cases for application features that performed partition maintenance. However, since the resultant benefits for their larger customers on Enterprise Edition were quite significant; the additional costs of development and testing were well-justified.

Now that table partitioning is available in SQL Server 2016 SP1 Standard Edition, they plan to require SQL Server 2016 SP1 (or later) going forward, use table partitioning unconditionally, and perhaps introduce usage of other features like columnstore that were previously Enterprise only. Not only will this simplify the code base and test cases, customers on Standard Edition will be happier with their experience and can upgrade to Enterprise if they so choose without reinstalling or reconfiguring the application. It will of course take some time before all their customers upgrade to the latest product version and SQL 2016 SP1+ but the future is much brighter now.

Perform Due Diligence
If you are new to features previously available only in Enterprise Edition, I suggest you perform due diligence before using these features. Memory-optimized features like columnstore and In-Memory OLTP require additional physical memory and insufficient memory with memory-optimized features will be a production show-stopper. Make sure your hardware is sized appropriately regardless of edition and, in the case of editions other than Enterprise or Developer, memory requirements don’t exceed the maximum capacity limits for that edition. Although very powerful, In-Memory OLTP is a fundamentally different paradigm that you might be accustomed to regarding transactional behavior and isolation levels. Be sure you fully understand these features before using it in development or production.

Summary
I hope these changes are enough motivation for you to consider upgrading to SQL Server 2016 SP1, especially if you are running Standard Edition or are currently on an older SQL Server version. Together with the fact that SQL Server 2016 just runs faster, the time and effort spend in upgrading is a solid investment that will pay dividends regardless of edition.

SQL Server 2016 and Azure SQL Database V12 Breaking Change

This post is to get the word out about a breaking change to datetime conversion and comparison behavior in SQL Server 2016 and Azure SQL Database V12. This change hasn’t been documented as of this writing in the Breaking Changes to Database Engine Features in SQL Server 2016 topic in the SQL Server Books Online.

In short, conversion from datetime to a higher precision temporal data type (datetime2, datetimeoffset, or time) may yield a different, but more accurate, time value than in prior versions. Also, predicates involving datetime consider the full precision of raw datetime internal value instead of the time value rounded to the nearest millisecond. These changes in conversion and comparison behavior may affect existing applications and is not intuitive unless one understands the underlying datetime data type implementation.

Background
You may be aware that the accuracy of datetime is limited to 1/300 of a second. This is because values are internally an 8-byte structure consisting of 2 separate 32-bit integers, one with the number of day units since 1900-01-01 and the other with the number of 1/300 second interval units since midnight. The 1/300 second unit interval limits the time accuracy to 3.33333… milliseconds and the milliseconds value will be a repeating decimal when time interval units are not evenly divisible by 3. The raw decimal value is rounded to a scale of 3 in accordance with the fixed datetime precision of 3, resulting in a millisecond value of 0, 3, or 7 for all datetime values.

Pre-SQL Server 2016 Behavior
Before SQL Server 2016, conversion from datetime to another temporal type used the source datetime value after it was rounded to the nearest millisecond, which truncated repeating decimal fractional milliseconds. The rounded value was then rounded again according to the target type precision. When the target type precision was greater than 3, the time was extended to the target type precision with insignificant trailing zeros, resulting in zero for the sub-millisecond value.

DECLARE @DateTime datetime = '2016-01-01T00:00:00.007';
SELECT CAST(@DateTime AS datetime2(0)); --2016-01-01 00:00:00
SELECT CAST(@DateTime AS datetime2(1)); --2016-01-01 00:00:00.0
SELECT CAST(@DateTime AS datetime2(2)); --2016-01-01 00:00:00.01
SELECT CAST(@DateTime AS datetime2(3)); --2016-01-01 00:00:00.007
SELECT CAST(@DateTime AS datetime2(4)); --2016-01-01 00:00:00.0070
SELECT CAST(@DateTime AS datetime2(5)); --2016-01-01 00:00:00.00700
SELECT CAST(@DateTime AS datetime2(6)); --2016-01-01 00:00:00.007000
SELECT CAST(@DateTime AS datetime2(7)); --2016-01-01 00:00:00.0070000

Also, when datetime was compared to another temporal type, the rounded value was used. This script shows the result of the equality predicate is true after the datetime value is converted to datetime2.

--This script prints EQUAL predicate is true
DECLARE @DateTime datetime = '2016-01-01T00:00:00.003';
DECLARE @DateTime2 datetime2(7) = @DateTime;
IF @DateTime = @DateTime2 PRINT 'EQUAL predicate is true' ELSE PRINT 'EQUAL predicate is not true';
IF @DateTime < @DateTime2 PRINT 'LESS THAN predicate is true' ELSE PRINT 'LESS THAN predicate is not true';
IF @DateTime > @DateTime2 PRINT 'GREATER THAN predicate is true' ELSE PRINT 'GREATER THAN predicate is not true';
GO
--This script prints EQUAL predicate is true
DECLARE @DateTime datetime = '2016-01-01T00:00:00.007';
DECLARE @DateTime2 datetime2(7) = @DateTime;
IF @DateTime = @DateTime2 PRINT 'EQUAL predicate is true' ELSE PRINT 'EQUAL predicate is not true';
IF @DateTime < @DateTime2 PRINT 'LESS THAN predicate is true' ELSE PRINT 'LESS THAN predicate is not true';
IF @DateTime > @DateTime2 PRINT 'GREATER THAN predicate is true' ELSE PRINT 'GREATER THAN predicate is not true';
GO

SQL Server 2016 Behavior Change
SQL Server 2016 and Azure SQL Database V12 use the raw datetime internal value without rounding during conversion to another temporal type. The value is rounded only once during conversion, to the target type precision. The end result will be the same as before SQL Server 2016 when the target type precision is 3 or less. However, the converted value will be different when the target type precision is greater than 3 and the internal time unit interval is not evenly divisible by 3 (i.e. rounded source datetime millisecond value is 3 or 7). Note the non-zero microseconds and nanoseconds in the script results below and that rounding is based on the target type precision rather than the source.

DECLARE @DateTime datetime = '2016-01-01T00:00:00.003';
SELECT CAST(@DateTime AS datetime2(0)); --2016-01-01 00:00:00
SELECT CAST(@DateTime AS datetime2(1)); --2016-01-01 00:00:00.0
SELECT CAST(@DateTime AS datetime2(2)); --2016-01-01 00:00:00.03
SELECT CAST(@DateTime AS datetime2(3)); --2016-01-01 00:00:00.003
SELECT CAST(@DateTime AS datetime2(4)); --2016-01-01 00:00:00.0033
SELECT CAST(@DateTime AS datetime2(5)); --2016-01-01 00:00:00.00333
SELECT CAST(@DateTime AS datetime2(6)); --2016-01-01 00:00:00.003333
SELECT CAST(@DateTime AS datetime2(7)); --2016-01-01 00:00:00.0033333
GO
DECLARE @DateTime datetime = '2016-01-01T00:00:00.007';
SELECT CAST(@DateTime AS datetime2(0)); --2016-01-01 00:00:00
SELECT CAST(@DateTime AS datetime2(1)); --2016-01-01 00:00:00.0
SELECT CAST(@DateTime AS datetime2(2)); --2016-01-01 00:00:00.01
SELECT CAST(@DateTime AS datetime2(3)); --2016-01-01 00:00:00.007
SELECT CAST(@DateTime AS datetime2(4)); --2016-01-01 00:00:00.0067
SELECT CAST(@DateTime AS datetime2(5)); --2016-01-01 00:00:00.00667
SELECT CAST(@DateTime AS datetime2(6)); --2016-01-01 00:00:00.006667
SELECT CAST(@DateTime AS datetime2(7)); --2016-01-01 00:00:00.0066667
GO

This behavior change provides a more accurate converted value but may break applications that expect the converted value to be the same as the rounded datetime value as was the case before SQL Server 2016.

Be aware than the full raw datetime precision (instead of the rounded value) is also used when evaluating predicates involving a datetime type. The full precision of both arguments are used, resulting in the equality compare predicate to evaluate to false in both scripts below. The greater than predicate is true in the first script and the less than predicate is true in the second:

--This script prints GREATER THAN predicate is true
DECLARE @DateTime datetime = '2016-01-01T00:00:00.003';
DECLARE @DateTime2 datetime2(7) = @DateTime;
IF @DateTime = @DateTime2 PRINT 'EQUAL predicate is true' ELSE PRINT 'EQUAL predicate is not true';
IF @DateTime < @DateTime2 PRINT 'LESS THAN predicate is true' ELSE PRINT 'LESS THAN predicate is not true';
IF @DateTime > @DateTime2 PRINT 'GREATER THAN predicate is true' ELSE PRINT 'GREATER THAN predicate is not true';
GO
--This script prints LESS THAN predicate is true
DECLARE @DateTime datetime = '2016-01-01T00:00:00.007';
DECLARE @DateTime2 datetime2(7) = @DateTime;
IF @DateTime = @DateTime2 PRINT 'EQUAL predicate is true' ELSE PRINT 'EQUAL predicate is not true';
IF @DateTime < @DateTime2 PRINT 'LESS THAN predicate is true' ELSE PRINT 'LESS THAN predicate is not true';
IF @DateTime > @DateTime2 PRINT 'GREATER THAN predicate is true' ELSE PRINT 'GREATER THAN predicate is not true';
GO

To provide insight into why the comparisons result in greater than and less than respectively, the script below shows the nanoseconds value of the compared data types:

DECLARE @DateTime datetime = '2016-01-01T00:00:00.003';
DECLARE @DateTime2 datetime2(7) = @DateTime;
SELECT DATEPART(nanosecond, @DateTime); --3333333
SELECT DATEPART(nanosecond, @DateTime2); --3333300
GO
DECLARE @DateTime datetime = '2016-01-01T00:00:00.007';
DECLARE @DateTime2 datetime2(7) = @DateTime;
SELECT DATEPART(nanosecond, @DateTime); --6666666
SELECT DATEPART(nanosecond, @DateTime2); --6666700
GO

The datetime2 type is accurate only to 100 nanosecond whereas datetime includes values to the nanosecond (and beyond) because the theoretical precision of repeating decimal values is unlimited. The implication is that a datetime type with a repeating decimal value will never compare equally with any temporal type except datetime.

Datetime conversion and comparison behavior is controlled by the database compatibility level. Databases in SQL Server 2016 level (130) use the new behavior and the legacy behavior is used with other levels.

Summary
These datetime behavior changes have the benefit of improved accuracy and performance of datetime conversion/comparison. Affected applications can use a pre-SQL Server 2016 database compatibility level until they can be remediated.

I recommend one avoid comparing datetime directly to other temporal types going forward. Instead convert the datetime value to the type being compared and use the converted value in the predicate. It’s generally best practice to match data types whenever possible for performance and to avoid ambiguity.

SQL Server TVP Performance Gotchas

Table-valued parameters have important considerations that developers and DBAs need to be aware of. It is essential that application code specify the proper data type and length for TVP columns in order to achieve optimal performance and reduce unnecessary overhead. Additionally, a trace (Extended Events, server-side SQL Trace, Profiler) that captures RPC events of an inappropriately defined TVP can not only exacerbate performance issues, but affect stability of the SQL Server instance in some cases.

The graph below summarizes the impact the application code max column length specification can have on performance, without and with a trace running. All tests used the same table type of 10 varchar(50) columns and a 10,000 row TVP rows passed via a DataTable object of 10 string columns. The only variables were the max column length specified by the app code and a trace running on the database server. The client application was run on a different machine than the database server and elapsed time measured by the application.

TVP Performance Comparison
Figure 1: Impact of TVP maximum column length specification with and without tracing

The average duration was 113ms without a trace running when the app code used the default -1 max column length. However, when max column length 50 was specified (matching the varchar(50) column of the table type), the average duration dropped significantly to 75ms. The trivial code change of specifying the string column max length of 50 improved performance by 33%.

I then ran the same pair of tests while a trace captured the RPC completed events. The average duration of the default max length test increased from 113ms to 9.324ms with the trace running, an over 80x degradation in performance! The test with the explicit 50 character max length was not nearly impacted as much, increasing from 75ms to 89ms (which is tolerable, IMHO).

TVP Internals
Under the hood, TVP data are passed to SQL Server over the Tabular Data Stream (TDS) protocol. The client API sends TVP column meta-data to SQL Server describing the data type, length, and other meta-data for each TVP column followed by data rows with each column in native format matching the preceding data type specification. SQL Server uses the provided column meta-data to prepare and fill buffers for efficient processing on the server side, leveraging native types to eliminate parsing overhead similarly to other parameterized queries.

Before a query or stored proc with a TVP starts executing, SQL Server creates a table in tempdb with the same schema as the parameter table type and uses bulk insert internally to efficiently load the table with TVP rows streamed by the client application. The size of a TVP is constrained only by available tempdb storage. SQL Server executes the query/proc after the TVP temp table is loaded and the parameterized T-SQL query/proc can then use the TVP data.

The TVP columns provided by the client application do not have to match the schema of the target table type; SQL Server implicitly converts TVP values to match the target table type column when data types differ. Although not optimal, implicit conversion is generally not a major factor in overall TVP performance.

The application-specified TVP max column length can impact performance significantly, and in some cases dramatically, as illustrated by the performance tests shown earlier. SQL Server prepares to receive TVP data up to the max length specified by the client application rather than the defined size of the target table type column. When the specified max length of variable length columns exceed the 8000 byte tipping point, SQL Server uses a different code path to allow for large object (LOB) values up to 2GB. Unless the table type actually contains LOB values (varchar(MAX), nvarchar(MAX)), database server resources are wasted unnecessarily when an inappropriate max column length is specified.

LOB values are especially problematic when a trace captures the RPC completed event of a TVP query. Tracing uses memory from the OBJECTSTORE_LBSS memory pool to build trace records that contain TVP LOB values. From my observations of the sys.dm_os_memory_clerks DMV, each LOB cell of a TVP requires about 8K during tracing regardless of the actual value length. This memory adds up very quickly when many rows and lob columns are passed via a TVP with a trace running. For example, the 10,000 row TVP with 10 LOB columns used in the earlier test required over 800MB memory for a single trace record. Consider that a large number of TVP LOB cells and/or concurrent TVP queries can cause queries to fail with insufficient memory errors. In extreme cases, the entire instance can become unstable and even crash under due to tracing of TVP queries.

Specifying Proper TVP Parameter Column Meta-Data
A SQL Server development best practice has long been to use strongly-typed parameters with attention to detail regarding the parameter data type and length such that it is consistent with the types on the server. This practice improves performance by avoiding implicit data type conversions, promotes sargable expressions, avoids unnecessary procedure cache bloat, and inherently validates data for proper typing on the client before it is sent to the database server. With scalar parameters, one need only specify the correct SqlDbType along with the proper length, precision/scale (avoiding AddWithValue method to add parameters) and all is well in the world.

TVP parameters require additional column meta-data not applicable to scalar parameters. The parameter data type of a TVP in .NET is always SqlDbType.Structured. The additional TVP column meta-data is inferred from the supplied parameter value, which may be a DbDataReader, IEnumerable, or DataTable object. These objects inherently contain column meta-data and methods enumerate rows, which the SqlClient API uses to send the TVP to SQL Server.

DataTable objects are most commonly used as TVP values. DataTables are easy to use and can serve as containers for data beyond just TVP usage. But unlike DbDataReader and IEnumerable objects, a big gotcha with a DataTable is that the default data type String with maximum length of -1 (2GB LOB). This is the .NET equivalent of the SQL Server nvarchar(MAX) data type and has many insidious and negative implications with a TVP. First, values of types other than string that are added to a DataTable string column will be converted to string (DateTime, Integer, GUID, etc.). Consequently, using the default DataTable column string data type for non-string types will:

• increase client memory requirements compared to more compact native types
• incur conversion overhead
• prevent strong-typed data validation on client side
• require date format aware formatting of date and datetime values
• require using a period as decimal separators
• increase network usage compared with smaller native types

When String is the proper column data type, developers must be especially mindful of the max length specification when the DataTable is used as the TVP value. Strings in Windows and .NET are Unicode, requiring 2 bytes per character. This means a max length of over 4000 characters will cross the 8000 byte threshold for LOB data on the server side regardless of the table type on the server. Avoid using long string columns in TVPs when many rows are passed and never use the default -1 length unless a MAX type is actually intended.

DBAs who support applications that use TVPs should be aware of the repercussions tracing can have on performance and SQL Server memory. Avoid capturing TVP RPC completed events of large TVP requests, if possible. When tracing TVP RPC completed events, monitor the OBJECTSTORE_LBSS memory pool for excessive memory usage.

Acknowledgements
I’d like the thank SQL Server MVP Ola Hallengren for his suggestion for me to write this article.

ORDER BY Is Required

I often see folks assume rows will be returned in order by the clustered index when ORDER BY is not specified in the query. This is despite the fact that the SQL Server Books Online ORDER BY topic specifically states “The order in which rows are returned in a result set are not guaranteed unless an ORDER BY clause is specified.”

For those that want to save a few keystrokes and omit the needed ORDER BY, here’s one example that shows rows may be returned in an order other than the clustered index.

USE tempdb;
GO

CREATE DATABASE Demo
	ON
		  (NAME='Demo_Primary1', FILENAME='D:\SqlDataFiles\Demo_Primary', SIZE=10MB)
		, (NAME='Demo_Primary2', FILENAME='D:\SqlDataFiles\Demo_Primary2', SIZE=10MB)
	LOG ON
		  (NAME='Demo_Log', FILENAME='L:\SqlLogFiles\Demo_Log.ldf', SIZE=1MB);
GO

USE Demo;
GO

CREATE TABLE OrderTest
(
	ID int,
	Name VARCHAR(50),
	OtherData char(5000)
);
CREATE CLUSTERED INDEX cdx_OrderTest ON OrderTest(ID);
GO

--load 512 odd numbers
WITH 
	 t4 AS (SELECT n FROM (VALUES(0),(0),(0),(0)) t(n))
	,t1K AS (SELECT ROW_NUMBER() OVER (ORDER BY (a.n)) AS num
		FROM t4 AS a CROSS JOIN t4 AS b CROSS JOIN t4 AS c CROSS JOIN t4 AS d CROSS JOIN t4 AS e)
INSERT INTO OrderTest WITH(TABLOCKX)
SELECT num, 'test', 'test'
FROM t1K
WHERE num % 2 = 1;
GO

----load 512 even numbers
WITH 
	 t4 AS (SELECT n FROM (VALUES(0),(0),(0),(0)) t(n))
	,t1K AS (SELECT ROW_NUMBER() OVER (ORDER BY (a.n)) AS num
		FROM t4 AS a CROSS JOIN t4 AS b CROSS JOIN t4 AS c CROSS JOIN t4 AS d CROSS JOIN t4 AS e)
INSERT INTO OrderTest WITH(TABLOCKX)
SELECT num, 'test', 'test'
FROM t1K
WHERE num % 2 = 0;
GO

SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
GO

SELECT * FROM dbo.OrderTest;
/*
ID	Name	OtherData
2	test	test
4	test	test
6	test	test
8	test	test
17	test	test
19	test	test
21	test	test
23	test	test
25	test	test
27	test	test
...
*/
SELECT * FROM dbo.OrderTest ORDER BY ID;
/*
ID	Name	OtherData
1	test	test
2	test	test
3	test	test
4	test	test
5	test	test
6	test	test
7	test	test
8	test	test
9	test	test
10	test	test
...
*/
GO

The reason the query without the ORDER BY returns rows in a different order than the clustered index logical order is because SQL Server chose to scan the table in physical allocation order using the IAM (Index Allocation Map) instead of following the clustered index linked list. For details of the allocation and page linkage of this table, you can use the undocumented (don’t use in in prod) sys.dm_db_database_page_allocations table-valued function in SQL 2012 and later:

SELECT * 
FROM sys.dm_db_database_page_allocations(DB_ID(), OBJECT_ID(N'OrderTest'), 1, 1, 'DETAILED')
WHERE page_type_desc = 'DATA_PAGE';

The IAM scan was done here due to no ordering requirement for this query and the allocation order scan was deemed more efficient, and permissible in the READ UNCOMMITTED isolation level because data integrity isn’t needed. Other factors can also affect the ordering of results, including available indexes, execution plan operators, parallelism, and other concurrent activity.

Summary
Remember that SQL Server strives to execute queries as efficiently as possible as long as it adheres to the data contract. The chosen plan and storage engine internals that vary by SQL Server version and edition will influence ordering of results. The ordering of rows is by happenstance unless ORDER BY is specified.

Tiered Storage Partition Copy

In my last tiered storage sliding window post, I shared a sliding window script that also moves an older partition to a different filegroup as part of a tiered storage strategy. This approach allows one to keep actively used partitions on the fastest storage available while keeping older less frequently used data on less expensive storage. That version of the script moves data efficiently using a staging table and CREATE INDEX…DROP EXISTING but the downside is the data being moved is unavailable for querying from the time the partition is switched out of the main table until it is switched back in. Consequently, the maintenance needs to be scheduled during a window where data in the partition being moved isn’t needed.

This follow-up article shows an alternative copy technique instead. Data in the partition being moved is left in the main table until the after the copy completes. Once data are copied to the different filegroup, partition maintenance is performed to switch out the original partition and switch in the copied data, now on the older data filegroup. This allows data to remain online except during the final partition switches, which are fast meta-data operations. It is assumed that the data in the partition being moved is read-only during this process.

Copy Data to Slower Storage
Unfortunately, neither ALTER INDEX nor CREATE INDEX…DROP EXISTING provide the capability to repartition an individual partition of a table/index. ALTER INDEX allows one to target a specific partition during a REORGANIZE or REBUILD but not change the filegroup or partition scheme in the process. CREATE INDEX…DROP EXISTING allows a filegroup or partition scheme specification but applies to the entire index; individual partitions cannot be specified. Consequently DML (INSERT…SELECT) instead of DDL must be used in order to keep data online during the copy process.

I’ll assume you’ve already slid the window by purging expired data and preparing for new data as detailed in my last article with the only task remaining to move an older partition to a slower storage filegroup. The following steps will keep the data being moved online during the potentially long copy process. The offline operations performed in steps 3-6 are metadata operations that will complete quickly, once an exclusive table lock (to avoid deadlocks) can be granted. Note that all partition functions and schemes are identical and the staging table is empty before these steps are performed.

1) Move the empty staging table partition to the older data filgegroup
2) Load data into the moved partition of the staging table
3) Switch the copied data out of the main table into a second staging table
4) Switch the adjacent partition out of the main table into the first staging table
5) Move the main table partition to the older data filegroup
6) Switch the partitions from the first staging table back into the main table

The complete DDL and script is at the end of this article. It differs from my last article (which I’ll assume you’ve already perused) by the addition of a second staging table and this alternative move technique. Below are the individual steps.

Move the empty staging table partition to the older data filgegroup
Remember, the staging table is partitioned using the secondary partition function/scheme so that it can be repartitioned independently of the main table. The first task is to move the partition of the unused secondary partition scheme to the slower storage filegroup using MERGE and SPLIT:

1) MERGE the secondary partition function boundary of the moved month to remove it from the secondary partition function and scheme:

ALTER PARTITION FUNCTION PF_12MonthSlidingWindow_Secondary()
    MERGE RANGE (@MonthStartDateToMoveFromCurrentToOld);

2) Set the secondary partition scheme NEXT USED to slower storage filegroup:

ALTER PARTITION SCHEME PS_12MonthSlidingWindow_Secondary
    NEXT USED FG_OlderData;

3) SPLIT the secondary partition function for the moved month to recreate the partition on the slower storage filegroup:

ALTER PARTITION FUNCTION PF_12MonthSlidingWindow_Secondary()
    SPLIT RANGE (@MonthStartDateToMoveFromCurrentToOld);

Load the moved partition of the staging table
Now that the staging table partition is on the FG_OlderData filegroup, copy data for the partition to be moved into a staging table using INSERT…SELECT. Note that the WHERE clause specifies the source partition boundaries:

INSERT INTO dbo.SlidingWindowTable_Staging WITH(TABLOCKX) (PartitioningColumn, OtherKeyColumn, OtherData)
    SELECT
          PartitioningColumn
	, OtherKeyColumn
	, OtherData
FROM dbo.SlidingWindowTable
WHERE
        PartitioningColumn >= @MonthStartDateToMoveFromCurrentToOld
    AND PartitioningColumn < DATEADD(month, 1, @MonthStartDateToMoveFromCurrentToOld);

Switch the copied data out of the main table into a second staging table
The second staging table is used to permanently remove data from the original partition, still on the NewerData filegroup.

TRUNCATE TABLE dbo.SlidingWindowTable_Staging2;
ALTER TABLE dbo.SlidingWindowTable
    SWITCH PARTITION $PARTITION.PF_12MonthSlidingWindow(@MonthStartDateToMoveFromCurrentToOld)
    TO dbo.SlidingWindowTable_Staging2 PARTITION $PARTITION.PF_12MonthSlidingWindow(@MonthStartDateToMoveFromCurrentToOld);

TRUNCATE TABLE dbo.SlidingWindowTable_Staging2;

Switch the adjacent partition out of the main table into the first staging table
Before we move the main table partition (now empty) to the OlderData filegroup, the adjacent partition is also switched out. This is technically not required but is done to follow Microsoft's recommended best practice of merging only empty partitions.

ALTER TABLE dbo.SlidingWindowTable
    SWITCH PARTITION $PARTITION.PF_12MonthSlidingWindow(DATEADD(month, -1, @MonthStartDateToMoveFromCurrentToOld))
    TO dbo.SlidingWindowTable_Staging PARTITION $PARTITION.PF_12MonthSlidingWindow_Secondary(DATEADD(month, -1, @MonthStartDateToMoveFromCurrentToOld));

Move the main table partition to the older data filegroup
The main table partition is moved from the NewerData filegroup to OlderData with merge. Keep in mind that the adjacent partition on the OlderData filegroup is empty. Data for both these partitions are in the staging table.

--remove partition from main table
ALTER PARTITION FUNCTION PF_12MonthSlidingWindow()
    MERGE RANGE (@MonthStartDateToMoveFromCurrentToOld);

--create new empty partition on old data filegroup for moved data
ALTER PARTITION SCHEME PS_12MonthSlidingWindow
    NEXT USED FG_OlderData;
    ALTER PARTITION FUNCTION PF_12MonthSlidingWindow()
    SPLIT RANGE (@MonthStartDateToMoveFromCurrentToOld);

Switch the partitions from the first staging table back into the main table
Both main and staging table partition schemes are now identical. The partitions can now be switch back into the main table.

ALTER TABLE dbo.SlidingWindowTable_Staging
    SWITCH PARTITION $PARTITION.PF_12MonthSlidingWindow_Secondary(DATEADD(month, -1, @MonthStartDateToMoveFromCurrentToOld))
    TO dbo.SlidingWindowTable PARTITION $PARTITION.PF_12MonthSlidingWindow(DATEADD(month, -1, @MonthStartDateToMoveFromCurrentToOld));
ALTER TABLE dbo.SlidingWindowTable_Staging
    SWITCH PARTITION $PARTITION.PF_12MonthSlidingWindow_Secondary(@MonthStartDateToMoveFromCurrentToOld)
    TO dbo.SlidingWindowTable PARTITION $PARTITION.PF_12MonthSlidingWindow(@MonthStartDateToMoveFromCurrentToOld);

Scripts
Below are the complete example scripts to implement this copy sliding window technique. My last tiered storage sliding window post also includes the CREATE DATABASE, including filegroups.

--main table partition function (before start of new month)
CREATE PARTITION FUNCTION PF_12MonthSlidingWindow (datetime2(0))
AS RANGE RIGHT FOR VALUES (
	  '2014-11-01T00:00:00' --current month -12 on OlderData filegroup
	, '2014-12-01T00:00:00' --current month -11 on OlderData filegroup
	, '2015-01-01T00:00:00' --current month -10 on OlderData filegroup
	, '2015-02-01T00:00:00' --current month -9 on OlderData filegroup
	, '2015-03-01T00:00:00' --current month -8 on OlderData filegroup
	, '2015-04-01T00:00:00' --current month -7 on OlderData filegroup
	, '2015-05-01T00:00:00' --current month -6 on OlderData filegroup
	, '2015-06-01T00:00:00' --current month -5 on OlderData filegroup
	, '2015-07-01T00:00:00' --current month -4 on OlderData filegroup
	, '2015-08-01T00:00:00' --current month -3 on NewerData filegroup
	, '2015-09-01T00:00:00' --current month -2 on NewerData filegroup
	, '2015-10-01T00:00:00' --current month -1 on NewerData filegroup
	, '2015-11-01T00:00:00' --current month on NewerData filegroup
	, '2015-12-01T00:00:00' --next month on NewerData filegroup
	, '2016-01-01T00:00:00' --future month
)
GO

--main table partition scheme
CREATE PARTITION SCHEME PS_12MonthSlidingWindow
AS PARTITION PF_12MonthSlidingWindow
TO (
	  FG_OlderData --older than 12 montths
	, FG_OlderData --current month -12 on OlderData
	, FG_OlderData --current month -11 on OlderData
	, FG_OlderData --current month -10 on OlderData
	, FG_OlderData --current month -9 on OlderData
	, FG_OlderData --current month -8 on OlderData
	, FG_OlderData --current month -7 on OlderData
	, FG_OlderData --current month -6 on OlderData
	, FG_OlderData --current month -5 on OlderData
	, FG_OlderData --current month -4 on OlderData
	, FG_OlderData --current month -3 on NewerData
	, FG_NewerData --current month -2 on NewerData
	, FG_NewerData --current month -1 on NewerData
	, FG_NewerData --current month on NewerData
	, FG_NewerData --next month on NewerData
	, FG_NewerData --future month on NewerData
      );
GO

--secondary partition function identical to main function
CREATE PARTITION FUNCTION PF_12MonthSlidingWindow_Secondary (datetime2(0))
AS RANGE RIGHT FOR VALUES (
	  '2014-11-01T00:00:00' --current month -12 on OlderData filegroup
	, '2014-12-01T00:00:00' --current month -11 on OlderData filegroup
	, '2015-01-01T00:00:00' --current month -10 on OlderData filegroup
	, '2015-02-01T00:00:00' --current month -9 on OlderData filegroup
	, '2015-03-01T00:00:00' --current month -8 on OlderData filegroup
	, '2015-04-01T00:00:00' --current month -7 on OlderData filegroup
	, '2015-05-01T00:00:00' --current month -6 on OlderData filegroup
	, '2015-06-01T00:00:00' --current month -5 on OlderData filegroup
	, '2015-07-01T00:00:00' --current month -4 on OlderData filegroup
	, '2015-08-01T00:00:00' --current month -3 on NewerData filegroup
	, '2015-09-01T00:00:00' --current month -2 on NewerData filegroup
	, '2015-10-01T00:00:00' --current month -1 on NewerData filegroup
	, '2015-11-01T00:00:00' --current month on NewerData filegroup
	, '2015-12-01T00:00:00' --next month on NewerData filegroup
	, '2016-01-01T00:00:00' --future month
)
GO

--secondary partition scheme identical to main scheme
CREATE PARTITION SCHEME PS_12MonthSlidingWindow_Secondary
AS PARTITION PF_12MonthSlidingWindow_Secondary
TO (
	  FG_OlderData --older than 12 montths
	, FG_OlderData --current month -12 on OlderData
	, FG_OlderData --current month -11 on OlderData
	, FG_OlderData --current month -10 on OlderData
	, FG_OlderData --current month -9 on OlderData
	, FG_OlderData --current month -8 on OlderData
	, FG_OlderData --current month -7 on OlderData
	, FG_OlderData --current month -6 on OlderData
	, FG_OlderData --current month -5 on OlderData
	, FG_OlderData --current month -4 on OlderData
	, FG_OlderData --current month -3 on NewerData
	, FG_NewerData --current month -2 on NewerData
	, FG_NewerData --current month -1 on NewerData
	, FG_NewerData --current month on NewerData
	, FG_NewerData --next month on NewerData
	, FG_NewerData --future month on NewerData
      );
GO

--main partitioned table
CREATE TABLE dbo.SlidingWindowTable(
	  PartitioningColumn datetime2(0)
	, OtherKeyColumn int NOT NULL
	, OtherData int NULL
	, CONSTRAINT PK_SlidingWindowTable PRIMARY KEY
		CLUSTERED (PartitioningColumn, OtherKeyColumn)
		ON PS_12MonthSlidingWindow(PartitioningColumn)
	) ON PS_12MonthSlidingWindow(PartitioningColumn);
GO

--The staging table initially uses the same scheme as the main table
--except with maintenance is in progress.
CREATE TABLE dbo.SlidingWindowTable_Staging(
	  PartitioningColumn datetime2(0)
	, OtherKeyColumn int NOT NULL
	, OtherData int NULL
	, CONSTRAINT PK_SlidingWindowTable_Staging PRIMARY KEY
		CLUSTERED (PartitioningColumn, OtherKeyColumn)
		ON PS_12MonthSlidingWindow(PartitioningColumn)
	) ON PS_12MonthSlidingWindow(PartitioningColumn);
GO

--This second staging table uses the same scheme as the main table and
--is used to switch data out of the main table after a partition is copied.
CREATE TABLE dbo.SlidingWindowTable_Staging2(
	  PartitioningColumn datetime2(0)
	, OtherKeyColumn int NOT NULL
	, OtherData int NULL
	, CONSTRAINT PK_SlidingWindowTable_Staging2 PRIMARY KEY
		CLUSTERED (PartitioningColumn, OtherKeyColumn)
		ON PS_12MonthSlidingWindow(PartitioningColumn)
	) ON PS_12MonthSlidingWindow(PartitioningColumn);
GO

WITH
	t4 AS (SELECT n FROM (VALUES(0),(0),(0),(0)) t(n))
	,t256 AS (SELECT 0 AS n FROM t4 AS a CROSS JOIN t4 AS b CROSS JOIN t4 AS c CROSS JOIN t4 AS d)
	,t16M AS (SELECT ROW_NUMBER() OVER (ORDER BY (a.n)) - 1 AS num FROM t256 AS a CROSS JOIN t256 AS b CROSS JOIN t256 AS c)
INSERT INTO dbo.SlidingWindowTable WITH (TABLOCKX) (PartitioningColumn, OtherKeyColumn, OtherData) 
SELECT DATEADD(month, num/1000000, '20141101'), num, num
FROM t16M
WHERE num <= 13000000;
GO

CREATE PROCEDURE dbo.PerformMonthlySlidingWindowMaintenace
	  @ExpiredMonthStartDate datetime2(0) --purge data for this month
	, @MonthStartDateToMoveFromCurrentToOld datetime2(0) --move data from this month from current data filegroup to old data filegroup
	, @FutureMonthStartDate datetime2(0) --create new partition for this date on current data filegroup
/*
	This proc performs partition maintenance needed to maintain a monthly sliding
	window with filegroups on tiered storage.  The objective is to keep recent frequently
	used data on filegroup FG_NewData and less often accessed data on PS_12MonthSlidingWindow_Secondary.
*/
AS

SET NOCOUNT ON;
SET XACT_ABORT ON;

DECLARE @result int;

BEGIN TRY

	BEGIN TRAN;

	--acquire exclusive lock on main table to prevent deadlocking during partition maintenance
	SET @result = (SELECT TOP (0) 1 FROM dbo.SlidingWindowTable WITH (TABLOCKX));

	--**************************
	--*** Purge expired data ***
	--**************************
	--purge oldest partition from main table (including partiton for data older than first boundary)
	TRUNCATE TABLE dbo.SlidingWindowTable_Staging;
	ALTER TABLE dbo.SlidingWindowTable
		SWITCH PARTITION $PARTITION.PF_12MonthSlidingWindow(DATEADD(month, -1, @ExpiredMonthStartDate))
		TO dbo.SlidingWindowTable_Staging PARTITION $PARTITION.PF_12MonthSlidingWindow_Secondary(DATEADD(month, -1, @ExpiredMonthStartDate));
	ALTER TABLE dbo.SlidingWindowTable
		SWITCH PARTITION $PARTITION.PF_12MonthSlidingWindow(@ExpiredMonthStartDate)
		TO dbo.SlidingWindowTable_Staging PARTITION $PARTITION.PF_12MonthSlidingWindow_Secondary(@ExpiredMonthStartDate);
	TRUNCATE TABLE dbo.SlidingWindowTable_Staging;

	--remove expired partition boundary from both functions after purge
	ALTER PARTITION FUNCTION PF_12MonthSlidingWindow()
		MERGE RANGE (@ExpiredMonthStartDate);
	ALTER PARTITION FUNCTION PF_12MonthSlidingWindow_Secondary()
		MERGE RANGE (@ExpiredMonthStartDate);

	--*******************************
	--*** Prepare for future data ***
	--*******************************
	--add new partition on new data filegroup for future data to both functions/schemes
	ALTER PARTITION SCHEME PS_12MonthSlidingWindow
			  NEXT USED FG_NewerData;
	ALTER PARTITION FUNCTION PF_12MonthSlidingWindow()
		SPLIT RANGE (@FutureMonthStartDate);
	ALTER PARTITION SCHEME PS_12MonthSlidingWindow_Secondary
			  NEXT USED FG_NewerData;
	ALTER PARTITION FUNCTION PF_12MonthSlidingWindow_Secondary()
		SPLIT RANGE (@FutureMonthStartDate);

	--this will release the exclusve table lock
	COMMIT;

	--********************************************************
	--*** move partition from FG_NewerData to FG_OlderData ***
	--********************************************************
	--alter secondary partition function and scheme to move partition from FG_NewerData to FG_OlderData
	ALTER PARTITION FUNCTION PF_12MonthSlidingWindow_Secondary()
		MERGE RANGE (@MonthStartDateToMoveFromCurrentToOld);
	ALTER PARTITION SCHEME PS_12MonthSlidingWindow_Secondary
		NEXT USED FG_OlderData;
	ALTER PARTITION FUNCTION PF_12MonthSlidingWindow_Secondary()
		SPLIT RANGE (@MonthStartDateToMoveFromCurrentToOld);

	--rebuild staging table using new partition scheme to move partition to filegroup FG_OlderData
	CREATE UNIQUE CLUSTERED INDEX PK_SlidingWindowTable_Staging
		ON dbo.SlidingWindowTable_Staging(PartitioningColumn, OtherKeyColumn)
		WITH(DROP_EXISTING=ON)
		ON PS_12MonthSlidingWindow_Secondary(PartitioningColumn);

	--copy data from main table into corresponding staging table partition, now on FG_OlderData
	INSERT INTO dbo.SlidingWindowTable_Staging WITH(TABLOCKX) (PartitioningColumn, OtherKeyColumn, OtherData)
		SELECT
			  PartitioningColumn
			, OtherKeyColumn
			, OtherData
		FROM dbo.SlidingWindowTable
		WHERE
			PartitioningColumn >= @MonthStartDateToMoveFromCurrentToOld
			AND PartitioningColumn < DATEADD(month, 1, @MonthStartDateToMoveFromCurrentToOld);

	BEGIN TRAN;

	--acquire exclusive lock on main table to prevent deadlocking during partition maintenance
	SET @result = (SELECT TOP (0) 1 FROM dbo.SlidingWindowTable WITH (TABLOCKX));

	--make sure the second staging table is empty
	TRUNCATE TABLE dbo.SlidingWindowTable_Staging2;

	--switch out partition to be moved on FG_NewData into second staging table
	ALTER TABLE dbo.SlidingWindowTable
		SWITCH PARTITION $PARTITION.PF_12MonthSlidingWindow(@MonthStartDateToMoveFromCurrentToOld)
		TO dbo.SlidingWindowTable_Staging2 PARTITION $PARTITION.PF_12MonthSlidingWindow(@MonthStartDateToMoveFromCurrentToOld);

	--remove redundant copy of data
	TRUNCATE TABLE dbo.SlidingWindowTable_Staging2;

	--switch the latest partition of main table on the FG_OlderData filegroup to the staging table
	--so that adjacent partitions of the main table are empty for the MERGE and SPLIT needed to 
	--physically move the main table partition to FG_OlderData
	ALTER TABLE dbo.SlidingWindowTable
		SWITCH PARTITION $PARTITION.PF_12MonthSlidingWindow(DATEADD(month, -1, @MonthStartDateToMoveFromCurrentToOld))
		TO dbo.SlidingWindowTable_Staging PARTITION $PARTITION.PF_12MonthSlidingWindow_Secondary(DATEADD(month, -1, @MonthStartDateToMoveFromCurrentToOld));

	--remove empty partition
	ALTER PARTITION FUNCTION PF_12MonthSlidingWindow()
		MERGE RANGE (@MonthStartDateToMoveFromCurrentToOld);

	--create new empty partition on old data filegroup for moved data
	ALTER PARTITION SCHEME PS_12MonthSlidingWindow
			  NEXT USED FG_OlderData;
	ALTER PARTITION FUNCTION PF_12MonthSlidingWindow()
		SPLIT RANGE (@MonthStartDateToMoveFromCurrentToOld);

	--now that both partition functions/schemes are identical, switch partitions from staging table back into main table
	ALTER TABLE dbo.SlidingWindowTable_Staging
		SWITCH PARTITION $PARTITION.PF_12MonthSlidingWindow_Secondary(DATEADD(month, -1, @MonthStartDateToMoveFromCurrentToOld))
		TO dbo.SlidingWindowTable PARTITION $PARTITION.PF_12MonthSlidingWindow(DATEADD(month, -1, @MonthStartDateToMoveFromCurrentToOld));
	ALTER TABLE dbo.SlidingWindowTable_Staging
		SWITCH PARTITION $PARTITION.PF_12MonthSlidingWindow_Secondary(@MonthStartDateToMoveFromCurrentToOld)
		TO dbo.SlidingWindowTable PARTITION $PARTITION.PF_12MonthSlidingWindow(@MonthStartDateToMoveFromCurrentToOld);

	COMMIT;

END TRY
BEGIN CATCH

	IF @@TRANCOUNT > 0 ROLLBACK;

	THROW;

END CATCH;
GO

CHECKPOINT
DBCC DROPCLEANBUFFERS
--when scheduling after the start of a new month, set @RunMonth to the first day of the current month
DECLARE
	--@RunMonth datetime2(0) = DATEADD(day, 1, DATEADD(month, -1, EOMONTH(GETDATE())));
	  @RunMonth datetime2(0) = '20151201';

--calculate boundary dates based on @RunMonth
DECLARE
	  @ExpiredMonthStartDate datetime2(0) = DATEADD(month, -13, @RunMonth)
	, @MonthStartDateToMoveFromCurrentToOld datetime2(0) = DATEADD(month, -4, @RunMonth)
	, @FutureMonthStartDate datetime2(0) = DATEADD(month, 2, @RunMonth);

SELECT
	  @ExpiredMonthStartDate AS ExpiredMonthStartDate
	, @MonthStartDateToMoveFromCurrentToOld AS MonthStartDateToMoveFromCurrentToOld
	, @FutureMonthStartDate AS FutureMonthStartDate;

EXEC dbo.PerformMonthlySlidingWindowMaintenace
	  @ExpiredMonthStartDate = @ExpiredMonthStartDate
	, @MonthStartDateToMoveFromCurrentToOld = @MonthStartDateToMoveFromCurrentToOld
	, @FutureMonthStartDate = @FutureMonthStartDate;
GO

SQL Server File Naming Standards

Attention to detail in naming SQL Server physical files and logical file names will make a DBAs life easier. This is especially important when using RESTORE or attach. I suggest one establish and follow a naming standard for physical and logical database file names.

SQL Server does not enforce any particular naming standard for files.  SQL Server is perfectly happy with a data file named “readme.txt” or a log file named “word.doc”.  Of course, such an inappropriate name and extension will lead to confusion so the best practice is to name files appropriately.  An appropriate SQL Server database file name is essentially one that is self-documenting; one should be able to determine the exact purpose of a file simply by examining the name.  I recommend a naming standard that includes the associated database name, filegroup name, and file type.

Physical File Names
I propose adopting a physical file naming convention of “<database-name>_< filegroup-name>_<uniqueifier>.<file-type>” where:

  • <database-name> is the name of the associated database
  • <filegroup-name> is the name of the filegroup containing the file, or the literal “Log” for log files
  • <uniqueifier> is a integer to ensure the file name is unique
  • <file-type> is the standard SQL Server extension for the file type (“mdf”, “ndf”, or “ldf”) as documented in the SQL Server Books Online Database Files and Filegroups topic

Personally, I use an underscore to separate the name components but a dash will also suffice.  Ideally, the separator character should never be used in database or filegroup names to avoid ambiguity.  I recommend one avoid using special characters in database and filegroup names (e.g. use proper case) and ensure database and filegroup names conform to the rules for regular identifiers as outlined in the Books Online Database Identifiers topic.

There is some wiggle room as to how strictly one adheres to this naming convention.  One could omit the filegroup name and uniqueifier components for the primary data file (mdf) because this file is implicitly in the PRIMARY filegroup and there can be only primary data file in the filegroup/database.  Similarly, the uniqueifier could be omitted for the first or only file within a filegroup.   That being said, a more strict adherence this naming convention provides better consistency and makes naming file more of a no-brainer.

Note that the uniqueifier in the name need not imply a sequential value with no gaps.  For example, consider a database named MyDatabase with filegroup DataFG containing 3 files named ‘MyDatabase_DataFG_1.ndf’, ‘MyDatabase_DataFG_2.ndf’, and ‘MyDatabase_DataFG_3.ndf’.  If the second file is removed, there is no requirement to rename file ‘MyDatabase_DataFG_3.ndf’ to ‘MyDatabase_DataFG_2.ndf’.  However, if one is anal about such things, there is no harm in doing so other than the unavailability of the database during the maintenance.

The physical file naming convention described above will guarantee physical file names are unique within a SQL Server instance and facilitate relocation to different drives/folders without naming conflicts.  On a server with multiple instances, I suggest placing files in separate folders for each instance.  This practice will better organize database files while avoiding file name conflicts when a database with the same name exists on different instances.

Logical File Names
Although I commonly see database names embedded within logical file names, I suggest one avoid that practice.  The scope of a logical file name is the context database so including the database name is redundant.  More importantly, the original logical file name is retained when a database is subsequently restored or attached with a different name so the name becomes out-of-sync with the actual database name unless one explicitly changes the names afterward to match the new database name.  This extra administrative work is often overlooked and can be avoided entirely by not including the database name in the logical file name when the initial database is created or altered.

I suggest one use the same naming convention for the logical file name as the physical file name but without the database name.  The logical name will therefore match the last part of physical name (< filegroup-name>_<uniqueifier>).

Unfortunately, SQL Server likes to include the database name in logical names of the primary data file and log file during initial creation.  Executing a minimal create database T-SQL statement like “CREATE DATABASE Foo;” will result in the files below created in the default data and log folder for the instance.  This also applies to the logical names suggested by the SSMS GUI, although one can specify different names as desired.

Logical Name Physical Name
Foo D:\SqlDataFiles\Foo.mdf
Foo_log L:\SqlLogFiles\Foo_log.ldf

I recommend using the expanded form of CREATE DATABASE so that you have complete control over names, locations, and sizes.  Consider creating a custom SSMS template or code snippet to facilitate creating databases with the proper names.

CREATE DATABASE ExampleDatabase
ON PRIMARY
    (NAME = PRIMARY_1,
    FILENAME = N'D:\SqlDataFiles\ExampleDatabase_PRIMARY_1.mdf',
    SIZE = 10MB,
    MAXSIZE = 50MB,
    FILEGROWTH = 10%)
LOG ON
    ( NAME = Log_1,
    FILENAME = N'L:\SqlLogFiles\ExampleDatabase_Log_1.ldf',
    SIZE = 10MB,
    MAXSIZE = 50MB,
    FILEGROWTH = 10MB);

Examples

Below are examples of the naming conventions detailed in this article.

Simple database with only a primary data and log file:

Logical Name Physical Name
PRIMARY_1 D:\SqlDataFiles\ExampleDatabase_PRIMARY_1.mdf
Log_1 L:\SqlLogFiles\ExampleDatabase_Log_1.ldf

Database with 2 files in PRIMARY filegroup, 2 secondary filegroups containing 3 files each, and 2 log files:

Logical Name Physical Name
PRIMARY_1 D:\SqlDataFiles\ExampleDatabase_PRIMARY_1.mdf
PRIMARY_2 E:\SqlDataFiles\ExampleDatabase_PRIMARY_2.ndf
DataFG_1 F:\SqlDataFiles\ExampleDatabase_DataFG_1.ndf
DataFG_2 G:\SqlDataFiles\ExampleDatabase_DataFG_2.ndf
DataFG_3 H:\SqlDataFiles\ExampleDatabase_DataFG_3.ndf
IndexFG_1 I:\SqlDataFiles\ExampleDatabase_IndexFG_1.ndf
IndexFG_2 J:\SqlDataFiles\ExampleDatabase_IndexFG_2.ndf
IndexFG_3 K:\SqlDataFiles\ExampleDatabase_IndexFG_3.ndf
Log_1 L:\SqlLogFiles\ExampleDatabase_Log_1.ldf
Log_2 M:\SqlLogFiles\ExampleDatabase_Log_2.ldf

Improving Uniqueidentifier Performance

A common anti-pattern I run into is the random primary key, commonly a GUID. This design is insidious because the performance implications of random access aren’t immediately obvious and exacerbated when the primary key index is clustered. It is often only after the table grows to a larger size that the performance problems become apparent. Symptoms include slowly degrading performance over time, with increased blocking and deadlocking as a side effect.

Figure 1 shows the performance profile of a random inserts with a random GUID (SQL Server uniqueidentifier data type) clustered primary key. The red line indicates the rate of batch requests per second (inserts) while the blue line shows the total number of rows in the table, scaled such that the top of the graph represents 3M rows. Only about 700, 000 rows could be inserted during this 15 minute single-threaded random key insert test, even though the insert rate was fast initially.

Figure 1: Random key insert performance
Random insert performance graph

Incremental Primary Keys

As you might guess, the cure for the random primary key anti-pattern is an incremental key pattern. With a uniqueidentifier data type, a sequential value can be assigned by SQL Server using the NEWSEQUENTIALID function (in a default constraint expression) or in application code using the UuidCreateSequential Win32 API call along with some byte swapping (code example below). Alternatively, one can use an integral data type (int, bigint, etc.) along with a value generated by an IDENTITY property or a SEQUENCE object. The advantage of an integral type is the reduced space requirements compared to a 16-byte uniqueidentifier. The advantage of a uniqueidentifier is that it can easily be generated in application code before database persistence without a database round trip, which is desirable for distributed applications and when keys of related tables are assigned in application code before writing to the database.

Figure 2 shows the same test using a sequential key value. Over 2.2M rows were inserted in 15 minutes. As you can see, significant performance improvement is achieved with this trivial application change.

Figure 2: Incremental key insert performance
Random insert performance graph

Listing 1 shows the T-SQL code I used for these performance tests and listing 2 contains the C# code (with the random GUID commented out). I generated the uniqueidentifier value via application code in the tests but performance with NEWID() is comparable to the first test and NEWSEQUENTIALID() is similar to the second test.

Listing 1: T-SQL scripts for test table and stored procedure

CREATE TABLE dbo.TestTable(
	TestKey uniqueidentifier NOT NULL
		CONSTRAINT PK_TestTable PRIMARY KEY CLUSTERED
	,TestData char(8000) NOT NULL
);
GO

CREATE PROC dbo.InsertTestTable
	@TestKey uniqueidentifier
	,@TestData char(8000)
AS
SET NOCOUNT ON;
DECLARE @TotalRows int;

--insert row
INSERT INTO dbo.TestTable (TestKey, TestData) 
	VALUES(@TestKey, @TestData);

--update pmon counter for rowcount
SELECT @TotalRows = rows 
FROM sys.partitions 
WHERE object_id = OBJECT_ID(N'TestTable') AND index_id = 1;
EXEC sys.sp_user_counter1 @TotalRows; --for pmon row count

RETURN @@ERROR;
GO

Listing 2: C# insert test console application

using System;
using System.Data;
using System.Data.SqlClient;
using System.Runtime.InteropServices;

namespace UniqueIdentifierPerformanceTest
{
    class Program
    {
        [DllImport("rpcrt4.dll", CharSet = CharSet.Auto)]
        public static extern int UuidCreateSequential(ref Guid guid);

        static string connectionString = @"Data Source=MyServer;Initial Catalog=MyDatabase;Integrated Security=SSPI";
        static int rowsToInsert = 10000000;
        static SqlConnection connection;
        static SqlCommand command;

        static void Main(string[] args)
        {

            int rowsInserted = 0;

            using (connection = new SqlConnection(connectionString))
            {
                using (command = new SqlCommand("dbo.InsertTestTable", connection))
                {
                    command.Connection = connection;
                    command.CommandType = CommandType.StoredProcedure;
                    command.Parameters.Add("@TestKey", SqlDbType.UniqueIdentifier);
                    command.Parameters.Add("@TestData", SqlDbType.Char, 8000);

                    connection.Open();

                    while (rowsInserted < rowsToInsert)
                    {
                        //random guid
                        //command.Parameters["@TestKey"].Value = Guid.NewGuid();

                        //sequential guid
                        command.Parameters["@TestKey"].Value = NewSequentialGuid();
                        command.Parameters["@TestData"].Value = "Test";
                        command.ExecuteNonQuery();
                        ++rowsInserted;

                        //display progress every 1000 rows
                        if (rowsInserted % 1000 == 0)
                        {
                            Console.WriteLine(string.Format(
                                "{0} of {1} rows inserted"
                                , rowsInserted.ToString("#,##0")
                                , rowsToInsert.ToString("#,##0")));
                        }
                    }
                }

                connection.Close();
            }

        }

        /// 
        /// call UuidCreateSequential and swap bytes for SQL Server format
        /// 
        /// sequential guid for SQL Server
        private static Guid NewSequentialGuid()
        {
            const int S_OK = 0;
            const int RPC_S_UUID_LOCAL_ONLY = 1824;

            Guid oldGuid = Guid.Empty;

            int result = UuidCreateSequential(ref oldGuid);
            if (result != S_OK && result != RPC_S_UUID_LOCAL_ONLY)
            {
                throw new ExternalException("UuidCreateSequential call failed", result);
            }

            byte[] oldGuidBytes = oldGuid.ToByteArray();
            byte[] newGuidBytes = new byte[16];
            oldGuidBytes.CopyTo(newGuidBytes, 0);

            // swap low timestamp bytes (0-3)
            newGuidBytes[0] = oldGuidBytes[3];
            newGuidBytes[1] = oldGuidBytes[2];
            newGuidBytes[2] = oldGuidBytes[1];
            newGuidBytes[3] = oldGuidBytes[0];

            // swap middle timestamp bytes (4-5)
            newGuidBytes[4] = oldGuidBytes[5];
            newGuidBytes[5] = oldGuidBytes[4];

            // swap high timestamp bytes (6-7)
            newGuidBytes[6] = oldGuidBytes[7];
            newGuidBytes[7] = oldGuidBytes[6];

            //remaining 8 bytes are unchanged (8-15) 

            return new Guid(newGuidBytes);

        }

    }
}

Why Random Keys Are Bad

I think it’s important for one to understand why random keys have such a negative impact on performance against large tables. DBAs often cite fragmentation and page splits as the primary causes of poor performance with random keys. Although it is true random inserts do cause fragmentation and splits, the primary cause of bad performance with random keys is poor temporal reference locality (http://en.wikipedia.org/wiki/Locality_of_reference), which I’ll detail shortly. Note that there were no real page splits in these insert performance tests because the nearly 8K row size allowed only one row per page. Although significant extent fragmentation occurred, this didn’t impact these single-row requests; extent fragmentation is mostly an issue with sequential scans against spinning media. So neither splits nor fragmentation explain the poor performance of the random inserts.

Temporal reference locality basically means that once data is used (e.g. inserted or touched in any way), it is likely to be used again in the near future. This is why SQL Server uses a LRU-2 algorithm to manage the buffer cache; data most recently touched will remain in memory while older, less often referenced data are aged out. The impact of random key values on temporal locality (i.e. buffer efficiency) is huge. Consider that inserts are basically rewrites of existing pages. When a new row is inserted into a table, SQL Server first reads the page where the row belongs (by key value if the table has a clustered index) and then either adds the row to the existing data page or allocates a new one if there’s not enough space available in the existing page for the new row. With a random key value, the new key value is unlikely to be adjacent to the last one inserted (which is probably still in memory) so the needed page often must be read from storage.

All things being equal, single-row performance will be roughly the same with both sequential and random keys as long as data are memory resident. This is why the random and sequential key insert tests show the same good performance initially. But once the table size exceeded the size of the buffer pool, the random key test showed a precipitous drop in throughput and steady degradation thereafter. In short, random keys diminish temporal reference locality because there is no correlation between time (most recently accessed data) and the key value.

Why Incremental Keys Good

An incremental key value naturally improves temporal reference locality; the next key value is adjacent to the last one inserted and is likely still in memory. An incremental key provides excellent insert performance regardless of table size as the insert performance test shows. Also, applications typically use recently inserted data more often than older data. This allows the same amount of work to done with much less physical I/O than a random key value.

Random Notes about GUIDs

According to the Globally unique identifier Wiki (http://en.wikipedia.org/wiki/Globally_unique_identifier), the random 122 bits of a GUID can generate 2122 unique values. That’s an incomprehensibly large 5.3 x 1036 (or 5,300,000,000,000,000,000,000,000,000,000,000,000) number unique values.

The value returned by NEWSEQUENTIALID and UuidCreateSequential is guaranteed to be unique on a given computer. Furthermore, it is globally unique if the computer has a network card because the MAC address is used as part of the GUID generation algorithm.

Secrets of Foreign Key Index Binding

You might be surprised to learn that foreign keys bind to physical indexes when they are created. Furthermore, a foreign key does not necessarily bind to the primary key index of the referenced table; SQL Server allows a foreign key to refer to any column(s) that are guaranteed to be unique as enforced by a primary key constraint, unique constraint or unique index.

In this post, I’ll discuss the undocumented rules SQL Server uses to bind foreign key constraints to referenced table indexes so that you can achieve performance goals and protect yourself against unexpected errors in DDL modification scripts.

Background

Typically, one references the primary key in foreign key relationships. I’ve seen a foreign key (deliberately) reference columns other than the primary key only a couple of times in my career. The foreign key referenced an alternate key with a unique constraint in those cases. Why one would create such a relationship is an exercise for the reader. I’ll focus on the primary key here, although the same considerations apply to foreign keys referencing alternate keys.

As I mentioned earlier, SQL Server binds a foreign key to a physical unique index. This binding performance implications because it determines the index SQL Server uses to enforce referential integrity as child table rows are inserted or updated. Also, SQL Server will not allow the index bound to a foreign key to be dropped since that could allow duplicate rows in the parent table and thus break the unique side of the relationship. This must be considered when developing scripts that drop unique indexes (including primary key and unique constraints) that may be bound to foreign keys.

A foreign key referencing the primary key will always be bound to the primary key index when that is the only unique index on the foreign key column(s). However, you might have additional unique indexes on the primary key column(s) for performance reasons. For example, consider the case of a clustered primary key. Performance of a frequently executed query may be improved with a covering non-clustered index:

–create parent table

CREATE TABLE dbo.ParentTable(

       ParentTableID int NOT NULL IDENTITY

              CONSTRAINT PK_ParentTable PRIMARY KEY CLUSTERED

       ,Column1 int NOT NULL

       ,Column2 varchar(100) NOT NULL

       );

GO

 

–create a non-clustered covering index

CREATE UNIQUE NONCLUSTERED INDEX idx_ParentTable_ParentTableID

       ON dbo.ParentTable(ParentTableID) INCLUDE(Column1);

GO

 

INSERT INTO dbo.ParentTable VALUES(1, ‘some data’);

INSERT INTO dbo.ParentTable VALUES(2, ‘some data’);

INSERT INTO dbo.ParentTable VALUES(3, ‘some data’);

GO

 

–create child table

CREATE TABLE dbo.ChildTable(

       ChildTableID int NOT NULL IDENTITY

            CONSTRAINT PK_ChildTable PRIMARY KEY CLUSTERED

       ,ParentTableID int NOT NULL

              CONSTRAINT FK_ChildTable_ParentTable

                     FOREIGN KEY REFERENCES dbo.ParentTable(ParentTableID)

       );

GO

 

INSERT INTO dbo.ChildTable VALUES(1);

INSERT INTO dbo.ChildTable VALUES(1);

INSERT INTO dbo.ChildTable VALUES(1);

INSERT INTO dbo.ChildTable VALUES(1);

INSERT INTO dbo.ChildTable VALUES(2);

INSERT INTO dbo.ChildTable VALUES(2);

INSERT INTO dbo.ChildTable VALUES(2);

INSERT INTO dbo.ChildTable VALUES(2);

INSERT INTO dbo.ChildTable VALUES(3);

INSERT INTO dbo.ChildTable VALUES(3);

INSERT INTO dbo.ChildTable VALUES(3);

INSERT INTO dbo.ChildTable VALUES(3);

GO

 

UPDATE STATISTICS dbo.ParentTable;

UPDATE STATISTICS dbo.ChildTable;

GO

 

–show the foreign key index binding

SELECT

    fki.name

FROM sys.foreign_keys AS f

JOIN sys.indexes AS fki ON

      fki.object_id = f.referenced_object_id     

      AND fki.index_id = f.key_index_id

WHERE

      f.object_id = OBJECT_ID(N’dbo.FK_ChildTable_ParentTable’);

GO

 

–this query uses the covering index instead of clustered PK index

SELECT p.ParentTableID, p.Column1

FROM dbo.ParentTable AS p

WHERE p.ParentTableID IN(1,2,3);

GO

 

 

The SELECT query in the above script uses the covering idx_ParentTable_ParentTableID index. While this is good for performance, it introduces ambiguity regarding index binding to the foreign key. Again, any primary key constraint, unique constraint or index on the referenced column(s) may be referenced by a foreign key. With two candidate unique indexes (PK_ParentTable and idx_ParentTable_ParentTableID), you have little control which index is bound to the foreign key. 

SQL Server chooses the index binding based on rules that vary by version so you will get different binding depending on your version of SQLServer. SQL Server 2005 chooses the clustered index when possible and, if no suitable clustered index exists, the first (lowest index_id) unique non-clustered index on the referenced column(s) is used. The sample script above binds the foreign key to the PK_WideTable index under SQL Server 2005 because it is the clustered index, not because it is the primary key.

In later versions (SQL 2008, SQL 2008R2 and SQL 2012), the foreign key is bound to the unique non-clustered index on the referenced column(s) with the lowest index_id when possible. Only when no suitable unique non-clustered index exists is the unique clustered index chosen. So the foreign key in the above script is bound to idx_ParentTable_ParentTableID in SQL 2008 and later versions instead of the primary key index as one might expect.

Why Foreign Key Index Binding is Important

There are two reasons why it is important to control the index bound to a foreign key. One is performance. As I mentioned earlier, the index bound to the foreign key constraint is used at execution time to enforce the constraint as child table rows are inserted or the foreign key column(s) updated. If the parent table is large and not queried often but rows are inserted into the child table heavily, a unique non-clustered index that “covers” the referential integrity check may be more desirable than the clustered index. This can improve buffer efficiency and page life expectancy compared to using a clustered index (e.g. primary key). My assumption is that this is why SQL Server 2008 and later versions prefer the unique non-clustered index over the clustered index for constraint enforcement.

Another reason one should control the index bound to the foreign key is to facilitate index changes. If you try to drop an index bound to a foreign key, you’ll get an error like “An explicit DROP INDEX is not allowed on index ‘dbo.ParentTable.idx_ParentTable_ParentTableID ‘. It is being used for FOREIGN KEY constraint enforcement.” You’ll need to drop the foreign key first and recreate after dropping the index.

Since one can’t specify the bound foreign key index declaratively, the only guaranteed way to control the binding is to create the foreign key when only the desired unique index exists and create additional indexes afterward. This isn’t to say you can’t rely on the rules described earlier but you need to be aware that such rules vary depending on the SQL Server version and could change in the future. 

 

Internal SQL Server Database Version Numbers

A database created by a more recent version of SQL Server cannot be attached or restored to an earlier version. This restriction is simply because an older version cannot know about file format changes that were introduced in the newer release. 
If you attempt to attach a database to an earlier version, you will get SQL Server error 948 with the internal version numbers listed in the error message text. For example, the following error occurs if you try to attach a SQL Server 2008 R2 database to a SQL Server 2008 server:

The database ‘MyDatabase’ cannot be opened because it is version 665. This server supports version 661 and earlier. A downgrade path is not supported.

Sample text from SQL Server error 948
The cryptic version numbers in the error message refer to the internal database version. These internal version numbers are undocumented but are (at least currently) the same value reported by the DATABASEPROPERTYEX function ‘Version’ property of the source database. If you are unsure of the source database version, the table below maps the internal version numbers to SQL Server versions so you can determine the minimum version you need for the attach to succeed:

SQL Server Version

Internal Database Version

SQL Server 2008 R2

665

SQL Server 2008

661

SQL Server 2005 SP2+ with vardecimal enabled

612

SQL Server 2005

611

SQL Server 2000

539

SQL Server 7

515

SQL Server versions and internal database versions
Below are the allowable SQL Server upgrade paths for a database attach or restore. The internal database version will be as above after a successful attach or restore.

Target SQL Server Version

Source SQL Server Version

Internal Database Version

SQL Server 2008 R2

SQL Server 2008 R2

665

SQL Server 2008

661

SQL Server 2005 with vardecimal enabled

612

SQL Server 2005

611

SQL Server 2000

539

SQL Server 2008

SQL Server 2008

661

SQL Server 2005 with vardecimal enabled

612

SQL Server 2005

611

SQL Server 2000

539

SQL Server 2005 SP2+

SQL Server 2005 with vardecimal enabled

612

SQL Server 2005

611

SQL Server 2000

539

SQL Server 7

515

SQL Server 2005

SQL Server 2005

611

SQL Server 2000

539

SQL Server 7

515

SQL Server 2000

SQL Server 2000

539

SQL Server 7

515
SQL Server 7

SQL Server 7

515

Database File Versions and Upgrade Paths
As I mentioned earlier, downgrades are not supported. You’ll need to copy objects and data from the newer source database to the older target if you need to downgrade; attach or restore is not an option to copy a database to an earlier version.

Secret of SQL Trace Duration Column

Why would a trace of long-running queries not show all queries that exceeded the specified duration filter?  We have a server-side SQL Trace that includes RPC:Completed and SQL:BatchCompleted events with a filter on Duration >= 100000.  Nearly all of the queries on this busy OLTP server run in under this 100 millisecond threshold so any that appear in the trace are candidates for root cause analysis and/or performance tuning opportunities.

After an application experienced query timeouts, the DBA looked at the trace data to corroborate the problem.  Surprisingly, he found no long-running queries in the trace from the application that experienced the timeouts even though the application’s error log clearly showed detail of the problem (query text, duration, start time, etc.).  The trace did show, however, that there were hundreds of other long-running queries from different applications during the problem timeframe.  We later determined those queries were blocked by a large UPDATE query against a critical table that was inadvertently run during this busy period.

So why didn’t the trace include all of the long-running queries?  The reason is because the SQL Trace event duration doesn’t include the time a request was queued while awaiting a worker thread.  Remember that the server was under considerable stress at the time due to the severe blocking episode.  Most of the worker threads were in use by blocked queries and new requests were queued awaiting a worker to free up (a DMV query on the DAC connection will show this queuing: “SELECT scheduler_id, work_queue_count FROM sys.dm_os_schedulers;”).  Technically, those queued requests had not started.  As worker threads became available, queries were dequeued and completed quickly.  These weren’t included in the trace because the duration was under the 100ms duration filter.  The duration reflected the time it took to actually run the query but didn’t include the time queued waiting for a worker thread.

The important point here is that duration is not end-to-end response time.  Duration of RPC:Completed and SQL:BatchCompleted events doesn’t include time before a worker thread is assigned nor does it include the time required to return the last result buffer to the client.  In other words, duration only includes time after the worker thread is assigned until the last buffer is filled.  But be aware that duration does include the time need to return intermediate result set buffers back to the client, which is a factor when large query results are returned.  Clients that are slow in consuming results sets can increase the duration value reported by the trace “completed” events.