SQL Server 2016 and Azure SQL Database V12 Breaking Change

Posted on October 2, 2016 by dguzman

This post is to get the word out about a breaking change to datetime conversion and comparison behavior in SQL Server 2016 and Azure SQL Database V12. This change hasn’t been documented as of this writing in the Breaking Changes to Database Engine Features in SQL Server 2016 topic in the SQL Server Books Online.

In short, conversion from datetime to a higher precision temporal data type (datetime2, datetimeoffset, or time) may yield a different, but more accurate, time value than in prior versions. Also, predicates involving datetime consider the full precision of raw datetime internal value instead of the time value rounded to the nearest millisecond. These changes in conversion and comparison behavior may affect existing applications and are not intuitive unless one understands the underlying datetime data type implementation.

Background
You may be aware that the accuracy of datetime is limited to 1/300 of a second. This is because values are internally an 8-byte structure consisting of 2 separate 32-bit integers, one with the number of day units since 1900-01-01 and the other with the number of 1/300 second interval units since midnight. The 1/300 second unit interval limits the time accuracy to 3.33333… milliseconds and the milliseconds value will be a repeating decimal when time interval units are not evenly divisible by 3. The raw decimal value is rounded to a scale of 3 in accordance with the fixed datetime precision of 3, resulting in a millisecond value of 0, 3, or 7 for all datetime values.

Pre-SQL Server 2016 Behavior
Before SQL Server 2016, conversion from datetime to another temporal type used the source datetime value after it was rounded to the nearest millisecond, which truncated repeating decimal fractional milliseconds. The rounded value was then rounded again according to the target type precision. When the target type precision was greater than 3, the time was extended to the target type precision with insignificant trailing zeros, resulting in zero for the sub-millisecond value.

DECLARE @DateTime datetime = '2016-01-01T00:00:00.007';
SELECT CAST(@DateTime AS datetime2(0)); --2016-01-01 00:00:00
SELECT CAST(@DateTime AS datetime2(1)); --2016-01-01 00:00:00.0
SELECT CAST(@DateTime AS datetime2(2)); --2016-01-01 00:00:00.01
SELECT CAST(@DateTime AS datetime2(3)); --2016-01-01 00:00:00.007
SELECT CAST(@DateTime AS datetime2(4)); --2016-01-01 00:00:00.0070
SELECT CAST(@DateTime AS datetime2(5)); --2016-01-01 00:00:00.00700
SELECT CAST(@DateTime AS datetime2(6)); --2016-01-01 00:00:00.007000
SELECT CAST(@DateTime AS datetime2(7)); --2016-01-01 00:00:00.0070000

Also, when datetime was compared to another temporal type, the rounded value was used. This script shows the result of the equality predicate is true after the datetime value is converted to datetime2.

--This script prints EQUAL predicate is true
DECLARE @DateTime datetime = '2016-01-01T00:00:00.003';
DECLARE @DateTime2 datetime2(7) = @DateTime;
IF @DateTime = @DateTime2 PRINT 'EQUAL predicate is true' ELSE PRINT 'EQUAL predicate is not true';
IF @DateTime < @DateTime2 PRINT 'LESS THAN predicate is true' ELSE PRINT 'LESS THAN predicate is not true';
IF @DateTime > @DateTime2 PRINT 'GREATER THAN predicate is true' ELSE PRINT 'GREATER THAN predicate is not true';
GO
--This script prints EQUAL predicate is true
DECLARE @DateTime datetime = '2016-01-01T00:00:00.007';
DECLARE @DateTime2 datetime2(7) = @DateTime;
IF @DateTime = @DateTime2 PRINT 'EQUAL predicate is true' ELSE PRINT 'EQUAL predicate is not true';
IF @DateTime < @DateTime2 PRINT 'LESS THAN predicate is true' ELSE PRINT 'LESS THAN predicate is not true';
IF @DateTime > @DateTime2 PRINT 'GREATER THAN predicate is true' ELSE PRINT 'GREATER THAN predicate is not true';
GO

SQL Server 2016 Behavior Change
SQL Server 2016 and Azure SQL Database V12 use the raw datetime internal value without rounding during conversion to another temporal type. The value is rounded only once during conversion, to the target type precision. The end result will be the same as before SQL Server 2016 when the target type precision is 3 or less. However, the converted value will be different when the target type precision is greater than 3 and the internal time unit interval is not evenly divisible by 3 (i.e. rounded source datetime millisecond value is 3 or 7). Note the non-zero microseconds and nanoseconds in the script results below and that rounding is based on the target type precision rather than the source.

DECLARE @DateTime datetime = '2016-01-01T00:00:00.003';
SELECT CAST(@DateTime AS datetime2(0)); --2016-01-01 00:00:00
SELECT CAST(@DateTime AS datetime2(1)); --2016-01-01 00:00:00.0
SELECT CAST(@DateTime AS datetime2(2)); --2016-01-01 00:00:00.03
SELECT CAST(@DateTime AS datetime2(3)); --2016-01-01 00:00:00.003
SELECT CAST(@DateTime AS datetime2(4)); --2016-01-01 00:00:00.0033
SELECT CAST(@DateTime AS datetime2(5)); --2016-01-01 00:00:00.00333
SELECT CAST(@DateTime AS datetime2(6)); --2016-01-01 00:00:00.003333
SELECT CAST(@DateTime AS datetime2(7)); --2016-01-01 00:00:00.0033333
GO
DECLARE @DateTime datetime = '2016-01-01T00:00:00.007';
SELECT CAST(@DateTime AS datetime2(0)); --2016-01-01 00:00:00
SELECT CAST(@DateTime AS datetime2(1)); --2016-01-01 00:00:00.0
SELECT CAST(@DateTime AS datetime2(2)); --2016-01-01 00:00:00.01
SELECT CAST(@DateTime AS datetime2(3)); --2016-01-01 00:00:00.007
SELECT CAST(@DateTime AS datetime2(4)); --2016-01-01 00:00:00.0067
SELECT CAST(@DateTime AS datetime2(5)); --2016-01-01 00:00:00.00667
SELECT CAST(@DateTime AS datetime2(6)); --2016-01-01 00:00:00.006667
SELECT CAST(@DateTime AS datetime2(7)); --2016-01-01 00:00:00.0066667
GO

This behavior change provides a more accurate converted value but may break applications that expect the converted value to be the same as the rounded datetime value as was the case before SQL Server 2016.

Be aware than the full raw datetime precision (instead of the rounded value) is also used when evaluating predicates involving a datetime type. The full precision of both arguments are used, resulting in the equality compare predicate to evaluate to false in both scripts below. The greater than predicate is true in the first script and the less than predicate is true in the second:

--This script prints GREATER THAN predicate is true
DECLARE @DateTime datetime = '2016-01-01T00:00:00.003';
DECLARE @DateTime2 datetime2(7) = @DateTime;
IF @DateTime = @DateTime2 PRINT 'EQUAL predicate is true' ELSE PRINT 'EQUAL predicate is not true';
IF @DateTime < @DateTime2 PRINT 'LESS THAN predicate is true' ELSE PRINT 'LESS THAN predicate is not true';
IF @DateTime > @DateTime2 PRINT 'GREATER THAN predicate is true' ELSE PRINT 'GREATER THAN predicate is not true';
GO
--This script prints LESS THAN predicate is true
DECLARE @DateTime datetime = '2016-01-01T00:00:00.007';
DECLARE @DateTime2 datetime2(7) = @DateTime;
IF @DateTime = @DateTime2 PRINT 'EQUAL predicate is true' ELSE PRINT 'EQUAL predicate is not true';
IF @DateTime < @DateTime2 PRINT 'LESS THAN predicate is true' ELSE PRINT 'LESS THAN predicate is not true';
IF @DateTime > @DateTime2 PRINT 'GREATER THAN predicate is true' ELSE PRINT 'GREATER THAN predicate is not true';
GO

To provide insight into why the comparisons result in greater than and less than respectively, the script below shows the nanoseconds value of the compared data types:

DECLARE @DateTime datetime = '2016-01-01T00:00:00.003';
DECLARE @DateTime2 datetime2(7) = @DateTime;
SELECT DATEPART(nanosecond, @DateTime); --3333333
SELECT DATEPART(nanosecond, @DateTime2); --3333300
GO
DECLARE @DateTime datetime = '2016-01-01T00:00:00.007';
DECLARE @DateTime2 datetime2(7) = @DateTime;
SELECT DATEPART(nanosecond, @DateTime); --6666666
SELECT DATEPART(nanosecond, @DateTime2); --6666700
GO

The datetime2 type is accurate only to 100 nanosecond whereas datetime includes values to the nanosecond (and beyond) because the theoretical precision of repeating decimal values is unlimited. The implication is that a datetime type with a repeating decimal value will never compare equally with any temporal type except datetime.

Datetime conversion and comparison behavior is controlled by the database compatibility level. Databases in SQL Server 2016 level (130) use the new behavior and the legacy behavior is used with other levels.

Summary
These datetime behavior changes have the benefit of improved accuracy and performance of datetime conversion/comparison. Affected applications can use a pre-SQL Server 2016 database compatibility level until they can be remediated.

I recommend one avoid comparing datetime directly to other temporal types going forward. Instead convert the datetime value to the type being compared and use the converted value in the predicate. It’s generally best practice to match data types whenever possible for performance and to avoid ambiguity.

SQL Server TVP Performance Gotchas

Posted on July 31, 2016 by dguzman

Table-valued parameters have important considerations that developers and DBAs need to be aware of. It is essential that application code specify the proper data type and length for TVP columns in order to achieve optimal performance and reduce unnecessary overhead. Additionally, a trace (Extended Events, server-side SQL Trace, Profiler) that captures RPC events of an inappropriately defined TVP can not only exacerbate performance issues, but affect stability of the SQL Server instance in some cases.

The graph below summarizes the impact the application code max column length specification can have on performance, without and with a trace running. All tests used the same table type of 10 varchar(50) columns and a 10,000 row TVP rows passed via a DataTable object of 10 string columns. The only variables were the max column length specified by the app code and a trace running on the database server. The client application was run on a different machine than the database server and elapsed time measured by the application.

Figure 1: Impact of TVP maximum column length specification with and without tracing

The average duration was 113ms without a trace running when the app code used the default -1 max column length. However, when max column length 50 was specified (matching the varchar(50) column of the table type), the average duration dropped significantly to 75ms. The trivial code change of specifying the string column max length of 50 improved performance by 33%.

I then ran the same pair of tests while a trace captured the RPC completed events. The average duration of the default max length test increased from 113ms to 9.324ms with the trace running, an over 80x degradation in performance! The test with the explicit 50 character max length was not nearly impacted as much, increasing from 75ms to 89ms (which is tolerable, IMHO).

TVP Internals
Under the hood, TVP data are passed to SQL Server over the Tabular Data Stream (TDS) protocol. The client API sends TVP column meta-data to SQL Server describing the data type, length, and other meta-data for each TVP column followed by data rows with each column in native format matching the preceding data type specification. SQL Server uses the provided column meta-data to prepare and fill buffers for efficient processing on the server side, leveraging native types to eliminate parsing overhead similarly to other parameterized queries.

Before a query or stored proc with a TVP starts executing, SQL Server creates a table in tempdb with the same schema as the parameter table type and uses bulk insert internally to efficiently load the table with TVP rows streamed by the client application. The size of a TVP is constrained only by available tempdb storage. SQL Server executes the query/proc after the TVP temp table is loaded and the parameterized T-SQL query/proc can then use the TVP data.

The TVP columns provided by the client application do not have to match the schema of the target table type; SQL Server implicitly converts TVP values to match the target table type column when data types differ. Although not optimal, implicit conversion is generally not a major factor in overall TVP performance.

The application-specified TVP max column length can impact performance significantly, and in some cases dramatically, as illustrated by the performance tests shown earlier. SQL Server prepares to receive TVP data up to the max length specified by the client application rather than the defined size of the target table type column. When the specified max length of variable length columns exceed the 8000 byte tipping point, SQL Server uses a different code path to allow for large object (LOB) values up to 2GB. Unless the table type actually contains LOB values (varchar(MAX), nvarchar(MAX)), database server resources are wasted unnecessarily when an inappropriate max column length is specified.

LOB values are especially problematic when a trace captures the RPC completed event of a TVP query. Tracing uses memory from the OBJECTSTORE_LBSS memory pool to build trace records that contain TVP LOB values. From my observations of the sys.dm_os_memory_clerks DMV, each LOB cell of a TVP requires about 8K during tracing regardless of the actual value length. This memory adds up very quickly when many rows and lob columns are passed via a TVP with a trace running. For example, the 10,000 row TVP with 10 LOB columns used in the earlier test required over 800MB memory for a single trace record. Consider that a large number of TVP LOB cells and/or concurrent TVP queries can cause queries to fail with insufficient memory errors. In extreme cases, the entire instance can become unstable and even crash under due to tracing of TVP queries.

Specifying Proper TVP Parameter Column Meta-Data
A SQL Server development best practice has long been to use strongly-typed parameters with attention to detail regarding the parameter data type and length such that it is consistent with the types on the server. This practice improves performance by avoiding implicit data type conversions, promotes sargable expressions, avoids unnecessary procedure cache bloat, and inherently validates data for proper typing on the client before it is sent to the database server. With scalar parameters, one need only specify the correct SqlDbType along with the proper length, precision/scale (avoiding AddWithValue method to add parameters) and all is well in the world.

TVP parameters require additional column meta-data not applicable to scalar parameters. The parameter data type of a TVP in .NET is always SqlDbType.Structured. The additional TVP column meta-data is inferred from the supplied parameter value, which may be a DbDataReader, IEnumerable, or DataTable object. These objects inherently contain column meta-data and methods enumerate rows, which the SqlClient API uses to send the TVP to SQL Server.

DataTable objects are most commonly used as TVP values. DataTables are easy to use and can serve as containers for data beyond just TVP usage. But unlike DbDataReader and IEnumerable objects, a big gotcha with a DataTable is that the default data type String with maximum length of -1 (2GB LOB). This is the .NET equivalent of the SQL Server nvarchar(MAX) data type and has many insidious and negative implications with a TVP. First, values of types other than string that are added to a DataTable string column will be converted to string (DateTime, Integer, GUID, etc.). Consequently, using the default DataTable column string data type for non-string types will:

• increase client memory requirements compared to more compact native types
• incur conversion overhead
• prevent strong-typed data validation on client side
• require date format aware formatting of date and datetime values
• require using a period as decimal separators
• increase network usage compared with smaller native types

When String is the proper column data type, developers must be especially mindful of the max length specification when the DataTable is used as the TVP value. Strings in Windows and .NET are Unicode, requiring 2 bytes per character. This means a max length of over 4000 characters will cross the 8000 byte threshold for LOB data on the server side regardless of the table type on the server. Avoid using long string columns in TVPs when many rows are passed and never use the default -1 length unless a MAX type is actually intended.

DBAs who support applications that use TVPs should be aware of the repercussions tracing can have on performance and SQL Server memory. Avoid capturing TVP RPC completed events of large TVP requests, if possible. When tracing TVP RPC completed events, monitor the OBJECTSTORE_LBSS memory pool for excessive memory usage.

Acknowledgements
I’d like the thank SQL Server MVP Ola Hallengren for his suggestion for me to write this article.

SSMS is Free

Posted on July 2, 2016 by dguzman

The general available release of SQL Server Management Studio 2016 is free. No SQL Server license is required to install and use it for either development or production purposes. This GA version supports SQL Server versions 2008 through 2016 plus Azure SQL Database. You might be able to use it against earlier SQL versions too but that’s not officially supported and features will be limited. SSMS 2016, now based on the Visual Studio 2015 isolated shell, can be downloaded from the SSMS Download Page.

Prior to SQL Server 2016, one typically installed SSMS from the SQL Server installation media and was upgraded only by CUs, service packs, or SQL Server version upgrades. This changed with the release of SQL Server 2016; SSMS is a now a separate download. After installation, regular updates are made available (monthly) and can be installed from the SSMS Tools–>Check for Updates… menu option. These frequent updates allow the tooling to stay aligned with both on-prem and Azure SQL Database while delivering enhancements and bug fixes to SSMS much more quickly. I am quite impressed by the responsiveness of the SSMS development team in this regard, having used it since the initial preview versions of the VS 2015 isolated shell were made available.

An important change in this GA version compared to earlier versions of SSMS is that SET XACT_ABORT ON is now the default session setting. That’s a good thing, in my opinion, because it helps ensure one doesn’t accidentally leave a transaction open in a query window following an error (I have many horror stories about that). But it does change script execution behavior because queries, with or without error handling, will generally not continue after an error with XACT_ABORT ON. Be aware of this when running scripts from SSMS. This and other SET options can be specified from the SSMS Query–>Query options–>Execution–>Advanced. I strongly recommend specifying SET XAT_ABORT ON unless you have a specific reason not to do so.

Kudos to the Microsoft for providing SSMS as a free tool and to the SSMS development team for their ongoing commitment to SQL Server tooling improvements and timely delivery.

Add or Remove IDENTITY Property From an Existing Column Efficiently

Posted on April 16, 2016 by dguzman

Introduction
Refactoring is often needed to improve schema design or address changes in requirements. Occasionally, these schema modifications require removing the IDENTITY property from an existing non-empty table while retaining the existing values. Unfortunately, there is no DDL command to add or remove IDENTITY from an existing column. Traditionally, one must go through the pain of either recreating the table or jumping through hoops by adding a new column, updating the new column value with the old column value, and dropping the original column. This is especially problematic with large tables and short maintenance windows.

This article details how to use SWITCH as an alternative technique to add or remove IDENTITY from an existing column. I have been using SWITCH for many years but learned only recently that the identical schema restriction for SWITCH does not include IDENTITY column property; one can SWITCH data from one table to another even though the IDENTITY column property differs. This behavior allows the IDENTITY property to be added or removed.

Overview
All tables are partitioned from a database storage engine perspective since SQL Server 2005, although multiple partitions require Enterprise Edition. The implication of this storage architecture is that one can use ALTER TABLE…SWITCH regardless of SQL Server edition to move the entire contents of one table to another without physically moving rows as long as the source and target tables have an identical schema and are physically aligned (indexes and tables on same filegroup). SWITCH performs fast storage meta-data changes so the operation typically takes less than a second regardless of table size. The target table must be empty before SWITCH. After SWITCH, the target table will contain the data and the source table will be empty (similar to TRUNCATE).

Since the identical schema restriction does not include the IDENTITY column property, SWITCH is a handy technique to add or remove IDENTITY from an existing column of a non-empty table. For an empty table, a simple drop and create is easier and more efficient.

Remove IDENTITY from an Existing Column
The script below uses a staging table without the IDENTITY column property along with SWITCH to remove the IDENTITY property from the column and retain previously assigned identity values. I drop the empty original table and rename the staging table to the original name along with constraints and indexes after the operation.

--create sample table and data
CREATE TABLE dbo.RemoveIdentityExample(
	  IncrementalColumn int IDENTITY NOT NULL
		CONSTRAINT PK_RemoveIdentityExample PRIMARY KEY
	, OtherData varchar(20) NULL
	);
CREATE INDEX idx_RemoveIdentityExample_OtherData
	ON dbo.RemoveIdentityExample(OtherData);
INSERT INTO dbo.RemoveIdentityExample VALUES
	  ('Sample data one')
	, ('Sample data two')
	, ('Sample data three');
GO

SET XACT_ABORT ON;
BEGIN TRY
	BEGIN TRAN;
	--create staging table with same schema, indexes, and constraints
	CREATE TABLE dbo.RemoveIdentityExampleStaging(
		  IncrementalColumn int NOT NULL --IDENTITY column property removed
			CONSTRAINT PK_RemoveIdentityExampleStaging PRIMARY KEY
		, OtherData varchar(20) NULL
		);
	CREATE INDEX idx_RemoveIdentityExampleStaging_OtherData
		ON dbo.RemoveIdentityExampleStaging(OtherData);

	ALTER TABLE dbo.RemoveIdentityExample
		SWITCH TO dbo.RemoveIdentityExampleStaging;
	DROP TABLE dbo.RemoveIdentityExample;
	--rename table
	EXEC sp_rename
		  @objname = N'dbo.RemoveIdentityExampleStaging'
		, @newname = N'RemoveIdentityExample'
		, @objtype = 'OBJECT';
	--rename constraints
	EXEC sp_rename
		  @objname = N'dbo.PK_RemoveIdentityExampleStaging'
		, @newname = N'PK_RemoveIdentityExample'
		, @objtype = 'OBJECT';
	--rename indexes
	EXEC sp_rename
		  @objname = N'dbo.RemoveIdentityExample.idx_RemoveIdentityExampleStaging_OtherData'
		, @newname = N'idx_RemoveIdentityExampleStaging_OtherData'
		, @objtype = N'INDEX';
	COMMIT;
END TRY
BEGIN CATCH
	IF @@TRANCOUNT > 0 ROLLBACK;
	THROW;
END CATCH;
GO

Add IDENTITY to an Existing Column
This script uses a staging table with the IDENTITY column property along with SWITCH to add the IDENTITY property while retaining the existing column values. After the SWITCH operation, DBCC CHECKIDENT seeds the next IDENTITY value greater than the current value in the column.

Note that an IDENTITY column must now allow NULL values so, in the case of a nullable column, the column must first be altered to NOT NULL before using this method.

--create sample table and data
CREATE TABLE dbo.AddIdentityExample(
	  IncrementalColumn int NOT NULL
		CONSTRAINT PK_AddIdentityExample PRIMARY KEY
	, OtherData varchar(20) NULL
	);
CREATE INDEX idx_AddIdentityExample_OtherData
	ON dbo.AddIdentityExample(OtherData);
INSERT INTO dbo.AddIdentityExample VALUES
	  (1, 'Sample data one')
	, (2, 'Sample data two')
	, (3, 'Sample data three');
GO

SET XACT_ABORT ON;
BEGIN TRY
	BEGIN TRAN;
	--create staging table with same schema, indexes, and constraints
	CREATE TABLE dbo.AddIdentityExampleStaging(
		  IncrementalColumn int IDENTITY NOT NULL --IDENTITY column property added
			CONSTRAINT PK_AddIdentityExampleStaging PRIMARY KEY
		, OtherData varchar(20) NULL
		);
	CREATE INDEX idx_AddIdentityExampleStaging_OtherData
		ON dbo.AddIdentityExampleStaging(OtherData);

	ALTER TABLE dbo.AddIdentityExample
		SWITCH TO dbo.AddIdentityExampleStaging;
	DROP TABLE dbo.AddIdentityExample;
	--rename table
	EXEC sp_rename
		  @objname = N'dbo.AddIdentityExampleStaging'
		, @newname = N'AddIdentityExample'
		, @objtype = 'OBJECT';
	--rename constraints
	EXEC sp_rename
		  @objname = N'dbo.PK_AddIdentityExampleStaging'
		, @newname = N'PK_AddIdentityExample'
		, @objtype = 'OBJECT';
	--rename indexes
	EXEC sp_rename
		  @objname = N'dbo.AddIdentityExample.idx_AddIdentityExampleStaging_OtherData'
		, @newname = N'idx_AddIdentityExampleStaging_OtherData'
		, @objtype = N'INDEX';
	--seed IDENTITY with current max column value
	DBCC CHECKIDENT(N'dbo.AddIdentityExample');
	COMMIT;
END TRY
BEGIN CATCH
	IF @@TRANCOUNT > 0 ROLLBACK;
	THROW;
END CATCH;
GO

ORDER BY Is Required

Posted on January 30, 2016 by dguzman

I often see folks assume rows will be returned in order by the clustered index when ORDER BY is not specified in the query. This is despite the fact that the SQL Server Books Online ORDER BY topic specifically states “The order in which rows are returned in a result set are not guaranteed unless an ORDER BY clause is specified.”

For those that want to save a few keystrokes and omit the needed ORDER BY, here’s one example that shows rows may be returned in an order other than the clustered index.

USE tempdb;
GO

CREATE DATABASE Demo
	ON
		  (NAME='Demo_Primary1', FILENAME='D:\SqlDataFiles\Demo_Primary', SIZE=10MB)
		, (NAME='Demo_Primary2', FILENAME='D:\SqlDataFiles\Demo_Primary2', SIZE=10MB)
	LOG ON
		  (NAME='Demo_Log', FILENAME='L:\SqlLogFiles\Demo_Log.ldf', SIZE=1MB);
GO

USE Demo;
GO

CREATE TABLE OrderTest
(
	ID int,
	Name VARCHAR(50),
	OtherData char(5000)
);
CREATE CLUSTERED INDEX cdx_OrderTest ON OrderTest(ID);
GO

--load 512 odd numbers
WITH 
	 t4 AS (SELECT n FROM (VALUES(0),(0),(0),(0)) t(n))
	,t1K AS (SELECT ROW_NUMBER() OVER (ORDER BY (a.n)) AS num
		FROM t4 AS a CROSS JOIN t4 AS b CROSS JOIN t4 AS c CROSS JOIN t4 AS d CROSS JOIN t4 AS e)
INSERT INTO OrderTest WITH(TABLOCKX)
SELECT num, 'test', 'test'
FROM t1K
WHERE num % 2 = 1;
GO

----load 512 even numbers
WITH 
	 t4 AS (SELECT n FROM (VALUES(0),(0),(0),(0)) t(n))
	,t1K AS (SELECT ROW_NUMBER() OVER (ORDER BY (a.n)) AS num
		FROM t4 AS a CROSS JOIN t4 AS b CROSS JOIN t4 AS c CROSS JOIN t4 AS d CROSS JOIN t4 AS e)
INSERT INTO OrderTest WITH(TABLOCKX)
SELECT num, 'test', 'test'
FROM t1K
WHERE num % 2 = 0;
GO

SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
GO

SELECT * FROM dbo.OrderTest;
/*
ID	Name	OtherData
2	test	test
4	test	test
6	test	test
8	test	test
17	test	test
19	test	test
21	test	test
23	test	test
25	test	test
27	test	test
...
*/
SELECT * FROM dbo.OrderTest ORDER BY ID;
/*
ID	Name	OtherData
1	test	test
2	test	test
3	test	test
4	test	test
5	test	test
6	test	test
7	test	test
8	test	test
9	test	test
10	test	test
...
*/
GO

The reason the query without the ORDER BY returns rows in a different order than the clustered index logical order is because SQL Server chose to scan the table in physical allocation order using the IAM (Index Allocation Map) instead of following the clustered index linked list. For details of the allocation and page linkage of this table, you can use the undocumented (don’t use in in prod) sys.dm_db_database_page_allocations table-valued function in SQL 2012 and later:

SELECT * 
FROM sys.dm_db_database_page_allocations(DB_ID(), OBJECT_ID(N'OrderTest'), 1, 1, 'DETAILED')
WHERE page_type_desc = 'DATA_PAGE';

The IAM scan was done here due to no ordering requirement for this query and the allocation order scan was deemed more efficient, and permissible in the READ UNCOMMITTED isolation level because data integrity isn’t needed. Other factors can also affect the ordering of results, including available indexes, execution plan operators, parallelism, and other concurrent activity.

Summary
Remember that SQL Server strives to execute queries as efficiently as possible as long as it adheres to the data contract. The chosen plan and storage engine internals that vary by SQL Server version and edition will influence ordering of results. The ordering of rows is by happenstance unless ORDER BY is specified.

QUERYTRACEON Permissions

Posted on November 14, 2015 by dguzman

I recently had a SQL Server 2014 stored procedure that ran for several hours with high CPU usage. I narrowed the problem down to a single query with dozens of joins was stuck in compilation. Suspecting an issue with the new cardinality estimator, I changed the database compatibility level from 120 (SQL Server 2014) to 110 (SQL Server 2012) so that the legacy CE would be used. The query then compiled immediately and executed quickly, which confirmed the new CE as the culprit of the excessive compilation times.

OPTION QUERYTRACEON (9481) to the Rescue
Changing the database compatibility level was a heavy-handed option I didn’t want to pursue because some queries benefited from the new CE behavior. Instead, I specified trace flag 4891 using the OPTION QUERYTRACEON(9481) query hint to use the legacy CE for the problem query. This is an acceptable workaround for me until the bug is addressed. However, you might not be so lucky. A more common use case for OPTION QUERYTRACEON(9481) is performance regression with the new CE. Although not documented as far as I know, the QUERYTRACEON query hint requires sysadmin permissions unless executed from within a stored procedure. Consequently, QUERYTRACEON won’t be an option for those who need the hint for queries run by non-sysadmins outside stored procs. I found SQL Server MVP Ola Hallengren’s Connect suggestion to allow QUERYTRACEON with no additional permissions. I upvoted and encourage you to do so too.

Tiered Storage Partition Copy

Posted on July 23, 2015 by dguzman

In my last tiered storage sliding window post, I shared a sliding window script that also moves an older partition to a different filegroup as part of a tiered storage strategy. This approach allows one to keep actively used partitions on the fastest storage available while keeping older less frequently used data on less expensive storage. That version of the script moves data efficiently using a staging table and CREATE INDEX…DROP EXISTING but the downside is the data being moved is unavailable for querying from the time the partition is switched out of the main table until it is switched back in. Consequently, the maintenance needs to be scheduled during a window where data in the partition being moved isn’t needed.

This follow-up article shows an alternative copy technique instead. Data in the partition being moved is left in the main table until the after the copy completes. Once data are copied to the different filegroup, partition maintenance is performed to switch out the original partition and switch in the copied data, now on the older data filegroup. This allows data to remain online except during the final partition switches, which are fast meta-data operations. It is assumed that the data in the partition being moved is read-only during this process.

Copy Data to Slower Storage
Unfortunately, neither ALTER INDEX nor CREATE INDEX…DROP EXISTING provide the capability to repartition an individual partition of a table/index. ALTER INDEX allows one to target a specific partition during a REORGANIZE or REBUILD but not change the filegroup or partition scheme in the process. CREATE INDEX…DROP EXISTING allows a filegroup or partition scheme specification but applies to the entire index; individual partitions cannot be specified. Consequently DML (INSERT…SELECT) instead of DDL must be used in order to keep data online during the copy process.

I’ll assume you’ve already slid the window by purging expired data and preparing for new data as detailed in my last article with the only task remaining to move an older partition to a slower storage filegroup. The following steps will keep the data being moved online during the potentially long copy process. The offline operations performed in steps 3-6 are metadata operations that will complete quickly, once an exclusive table lock (to avoid deadlocks) can be granted. Note that all partition functions and schemes are identical and the staging table is empty before these steps are performed.

1) Move the empty staging table partition to the older data filgegroup
2) Load data into the moved partition of the staging table
3) Switch the copied data out of the main table into a second staging table
4) Switch the adjacent partition out of the main table into the first staging table
5) Move the main table partition to the older data filegroup
6) Switch the partitions from the first staging table back into the main table

The complete DDL and script is at the end of this article. It differs from my last article (which I’ll assume you’ve already perused) by the addition of a second staging table and this alternative move technique. Below are the individual steps.

Move the empty staging table partition to the older data filgegroup
Remember, the staging table is partitioned using the secondary partition function/scheme so that it can be repartitioned independently of the main table. The first task is to move the partition of the unused secondary partition scheme to the slower storage filegroup using MERGE and SPLIT:

1) MERGE the secondary partition function boundary of the moved month to remove it from the secondary partition function and scheme:

ALTER PARTITION FUNCTION PF_12MonthSlidingWindow_Secondary()
    MERGE RANGE (@MonthStartDateToMoveFromCurrentToOld);

2) Set the secondary partition scheme NEXT USED to slower storage filegroup:

ALTER PARTITION SCHEME PS_12MonthSlidingWindow_Secondary
    NEXT USED FG_OlderData;

3) SPLIT the secondary partition function for the moved month to recreate the partition on the slower storage filegroup:

ALTER PARTITION FUNCTION PF_12MonthSlidingWindow_Secondary()
    SPLIT RANGE (@MonthStartDateToMoveFromCurrentToOld);

Load the moved partition of the staging table
Now that the staging table partition is on the FG_OlderData filegroup, copy data for the partition to be moved into a staging table using INSERT…SELECT. Note that the WHERE clause specifies the source partition boundaries:

INSERT INTO dbo.SlidingWindowTable_Staging WITH(TABLOCKX) (PartitioningColumn, OtherKeyColumn, OtherData)
    SELECT
          PartitioningColumn
	, OtherKeyColumn
	, OtherData
FROM dbo.SlidingWindowTable
WHERE
        PartitioningColumn >= @MonthStartDateToMoveFromCurrentToOld
    AND PartitioningColumn < DATEADD(month, 1, @MonthStartDateToMoveFromCurrentToOld);

Switch the copied data out of the main table into a second staging table
The second staging table is used to permanently remove data from the original partition, still on the NewerData filegroup.

TRUNCATE TABLE dbo.SlidingWindowTable_Staging2;
ALTER TABLE dbo.SlidingWindowTable
    SWITCH PARTITION $PARTITION.PF_12MonthSlidingWindow(@MonthStartDateToMoveFromCurrentToOld)
    TO dbo.SlidingWindowTable_Staging2 PARTITION $PARTITION.PF_12MonthSlidingWindow(@MonthStartDateToMoveFromCurrentToOld);

TRUNCATE TABLE dbo.SlidingWindowTable_Staging2;

Switch the adjacent partition out of the main table into the first staging table
Before we move the main table partition (now empty) to the OlderData filegroup, the adjacent partition is also switched out. This is technically not required but is done to follow Microsoft’s recommended best practice of merging only empty partitions.

ALTER TABLE dbo.SlidingWindowTable
    SWITCH PARTITION $PARTITION.PF_12MonthSlidingWindow(DATEADD(month, -1, @MonthStartDateToMoveFromCurrentToOld))
    TO dbo.SlidingWindowTable_Staging PARTITION $PARTITION.PF_12MonthSlidingWindow_Secondary(DATEADD(month, -1, @MonthStartDateToMoveFromCurrentToOld));

Move the main table partition to the older data filegroup
The main table partition is moved from the NewerData filegroup to OlderData with merge. Keep in mind that the adjacent partition on the OlderData filegroup is empty. Data for both these partitions are in the staging table.

--remove partition from main table
ALTER PARTITION FUNCTION PF_12MonthSlidingWindow()
    MERGE RANGE (@MonthStartDateToMoveFromCurrentToOld);

--create new empty partition on old data filegroup for moved data
ALTER PARTITION SCHEME PS_12MonthSlidingWindow
    NEXT USED FG_OlderData;
    ALTER PARTITION FUNCTION PF_12MonthSlidingWindow()
    SPLIT RANGE (@MonthStartDateToMoveFromCurrentToOld);

Switch the partitions from the first staging table back into the main table
Both main and staging table partition schemes are now identical. The partitions can now be switch back into the main table.

ALTER TABLE dbo.SlidingWindowTable_Staging
    SWITCH PARTITION $PARTITION.PF_12MonthSlidingWindow_Secondary(DATEADD(month, -1, @MonthStartDateToMoveFromCurrentToOld))
    TO dbo.SlidingWindowTable PARTITION $PARTITION.PF_12MonthSlidingWindow(DATEADD(month, -1, @MonthStartDateToMoveFromCurrentToOld));
ALTER TABLE dbo.SlidingWindowTable_Staging
    SWITCH PARTITION $PARTITION.PF_12MonthSlidingWindow_Secondary(@MonthStartDateToMoveFromCurrentToOld)
    TO dbo.SlidingWindowTable PARTITION $PARTITION.PF_12MonthSlidingWindow(@MonthStartDateToMoveFromCurrentToOld);

Scripts
Below are the complete example scripts to implement this copy sliding window technique. My last tiered storage sliding window post also includes the CREATE DATABASE, including filegroups.

--main table partition function (before start of new month)
CREATE PARTITION FUNCTION PF_12MonthSlidingWindow (datetime2(0))
AS RANGE RIGHT FOR VALUES (
	  '2014-11-01T00:00:00' --current month -12 on OlderData filegroup
	, '2014-12-01T00:00:00' --current month -11 on OlderData filegroup
	, '2015-01-01T00:00:00' --current month -10 on OlderData filegroup
	, '2015-02-01T00:00:00' --current month -9 on OlderData filegroup
	, '2015-03-01T00:00:00' --current month -8 on OlderData filegroup
	, '2015-04-01T00:00:00' --current month -7 on OlderData filegroup
	, '2015-05-01T00:00:00' --current month -6 on OlderData filegroup
	, '2015-06-01T00:00:00' --current month -5 on OlderData filegroup
	, '2015-07-01T00:00:00' --current month -4 on OlderData filegroup
	, '2015-08-01T00:00:00' --current month -3 on NewerData filegroup
	, '2015-09-01T00:00:00' --current month -2 on NewerData filegroup
	, '2015-10-01T00:00:00' --current month -1 on NewerData filegroup
	, '2015-11-01T00:00:00' --current month on NewerData filegroup
	, '2015-12-01T00:00:00' --next month on NewerData filegroup
	, '2016-01-01T00:00:00' --future month
)
GO

--main table partition scheme
CREATE PARTITION SCHEME PS_12MonthSlidingWindow
AS PARTITION PF_12MonthSlidingWindow
TO (
	  FG_OlderData --older than 12 montths
	, FG_OlderData --current month -12 on OlderData
	, FG_OlderData --current month -11 on OlderData
	, FG_OlderData --current month -10 on OlderData
	, FG_OlderData --current month -9 on OlderData
	, FG_OlderData --current month -8 on OlderData
	, FG_OlderData --current month -7 on OlderData
	, FG_OlderData --current month -6 on OlderData
	, FG_OlderData --current month -5 on OlderData
	, FG_OlderData --current month -4 on OlderData
	, FG_OlderData --current month -3 on NewerData
	, FG_NewerData --current month -2 on NewerData
	, FG_NewerData --current month -1 on NewerData
	, FG_NewerData --current month on NewerData
	, FG_NewerData --next month on NewerData
	, FG_NewerData --future month on NewerData
      );
GO

--secondary partition function identical to main function
CREATE PARTITION FUNCTION PF_12MonthSlidingWindow_Secondary (datetime2(0))
AS RANGE RIGHT FOR VALUES (
	  '2014-11-01T00:00:00' --current month -12 on OlderData filegroup
	, '2014-12-01T00:00:00' --current month -11 on OlderData filegroup
	, '2015-01-01T00:00:00' --current month -10 on OlderData filegroup
	, '2015-02-01T00:00:00' --current month -9 on OlderData filegroup
	, '2015-03-01T00:00:00' --current month -8 on OlderData filegroup
	, '2015-04-01T00:00:00' --current month -7 on OlderData filegroup
	, '2015-05-01T00:00:00' --current month -6 on OlderData filegroup
	, '2015-06-01T00:00:00' --current month -5 on OlderData filegroup
	, '2015-07-01T00:00:00' --current month -4 on OlderData filegroup
	, '2015-08-01T00:00:00' --current month -3 on NewerData filegroup
	, '2015-09-01T00:00:00' --current month -2 on NewerData filegroup
	, '2015-10-01T00:00:00' --current month -1 on NewerData filegroup
	, '2015-11-01T00:00:00' --current month on NewerData filegroup
	, '2015-12-01T00:00:00' --next month on NewerData filegroup
	, '2016-01-01T00:00:00' --future month
)
GO

--secondary partition scheme identical to main scheme
CREATE PARTITION SCHEME PS_12MonthSlidingWindow_Secondary
AS PARTITION PF_12MonthSlidingWindow_Secondary
TO (
	  FG_OlderData --older than 12 montths
	, FG_OlderData --current month -12 on OlderData
	, FG_OlderData --current month -11 on OlderData
	, FG_OlderData --current month -10 on OlderData
	, FG_OlderData --current month -9 on OlderData
	, FG_OlderData --current month -8 on OlderData
	, FG_OlderData --current month -7 on OlderData
	, FG_OlderData --current month -6 on OlderData
	, FG_OlderData --current month -5 on OlderData
	, FG_OlderData --current month -4 on OlderData
	, FG_OlderData --current month -3 on NewerData
	, FG_NewerData --current month -2 on NewerData
	, FG_NewerData --current month -1 on NewerData
	, FG_NewerData --current month on NewerData
	, FG_NewerData --next month on NewerData
	, FG_NewerData --future month on NewerData
      );
GO

--main partitioned table
CREATE TABLE dbo.SlidingWindowTable(
	  PartitioningColumn datetime2(0)
	, OtherKeyColumn int NOT NULL
	, OtherData int NULL
	, CONSTRAINT PK_SlidingWindowTable PRIMARY KEY
		CLUSTERED (PartitioningColumn, OtherKeyColumn)
		ON PS_12MonthSlidingWindow(PartitioningColumn)
	) ON PS_12MonthSlidingWindow(PartitioningColumn);
GO

--The staging table initially uses the same scheme as the main table
--except with maintenance is in progress.
CREATE TABLE dbo.SlidingWindowTable_Staging(
	  PartitioningColumn datetime2(0)
	, OtherKeyColumn int NOT NULL
	, OtherData int NULL
	, CONSTRAINT PK_SlidingWindowTable_Staging PRIMARY KEY
		CLUSTERED (PartitioningColumn, OtherKeyColumn)
		ON PS_12MonthSlidingWindow(PartitioningColumn)
	) ON PS_12MonthSlidingWindow(PartitioningColumn);
GO

--This second staging table uses the same scheme as the main table and
--is used to switch data out of the main table after a partition is copied.
CREATE TABLE dbo.SlidingWindowTable_Staging2(
	  PartitioningColumn datetime2(0)
	, OtherKeyColumn int NOT NULL
	, OtherData int NULL
	, CONSTRAINT PK_SlidingWindowTable_Staging2 PRIMARY KEY
		CLUSTERED (PartitioningColumn, OtherKeyColumn)
		ON PS_12MonthSlidingWindow(PartitioningColumn)
	) ON PS_12MonthSlidingWindow(PartitioningColumn);
GO

WITH
	t4 AS (SELECT n FROM (VALUES(0),(0),(0),(0)) t(n))
	,t256 AS (SELECT 0 AS n FROM t4 AS a CROSS JOIN t4 AS b CROSS JOIN t4 AS c CROSS JOIN t4 AS d)
	,t16M AS (SELECT ROW_NUMBER() OVER (ORDER BY (a.n)) - 1 AS num FROM t256 AS a CROSS JOIN t256 AS b CROSS JOIN t256 AS c)
INSERT INTO dbo.SlidingWindowTable WITH (TABLOCKX) (PartitioningColumn, OtherKeyColumn, OtherData) 
SELECT DATEADD(month, num/1000000, '20141101'), num, num
FROM t16M
WHERE num <= 13000000;
GO

CREATE PROCEDURE dbo.PerformMonthlySlidingWindowMaintenace
	  @ExpiredMonthStartDate datetime2(0) --purge data for this month
	, @MonthStartDateToMoveFromCurrentToOld datetime2(0) --move data from this month from current data filegroup to old data filegroup
	, @FutureMonthStartDate datetime2(0) --create new partition for this date on current data filegroup
/*
	This proc performs partition maintenance needed to maintain a monthly sliding
	window with filegroups on tiered storage.  The objective is to keep recent frequently
	used data on filegroup FG_NewData and less often accessed data on PS_12MonthSlidingWindow_Secondary.
*/
AS

SET NOCOUNT ON;
SET XACT_ABORT ON;

DECLARE @result int;

BEGIN TRY

	BEGIN TRAN;

	--acquire exclusive lock on main table to prevent deadlocking during partition maintenance
	SET @result = (SELECT TOP (0) 1 FROM dbo.SlidingWindowTable WITH (TABLOCKX));

	--**************************
	--*** Purge expired data ***
	--**************************
	--purge oldest partition from main table (including partiton for data older than first boundary)
	TRUNCATE TABLE dbo.SlidingWindowTable_Staging;
	ALTER TABLE dbo.SlidingWindowTable
		SWITCH PARTITION $PARTITION.PF_12MonthSlidingWindow(DATEADD(month, -1, @ExpiredMonthStartDate))
		TO dbo.SlidingWindowTable_Staging PARTITION $PARTITION.PF_12MonthSlidingWindow_Secondary(DATEADD(month, -1, @ExpiredMonthStartDate));
	ALTER TABLE dbo.SlidingWindowTable
		SWITCH PARTITION $PARTITION.PF_12MonthSlidingWindow(@ExpiredMonthStartDate)
		TO dbo.SlidingWindowTable_Staging PARTITION $PARTITION.PF_12MonthSlidingWindow_Secondary(@ExpiredMonthStartDate);
	TRUNCATE TABLE dbo.SlidingWindowTable_Staging;

	--remove expired partition boundary from both functions after purge
	ALTER PARTITION FUNCTION PF_12MonthSlidingWindow()
		MERGE RANGE (@ExpiredMonthStartDate);
	ALTER PARTITION FUNCTION PF_12MonthSlidingWindow_Secondary()
		MERGE RANGE (@ExpiredMonthStartDate);

	--*******************************
	--*** Prepare for future data ***
	--*******************************
	--add new partition on new data filegroup for future data to both functions/schemes
	ALTER PARTITION SCHEME PS_12MonthSlidingWindow
			  NEXT USED FG_NewerData;
	ALTER PARTITION FUNCTION PF_12MonthSlidingWindow()
		SPLIT RANGE (@FutureMonthStartDate);
	ALTER PARTITION SCHEME PS_12MonthSlidingWindow_Secondary
			  NEXT USED FG_NewerData;
	ALTER PARTITION FUNCTION PF_12MonthSlidingWindow_Secondary()
		SPLIT RANGE (@FutureMonthStartDate);

	--this will release the exclusve table lock
	COMMIT;

	--********************************************************
	--*** move partition from FG_NewerData to FG_OlderData ***
	--********************************************************
	--alter secondary partition function and scheme to move partition from FG_NewerData to FG_OlderData
	ALTER PARTITION FUNCTION PF_12MonthSlidingWindow_Secondary()
		MERGE RANGE (@MonthStartDateToMoveFromCurrentToOld);
	ALTER PARTITION SCHEME PS_12MonthSlidingWindow_Secondary
		NEXT USED FG_OlderData;
	ALTER PARTITION FUNCTION PF_12MonthSlidingWindow_Secondary()
		SPLIT RANGE (@MonthStartDateToMoveFromCurrentToOld);

	--rebuild staging table using new partition scheme to move partition to filegroup FG_OlderData
	CREATE UNIQUE CLUSTERED INDEX PK_SlidingWindowTable_Staging
		ON dbo.SlidingWindowTable_Staging(PartitioningColumn, OtherKeyColumn)
		WITH(DROP_EXISTING=ON)
		ON PS_12MonthSlidingWindow_Secondary(PartitioningColumn);

	--copy data from main table into corresponding staging table partition, now on FG_OlderData
	INSERT INTO dbo.SlidingWindowTable_Staging WITH(TABLOCKX) (PartitioningColumn, OtherKeyColumn, OtherData)
		SELECT
			  PartitioningColumn
			, OtherKeyColumn
			, OtherData
		FROM dbo.SlidingWindowTable
		WHERE
			PartitioningColumn >= @MonthStartDateToMoveFromCurrentToOld
			AND PartitioningColumn < DATEADD(month, 1, @MonthStartDateToMoveFromCurrentToOld);

	BEGIN TRAN;

	--acquire exclusive lock on main table to prevent deadlocking during partition maintenance
	SET @result = (SELECT TOP (0) 1 FROM dbo.SlidingWindowTable WITH (TABLOCKX));

	--make sure the second staging table is empty
	TRUNCATE TABLE dbo.SlidingWindowTable_Staging2;

	--switch out partition to be moved on FG_NewData into second staging table
	ALTER TABLE dbo.SlidingWindowTable
		SWITCH PARTITION $PARTITION.PF_12MonthSlidingWindow(@MonthStartDateToMoveFromCurrentToOld)
		TO dbo.SlidingWindowTable_Staging2 PARTITION $PARTITION.PF_12MonthSlidingWindow(@MonthStartDateToMoveFromCurrentToOld);

	--remove redundant copy of data
	TRUNCATE TABLE dbo.SlidingWindowTable_Staging2;

	--switch the latest partition of main table on the FG_OlderData filegroup to the staging table
	--so that adjacent partitions of the main table are empty for the MERGE and SPLIT needed to 
	--physically move the main table partition to FG_OlderData
	ALTER TABLE dbo.SlidingWindowTable
		SWITCH PARTITION $PARTITION.PF_12MonthSlidingWindow(DATEADD(month, -1, @MonthStartDateToMoveFromCurrentToOld))
		TO dbo.SlidingWindowTable_Staging PARTITION $PARTITION.PF_12MonthSlidingWindow_Secondary(DATEADD(month, -1, @MonthStartDateToMoveFromCurrentToOld));

	--remove empty partition
	ALTER PARTITION FUNCTION PF_12MonthSlidingWindow()
		MERGE RANGE (@MonthStartDateToMoveFromCurrentToOld);

	--create new empty partition on old data filegroup for moved data
	ALTER PARTITION SCHEME PS_12MonthSlidingWindow
			  NEXT USED FG_OlderData;
	ALTER PARTITION FUNCTION PF_12MonthSlidingWindow()
		SPLIT RANGE (@MonthStartDateToMoveFromCurrentToOld);

	--now that both partition functions/schemes are identical, switch partitions from staging table back into main table
	ALTER TABLE dbo.SlidingWindowTable_Staging
		SWITCH PARTITION $PARTITION.PF_12MonthSlidingWindow_Secondary(DATEADD(month, -1, @MonthStartDateToMoveFromCurrentToOld))
		TO dbo.SlidingWindowTable PARTITION $PARTITION.PF_12MonthSlidingWindow(DATEADD(month, -1, @MonthStartDateToMoveFromCurrentToOld));
	ALTER TABLE dbo.SlidingWindowTable_Staging
		SWITCH PARTITION $PARTITION.PF_12MonthSlidingWindow_Secondary(@MonthStartDateToMoveFromCurrentToOld)
		TO dbo.SlidingWindowTable PARTITION $PARTITION.PF_12MonthSlidingWindow(@MonthStartDateToMoveFromCurrentToOld);

	COMMIT;

END TRY
BEGIN CATCH

	IF @@TRANCOUNT > 0 ROLLBACK;

	THROW;

END CATCH;
GO

CHECKPOINT
DBCC DROPCLEANBUFFERS
--when scheduling after the start of a new month, set @RunMonth to the first day of the current month
DECLARE
	--@RunMonth datetime2(0) = DATEADD(day, 1, DATEADD(month, -1, EOMONTH(GETDATE())));
	  @RunMonth datetime2(0) = '20151201';

--calculate boundary dates based on @RunMonth
DECLARE
	  @ExpiredMonthStartDate datetime2(0) = DATEADD(month, -13, @RunMonth)
	, @MonthStartDateToMoveFromCurrentToOld datetime2(0) = DATEADD(month, -4, @RunMonth)
	, @FutureMonthStartDate datetime2(0) = DATEADD(month, 2, @RunMonth);

SELECT
	  @ExpiredMonthStartDate AS ExpiredMonthStartDate
	, @MonthStartDateToMoveFromCurrentToOld AS MonthStartDateToMoveFromCurrentToOld
	, @FutureMonthStartDate AS FutureMonthStartDate;

EXEC dbo.PerformMonthlySlidingWindowMaintenace
	  @ExpiredMonthStartDate = @ExpiredMonthStartDate
	, @MonthStartDateToMoveFromCurrentToOld = @MonthStartDateToMoveFromCurrentToOld
	, @FutureMonthStartDate = @FutureMonthStartDate;
GO

Partitioned Tables and Tiered Storage

Posted on July 12, 2015 by dguzman

A sliding window has long been a mainstay use case for table partitioning. A table partitioning sliding window pattern provides an efficient way to purge data according to retention needs and allows individual partitions to be placed on segregated storage. Well-planned data placement can improve performance of parallel scans of related data, meet RTO objectives with piecemeal restores, and performance SLAs with tiered storage. One can place more actively used partitions on the fastest storage available while keeping older less often used data on slower storage, which is typically much less expensive and more abundant.

I’ve posted articles on this site that show how to automate retention of temporal data using partitioned tables and T-SQL scripts. In this post, I’ll combine those building blocks to implement a tiered storage strategy with an automated sliding window. This technique will keep the most current data on the fastest storage, move data to slower storage as temporal partitions age, purge the oldest partition, and finally prepare for new data. I’ll assume you’re already familiar with partitioning concepts, objects (PARTTION SCHEME and FUNCTION), DDL (SWITCH, SPLIT, MERGE), and partition alignment. Alignment is required in order to use the methods described here.

Tiered Storage Planning
Partitioned tables or not, remember that SQL Server provides control of data placement only at the filegroup level. Without partitioning, an entire table or index is placed on the same filegroup. SQL Server then distributes data among the underlying files according to its proportional fill algorithm. Table partitioning provides more granular control over data placement by allowing individual table/index partitions within a table to be placed on different filegroups when it is advantageous to do so to meet the objectives I mentioned earlier.

In order control placement of individual partitions on different storage, at least one filegroup for each storage tier is needed, each with at least one file on that storage tier. Avoid mixing files within a filegroup that reside on different types of storage if you want maximum control over performance. Consider that there may be multiple levels of storage abstraction (OS, virtualization, SAN, NAS, etc.) that ultimately determines where data are physical stored so one must be mindful of those implications in file placement within a storage tier filegroup.

I’ll walk through the implementation with this scenario. Let’s assume 1TB of solid state storage is available and an additional 5TB available on a spinning media SAN. Although I use only 2 storage tiers here, this technique can be extended for additional tiers if needed.

A 12 month sliding window is planned with a single partitioned table. We’ll keep the current month plus 3 prior months on solid state storage and the remaining 9 months of historical data on the SAN. This necessitates a monthly maintenance script that that performs the following tasks:

1) Purge expired data (older than 12 months)
2) Prepare for new data (on solid state storage)
3) Move data to slower storage (move the 4-month old partition from solid state to spinning media storage
4) Cleanup

I start by creating the partition function, partition scheme, partitioned table, and a staging table partitioned using the same partition scheme (complete DDL at the end of this article). The partitioned staging table avoids the need to create check constraints on the staging table in order to use switch partitions back into the main table. I also create a secondary partition function and partition scheme, identical the others but initially not used. These objects are used to move the partition and data, which I’ll discuss in detail later.

Note that I use a RANGE RIGHT partition function, which I consider to be more natural for incremental temporal data than RANGE LEFT. I also follow the recommended practice of a SPLIT and MERGE of only empty partitions in order to avoid data movement and excessive logging (about 4 times that of normal DML).

Monthly maintenance tasks are performed by a stored procedure scheduled by SQL Agent job after the start of each new month. This proc takes 3 date parameters and is run scheduled for execution after the start of a new month:

Below are the details of the maintenance tasks.

Purge Expired data
The first task is the same with any sliding window where oldest data are purged:

1) Truncate the staging table, which is partitioned using a partition scheme identical to the main table:

TRUNCATE TABLE dbo.SlidingWindowTable_Staging;

2) Switch expired data partitions from to staging table into the corresponding staging table partition using SWITCH:

ALTER TABLE dbo.SlidingWindowTable
SWITCH PARTITION $PARTITION.PF_12MonthSlidingWindow(DATEADD(month, -1, @ExpiredMonthStartDate))
		TO dbo.SlidingWindowTable_Staging PARTITION $PARTITION.PF_12MonthSlidingWindow_Secondary(DATEADD(month, -1, @ExpiredMonthStartDate));
ALTER TABLE dbo.SlidingWindowTable
	SWITCH PARTITION $PARTITION.PF_12MonthSlidingWindow(@ExpiredMonthStartDate)
	TO dbo.SlidingWindowTable_Staging PARTITION $PARTITION.PF_12MonthSlidingWindow_Secondary(@ExpiredMonthStartDate);

3) Truncate staging table again:

TRUNCATE TABLE dbo.SlidingWindowTable_Staging;

4) Remove the first partition boundary with MERGE of the main table partition function:

ALTER PARTITION FUNCTION PF_12MonthSlidingWindow()
	MERGE RANGE (@ExpiredMonthStartDate);
ALTER PARTITION FUNCTION PF_12MonthSlidingWindow_Secondary()
	MERGE RANGE (@ExpiredMonthStartDate);

The secondary partition function is also merged to keep it identical to the main partition. Again, no objects yet use the secondary partition function/scheme.

Prepare for New Data
The second task is also similar to the typical sliding window pattern, except the new partition is specifically placed on the solid state filegroup and the secondary partition function/scheme is also modified:

1) Set NEXT USED partition schemes to solid state filegroup

ALTER PARTITION SCHEME PS_12MonthSlidingWindow
	NEXT USED FG_NewerData;
ALTER PARTITION SCHEME PS_12MonthSlidingWindow_Secondary
	NEXT USED FG_NewerData;

2) SPLIT main table partition function to create future boundary

ALTER PARTITION FUNCTION PF_12MonthSlidingWindow()
	SPLIT RANGE (@FutureMonthStartDate);
ALTER PARTITION FUNCTION PF_12MonthSlidingWindow_Secondary()
	SPLIT RANGE (@FutureMonthStartDate);

Move Data to Slower Storage
The final task is to move the 4 month old partition from solid state to spinning media storage. Unlike the previous MERGE and SPLIT, which are metadata-only operations due to the empty partitions, moving the non-empty partition requires physical movement of a potentially large amount of data. Although it is possible to do this with MERGE and SPLIT alone, that would be very inefficient. Instead, I use the DROP_EXISITNG clause of CREATE INDEX after switching the partition to the staging table. The DROP_EXISITNG clause leverages the existing index order to avoid an expensive sort operation.

Note that this technique switches the partition to be moved out of the main table so data is not available for querying during the move process. In my next post, I’ll show how to keep read-only data online during the move using INSERT…SELECT with detailed attention to performance.

1) SWITCH the partition of the month to be moved into the staging table (partitioned using the same partition scheme as the main table to ensure alignment):

ALTER TABLE dbo.SlidingWindowTable
	SWITCH PARTITION $PARTITION.PF_12MonthSlidingWindow(DATEADD(month, -1, @MonthStartDateToMoveFromCurrentToOld))
	TO dbo.SlidingWindowTable_Staging PARTITION $PARTITION.PF_12MonthSlidingWindow (DATEADD(month, -1, @MonthStartDateToMoveFromCurrentToOld));

2) Move the partition of the unused secondary partition scheme to the slower storage filegroup:

a. MERGE the secondary partition function boundary of the moved month to remove it from the secondary partition function and scheme:

ALTER PARTITION FUNCTION PF_12MonthSlidingWindow_Secondary()
	MERGE RANGE (@MonthStartDateToMoveFromCurrentToOld);

b. Set the secondary partition scheme NEXT USED to slower storage filegroup:

ALTER PARTITION SCHEME PS_12MonthSlidingWindow_Secondary
	NEXT USED FG_OlderData;

c. SPLIT the secondary partition function for the moved month to recreate the partition on the slower storage filegroup:

ALTER PARTITION FUNCTION PF_12MonthSlidingWindow_Secondary()
	SPLIT RANGE (@MonthStartDateToMoveFromCurrentToOld);

3) Rebuild the staging table on the secondary partition scheme. This will physically move the partition and data of the month to move (the only non-empty partition in the staging table) to the slower storage filegroup:

CREATE UNIQUE CLUSTERED INDEX PK_SlidingWindowTable_Staging
	ON dbo.SlidingWindowTable_Staging(PartitioningColumn, OtherKeyColumn)
	WITH(DROP_EXISTING=ON)
	ON PS_12MonthSlidingWindow_Secondary(PartitioningColumn);

4) SWITCH the partition prior to the moved month into the staging table so that the adjacent partitions of the main table are empty:

ALTER TABLE dbo.SlidingWindowTable
	SWITCH PARTITION $PARTITION.PF_12MonthSlidingWindow(DATEADD(month, -1, @MonthStartDateToMoveFromCurrentToOld))
	TO dbo.SlidingWindowTable_Staging PARTITION $PARTITION.PF_12MonthSlidingWindow_Secondary(DATEADD(month, -1, @MonthStartDateToMoveFromCurrentToOld));

5) MERGE the main table partition function to remove the moved month boundary:

ALTER PARTITION FUNCTION PF_12MonthSlidingWindow()
	MERGE RANGE (@MonthStartDateToMoveFromCurrentToOld);

6) Recreate the main table moved month partition on slower storage:

a. Set the main table partition scheme NEXT USED to the slower storage filegroup:

ALTER PARTITION SCHEME PS_12MonthSlidingWindow
	NEXT USED FG_OlderData;

b. SPLIT the main table partition scheme to recreate the moved month partition on the slower storage filegroup:

ALTER PARTITION FUNCTION PF_12MonthSlidingWindow()
	SPLIT RANGE (@MonthStartDateToMoveFromCurrentToOld);

7) SWITCH the moved and prior month partitions back into the staging table:

ALTER TABLE dbo.SlidingWindowTable_Staging
	SWITCH PARTITION $PARTITION.PF_12MonthSlidingWindow_Secondary(DATEADD(month, -1, @MonthStartDateToMoveFromCurrentToOld))
	TO dbo.SlidingWindowTable PARTITION $PARTITION.PF_12MonthSlidingWindow(DATEADD(month, -1, @MonthStartDateToMoveFromCurrentToOld));
ALTER TABLE dbo.SlidingWindowTable_Staging
	SWITCH PARTITION $PARTITION.PF_12MonthSlidingWindow_Secondary(@MonthStartDateToMoveFromCurrentToOld)
	TO dbo.SlidingWindowTable PARTITION $PARTITION.PF_12MonthSlidingWindow(@MonthStartDateToMoveFromCurrentToOld);

Cleanup
1) Rebuild the staging table specifying the main table partition scheme to prepare for the next months maintenance. The secondary partition function and scheme are again unused and are identical to the main table partition function/scheme:

CREATE UNIQUE CLUSTERED INDEX PK_SlidingWindowTable_Staging
	ON dbo.SlidingWindowTable_Staging(PartitioningColumn, OtherKeyColumn)
	WITH(DROP_EXISTING=ON)
	ON PS_12MonthSlidingWindow(PartitioningColumn);

Complete Script
Below is a complete script to create the initial database, partition function, partition scheme, table DDL, sample data, partition maintenance stored procedure, and example usage. The D drive is on solid state storage, E and L drives are on a spinning disk SAN, with the L drive dedicated to log.

Note that the stored procedure acquires an exclusive table lock on the partitioned table to prevent deadlocking during partition maintenance. However, the duration of the transaction should be very short since only metadata operations are performed.

If you have multiple partition schemes (e.g. indexes on separate filegroups) using the same partition functions, you’ll need to create secondary schemes for those too and set the NEXT USED accordingly prior to SPLIT. I can’t think of a reason to also have different partition functions in this aligned table scenario since the functions must be identical anyway to provide alignment. If you also have different functions for the index partition schemes, those will need to be split and merged at the same time as the data partition scheme.

CREATE DATABASE SlidingWindowWithTieredStorage ON
	PRIMARY 
		( NAME = N'SlidingWindowWithTieredStorage'
		, FILENAME = N'D:\SqlDataFiles\MSSQLSERVER\SlidingWindowWithTieredStorage.mdf'
		, SIZE = 10MB
		, MAXSIZE = UNLIMITED
		, FILEGROWTH = 10MB ),
    FILEGROUP FG_NewerData 
		( NAME = N'NewerData1'
		, FILENAME = N'D:\SqlDataFiles\MSSQLSERVER\SlidingWindowWithTieredStorage_NewerData1.ndf'
		, SIZE = 1TB
		, MAXSIZE = UNLIMITED
		, FILEGROWTH = 10GB ),
    FILEGROUP FG_OlderData 
		( NAME = N'OlderData2'
		, FILENAME = N'D:\SqlDataFiles\MSSQLSERVER\SlidingWindowWithTieredStorage_OlderData2.ndf' 
		, SIZE = 1TB
		, MAXSIZE = UNLIMITED
		, FILEGROWTH = 10GB ),
    FILEGROUP FG_OlderData 
		( NAME = N'OlderData3'
		, FILENAME = N'D:\SqlDataFiles\MSSQLSERVER\SlidingWindowWithTieredStorage_OlderData3.ndf' 
		, SIZE = 1TB
		, MAXSIZE = UNLIMITED
		, FILEGROWTH = 10GB ),
    FILEGROUP FG_OlderData 
		( NAME = N'OlderData4'
		, FILENAME = N'D:\SqlDataFiles\MSSQLSERVER\SlidingWindowWithTieredStorage_OlderData4.ndf' 
		, SIZE = 1TB
		, MAXSIZE = UNLIMITED
		, FILEGROWTH = 10GB ),
    FILEGROUP FG_OlderData 
		( NAME = N'OlderData5'
		, FILENAME = N'D:\SqlDataFiles\MSSQLSERVER\SlidingWindowWithTieredStorage_OlderData5.ndf' 
		, SIZE = 1TB
		, MAXSIZE = UNLIMITED
		, FILEGROWTH = 10GB )
    LOG ON 
	( NAME = N'SlidingWindowWithTieredStorage_Log'
	, FILENAME = N'C:\SqlLogFiles\MSSQLSERVER\SlidingWindowWithTieredStorage_Log.ldf' 
	, SIZE = 1024KB 
	, MAXSIZE = 2048GB 
	, FILEGROWTH = 1GB);
GO

CREATE DATABASE SlidingWindowWithTieredStorage ON
	PRIMARY 
		( NAME = N'SlidingWindowWithTieredStorage'
		, FILENAME = N'D:\SqlDataFiles\MSSQLSERVER\SlidingWindowWithTieredStorage.mdf'
		, SIZE = 4096KB
		, MAXSIZE = UNLIMITED
		, FILEGROWTH = 1024KB ),
    FILEGROUP FG_NewerData 
		( NAME = N'NewerData'
		, FILENAME = N'D:\SqlDataFiles\MSSQLSERVER\SlidingWindowWithTieredStorage_NewerData.ndf'
		, SIZE = 4096KB
		, MAXSIZE = UNLIMITED
		, FILEGROWTH = 1024KB ),
    FILEGROUP FG_OlderData 
	( NAME = N'OlderData'
		, FILENAME = N'D:\SqlDataFiles\MSSQLSERVER\SlidingWindowWithTieredStorage_OlderData.ndf' 
		, SIZE = 4096KB 
		, MAXSIZE = UNLIMITED
		, FILEGROWTH = 1024KB )
    LOG ON 
	( NAME = N'SlidingWindowWithTieredStorage_Log'
	, FILENAME = N'C:\SqlLogFiles\MSSQLSERVER\SlidingWindowWithTieredStorage_Log.ldf' 
	, SIZE = 1024KB 
	, MAXSIZE = 2048GB 
	, FILEGROWTH = 1GB);
GO

USE SlidingWindowWithTieredStorage;
GO

--main table partition function (before start of next month)
CREATE PARTITION FUNCTION PF_12MonthSlidingWindow (datetime2(0))
AS RANGE RIGHT FOR VALUES (
	  '2014-11-01T00:00:00' --current month -12 on OlderData filegroup
	, '2014-12-01T00:00:00' --current month -11 on OlderData filegroup
	, '2015-01-01T00:00:00' --current month -10 on OlderData filegroup
	, '2015-02-01T00:00:00' --current month -9 on OlderData filegroup
	, '2015-03-01T00:00:00' --current month -8 on OlderData filegroup
	, '2015-04-01T00:00:00' --current month -7 on OlderData filegroup
	, '2015-05-01T00:00:00' --current month -6 on OlderData filegroup
	, '2015-06-01T00:00:00' --current month -5 on OlderData filegroup
	, '2015-07-01T00:00:00' --current month -4 on OlderData filegroup
	, '2015-08-01T00:00:00' --current month -3 on NewerData filegroup
	, '2015-09-01T00:00:00' --current month -2 on NewerData filegroup
	, '2015-10-01T00:00:00' --current month -1 on NewerData filegroup
	, '2015-11-01T00:00:00' --current month on NewerData filegroup
	, '2015-12-01T00:00:00' --next month on NewerData filegroup
	, '2016-01-01T00:00:00' --future month
)
GO

--main table partition scheme
CREATE PARTITION SCHEME PS_12MonthSlidingWindow
AS PARTITION PF_12MonthSlidingWindow
TO (
	  FG_OlderData --older than 12 montths
	, FG_OlderData --current month -12 on OlderData
	, FG_OlderData --current month -11 on OlderData
	, FG_OlderData --current month -10 on OlderData
	, FG_OlderData --current month -9 on OlderData
	, FG_OlderData --current month -8 on OlderData
	, FG_OlderData --current month -7 on OlderData
	, FG_OlderData --current month -6 on OlderData
	, FG_OlderData --current month -5 on OlderData
	, FG_OlderData --current month -4 on OlderData
	, FG_OlderData --current month -3 on NewerData
	, FG_NewerData --current month -2 on NewerData
	, FG_NewerData --current month -1 on NewerData
	, FG_NewerData --current month on NewerData
	, FG_NewerData --next month on NewerData
	, FG_NewerData --future month on NewerData
      );
GO

--secondary partition function identical to main function
CREATE PARTITION FUNCTION PF_12MonthSlidingWindow_Secondary (datetime2(0))
AS RANGE RIGHT FOR VALUES (
	  '2014-11-01T00:00:00' --current month -12 on OlderData filegroup
	, '2014-12-01T00:00:00' --current month -11 on OlderData filegroup
	, '2015-01-01T00:00:00' --current month -10 on OlderData filegroup
	, '2015-02-01T00:00:00' --current month -9 on OlderData filegroup
	, '2015-03-01T00:00:00' --current month -8 on OlderData filegroup
	, '2015-04-01T00:00:00' --current month -7 on OlderData filegroup
	, '2015-05-01T00:00:00' --current month -6 on OlderData filegroup
	, '2015-06-01T00:00:00' --current month -5 on OlderData filegroup
	, '2015-07-01T00:00:00' --current month -4 on OlderData filegroup
	, '2015-08-01T00:00:00' --current month -3 on NewerData filegroup
	, '2015-09-01T00:00:00' --current month -2 on NewerData filegroup
	, '2015-10-01T00:00:00' --current month -1 on NewerData filegroup
	, '2015-11-01T00:00:00' --current month on NewerData filegroup
	, '2015-12-01T00:00:00' --next month on NewerData filegroup
	, '2016-01-01T00:00:00' --future month
)
GO

--secondary partition scheme identical to main scheme
CREATE PARTITION SCHEME PS_12MonthSlidingWindow_Secondary
AS PARTITION PF_12MonthSlidingWindow_Secondary
TO (
	  FG_OlderData --older than 12 montths
	, FG_OlderData --current month -12 on OlderData
	, FG_OlderData --current month -11 on OlderData
	, FG_OlderData --current month -10 on OlderData
	, FG_OlderData --current month -9 on OlderData
	, FG_OlderData --current month -8 on OlderData
	, FG_OlderData --current month -7 on OlderData
	, FG_OlderData --current month -6 on OlderData
	, FG_OlderData --current month -5 on OlderData
	, FG_OlderData --current month -4 on OlderData
	, FG_OlderData --current month -3 on NewerData
	, FG_NewerData --current month -2 on NewerData
	, FG_NewerData --current month -1 on NewerData
	, FG_NewerData --current month on NewerData
	, FG_NewerData --next month on NewerData
	, FG_NewerData --future month on NewerData
      );
GO

--this is the main partitioned table
CREATE TABLE dbo.SlidingWindowTable(
	  PartitioningColumn datetime2(0)
	, OtherKeyColumn int NOT NULL
	, OtherData int NULL
	, CONSTRAINT PK_SlidingWindowTable PRIMARY KEY
		CLUSTERED (PartitioningColumn, OtherKeyColumn)
		ON PS_12MonthSlidingWindow(PartitioningColumn)
	) ON PS_12MonthSlidingWindow(PartitioningColumn);
GO

--The staging table initially uses the same scheme as the main table
--except with maintenance is in progress
CREATE TABLE dbo.SlidingWindowTable_Staging(
	  PartitioningColumn datetime2(0)
	, OtherKeyColumn int NOT NULL
	, OtherData int NULL
	, CONSTRAINT PK_SlidingWindowTable_Staging PRIMARY KEY
		CLUSTERED (PartitioningColumn, OtherKeyColumn)
		ON PS_12MonthSlidingWindow(PartitioningColumn)
	) ON PS_12MonthSlidingWindow(PartitioningColumn);
GO

--load main table with 14M rows of sample data
WITH
	t4 AS (SELECT n FROM (VALUES(0),(0),(0),(0)) t(n))
	,t256 AS (SELECT 0 AS n FROM t4 AS a CROSS JOIN t4 AS b CROSS JOIN t4 AS c CROSS JOIN t4 AS d)
	,t16M AS (SELECT ROW_NUMBER() OVER (ORDER BY (a.n)) AS num FROM t256 AS a CROSS JOIN t256 AS b CROSS JOIN t256 AS c)
INSERT INTO dbo.SlidingWindowTable WITH (TABLOCKX) (PartitioningColumn, OtherKeyColumn, OtherData) 
SELECT DATEADD(month, num/1000000, '20141101'), num, num
FROM t16M
WHERE num <= 14000000;
GO

CREATE PROCEDURE dbo.PerformMonthlySlidingWindowMaintenace
	  @ExpiredMonthStartDate datetime2(0) --purge data for this month
	, @MonthStartDateToMoveFromCurrentToOld datetime2(0) --move data from this month from current data filegroup to old data filegroup
	, @FutureMonthStartDate datetime2(0) --create new partition for this date on current data filegroup
/*
	This proc performs partition maintenance needed to maintain a monthly sliding
	window with filegroups on tiered storage.  The objective is to keep recent frequently
	used data on filegroup FG_NewData and less often accessed data on PS_12MonthSlidingWindow_Secondary.
*/
AS

SET XACT_ABORT ON;

BEGIN TRY

	BEGIN TRAN;

	--acquire exclusive lock on main table to prevent deadlocking during partition maintenance
	DECLARE @result int = (SELECT TOP (0) 1 FROM dbo.SlidingWindowTable WITH (TABLOCKX));

	--purge oldest partition from main table (including partiton for data older than first boundary)
	TRUNCATE TABLE dbo.SlidingWindowTable_Staging;
	ALTER TABLE dbo.SlidingWindowTable
		SWITCH PARTITION $PARTITION.PF_12MonthSlidingWindow(DATEADD(month, -1, @ExpiredMonthStartDate))
		TO dbo.SlidingWindowTable_Staging PARTITION $PARTITION.PF_12MonthSlidingWindow_Secondary(DATEADD(month, -1, @ExpiredMonthStartDate));
	ALTER TABLE dbo.SlidingWindowTable
		SWITCH PARTITION $PARTITION.PF_12MonthSlidingWindow(@ExpiredMonthStartDate)
		TO dbo.SlidingWindowTable_Staging PARTITION $PARTITION.PF_12MonthSlidingWindow_Secondary(@ExpiredMonthStartDate);
	TRUNCATE TABLE dbo.SlidingWindowTable_Staging;

	--remove expired partition boundary from both functions after purge
	ALTER PARTITION FUNCTION PF_12MonthSlidingWindow()
		MERGE RANGE (@ExpiredMonthStartDate);
	ALTER PARTITION FUNCTION PF_12MonthSlidingWindow_Secondary()
		MERGE RANGE (@ExpiredMonthStartDate);

	--add new partition on new data filegroup for future data to both functions/schemes
	ALTER PARTITION SCHEME PS_12MonthSlidingWindow
			  NEXT USED FG_NewerData;
	ALTER PARTITION FUNCTION PF_12MonthSlidingWindow()
		SPLIT RANGE (@FutureMonthStartDate);
	ALTER PARTITION SCHEME PS_12MonthSlidingWindow_Secondary
			  NEXT USED FG_NewerData;
	ALTER PARTITION FUNCTION PF_12MonthSlidingWindow_Secondary()
		SPLIT RANGE (@FutureMonthStartDate);
	
	--switch partition to be moved from FG_NewData to PS_12MonthSlidingWindow_Secondary into staging table
	ALTER TABLE dbo.SlidingWindowTable
		SWITCH PARTITION $PARTITION.PF_12MonthSlidingWindow(@MonthStartDateToMoveFromCurrentToOld)
		TO dbo.SlidingWindowTable_Staging PARTITION $PARTITION.PF_12MonthSlidingWindow(@MonthStartDateToMoveFromCurrentToOld);

	--alter secondary partition function and scheme to move partition from FG_NewerData to FG_OlderData
	ALTER PARTITION FUNCTION PF_12MonthSlidingWindow_Secondary()
		MERGE RANGE (@MonthStartDateToMoveFromCurrentToOld);
	ALTER PARTITION SCHEME PS_12MonthSlidingWindow_Secondary
		NEXT USED FG_OlderData;
	ALTER PARTITION FUNCTION PF_12MonthSlidingWindow_Secondary()
		SPLIT RANGE (@MonthStartDateToMoveFromCurrentToOld);

	--this will release the exclusve table lock but the data in the staging table temporarily unavailable
	COMMIT;

	--rebuild staging table using new partition scheme to move switched out data to old filegroup
	CREATE UNIQUE CLUSTERED INDEX PK_SlidingWindowTable_Staging
		ON dbo.SlidingWindowTable_Staging(PartitioningColumn, OtherKeyColumn)
		WITH(DROP_EXISTING=ON)
		ON PS_12MonthSlidingWindow_Secondary(PartitioningColumn);

	BEGIN TRAN;

	--acquire exclusive lock on main table to prevent deadlocking during partition maintenance
	SET @result = (SELECT TOP (0) 1 FROM dbo.SlidingWindowTable WITH (TABLOCKX));

	--switch the latest partition of main table on the FG_OlderData filegroup to the staging table
	--so that adjacent partitions of the main table are empty for the MERGE and SPLIT needed to 
	--physically move the main table partition
	ALTER TABLE dbo.SlidingWindowTable
		SWITCH PARTITION $PARTITION.PF_12MonthSlidingWindow(DATEADD(month, -1, @MonthStartDateToMoveFromCurrentToOld))
		TO dbo.SlidingWindowTable_Staging PARTITION $PARTITION.PF_12MonthSlidingWindow_Secondary(DATEADD(month, -1, @MonthStartDateToMoveFromCurrentToOld));

	--remove empty partition
	ALTER PARTITION FUNCTION PF_12MonthSlidingWindow()
		MERGE RANGE (@MonthStartDateToMoveFromCurrentToOld);

	--create new empty partition on old data filegroup for moved data
	ALTER PARTITION SCHEME PS_12MonthSlidingWindow
			  NEXT USED FG_OlderData;
	ALTER PARTITION FUNCTION PF_12MonthSlidingWindow()
		SPLIT RANGE (@MonthStartDateToMoveFromCurrentToOld);

	--now that both partition functions/schemes are identical, switch partitions from staging table back into main table
	ALTER TABLE dbo.SlidingWindowTable_Staging
		SWITCH PARTITION $PARTITION.PF_12MonthSlidingWindow_Secondary(DATEADD(month, -1, @MonthStartDateToMoveFromCurrentToOld))
		TO dbo.SlidingWindowTable PARTITION $PARTITION.PF_12MonthSlidingWindow(DATEADD(month, -1, @MonthStartDateToMoveFromCurrentToOld));
	ALTER TABLE dbo.SlidingWindowTable_Staging
		SWITCH PARTITION $PARTITION.PF_12MonthSlidingWindow_Secondary(@MonthStartDateToMoveFromCurrentToOld)
		TO dbo.SlidingWindowTable PARTITION $PARTITION.PF_12MonthSlidingWindow(@MonthStartDateToMoveFromCurrentToOld);

	--rebuild staging table using same partition scheme as main table to prepare for next month
	CREATE UNIQUE CLUSTERED INDEX PK_SlidingWindowTable_Staging
		ON dbo.SlidingWindowTable_Staging(PartitioningColumn, OtherKeyColumn)
		WITH(DROP_EXISTING=ON)
		ON PS_12MonthSlidingWindow(PartitioningColumn);

	COMMIT;

END TRY
BEGIN CATCH

	IF @@TRANCOUNT > 0 ROLLBACK;

	THROW;

END CATCH;
GO

--when scheduling after the start of a new month, set @RunMonth to the first day of the current month
DECLARE
	--@RunMonth datetime2(0) = DATEADD(day, 1, DATEADD(month, -1, EOMONTH(GETDATE())));
	  @RunMonth datetime2(0) = '20151201';

--calculate boundary dates based on @RunMonth
DECLARE
	  @ExpiredMonthStartDate datetime2(0) = DATEADD(month, -13, @RunMonth)
	, @MonthStartDateToMoveFromCurrentToOld datetime2(0) = DATEADD(month, -4, @RunMonth)
	, @FutureMonthStartDate datetime2(0) = DATEADD(month, 2, @RunMonth);

SELECT
	  @ExpiredMonthStartDate
	, @MonthStartDateToMoveFromCurrentToOld
	, @FutureMonthStartDate;

EXEC dbo.PerformMonthlySlidingWindowMaintenace
	  @ExpiredMonthStartDate = @ExpiredMonthStartDate
	, @MonthStartDateToMoveFromCurrentToOld = @MonthStartDateToMoveFromCurrentToOld
	, @FutureMonthStartDate = @FutureMonthStartDate;
GO

SQL Server and Hyper-threading

Posted on June 27, 2015 by dguzman

I had a client ask whether or not SQL Server could take advantage of all 32 logical processors if they enabled hyper-threading on their server. They were running SQL Server 2012 Standard Edition on a server with 4-sockets, each would 4 cores. As in nearly all things related to SQL Server, the answer is “it depends”.

The compute capacity of SQL Server 2012 Standard Edition (and later versions as of this writing) is the lesser of 4 sockets or 16 cores regardless of the number of logical processors. With hyper-threading enabled and running SQL Server on bare metal, the number of logical processors doubles from 16 to 32. SQL Server 2012 SE can use all 32 of these logical processors because the compute limit is based on physical sockets/cores, not logical processors.

However, SQL Server 2012 SE can use only 16 of the 32 logical processors when running virtualized on the same physical server. This is because the physical processor architecture isn’t exposed to the guest OS. Logical processors available to the virtualization host are mapped to guest VMs and these appear as virtual sockets and cores to both the OS and SQL Server with no notion of hyper-threading. The end result is that SQL Server 2012 SE will use no more than 16 logical processors when virtualized with or without hyper-threading enabled.

Other Hyper-threading Considerations
Regardless of the SQL Server edition, one should enable hyper-threading only if the OS and SQL Server can utilize the additional logical processors. This practice guarantees logical processors and physical cores are one and the same, providing the best performance possible. When additional logical processors provided by hyper-threading can be used, the SQL Server performance benefits, or even detriments, are very work-load dependent. The only way to accurately ascertain HT performance is with your actual production workload.

My past experience is that OLTP workloads benefit the most from hyper-threading. In the early days of HT technology (before Windows and SQL Server were HT-aware), I observed a 15-20% performance improvement for an OLTP workload. Unfortunately, reporting workload performance could worsen with HT enabled. Slava Oks’s blog post may explain why.

The world has changed a lot since then. SQL Server is now both HT and NUMA aware, with improved locking primitives and memory-optimized structures available that avoid locking entirely. With current SQL Server versions running bare-metal on modern hardware, I suggest you enable HT unless your actual workload experience dictates otherwise. Premature optimization is evil; don’t disable HT preemptively without justification with your own workload or you may be wasting free CPU cycles. Importantly, limit MAXDOP per this support article.

See Linchi’s Shea’s post and also this one for reporting workload tests with and without HT enabled. Also, see Joe Chang’s post. The first graph is especially interesting in that it shows an actual production OLTP workload running with and without HT. The remainder of the post shows results of the TPC-H benchmark tests, a synthetic decision support (reporting) workload.

Again, your workload is unique so only you can ascertain if HT is actually beneficial.

Not Before Service Pack 1 Redux

Posted on April 15, 2015 by dguzman

Microsoft announced a new Modern Servicing Model for SQL Server a few months ago. That reminded me of a blog post of mine from several years ago, Not Before Service Pack 1, when I explained why the “not before the first service pack” policy is flawed and unnecessarily delays many organizations from taking advantage of powerful new SQL Server features. That post is even more relevant today considering there will be no more SQL Server service packs going forward, only Cumulative Updates (CUs) and General Distribution Releases (GDRs).

The rationale of postponing adoption until the first service pack stems from the days when testing was mostly manual, often incomplete, and at a time where speed to market sometimes trumped release quality. Unless you’ve been asleep for the last dozen years or so, you’ve probably noticed the technology world has changed, and is changing faster every day. A great deal of investment in automated unit testing has facilitated more rapid delivery of production quality software, not just by vendors like Microsoft, but by many organizations as well. We live in a different world today with much more thorough automated regression testing and it makes little sense to abide by rules based on rationale that isn’t relevant in today’s world.

The truth is that only you can ascertain production readiness of your application or vendor software. There is risk with any change but the decision to adopt and deploy a new version should be based on priorities specific to your application and organization, considering factors like cost, value in new features, vendor lifecycle alignment (including support), application mission criticality, risk of falling behind in security patches, and the level of testing you are able to perform. How Microsoft decides to label a release (RTM, CU or SP) shouldn’t be a consideration. If you’ve waited to install SQL Server 2014 until SP1 was released, I think Microsoft erred and should have just labeled CU1 as SP1 to get you onboard earlier.

Dan Guzman's Blog

Microsoft SQL Server Technical Articles

Author Archives: dguzman