SQL Server 2016 and Azure SQL Database V12 Breaking Change

This post is to get the word out about a breaking change to datetime conversion and comparison behavior in SQL Server 2016 and Azure SQL Database V12. This change hasn’t been documented as of this writing in the Breaking Changes to Database Engine Features in SQL Server 2016 topic in the SQL Server Books Online.

In short, conversion from datetime to a higher precision temporal data type (datetime2, datetimeoffset, or time) may yield a different, but more accurate, time value than in prior versions. Also, predicates involving datetime consider the full precision of raw datetime internal value instead of the time value rounded to the nearest millisecond. These changes in conversion and comparison behavior may affect existing applications and are not intuitive unless one understands the underlying datetime data type implementation.

Background
You may be aware that the accuracy of datetime is limited to 1/300 of a second. This is because values are internally an 8-byte structure consisting of 2 separate 32-bit integers, one with the number of day units since 1900-01-01 and the other with the number of 1/300 second interval units since midnight. The 1/300 second unit interval limits the time accuracy to 3.33333… milliseconds and the milliseconds value will be a repeating decimal when time interval units are not evenly divisible by 3. The raw decimal value is rounded to a scale of 3 in accordance with the fixed datetime precision of 3, resulting in a millisecond value of 0, 3, or 7 for all datetime values.

Pre-SQL Server 2016 Behavior
Before SQL Server 2016, conversion from datetime to another temporal type used the source datetime value after it was rounded to the nearest millisecond, which truncated repeating decimal fractional milliseconds. The rounded value was then rounded again according to the target type precision. When the target type precision was greater than 3, the time was extended to the target type precision with insignificant trailing zeros, resulting in zero for the sub-millisecond value.

DECLARE @DateTime datetime = '2016-01-01T00:00:00.007';
SELECT CAST(@DateTime AS datetime2(0)); --2016-01-01 00:00:00
SELECT CAST(@DateTime AS datetime2(1)); --2016-01-01 00:00:00.0
SELECT CAST(@DateTime AS datetime2(2)); --2016-01-01 00:00:00.01
SELECT CAST(@DateTime AS datetime2(3)); --2016-01-01 00:00:00.007
SELECT CAST(@DateTime AS datetime2(4)); --2016-01-01 00:00:00.0070
SELECT CAST(@DateTime AS datetime2(5)); --2016-01-01 00:00:00.00700
SELECT CAST(@DateTime AS datetime2(6)); --2016-01-01 00:00:00.007000
SELECT CAST(@DateTime AS datetime2(7)); --2016-01-01 00:00:00.0070000

Also, when datetime was compared to another temporal type, the rounded value was used. This script shows the result of the equality predicate is true after the datetime value is converted to datetime2.

--This script prints EQUAL predicate is true
DECLARE @DateTime datetime = '2016-01-01T00:00:00.003';
DECLARE @DateTime2 datetime2(7) = @DateTime;
IF @DateTime = @DateTime2 PRINT 'EQUAL predicate is true' ELSE PRINT 'EQUAL predicate is not true';
IF @DateTime < @DateTime2 PRINT 'LESS THAN predicate is true' ELSE PRINT 'LESS THAN predicate is not true';
IF @DateTime > @DateTime2 PRINT 'GREATER THAN predicate is true' ELSE PRINT 'GREATER THAN predicate is not true';
GO
--This script prints EQUAL predicate is true
DECLARE @DateTime datetime = '2016-01-01T00:00:00.007';
DECLARE @DateTime2 datetime2(7) = @DateTime;
IF @DateTime = @DateTime2 PRINT 'EQUAL predicate is true' ELSE PRINT 'EQUAL predicate is not true';
IF @DateTime < @DateTime2 PRINT 'LESS THAN predicate is true' ELSE PRINT 'LESS THAN predicate is not true';
IF @DateTime > @DateTime2 PRINT 'GREATER THAN predicate is true' ELSE PRINT 'GREATER THAN predicate is not true';
GO

SQL Server 2016 Behavior Change
SQL Server 2016 and Azure SQL Database V12 use the raw datetime internal value without rounding during conversion to another temporal type. The value is rounded only once during conversion, to the target type precision. The end result will be the same as before SQL Server 2016 when the target type precision is 3 or less. However, the converted value will be different when the target type precision is greater than 3 and the internal time unit interval is not evenly divisible by 3 (i.e. rounded source datetime millisecond value is 3 or 7). Note the non-zero microseconds and nanoseconds in the script results below and that rounding is based on the target type precision rather than the source.

DECLARE @DateTime datetime = '2016-01-01T00:00:00.003';
SELECT CAST(@DateTime AS datetime2(0)); --2016-01-01 00:00:00
SELECT CAST(@DateTime AS datetime2(1)); --2016-01-01 00:00:00.0
SELECT CAST(@DateTime AS datetime2(2)); --2016-01-01 00:00:00.03
SELECT CAST(@DateTime AS datetime2(3)); --2016-01-01 00:00:00.003
SELECT CAST(@DateTime AS datetime2(4)); --2016-01-01 00:00:00.0033
SELECT CAST(@DateTime AS datetime2(5)); --2016-01-01 00:00:00.00333
SELECT CAST(@DateTime AS datetime2(6)); --2016-01-01 00:00:00.003333
SELECT CAST(@DateTime AS datetime2(7)); --2016-01-01 00:00:00.0033333
GO
DECLARE @DateTime datetime = '2016-01-01T00:00:00.007';
SELECT CAST(@DateTime AS datetime2(0)); --2016-01-01 00:00:00
SELECT CAST(@DateTime AS datetime2(1)); --2016-01-01 00:00:00.0
SELECT CAST(@DateTime AS datetime2(2)); --2016-01-01 00:00:00.01
SELECT CAST(@DateTime AS datetime2(3)); --2016-01-01 00:00:00.007
SELECT CAST(@DateTime AS datetime2(4)); --2016-01-01 00:00:00.0067
SELECT CAST(@DateTime AS datetime2(5)); --2016-01-01 00:00:00.00667
SELECT CAST(@DateTime AS datetime2(6)); --2016-01-01 00:00:00.006667
SELECT CAST(@DateTime AS datetime2(7)); --2016-01-01 00:00:00.0066667
GO

This behavior change provides a more accurate converted value but may break applications that expect the converted value to be the same as the rounded datetime value as was the case before SQL Server 2016.

Be aware than the full raw datetime precision (instead of the rounded value) is also used when evaluating predicates involving a datetime type. The full precision of both arguments are used, resulting in the equality compare predicate to evaluate to false in both scripts below. The greater than predicate is true in the first script and the less than predicate is true in the second:

--This script prints GREATER THAN predicate is true
DECLARE @DateTime datetime = '2016-01-01T00:00:00.003';
DECLARE @DateTime2 datetime2(7) = @DateTime;
IF @DateTime = @DateTime2 PRINT 'EQUAL predicate is true' ELSE PRINT 'EQUAL predicate is not true';
IF @DateTime < @DateTime2 PRINT 'LESS THAN predicate is true' ELSE PRINT 'LESS THAN predicate is not true';
IF @DateTime > @DateTime2 PRINT 'GREATER THAN predicate is true' ELSE PRINT 'GREATER THAN predicate is not true';
GO
--This script prints LESS THAN predicate is true
DECLARE @DateTime datetime = '2016-01-01T00:00:00.007';
DECLARE @DateTime2 datetime2(7) = @DateTime;
IF @DateTime = @DateTime2 PRINT 'EQUAL predicate is true' ELSE PRINT 'EQUAL predicate is not true';
IF @DateTime < @DateTime2 PRINT 'LESS THAN predicate is true' ELSE PRINT 'LESS THAN predicate is not true';
IF @DateTime > @DateTime2 PRINT 'GREATER THAN predicate is true' ELSE PRINT 'GREATER THAN predicate is not true';
GO

To provide insight into why the comparisons result in greater than and less than respectively, the script below shows the nanoseconds value of the compared data types:

DECLARE @DateTime datetime = '2016-01-01T00:00:00.003';
DECLARE @DateTime2 datetime2(7) = @DateTime;
SELECT DATEPART(nanosecond, @DateTime); --3333333
SELECT DATEPART(nanosecond, @DateTime2); --3333300
GO
DECLARE @DateTime datetime = '2016-01-01T00:00:00.007';
DECLARE @DateTime2 datetime2(7) = @DateTime;
SELECT DATEPART(nanosecond, @DateTime); --6666666
SELECT DATEPART(nanosecond, @DateTime2); --6666700
GO

The datetime2 type is accurate only to 100 nanosecond whereas datetime includes values to the nanosecond (and beyond) because the theoretical precision of repeating decimal values is unlimited. The implication is that a datetime type with a repeating decimal value will never compare equally with any temporal type except datetime.

Datetime conversion and comparison behavior is controlled by the database compatibility level. Databases in SQL Server 2016 level (130) use the new behavior and the legacy behavior is used with other levels.

Summary
These datetime behavior changes have the benefit of improved accuracy and performance of datetime conversion/comparison. Affected applications can use a pre-SQL Server 2016 database compatibility level until they can be remediated.

I recommend one avoid comparing datetime directly to other temporal types going forward. Instead convert the datetime value to the type being compared and use the converted value in the predicate. It’s generally best practice to match data types whenever possible for performance and to avoid ambiguity.

Add or Remove IDENTITY Property From an Existing Column Efficiently

Introduction
Refactoring is often needed to improve schema design or address changes in requirements. Occasionally, these schema modifications require removing the IDENTITY property from an existing non-empty table while retaining the existing values. Unfortunately, there is no DDL command to add or remove IDENTITY from an existing column. Traditionally, one must go through the pain of either recreating the table or jumping through hoops by adding a new column, updating the new column value with the old column value, and dropping the original column. This is especially problematic with large tables and short maintenance windows.

This article details how to use SWITCH as an alternative technique to add or remove IDENTITY from an existing column. I have been using SWITCH for many years but learned only recently that the identical schema restriction for SWITCH does not include IDENTITY column property; one can SWITCH data from one table to another even though the IDENTITY column property differs. This behavior allows the IDENTITY property to be added or removed.

Overview
All tables are partitioned from a database storage engine perspective since SQL Server 2005, although multiple partitions require Enterprise Edition. The implication of this storage architecture is that one can use ALTER TABLE…SWITCH regardless of SQL Server edition to move the entire contents of one table to another without physically moving rows as long as the source and target tables have an identical schema and are physically aligned (indexes and tables on same filegroup). SWITCH performs fast storage meta-data changes so the operation typically takes less than a second regardless of table size. The target table must be empty before SWITCH. After SWITCH, the target table will contain the data and the source table will be empty (similar to TRUNCATE).

Since the identical schema restriction does not include the IDENTITY column property, SWITCH is a handy technique to add or remove IDENTITY from an existing column of a non-empty table. For an empty table, a simple drop and create is easier and more efficient.

Remove IDENTITY from an Existing Column
The script below uses a staging table without the IDENTITY column property along with SWITCH to remove the IDENTITY property from the column and retain previously assigned identity values. I drop the empty original table and rename the staging table to the original name along with constraints and indexes after the operation.

--create sample table and data
CREATE TABLE dbo.RemoveIdentityExample(
	  IncrementalColumn int IDENTITY NOT NULL
		CONSTRAINT PK_RemoveIdentityExample PRIMARY KEY
	, OtherData varchar(20) NULL
	);
CREATE INDEX idx_RemoveIdentityExample_OtherData
	ON dbo.RemoveIdentityExample(OtherData);
INSERT INTO dbo.RemoveIdentityExample VALUES
	  ('Sample data one')
	, ('Sample data two')
	, ('Sample data three');
GO

SET XACT_ABORT ON;
BEGIN TRY
	BEGIN TRAN;
	--create staging table with same schema, indexes, and constraints
	CREATE TABLE dbo.RemoveIdentityExampleStaging(
		  IncrementalColumn int NOT NULL --IDENTITY column property removed
			CONSTRAINT PK_RemoveIdentityExampleStaging PRIMARY KEY
		, OtherData varchar(20) NULL
		);
	CREATE INDEX idx_RemoveIdentityExampleStaging_OtherData
		ON dbo.RemoveIdentityExampleStaging(OtherData);

	ALTER TABLE dbo.RemoveIdentityExample
		SWITCH TO dbo.RemoveIdentityExampleStaging;
	DROP TABLE dbo.RemoveIdentityExample;
	--rename table
	EXEC sp_rename
		  @objname = N'dbo.RemoveIdentityExampleStaging'
		, @newname = N'RemoveIdentityExample'
		, @objtype = 'OBJECT';
	--rename constraints
	EXEC sp_rename
		  @objname = N'dbo.PK_RemoveIdentityExampleStaging'
		, @newname = N'PK_RemoveIdentityExample'
		, @objtype = 'OBJECT';
	--rename indexes
	EXEC sp_rename
		  @objname = N'dbo.RemoveIdentityExample.idx_RemoveIdentityExampleStaging_OtherData'
		, @newname = N'idx_RemoveIdentityExampleStaging_OtherData'
		, @objtype = N'INDEX';
	COMMIT;
END TRY
BEGIN CATCH
	IF @@TRANCOUNT > 0 ROLLBACK;
	THROW;
END CATCH;
GO

Add IDENTITY to an Existing Column
This script uses a staging table with the IDENTITY column property along with SWITCH to add the IDENTITY property while retaining the existing column values. After the SWITCH operation, DBCC CHECKIDENT seeds the next IDENTITY value greater than the current value in the column.

Note that an IDENTITY column must now allow NULL values so, in the case of a nullable column, the column must first be altered to NOT NULL before using this method.

--create sample table and data
CREATE TABLE dbo.AddIdentityExample(
	  IncrementalColumn int NOT NULL
		CONSTRAINT PK_AddIdentityExample PRIMARY KEY
	, OtherData varchar(20) NULL
	);
CREATE INDEX idx_AddIdentityExample_OtherData
	ON dbo.AddIdentityExample(OtherData);
INSERT INTO dbo.AddIdentityExample VALUES
	  (1, 'Sample data one')
	, (2, 'Sample data two')
	, (3, 'Sample data three');
GO

SET XACT_ABORT ON;
BEGIN TRY
	BEGIN TRAN;
	--create staging table with same schema, indexes, and constraints
	CREATE TABLE dbo.AddIdentityExampleStaging(
		  IncrementalColumn int IDENTITY NOT NULL --IDENTITY column property added
			CONSTRAINT PK_AddIdentityExampleStaging PRIMARY KEY
		, OtherData varchar(20) NULL
		);
	CREATE INDEX idx_AddIdentityExampleStaging_OtherData
		ON dbo.AddIdentityExampleStaging(OtherData);

	ALTER TABLE dbo.AddIdentityExample
		SWITCH TO dbo.AddIdentityExampleStaging;
	DROP TABLE dbo.AddIdentityExample;
	--rename table
	EXEC sp_rename
		  @objname = N'dbo.AddIdentityExampleStaging'
		, @newname = N'AddIdentityExample'
		, @objtype = 'OBJECT';
	--rename constraints
	EXEC sp_rename
		  @objname = N'dbo.PK_AddIdentityExampleStaging'
		, @newname = N'PK_AddIdentityExample'
		, @objtype = 'OBJECT';
	--rename indexes
	EXEC sp_rename
		  @objname = N'dbo.AddIdentityExample.idx_AddIdentityExampleStaging_OtherData'
		, @newname = N'idx_AddIdentityExampleStaging_OtherData'
		, @objtype = N'INDEX';
	--seed IDENTITY with current max column value
	DBCC CHECKIDENT(N'dbo.AddIdentityExample');
	COMMIT;
END TRY
BEGIN CATCH
	IF @@TRANCOUNT > 0 ROLLBACK;
	THROW;
END CATCH;
GO

Always Use Semicolon Statement Terminators

ANSI-standard semicolon statement terminators are often omitted in T-SQL queries and many developers are unaware that this is syntax is deprecated.  Omitting statement terminators is a dangerous practice because, even if the batch compiles, you may get unexpected results.  Consider the insidious examples below pointed out by SQL Server MVP Erland Sommarskog:

BEGIN TRY
	BEGIN TRAN
	SELECT 1/0 AS CauseAnException
	COMMIT
END TRY
BEGIN CATCH
	SELECT ERROR_MESSAGE()
	THROW
END CATCH

Few of us will catch (no pun intended) the bug in the above script.  What results do you expect after running the above script under SQL Server 2012 or later versions?  Rather than leave this as an exercise for the reader, I’ll spoil the fun and mention that no run-time error is raised at all.  Instead, the THROW statement is interpreted as a column alias for the ERROR_MESSAGE() column.  This sort of coding error is especially nasty because catch blocks are rarely unit tested and this catch block coding mistake hides the run-time error entirely without raising an exception.

Similarly, the absence of statement terminators in the script below causes another problem.  Can you spot it?

BEGIN TRY
	BEGIN TRAN
	SELECT 1/0 AS CauseAnException
	COMMIT
END TRY
BEGIN CATCH
    IF @@TRANCOUNT > 0 ROLLBACK TRAN
    THROW
END CATCH

At least an error is raised in this case, albeit not the one you might expect.  The resultant error is “Cannot roll back THROW. No transaction or savepoint of that name was found”.  This coding bug obfuscates the preceding divide by zero error and prevents the THROW statement from being executed.

Below is another example where the absence of the semi-colon terminator obfuscates the root cause of the error. As you may know, GO is not a T-SQL statement but a batch terminator command recognized by SSMS and other SQL Server tools and utilities. This script executes as expected from an SSMS query window because SSMS parses the script and executes each batch individually when GO commands are encountered:

SELECT 'Creating view'
GO
CREATE VIEW dbo.foo AS SELECT 1 AS bar

However, running the same script with PowerShell (or any other client application) fails with the error “CREATE VIEW must be the first statement in a query batch”:

try
{
    $connectionString = "Data Source=.;Initial Catalog=tempdb;Integrated Security=SSPI"
    $connection = New-Object System.Data.SqlClient.SqlConnection($ConnectionString)
    $Script = 
@"
SELECT 'Creating view'
GO
CREATE VIEW dbo.foo AS SELECT 1 AS bar
"@

    $command = New-Object System.Data.SqlClient.SqlCommand($Script, $connection)
    $connection.Open()
    $result = $command.ExecuteNonQuery()
    $connection.Close()
}
catch [Exception]
{
    throw;
}

In this case, SQL Server interprets the GO as a column alias in the first SELECT query and the batch errs on the CREATE VIEW statement during compilation. If you a semi-colon is added to the end of the first SELECT statement, the correct error message results: “Incorrect syntax near ‘GO'”.

As a side note, one can execute scripts containing GO terminators programmatically using the SMO API, which is also used by some SQL Server tools. See this Stackoverflow answer. Another approach I’ve used is to parse scripts in code using the Transact-SQL script DOM and execute each batch individually. I’ll follow up with a separate article detailing that method and add the link here.

Semicolons Will Become Mandatory
Microsoft announced with the SQL Server 2008 release that semicolon statement terminators will become mandatory in a future version so statement terminators other than semicolons (whitespace) are currently deprecated.  This deprecation announcement means that you should always use semicolon terminators in new development.  I honestly don’t expect SQL Server to strictly enforce mandatory semicolons in the near future but it is still a best practice to use semicolon statement to avoid issues like those mentioned earlier as well as facilitate code maintainability.  I suggest specifying statement terminators in all new development and perhaps adding terminators to existing code as you perform maintenance.

Transact-SQL does not currently enforce the ANSI semicolon statement terminator requirement.  Instead, semicolon statement terminators are optional and any whitespace (spaces, tabs, newline) may be used instead.  The exception to this rule is that many of the statements introduced in SQL Server 2005 and later require the preceding statement to be properly terminated in order for the batch to compile.

Below are some guidelines I suggest on when to, and when not to, use semicolon statement terminators.

Suggested Guidelines
The Transact-SQL parser is quite lax, allowing any whitespace (e.g. space, tab, newline) to be used.  This laxness results in ambiguity like the examples at the beginning of this article demonstrate.  Similarly, statement terminators may not only be omitted, they may also be used in inappropriately.  I strongly suggest you adhere to the T-SQL syntax documented in the Books Online even if the parser allows otherwise.  This practice will help future-proof your code since relying on undocumented behavior is inherently risky.

Don’t precede a statement with a semicolon
Remember that the purpose of semicolons is to terminate SQL statements, not begin them.  A common mistake I see is throwing a semicolon in front of statements in order to get a batch of statements to compile, especially with newer statements like WITH (CTE expression) that require previous statement termination.  Although the T-SQL parser currently ignores extraneous and misplaced semi-colons, I suggest they be specified in the appropriate place according statement syntax documented in the SQL Server Books Online.

Specify semicolons at the end of each stand-alone SQL statement
Not only will this conform to the ANSI standard, your intent will be clearer and the code easier to read.

Terminate control-of-flow statement blocks at the end of the control-of-flow scope
Control-of-flow statements are not covered by the ANSI SQL standard because these are proprietary SQL extensions.  The SQL Server Books Online is sketchy on the subject and many of the examples (as of this writing) are inconsistent and do not always include statement terminators.  Furthermore, control-of-flow statement blocks are confusing due to the many variations, nesting, and optional BEGIN/END specifications.

Below are examples illustrating what I believe to be proper use of statement terminators control-of-flow block terminators using IF statements in SQL 2008 and later versions.  The same concepts apply to other control-of-flow constructs like WHILE and TRY/CATCH blocks.  I should add that this batch example will not compile under SQL 2005 because an explicit BEGIN/END block is required to execute a common table expression conditionally in that version.  T-SQL parser enhancements eliminated that requirement in SQL 2008 and later.

IF 1 = 1
  WITH a AS (SELECT 1 a) SELECT a FROM a; --terminate statement and IF

IF 1 = 1
  WITH a AS (SELECT 1 a) SELECT a FROM a; --terminate statement
ELSE
  WITH a AS (SELECT 1 a) SELECT a FROM a; --terminate statement and IF

IF 1 = 1
  BEGIN
    WITH a AS (SELECT 1 a) SELECT a FROM a; --terminate statement
  END; --terminate IF

IF 1 = 1
  BEGIN
    WITH a AS (SELECT 1 a) SELECT a FROM a; --terminate statement
  END
ELSE
  BEGIN
    WITH a AS (SELECT 1 a) SELECT a FROM a; --terminate statement
  END; --terminate IF

IF 1 = 1
  WITH a AS (SELECT 1 a) SELECT a FROM a; --terminate statement
ELSE
  BEGIN
    WITH a AS (SELECT 1 a) SELECT a FROM a; --terminate statement
  END; --terminate IF

IF 1 = 1
  BEGIN
    WITH a AS (SELECT 1 a) SELECT a FROM a; --terminate statement
  END
ELSE
  WITH a AS (SELECT 1 a) SELECT a FROM a; --terminate statement and IF

IF 1 = 1
  BEGIN
    WITH a AS (SELECT 1 a) SELECT a FROM a; --terminate statement
	IF 1 = 1
		WITH a AS (SELECT 1 a) SELECT a FROM a; --terminate statement
	ELSE
	BEGIN
		WITH a AS (SELECT 1 a) SELECT a FROM a; --terminate statement
	END; --terminate inner nested IF
	WITH a AS (SELECT 1 a) SELECT a FROM a; --terminate statement
  END
ELSE
  WITH a AS (SELECT 1 a) SELECT a FROM a; --terminate statement and outer IF

Summary
Consistent user of semicolons helps avoid bugs in code that might otherwise go undetected.  Code with statement terminators can also be more easily modified without introducing compile errors and make code easier to maintain because the end of each statement is readily apparent to subsequent developers.  Importantly, you’ll be better positioned for future SQL Server versions by consistently using semicolon statement terminators.