The Curious Case of Undetected SQL Exceptions

Undetected database errors are insidious. It can be really bad when an error gets dropped on the floor, resulting in incomplete or wrong results. Consider that this simple SELECT query returns an empty result set instead of raising a SqlException for the divide by zero error:

string sqlBatchText = @"
BEGIN TRY

    SELECT 1/0 AS DivideByZeroError;

END TRY
BEGIN CATCH

    THROW;

END CATCH;
";

int rowCount = 0;

try
{

    using (SqlConnection connection = new SqlConnection(@"Data Source=.;Initial Catalog=tempdb;Integrated Security=SSPI"))
    using (SqlCommand command = new SqlCommand(sqlBatchText, connection))
    {
        connection.Open();
        using (SqlDataReader reader = command.ExecuteReader())
        {
            while (reader.Read()) { ++rowCount; };
        }
        Console.WriteLine("{0} rows read", rowCount);
    }
}
catch
{
    throw;
}

One can generally assume SQL errors raised during batch execution will also raise ADO.NET database exceptions. However, there are cases involving multi-statement batches and stored procedures where exceptions are not raised by ADO.NET automatically following SQL errors, as the above example shows. These scenarios can be distilled as:

1) An error is caught in T-SQL while executing a row-returning statement.
2) An error occurs after a row-returning statement successfully executes.

To ensure database exceptions are raised, one must either code T-SQL to avoid these scenarios entirely or ensure the ADO.NET data access layer invokes the data reader NextResult method to consume all results returned even when no more than one result set is expected. Row-returning statements include SELECT (not variable assignment), OUTPUT clause in an INSERT/UPDATE/DELETE/MERGE statement that returns rows to the client, as well as some specialized commands like RESTORE FILELISTONLY, DBCC commands with TABLE_RESULTS, etc.

The defensive programming technique example below executes a command that is expected to return a single result set but still calls NextResult afterwards to ensure database exceptions are raised when SQL errors are subsequently raised in the T-SQL batch.

string sqlBatchText = @"
BEGIN TRY

    SELECT 1/0 AS DivideByZeroError;

END TRY
BEGIN CATCH

    THROW;

END CATCH;
";

int rowCount = 0;

try
{

    using (SqlConnection connection = new SqlConnection(@"Data Source=.;Initial Catalog=tempdb;Integrated Security=SSPI"))
    using (SqlCommand command = new SqlCommand(sqlBatchText, connection))
    {
        connection.Open();
        using (SqlDataReader reader = command.ExecuteReader())
        {
            while (reader.Read()) { ++rowCount; };
            //get any subsequent errors returned
            while (reader.NextResult()) { };
        }
        Console.WriteLine("{0} rows read", rowCount);
    }
}
catch
{
    throw;
}

I use System.Data.SqlClient namespace objects (.NET Framework Data Provider for SQL Server) in this article but the same considerations apply to other ADO.NET managed providers (System.Data.Odbc and System.Data.OleDb namespace) when accessing SQL Server.

Regardless of the method one uses to execute SQL Server queries, ADO.NET uses a data reader to return command results even when higher-level objects (e.g. Dataset) or ORMs (e.g. Entity Framework) are used. The low-level ADO.NET command ExecuteReader method exposes the data reader whereas ExecuteScalar and ExecuteNonQuery do not expose the internal reader.

ExecuteScalar returns the first column of the first row returned as a scalar value but doesn’t call NextResult on the internal data reader to retrieve subsequent results. Consequently, errors may go undetected with ExecuteScalar. ExecuteScalar will not raise an exception if a T-SQL error occurs after the first row is returned. Also, if no rows are returned because the row-returning statement erred and the error was caught in T-SQL, ExecuteScalar returns a null object without raising an exception.

ExecuteNonQuery executes the entire batch of statements and returns the accumulated count of affected rows as a scalar value, discarding rows returned (if any). The returned value will be -1 if SET NOCOUNT ON is specified. Because ExecuteNonQuery internaly consumes all results in the process, errors will be raised without additional ADO.NET programming, albeit one doesn’t typically use ExecuteNonQuery to execute a batch that returns rows. Again, the ADO.NET error detection issue only applies to row-returning statements.

The remainder of this article discusses T-SQL error handling and ADO.NET defensive programming techniques in more detail and discusses techniques to avoid undetected database errors in ADO.NET.

T-SQL Error Handling Objectives
T-SQL and ADO.NET data access code must work in concert with one another to ensure SQL errors are detected in application code. The T-SQL constructs used in multi-statement batches can affect if and how when errors are reported by ADO.NET during batch execution. I’ll start by citing core T-SQL error handling objectives, which can be summarized as:

1) Ensure a multi-statement T-SQL batch doesn’t continue after an error occurs.
2) Rollback transaction after errors.
3) Raise error so that the client application is aware a problem occurred.

The T-SQL building blocks used to achieve these objectives are:
1) SET XACT_ABORT ON
2) Structured error handling (SHE) (a.k.a. TRY/CATCH)
3) Control-of-flow (e.g. IF @@ERROR GOTO ErrorHandler)

T-SQL Behavior Without SEH and XACT_ABORT ON
When a runtime error occurs with the XACT_ABORT session setting ON outside a TRY block, SQL Server will stop batch execution immediately, rollback the transaction (if any), and raise the error. Consequently, a single SET XACT_ABORT ON statement will meet all aforementioned error handling objectives without T-SQL procedural code. However, the XACT_ABORT setting is not considered when user-defined errors are raised with RAISERROR so control-of-flow statements are required to meet objectives #2 and #3 when RAISERROR is employed.

SET XACT_ABORT ON also rolls back open transactions following an attention event like an explicit cancel or query timeout, which would otherwise leave the transaction open. This is one reason why I strongly recommend using SET XACT_ABORT ON, especially in procs that include BEGIN TRAN, regardless of whether or not SEH is also used.

T-SQL Behavior Without SEH and XACT_ABORT OFF
When an error occurs with the SET XACT_ABORT session setting OFF and SEH is not used, SQL Server will raise the error immediately but, depending on the error and severity, batch execution might continue and the transaction not rolled back. The T-SQL batch must use control-of-flow statements after each statement to avoid continuing after errors and roll back the transaction (objectives #1 and #2).

T-SQL Behavior With T-SQL Structured Error Handling
When an error occurs during statement execution with a T-SQL structured error handler is in scope, the CATCH block error handler is entered, after marking the transaction uncommittable if SET XACT_ABORT is ON. SEH meets the first error handling objective by skipping subsequent statements in the TRY block after an error. It is the responsibility of the error handling code in the CATCH BLOCK to roll back the transaction if needed and raise the error. The simple T-SQL handler below achieves objectives #2 and #3 in Azure SQL Database and SQL Server 2012 and later:

BEGIN CATCH
    IF @@TRANCOUNT > 0 ROLLBACK;
    THROW;
END;

THROW is not available In SQL 2008 R2 and earlier so one must use RAISERROR instead in older versions. The error handler below provides similar functionality a THROW, although RAISERROR obfuscates the original error as a user error with message number 50000+.

BEGIN CATCH
    DECLARE
          @ErrorNumber int = ERROR_NUMBER()
        , @ErrorMessage nvarchar(2048) = ERROR_MESSAGE()
        , @ErrorSeverity int = ERROR_SEVERITY()
        , @ErrorState int = ERROR_STATE()
        , @ErrorLine int = ERROR_LINE();

    RAISERROR('Error %d caught at line %d: %s'
        , @ErrorSeverity
        , @ErrorState
        , @ErrorNumber
        , @ErrorLine
        , @ErrorMessage);
END CATCH;

T-SQL Interaction with ADO.NET
ADO.NET will reliably detect errors in batches without T-SQL SEH when no result sets are returned or only the last statement in the batch returns rows. When a row returning statement is not the last statement in the batch, ADO.NET code must call NextResult to ensure ADO.NET raises errors for subsequent statements in the batch that may have erred. Avoid row returning statements that are not the last one in a batch unless you are certain the calling code consumes all results with NextResult.

Using T-SQL SEH when result sets are returned has implications on ADO.NET data access code to ensure database exceptions are raised after SQL errors. As illustrated with the ExecuteReader example at the beginning of this article, when an error is caught in T-SQL during a row-returning statement, ADO.NET will not raise the database exception immediately and instead return the partial or empty result set generated by the failed statement. The next data reader Read method call will return false after the error. The error raised in the CATCH block by the T-SQL THROW or RAISERROR statement is considered a separate result by ADO.NET and will not be raised as a database exception until NextResult is called.

Unless you have a specific reason to use SEH in row-returning batches, I suggest instead using SET XACT_ABORT ON alone as this will address core T-SQL error handling objectives and allow ADO.NET to detect errors without calling NextResult. However, as mentioned earlier, ADO.NET code will still need to call NextResult if the row-returning statement is not the last statement in the batch.

Summary
I hope this information will help you ensure database errors in multi-statement batches are detected. The interaction between T-SQL and ADO.NET isn’t as intuitive as it could be.

Deprecated SQL Server Data Access Technologies

I hope the warning excerpt below from the SQL Server Books Online is not a surprise to you (emphasis mine):

Warning:
SQL Server Native Client (SNAC) is not supported beyond SQL Server 2012
. Avoid using SNAC in new development work, and plan to modify applications that currently use it. The Microsoft ODBC Driver for SQL Server provides native connectivity from Windows to Microsoft SQL Server and Microsoft Azure SQL Database.

Let me first mention that the Books Online should have made it clear that this warning applies only to SNAC data access by applications. The warning does not apply to Microsoft SQL Server features and products that use SNAC as part of the OLE DB stack, like Linked Servers, SSIS, and SSAS. Microsoft SQL Server 2012 Native Client is still installed and used as a component dependency to support these features in SQL Server 2014 and later versions, although that may change in the future.

More concerning is OLE DB in general. Even before SQL Server 2012 was released, Microsoft announced in August, 2011 that ODBC was the preferred technology for relational database access and that OLE DB was deprecated. Below is an excerpt from the Microsoft is Aligning with ODBC for Native Relational Data Access FAQ which should be alarming to folks who continue to use Microsoft OLE DB providers to access SQL Server 2014 and later databases (note Denali is the code name for SQL Server 2012 and, again, emphasis mine):

Question6: If I have an OLE DB application that I write for Denali, will it be supported on a post Denali version of SQL Server that is released during the life of Denali?

Answer: No, in fact we may explicitly block the OLE DB applications on post-Denali versions of SQL Server. It is recommended that you plan your migration soon to ODBC, if you want to start using newer versions of SQL Server as soon as they release.

The explicit block of OLE DB data access has yet not happened (as of this writing) but I think it’s prudent to heed the announcement and warning, at least for applications that currently use SQL Server 2014 and later versions as well as SQL Azure Database, or may need to use newer SQL versions in the future. Microsoft has a history of going out of their way to provide backwards compatibility in the SQL Server space but it’s been 3 SQL Server versions (soon to be 4 versions) since the deprecation announcement.

The bottom line is that native Windows applications (e.g. unmanaged C++, VB6, VBA, Classic ASP, etc.) should generally use ODBC and the latest Microsoft ODBC Driver for SQL Server. This stand-alone ODBC driver is a free download from https://docs.microsoft.com/en-us/sql/connect/odbc/download-odbc-driver-for-sql-server. Not only should SNAC not be used against SQL Server 2014 and later versions, the deprecated SQL Server OLE DB provider and ODBC driver that ship with Windows should also be avoided. This practice will facilitate more seamless upgrades to new versions of SQL Server and Azure SQL Database.

Preparing for the Future
The message is loud and clear that ODBC is the supported and preferred path for native applications going forward. The Data Access Technologies Road Map provides an overview and history of Microsoft data access technologies, which I recommend you peruse to ensure you are not inadvertently using deprecated or unsupported technologies for new development and, for existing applications, consider moving from legacy data access technologies to current ones when practical.

The current Microsoft ODBC Driver for SQL Server as of this writing is ODBC Driver 13 for SQL Server. Note that that both the 13.0 and 13.1 versions of this driver have the same “ODBC Driver 13 for SQL Server” display name listed under installed programs and ODBC Data Source Administrator. If installed, the driver will be listed under installed programs along with the corresponding driver version (when viewed detail mode). The 13.1 version adds support for the Always Encrypted feature. These ODBC Drivers are available from the link I mentioned earlier.

The remainder of this article reviews common data access technologies for native (unmanaged) Windows applications that are explicitly identified as unsupported, deprecated, or may have a limited future along with remediation considerations. This isn’t to say these technologies won’t work, just that using them may block upgrades to new versions of SQL Server and Azure SQL Database as well as prohibit using new SQL Server features. Applications that are aligned with current technologies are much better positioned for seamless SQL Server upgrades, both on-prem and in the cloud.

Deprecated Windows Data Access Components
Windows Data Access Components (WDAC), which was formally known as Microsoft Data Access Components (MDAC) in older Windows versions, is included with Windows to provide data access infrastructure for ODBC, OLE DB, ADO Classic, and managed ADO.NET out-of-the box. WDAC includes a mix of deprecated, mature, and modern components. WDAC is part of the operating system and maintained by Windows Update.

Both the SQL Server ODBC driver and OLE DB provider included with WDAC are deprecated; these are provided only for legacy application backwards compatibility and should not be used for new application development. The WDAC ODBC driver named “SQL Server” and OLE DB provider named “Microsoft OLE DB Provider for SQL Server” (SQLOLEDB) were both deprecated in favor of SQL Server Native Client when SQL Server 2005 was released for over 10 years ago. Not only do these old components not support new data types introduced after SQL Server 2000 directly, newer features like Availability Groups, MARS, Always Encrypted, and idle connection resiliency are not supported either. Native applications should install and use a separately installed SQL Server driver rather instead of the WDAC “SQL Server ODBC driver, SQLOLEDB provider, or SNAC ODBC driver/OLE DB provider.

WDAC also includes the Microsoft OLE DB Provider for ODBC Drivers (MSDASQL), which acts as a bridge to allow OLE DB applications to use an ODBC driver instead of an OLE DB provider for low-level data access. The main purpose of MSDASQL was to allow OLE DB programs to access RDBMS products other than SQL Server that had an ODBC driver but no OLE DB provider, with the expectation that vendors would eventually jump on the OLE DB bandwagon. However, that didn’t happen widely because OLE DB relies on Windows-only COM interfaces and many vendors preferred to support only ODBC interfaces. Microsoft recognized the value of ODBC for cross-platform SQL Server data access, which is one of the reasons called out for OLE DB deprecation. Although MSDASQL is an OLE DB provider, it is not deprecated for SQL Server relational data access because it uses ODBC to access the data store. MSDASQL can facilitate transitioning to ODBC in existing OLE DB applications. I’ll discuss this in more detail shortly.

ADO (ActiveX Data Objects, not to be confused with ADO.NET) components are included with WDAC. Although not deprecated, ADO a mature OLE DB technology that hasn’t been enhanced since ADO 6.0 was released with WDAC in Windows Vista over 10 years ago. ADO types do not directly support data types introduced after SQL Server 2000, regardless of the underlying provider/driver used. I’ll leave speculation on the future of ADO as an exercise for the reader.

Changing Existing ODBC Applications
Changing an existing ODBC application (ADO or direct ODBC function calls) to use the Microsoft ODBC Driver for SQL Server is often simply a matter of installing the driver and changing the connection string or DSN to use it. Unmanaged C++ applications that use the ODBC call level interface directly may need to be recompiled using the header files included with the SDK install of the ODBC driver. It’s often trivial to switch from SNAC or the WDAC SQL Server ODBC driver. Application testing should be done to ensure compatibility but ODBC driver upgrades are usually transparent.

Changing from OLE DB to ODBC in ADO Classic
There is still quite a bit of OLE DB SQL Server access in the wild in my experience, varying from C++, VBA, Classic ASP, and even in VB.NET apps migrated from VB6 that were never retrofitted to use to SqlClient (which should be done as it provides high-performance managed data access for .NET applications). From unmanaged code, ADO provides an easy to use object-oriented COM interface on top of the 100+ complex OLE DB interfaces. Low-level data access is performed by the OLE DB provider specified in the connection string or connection object. MSDASQL (Microsoft OLE DB Provider for ODBC Drivers) is the default provider in ADO so ODBC is used when no OLE DB provider is specified. When MSDASQL is used explicitly or by default, the connection string must specify either a DSN or ODBC driver.

Below are DSN-less ADO ODBC connection string examples that use the Microsoft ODBC Driver for SQL Server with and without a trusted connection. “Provider=MSDASQL” could have been specified in the connection string but is not technically required since it’s the default provider:

DataSource=YourServer;Driver={ODBC Driver 13 for SQL Server};Database=YourDatabase;Trusted_Connection=Yes
DataSource=YourServer;Driver={ODBC Driver 13 for SQL Server};Database=YourDatabase;Trusted_Connection=Yes;UID=YourUser;PWD=YourPassword

The examples below use an ODBC DSN with the default database specified in the DSN configuration. The first uses Integrated Windows authentication (specified in the DSN configuration) and the second example is functionally identical except using SQL Server authentication. Be aware that the Microsoft ODBC Driver for SQL Server does not store user credentials in the DSN configuration so those must be specified by the application:

DSN=YourOdbcDataSource
DSN=YourOdbcDataSource;UID=YourUser;PWD=YourPassword

ADO applications use a handful of objects to interact with SQL Server (mostly connection, command, recordset, record, transaction, parameter, and field objects). These objects abstract the implementation details such that it is possible to use ADO with any DBMS product, OLE DB provider, or ODBC driver as long as the low-level driver/provider supports the requested functionality. One can theoretically transition from OLE DB to ODBC with only a connection string change. However, the devil is in the details when switching from OLE DB to ODBC via MSDASQL.

MSDASQL converts OLE DB method calls into their equivalent ODBC function calls. Commonly used fast-forward read-only client-side cursors (a.k.a. firehose cursor, same as the only type ADO.NET uses) tend to work well in my experience. However, ADO provides a plethora of options for cursor location, execution, cursor types, and locking modes that ADO and MSDASQL may or may not translate well to ODBC equivalents. If you run into problems during testing with ODBC, make sure the requested ADO options are appropriate for the task at hand. It is common for ADO applications to use advanced options and pessimistic locking inappropriately. A firehose cursor with optimistic locking is often best for the task at hand and will perform better too.

Like ADO, MSDASQL is a mature technology that hasn’t gotten much love lately. I could be wrong but I don’t expect patches that are not security related. If you run into issues with MSDASQL, you may be better off developing a work-around rather than wait for a fix.

Changing OLE DB Interface Calls to ODBC
C++ applications that use OLE DB interfaces directly rather than the higher-level ADO API can be converted to ODBC by using the MSDASQL provider similarly as discussed in the ADO topic. The same considerations apply.

Note that MSDASQL does add another layer, which may be an issue for applications that are especially performance-sensitive. The best approach for C++ applications that are expected to be around for a while might be to move from OLE DB to ODBC function calls (or via MFC). This will provide the highest level of performance and alignment with ODBC.

Remediation Summary
Inasmuch as I recommend staying aligned with product technology lifecycles, whether or not one should modify an existing application to use currently data access technologies is ultimately a business decision. Business benefits include not only supportability but also agility in making functional enhancements that leverage newer features as well as non-functional ones, such as support for Availability Groups. The costs of the development and testing needed to use the Microsoft ODBC Driver for SQL Server will vary considerably depending on application design. Those costs need to be weighed against the benefits of the effort.

That being said, I always recommend one avoid deprecated technologies in new development work even if that requires moving outside one’s comfort zone. Technical debt is a hidden cost that often needs to be paid back with interest.

p.s. For history buffs, this post by Hal Berenson, retired Microsoft Distinguished Engineer and General Manager, provides additional background on OLE DB and data access technologies.

SQL Server 2016 SP1 Standard Edition Enhancements

I seldom get excited about service packs but the changes released with SQL Server 2016 SP1 are the most significant I’ve seen in a SQL Server service pack in 20+ years. Microsoft announced this week at the Microsoft Connect(); developer’s conference that SQL Server 2016 SP1, which is available for download immediately, allows features previously available only in Enterprise/Developer Editions to be used in lessor Standard, Web, Express, and LocalDB Editions too. Features like table partitioning, In-Memory OLTP, and columnstore are now options for developers and DBAs using SQL Server Standard Edition and even the free Express Edition in production. See SQL Server 2016 Service Pack 1 (SP1) released !!! for the complete matrix of programmability features by edition along with other cool SP1 information.

The implications are huge now that SQL Server has the same programmability surface area among editions. The choice of the production edition can be made independently based on operational needs rather than programmability features. Developers can use a free edition (i.e. LocalDB, Express or Developer) without fear a feature won’t be available in production as long as prod is running SQL Server 2016 SP1 or greater. DBAs can now choose the appropriate edition for production based on other considerations like advanced high availability, TDE, Auditing as well as performance features like higher supported memory, more number of cores, and advanced scanning. This separation of concerns avoids the need to lock in the production edition early in the application lifecycle, making development easier and production implementation more flexible.

Real World Use Case Scenario
I work with an ISV with hundreds of customers running a mix of Standard and Enterprise Edition. Their needs vary widely and SQL Server Enterprise Edition is not an option for some due to budget constraints. Some tables are often quite large so partitioning is required for manageability and, for their reporting workload, partitioning also improves performance of large scans due to partition elimination. The ugliness though, is that table partitioning (and/or columnstore) is the right tool for the job but was not an option for customers on Standard Edition.

The ISV initially compromised and used view partitioning instead of table partitioning so that the same code would run regardless of edition. Although that provided the expected manageability benefits, there were some downsides. Compilation time increased significantly as the number of partitioned view member tables increased as did the query plan complexity. This sometimes resulted in poor query plans against very large tables and especially impacted larger and most valued customers, most of which were running Enterprise Edition.

To address the problem before SQL Server 2016 SP1, the ISV added conditional code to the application so that either view or table partitioning could be used depending on the SQL Server edition. This wasn’t ideal as it added code complexity and doubled the number of QA test cases for application features that performed partition maintenance. However, since the resultant benefits for their larger customers on Enterprise Edition were quite significant; the additional costs of development and testing were well-justified.

Now that table partitioning is available in SQL Server 2016 SP1 Standard Edition, they plan to require SQL Server 2016 SP1 (or later) going forward, use table partitioning unconditionally, and perhaps introduce usage of other features like columnstore that were previously Enterprise only. Not only will this simplify the code base and test cases, customers on Standard Edition will be happier with their experience and can upgrade to Enterprise if they so choose without reinstalling or reconfiguring the application. It will of course take some time before all their customers upgrade to the latest product version and SQL 2016 SP1+ but the future is much brighter now.

Perform Due Diligence
If you are new to features previously available only in Enterprise Edition, I suggest you perform due diligence before using these features. Memory-optimized features like columnstore and In-Memory OLTP require additional physical memory and insufficient memory with memory-optimized features will be a production show-stopper. Make sure your hardware is sized appropriately regardless of edition and, in the case of editions other than Enterprise or Developer, memory requirements don’t exceed the maximum capacity limits for that edition. Although very powerful, In-Memory OLTP is a fundamentally different paradigm that you might be accustomed to regarding transactional behavior and isolation levels. Be sure you fully understand these features before using it in development or production.

Summary
I hope these changes are enough motivation for you to consider upgrading to SQL Server 2016 SP1, especially if you are running Standard Edition or are currently on an older SQL Server version. Together with the fact that SQL Server 2016 just runs faster, the time and effort spend in upgrading is a solid investment that will pay dividends regardless of edition.

SQL Server 2016 and Azure SQL Database V12 Breaking Change

This post is to get the word out about a breaking change to datetime conversion and comparison behavior in SQL Server 2016 and Azure SQL Database V12. This change hasn’t been documented as of this writing in the Breaking Changes to Database Engine Features in SQL Server 2016 topic in the SQL Server Books Online.

In short, conversion from datetime to a higher precision temporal data type (datetime2, datetimeoffset, or time) may yield a different, but more accurate, time value than in prior versions. Also, predicates involving datetime consider the full precision of raw datetime internal value instead of the time value rounded to the nearest millisecond. These changes in conversion and comparison behavior may affect existing applications and is not intuitive unless one understands the underlying datetime data type implementation.

Background
You may be aware that the accuracy of datetime is limited to 1/300 of a second. This is because values are internally an 8-byte structure consisting of 2 separate 32-bit integers, one with the number of day units since 1900-01-01 and the other with the number of 1/300 second interval units since midnight. The 1/300 second unit interval limits the time accuracy to 3.33333… milliseconds and the milliseconds value will be a repeating decimal when time interval units are not evenly divisible by 3. The raw decimal value is rounded to a scale of 3 in accordance with the fixed datetime precision of 3, resulting in a millisecond value of 0, 3, or 7 for all datetime values.

Pre-SQL Server 2016 Behavior
Before SQL Server 2016, conversion from datetime to another temporal type used the source datetime value after it was rounded to the nearest millisecond, which truncated repeating decimal fractional milliseconds. The rounded value was then rounded again according to the target type precision. When the target type precision was greater than 3, the time was extended to the target type precision with insignificant trailing zeros, resulting in zero for the sub-millisecond value.

DECLARE @DateTime datetime = '2016-01-01T00:00:00.007';
SELECT CAST(@DateTime AS datetime2(0)); --2016-01-01 00:00:00
SELECT CAST(@DateTime AS datetime2(1)); --2016-01-01 00:00:00.0
SELECT CAST(@DateTime AS datetime2(2)); --2016-01-01 00:00:00.01
SELECT CAST(@DateTime AS datetime2(3)); --2016-01-01 00:00:00.007
SELECT CAST(@DateTime AS datetime2(4)); --2016-01-01 00:00:00.0070
SELECT CAST(@DateTime AS datetime2(5)); --2016-01-01 00:00:00.00700
SELECT CAST(@DateTime AS datetime2(6)); --2016-01-01 00:00:00.007000
SELECT CAST(@DateTime AS datetime2(7)); --2016-01-01 00:00:00.0070000

Also, when datetime was compared to another temporal type, the rounded value was used. This script shows the result of the equality predicate is true after the datetime value is converted to datetime2.

--This script prints EQUAL predicate is true
DECLARE @DateTime datetime = '2016-01-01T00:00:00.003';
DECLARE @DateTime2 datetime2(7) = @DateTime;
IF @DateTime = @DateTime2 PRINT 'EQUAL predicate is true' ELSE PRINT 'EQUAL predicate is not true';
IF @DateTime < @DateTime2 PRINT 'LESS THAN predicate is true' ELSE PRINT 'LESS THAN predicate is not true';
IF @DateTime > @DateTime2 PRINT 'GREATER THAN predicate is true' ELSE PRINT 'GREATER THAN predicate is not true';
GO
--This script prints EQUAL predicate is true
DECLARE @DateTime datetime = '2016-01-01T00:00:00.007';
DECLARE @DateTime2 datetime2(7) = @DateTime;
IF @DateTime = @DateTime2 PRINT 'EQUAL predicate is true' ELSE PRINT 'EQUAL predicate is not true';
IF @DateTime < @DateTime2 PRINT 'LESS THAN predicate is true' ELSE PRINT 'LESS THAN predicate is not true';
IF @DateTime > @DateTime2 PRINT 'GREATER THAN predicate is true' ELSE PRINT 'GREATER THAN predicate is not true';
GO

SQL Server 2016 Behavior Change
SQL Server 2016 and Azure SQL Database V12 use the raw datetime internal value without rounding during conversion to another temporal type. The value is rounded only once during conversion, to the target type precision. The end result will be the same as before SQL Server 2016 when the target type precision is 3 or less. However, the converted value will be different when the target type precision is greater than 3 and the internal time unit interval is not evenly divisible by 3 (i.e. rounded source datetime millisecond value is 3 or 7). Note the non-zero microseconds and nanoseconds in the script results below and that rounding is based on the target type precision rather than the source.

DECLARE @DateTime datetime = '2016-01-01T00:00:00.003';
SELECT CAST(@DateTime AS datetime2(0)); --2016-01-01 00:00:00
SELECT CAST(@DateTime AS datetime2(1)); --2016-01-01 00:00:00.0
SELECT CAST(@DateTime AS datetime2(2)); --2016-01-01 00:00:00.03
SELECT CAST(@DateTime AS datetime2(3)); --2016-01-01 00:00:00.003
SELECT CAST(@DateTime AS datetime2(4)); --2016-01-01 00:00:00.0033
SELECT CAST(@DateTime AS datetime2(5)); --2016-01-01 00:00:00.00333
SELECT CAST(@DateTime AS datetime2(6)); --2016-01-01 00:00:00.003333
SELECT CAST(@DateTime AS datetime2(7)); --2016-01-01 00:00:00.0033333
GO
DECLARE @DateTime datetime = '2016-01-01T00:00:00.007';
SELECT CAST(@DateTime AS datetime2(0)); --2016-01-01 00:00:00
SELECT CAST(@DateTime AS datetime2(1)); --2016-01-01 00:00:00.0
SELECT CAST(@DateTime AS datetime2(2)); --2016-01-01 00:00:00.01
SELECT CAST(@DateTime AS datetime2(3)); --2016-01-01 00:00:00.007
SELECT CAST(@DateTime AS datetime2(4)); --2016-01-01 00:00:00.0067
SELECT CAST(@DateTime AS datetime2(5)); --2016-01-01 00:00:00.00667
SELECT CAST(@DateTime AS datetime2(6)); --2016-01-01 00:00:00.006667
SELECT CAST(@DateTime AS datetime2(7)); --2016-01-01 00:00:00.0066667
GO

This behavior change provides a more accurate converted value but may break applications that expect the converted value to be the same as the rounded datetime value as was the case before SQL Server 2016.

Be aware than the full raw datetime precision (instead of the rounded value) is also used when evaluating predicates involving a datetime type. The full precision of both arguments are used, resulting in the equality compare predicate to evaluate to false in both scripts below. The greater than predicate is true in the first script and the less than predicate is true in the second:

--This script prints GREATER THAN predicate is true
DECLARE @DateTime datetime = '2016-01-01T00:00:00.003';
DECLARE @DateTime2 datetime2(7) = @DateTime;
IF @DateTime = @DateTime2 PRINT 'EQUAL predicate is true' ELSE PRINT 'EQUAL predicate is not true';
IF @DateTime < @DateTime2 PRINT 'LESS THAN predicate is true' ELSE PRINT 'LESS THAN predicate is not true';
IF @DateTime > @DateTime2 PRINT 'GREATER THAN predicate is true' ELSE PRINT 'GREATER THAN predicate is not true';
GO
--This script prints LESS THAN predicate is true
DECLARE @DateTime datetime = '2016-01-01T00:00:00.007';
DECLARE @DateTime2 datetime2(7) = @DateTime;
IF @DateTime = @DateTime2 PRINT 'EQUAL predicate is true' ELSE PRINT 'EQUAL predicate is not true';
IF @DateTime < @DateTime2 PRINT 'LESS THAN predicate is true' ELSE PRINT 'LESS THAN predicate is not true';
IF @DateTime > @DateTime2 PRINT 'GREATER THAN predicate is true' ELSE PRINT 'GREATER THAN predicate is not true';
GO

To provide insight into why the comparisons result in greater than and less than respectively, the script below shows the nanoseconds value of the compared data types:

DECLARE @DateTime datetime = '2016-01-01T00:00:00.003';
DECLARE @DateTime2 datetime2(7) = @DateTime;
SELECT DATEPART(nanosecond, @DateTime); --3333333
SELECT DATEPART(nanosecond, @DateTime2); --3333300
GO
DECLARE @DateTime datetime = '2016-01-01T00:00:00.007';
DECLARE @DateTime2 datetime2(7) = @DateTime;
SELECT DATEPART(nanosecond, @DateTime); --6666666
SELECT DATEPART(nanosecond, @DateTime2); --6666700
GO

The datetime2 type is accurate only to 100 nanosecond whereas datetime includes values to the nanosecond (and beyond) because the theoretical precision of repeating decimal values is unlimited. The implication is that a datetime type with a repeating decimal value will never compare equally with any temporal type except datetime.

Datetime conversion and comparison behavior is controlled by the database compatibility level. Databases in SQL Server 2016 level (130) use the new behavior and the legacy behavior is used with other levels.

Summary
These datetime behavior changes have the benefit of improved accuracy and performance of datetime conversion/comparison. Affected applications can use a pre-SQL Server 2016 database compatibility level until they can be remediated.

I recommend one avoid comparing datetime directly to other temporal types going forward. Instead convert the datetime value to the type being compared and use the converted value in the predicate. It’s generally best practice to match data types whenever possible for performance and to avoid ambiguity.

SQL Saturday – St. Louis 2016

Please join us for SQL Saturday, St. Louis 2016. We have 25 sessions scheduled covering a wide range of topics on the Microsoft Data Platform stack.

As in the past few years, the event will be held at the SLU Center for Workforce & Organizational Development, 3545 Lindell Blvd , Saint Louis, Missouri 63103. Visit the event home page for registration, directions and session details.

See you there!

SQL Server TVP Performance Gotchas

Table-valued parameters have important considerations that developers and DBAs need to be aware of. It is essential that application code specify the proper data type and length for TVP columns in order to achieve optimal performance and reduce unnecessary overhead. Additionally, a trace (Extended Events, server-side SQL Trace, Profiler) that captures RPC events of an inappropriately defined TVP can not only exacerbate performance issues, but affect stability of the SQL Server instance in some cases.

The graph below summarizes the impact the application code max column length specification can have on performance, without and with a trace running. All tests used the same table type of 10 varchar(50) columns and a 10,000 row TVP rows passed via a DataTable object of 10 string columns. The only variables were the max column length specified by the app code and a trace running on the database server. The client application was run on a different machine than the database server and elapsed time measured by the application.

TVP Performance Comparison
Figure 1: Impact of TVP maximum column length specification with and without tracing

The average duration was 113ms without a trace running when the app code used the default -1 max column length. However, when max column length 50 was specified (matching the varchar(50) column of the table type), the average duration dropped significantly to 75ms. The trivial code change of specifying the string column max length of 50 improved performance by 33%.

I then ran the same pair of tests while a trace captured the RPC completed events. The average duration of the default max length test increased from 113ms to 9.324ms with the trace running, an over 80x degradation in performance! The test with the explicit 50 character max length was not nearly impacted as much, increasing from 75ms to 89ms (which is tolerable, IMHO).

TVP Internals
Under the hood, TVP data are passed to SQL Server over the Tabular Data Stream (TDS) protocol. The client API sends TVP column meta-data to SQL Server describing the data type, length, and other meta-data for each TVP column followed by data rows with each column in native format matching the preceding data type specification. SQL Server uses the provided column meta-data to prepare and fill buffers for efficient processing on the server side, leveraging native types to eliminate parsing overhead similarly to other parameterized queries.

Before a query or stored proc with a TVP starts executing, SQL Server creates a table in tempdb with the same schema as the parameter table type and uses bulk insert internally to efficiently load the table with TVP rows streamed by the client application. The size of a TVP is constrained only by available tempdb storage. SQL Server executes the query/proc after the TVP temp table is loaded and the parameterized T-SQL query/proc can then use the TVP data.

The TVP columns provided by the client application do not have to match the schema of the target table type; SQL Server implicitly converts TVP values to match the target table type column when data types differ. Although not optimal, implicit conversion is generally not a major factor in overall TVP performance.

The application-specified TVP max column length can impact performance significantly, and in some cases dramatically, as illustrated by the performance tests shown earlier. SQL Server prepares to receive TVP data up to the max length specified by the client application rather than the defined size of the target table type column. When the specified max length of variable length columns exceed the 8000 byte tipping point, SQL Server uses a different code path to allow for large object (LOB) values up to 2GB. Unless the table type actually contains LOB values (varchar(MAX), nvarchar(MAX)), database server resources are wasted unnecessarily when an inappropriate max column length is specified.

LOB values are especially problematic when a trace captures the RPC completed event of a TVP query. Tracing uses memory from the OBJECTSTORE_LBSS memory pool to build trace records that contain TVP LOB values. From my observations of the sys.dm_os_memory_clerks DMV, each LOB cell of a TVP requires about 8K during tracing regardless of the actual value length. This memory adds up very quickly when many rows and lob columns are passed via a TVP with a trace running. For example, the 10,000 row TVP with 10 LOB columns used in the earlier test required over 800MB memory for a single trace record. Consider that a large number of TVP LOB cells and/or concurrent TVP queries can cause queries to fail with insufficient memory errors. In extreme cases, the entire instance can become unstable and even crash under due to tracing of TVP queries.

Specifying Proper TVP Parameter Column Meta-Data
A SQL Server development best practice has long been to use strongly-typed parameters with attention to detail regarding the parameter data type and length such that it is consistent with the types on the server. This practice improves performance by avoiding implicit data type conversions, promotes sargable expressions, avoids unnecessary procedure cache bloat, and inherently validates data for proper typing on the client before it is sent to the database server. With scalar parameters, one need only specify the correct SqlDbType along with the proper length, precision/scale (avoiding AddWithValue method to add parameters) and all is well in the world.

TVP parameters require additional column meta-data not applicable to scalar parameters. The parameter data type of a TVP in .NET is always SqlDbType.Structured. The additional TVP column meta-data is inferred from the supplied parameter value, which may be a DbDataReader, IEnumerable, or DataTable object. These objects inherently contain column meta-data and methods enumerate rows, which the SqlClient API uses to send the TVP to SQL Server.

DataTable objects are most commonly used as TVP values. DataTables are easy to use and can serve as containers for data beyond just TVP usage. But unlike DbDataReader and IEnumerable objects, a big gotcha with a DataTable is that the default data type String with maximum length of -1 (2GB LOB). This is the .NET equivalent of the SQL Server nvarchar(MAX) data type and has many insidious and negative implications with a TVP. First, values of types other than string that are added to a DataTable string column will be converted to string (DateTime, Integer, GUID, etc.). Consequently, using the default DataTable column string data type for non-string types will:

• increase client memory requirements compared to more compact native types
• incur conversion overhead
• prevent strong-typed data validation on client side
• require date format aware formatting of date and datetime values
• require using a period as decimal separators
• increase network usage compared with smaller native types

When String is the proper column data type, developers must be especially mindful of the max length specification when the DataTable is used as the TVP value. Strings in Windows and .NET are Unicode, requiring 2 bytes per character. This means a max length of over 4000 characters will cross the 8000 byte threshold for LOB data on the server side regardless of the table type on the server. Avoid using long string columns in TVPs when many rows are passed and never use the default -1 length unless a MAX type is actually intended.

DBAs who support applications that use TVPs should be aware of the repercussions tracing can have on performance and SQL Server memory. Avoid capturing TVP RPC completed events of large TVP requests, if possible. When tracing TVP RPC completed events, monitor the OBJECTSTORE_LBSS memory pool for excessive memory usage.

Acknowledgements
I’d like the thank SQL Server MVP Ola Hallengren for his suggestion for me to write this article.

SSMS 2016 is Free

The general available release of SQL Server Management Studio 2016 is free. No SQL Server license is required to install and use it for either development or production purposes. This GA version supports SQL Server versions 2008 through 2016 plus Azure SQL Database. You might be able to use it against earlier SQL versions too but that’s not officially supported and features will be limited. SSMS 2016, now based on the Visual Studio 2015 isolated shell, can be downloaded from the SSMS Download Page.

Prior to SQL Server 2016, one typically installed SSMS from the SQL Server installation media and was upgraded only by CUs, service packs, or SQL Server version upgrades. This changed with the release of SQL Server 2016; SSMS is a now a separate download. After installation, regular updates are made available (monthly) and can be installed from the SSMS Tools–>Check for Updates… menu option. These frequent updates allow the tooling to stay aligned with both on-prem and Azure SQL Database while delivering enhancements and bug fixes to SSMS much more quickly. I am quite impressed by the responsiveness of the SSMS development team in this regard, having used it since the initial preview versions of the VS 2015 isolated shell were made available.

An important change in this GA version compared to earlier versions of SSMS is that SET XACT_ABORT ON is now the default session setting. That’s a good thing, in my opinion, because it helps ensure one doesn’t accidentally leave a transaction open in a query window following an error (I have many horror stories about that). But it does change script execution behavior because queries, with or without error handling, will generally not continue after an error with XACT_ABORT ON. Be aware of this when running scripts from SSMS. This and other SET options can be specified from the SSMS Query–>Query options–>Execution–>Advanced. I strongly recommend specifying SET XAT_ABORT ON unless you have a specific reason not to do so.

Kudos to the Microsoft for providing SSMS as a free tool and to the SSMS development team for their ongoing commitment to SQL Server tooling improvements and timely delivery.

Add or Remove IDENTITY Property From an Existing Column Efficiently

Introduction
Refactoring is often needed to improve schema design or address changes in requirements. Occasionally, these schema modifications require removing the IDENTITY property from an existing non-empty table while retaining the existing values. Unfortunately, there is no DDL command to add or remove IDENTITY from an existing column. Traditionally, one must go through the pain of either recreating the table or jumping through hoops by adding a new column, updating the new column value with the old column value, and dropping the original column. This is especially problematic with large tables and short maintenance windows.

This article details how to use SWITCH as an alternative technique to add or remove IDENTITY from an existing column. I have been using SWITCH for many years but learned only recently that the identical schema restriction for SWITCH does not include IDENTITY column property; one can SWITCH data from one table to another even though the IDENTITY column property differs. This behavior allows the IDENTITY property to be added or removed.

Overview
All tables are partitioned from a database storage engine perspective since SQL Server 2005, although multiple partitions require Enterprise Edition. The implication of this storage architecture is that one can use ALTER TABLE…SWITCH regardless of SQL Server edition to move the entire contents of one table to another without physically moving rows as long as the source and target tables have an identical schema and are physically aligned (indexes and tables on same filegroup). SWITCH performs fast storage meta-data changes so the operation typically takes less than a second regardless of table size. The target table must be empty before SWITCH. After SWITCH, the target table will contain the data and the source table will be empty (similar to TRUNCATE).

Since the identical schema restriction does not include the IDENTITY column property, SWITCH is a handy technique to add or remove IDENTITY from an existing column of a non-empty table. For an empty table, a simple drop and create is easier and more efficient.

Remove IDENTITY from an Existing Column
The script below uses a staging table without the IDENTITY column property along with SWITCH to remove the IDENTITY property from the column and retain previously assigned identity values. I drop the empty original table and rename the staging table to the original name along with constraints and indexes after the operation.

--create sample table and data
CREATE TABLE dbo.AddIdentityExample(
	  IncrementalColumn int NOT NULL
		CONSTRAINT PK_AddIdentityExample PRIMARY KEY
	, OtherData varchar(20) NULL
	);
CREATE INDEX idx_AddIdentityExample_OtherData
	ON dbo.AddIdentityExample(OtherData);
INSERT INTO dbo.AddIdentityExample VALUES
	  (1, 'Sample data one')
	, (2, 'Sample data two')
	, (3, 'Sample data three');
GO

SET XACT_ABORT ON;
BEGIN TRY
	BEGIN TRAN;
	--create staging table with same schema, indexes, and constraints
	CREATE TABLE dbo.AddIdentityExampleStaging(
		  IncrementalColumn int IDENTITY NOT NULL --IDENTITY column property added
			CONSTRAINT PK_AddIdentityExampleStaging PRIMARY KEY
		, OtherData varchar(20) NULL
		);
	CREATE INDEX idx_AddIdentityExampleStaging_OtherData
		ON dbo.AddIdentityExampleStaging(OtherData);

	ALTER TABLE dbo.AddIdentityExample
		SWITCH TO dbo.AddIdentityExampleStaging;
	DROP TABLE dbo.AddIdentityExample;
	--rename table
	EXEC sp_rename
		  @objname = N'dbo.AddIdentityExampleStaging'
		, @newname = N'AddIdentityExample'
		, @objtype = 'OBJECT';
	--rename constraints
	EXEC sp_rename
		  @objname = N'dbo.PK_AddIdentityExampleStaging'
		, @newname = N'PK_AddIdentityExample'
		, @objtype = 'OBJECT';
	--rename indexes
	EXEC sp_rename
		  @objname = N'dbo.AddIdentityExample.idx_AddIdentityExampleStaging_OtherData'
		, @newname = N'idx_AddIdentityExampleStaging_OtherData'
		, @objtype = N'INDEX';
	--seed IDENTITY with current max column value
	DBCC CHECKIDENT(N'dbo.AddIdentityExample');
	COMMIT;
END TRY
BEGIN CATCH
	IF @@TRANCOUNT > 0 ROLLBACK;
	THROW;
END CATCH;
GO

Add IDENTITY to an Existing Column
This script uses a staging table with the IDENTITY column property along with SWITCH to add the IDENTITY property while retaining the existing column values. After the SWITCH operation, DBCC CHECKIDENT seeds the next IDENTITY value greater than the current value in the column.

Note that an IDENTITY column must now allow NULL values so, in the case of a nullable column, the column must first be altered to NOT NULL before using this method.

--create sample table and data
CREATE TABLE dbo.RemoveIdentityExample(
	  IncrementalColumn int IDENTITY NOT NULL
		CONSTRAINT PK_RemoveIdentityExample PRIMARY KEY
	, OtherData varchar(20) NULL
	);
CREATE INDEX idx_RemoveIdentityExample_OtherData
	ON dbo.RemoveIdentityExample(OtherData);
INSERT INTO dbo.RemoveIdentityExample VALUES
	  ('Sample data one')
	, ('Sample data two')
	, ('Sample data three');
GO

SET XACT_ABORT ON;
BEGIN TRY
	BEGIN TRAN;
	--create staging table with same schema, indexes, and constraints
	CREATE TABLE dbo.RemoveIdentityExampleStaging(
		  IncrementalColumn int NOT NULL --IDENTITY column property removed
			CONSTRAINT PK_RemoveIdentityExampleStaging PRIMARY KEY
		, OtherData varchar(20) NULL
		);
	CREATE INDEX idx_RemoveIdentityExampleStaging_OtherData
		ON dbo.RemoveIdentityExampleStaging(OtherData);

	ALTER TABLE dbo.RemoveIdentityExample
		SWITCH TO dbo.RemoveIdentityExampleStaging;
	DROP TABLE dbo.RemoveIdentityExample;
	--rename table
	EXEC sp_rename
		  @objname = N'dbo.RemoveIdentityExampleStaging'
		, @newname = N'RemoveIdentityExample'
		, @objtype = 'OBJECT';
	--rename constraints
	EXEC sp_rename
		  @objname = N'dbo.PK_RemoveIdentityExampleStaging'
		, @newname = N'PK_RemoveIdentityExample'
		, @objtype = 'OBJECT';
	--rename indexes
	EXEC sp_rename
		  @objname = N'dbo.RemoveIdentityExample.idx_RemoveIdentityExampleStaging_OtherData'
		, @newname = N'idx_RemoveIdentityExampleStaging_OtherData'
		, @objtype = N'INDEX';
	COMMIT;
END TRY
BEGIN CATCH
	IF @@TRANCOUNT > 0 ROLLBACK;
	THROW;
END CATCH;
GO



--create sample table and data
CREATE TABLE dbo.AddIdentityExample(
	  IncrementalColumn int NOT NULL
		CONSTRAINT PK_AddIdentityExample PRIMARY KEY
	, OtherData varchar(20) NULL
	);
CREATE INDEX idx_AddIdentityExample_OtherData
	ON dbo.AddIdentityExample(OtherData);
INSERT INTO dbo.AddIdentityExample VALUES
	  (1, 'Sample data one')
	, (2, 'Sample data two')
	, (3, 'Sample data three');
GO

SET XACT_ABORT ON;
BEGIN TRY
	BEGIN TRAN;
	--create staging table with same schema, indexes, and constraints
	CREATE TABLE dbo.AddIdentityExampleStaging(
		  IncrementalColumn int IDENTITY NOT NULL --IDENTITY column property added
			CONSTRAINT PK_AddIdentityExampleStaging PRIMARY KEY
		, OtherData varchar(20) NULL
		);
	CREATE INDEX idx_AddIdentityExampleStaging_OtherData
		ON dbo.AddIdentityExampleStaging(OtherData);

	ALTER TABLE dbo.AddIdentityExample
		SWITCH TO dbo.AddIdentityExampleStaging;
	DROP TABLE dbo.AddIdentityExample;
	--rename table
	EXEC sp_rename
		  @objname = N'dbo.AddIdentityExampleStaging'
		, @newname = N'AddIdentityExample'
		, @objtype = 'OBJECT';
	--rename constraints
	EXEC sp_rename
		  @objname = N'dbo.PK_AddIdentityExampleStaging'
		, @newname = N'PK_AddIdentityExample'
		, @objtype = 'OBJECT';
	--rename indexes
	EXEC sp_rename
		  @objname = N'dbo.AddIdentityExample.idx_AddIdentityExampleStaging_OtherData'
		, @newname = N'idx_AddIdentityExampleStaging_OtherData'
		, @objtype = N'INDEX';
	DBCC CHECKIDENT(N'dbo.AddIdentityExample');
	COMMIT;
	UPDATE STATISTICS dbo.AddIdentityExample;
END TRY
BEGIN CATCH
	IF @@TRANCOUNT > 0 ROLLBACK;
	THROW;
END CATCH;
GO

ORDER BY Is Required

I often see folks assume rows will be returned in order by the clustered index when ORDER BY is not specified in the query. This is despite the fact that the SQL Server Books Online ORDER BY topic specifically states “The order in which rows are returned in a result set are not guaranteed unless an ORDER BY clause is specified.”

For those that want to save a few keystrokes and omit the needed ORDER BY, here’s one example that shows rows may be returned in an order other than the clustered index.

USE tempdb;
GO

CREATE DATABASE Demo
	ON
		  (NAME='Demo_Primary1', FILENAME='D:\SqlDataFiles\Demo_Primary', SIZE=10MB)
		, (NAME='Demo_Primary2', FILENAME='D:\SqlDataFiles\Demo_Primary2', SIZE=10MB)
	LOG ON
		  (NAME='Demo_Log', FILENAME='L:\SqlLogFiles\Demo_Log.ldf', SIZE=1MB);
GO

USE Demo;
GO

CREATE TABLE OrderTest
(
	ID int,
	Name VARCHAR(50),
	OtherData char(5000)
);
CREATE CLUSTERED INDEX cdx_OrderTest ON OrderTest(ID);
GO

--load 512 odd numbers
WITH 
	 t4 AS (SELECT n FROM (VALUES(0),(0),(0),(0)) t(n))
	,t1K AS (SELECT ROW_NUMBER() OVER (ORDER BY (a.n)) AS num
		FROM t4 AS a CROSS JOIN t4 AS b CROSS JOIN t4 AS c CROSS JOIN t4 AS d CROSS JOIN t4 AS e)
INSERT INTO OrderTest WITH(TABLOCKX)
SELECT num, 'test', 'test'
FROM t1K
WHERE num % 2 = 1;
GO

----load 512 even numbers
WITH 
	 t4 AS (SELECT n FROM (VALUES(0),(0),(0),(0)) t(n))
	,t1K AS (SELECT ROW_NUMBER() OVER (ORDER BY (a.n)) AS num
		FROM t4 AS a CROSS JOIN t4 AS b CROSS JOIN t4 AS c CROSS JOIN t4 AS d CROSS JOIN t4 AS e)
INSERT INTO OrderTest WITH(TABLOCKX)
SELECT num, 'test', 'test'
FROM t1K
WHERE num % 2 = 0;
GO

SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
GO

SELECT * FROM dbo.OrderTest;
/*
ID	Name	OtherData
2	test	test
4	test	test
6	test	test
8	test	test
17	test	test
19	test	test
21	test	test
23	test	test
25	test	test
27	test	test
...
*/
SELECT * FROM dbo.OrderTest ORDER BY ID;
/*
ID	Name	OtherData
1	test	test
2	test	test
3	test	test
4	test	test
5	test	test
6	test	test
7	test	test
8	test	test
9	test	test
10	test	test
...
*/
GO

The reason the query without the ORDER BY returns rows in a different order than the clustered index logical order is because SQL Server chose to scan the table in physical allocation order using the IAM (Index Allocation Map) instead of following the clustered index linked list. For details of the allocation and page linkage of this table, you can use the undocumented (don’t use in in prod) sys.dm_db_database_page_allocations table-valued function in SQL 2012 and later:

SELECT * 
FROM sys.dm_db_database_page_allocations(DB_ID(), OBJECT_ID(N'OrderTest'), 1, 1, 'DETAILED')
WHERE page_type_desc = 'DATA_PAGE';

The IAM scan was done here due to no ordering requirement for this query and the allocation order scan was deemed more efficient, and permissible in the READ UNCOMMITTED isolation level because data integrity isn’t needed. Other factors can also affect the ordering of results, including available indexes, execution plan operators, parallelism, and other concurrent activity.

Summary
Remember that SQL Server strives to execute queries as efficiently as possible as long as it adheres to the data contract. The chosen plan and storage engine internals that vary by SQL Server version and edition will influence ordering of results. The ordering of rows is by happenstance unless ORDER BY is specified.

QUERYTRACEON Permissions

I recently had a SQL Server 2014 stored procedure that ran for several hours with high CPU usage. I narrowed the problem down to a single query with dozens of joins was stuck in compilation. Suspecting an issue with the new cardinality estimator, I changed the database compatibility level from 120 (SQL Server 2014) to 110 (SQL Server 2012) so that the legacy CE would be used. The query then compiled immediately and executed quickly, which confirmed the new CE as the culprit of the excessive compilation times.

OPTION QUERYTRACEON (9481) to the Rescue
Changing the database compatibility level was a heavy-handed option I didn’t want to pursue because some queries benefited from the new CE behavior. Instead, I specified trace flag 4891 using the OPTION QUERYTRACEON(9481) query hint to use the legacy CE for the problem query. This is an acceptable workaround for me until the bug is addressed. However, you might not be so lucky. A more common use case for OPTION QUERYTRACEON(9481) is performance regression with the new CE. Although not documented as far as I know, the QUERYTRACEON query hint requires sysadmin permissions unless executed from within a stored procedure. Consequently, QUERYTRACEON won’t be an option for those who need the hint for queries run by non-sysadmins outside stored procs. I found SQL Server MVP Ola Hallengren’s Connect suggestion to allow QUERYTRACEON with no additional permissions. I upvoted and encourage you to do so too.