Improper SET Option Errors

SQL Server backwards compatibility SET options are hidden land mines that explode when one tries to use a feature that requires proper session settings, such as a filtered index, indexed view, etc. The QUOTED_IDENTIFIER, ANSI_NULLS, and ANSI_PADDING settings are especially problematic. These are persisted as meta-data with view, stored procedure, function, trigger, and column definitions and, since persisted settings override current session settings, a nasty runtime error like “…failed because the following SET options have incorrect settings: ‘QUOTED_IDENTIFIER…’” occurs when a persisted setting is set to OFF even though the current session settings are set properly. “Sticky” OFF settings are a common problem and often accidental rather than because the setting is required by T-SQL code that doesn’t follow ISO SQL standards.

This article reviews QUOTED_IDENTIFIER, ANSI_NULLS, and ANSI_PADDING settings, settings persisted as database object meta-data, and considerations with SQL Server tools. I’ll discuss how to identify and remediate problem objects and considerations to ensure proper ON settings going forward.

Background
Microsoft SQL Server, along with its Sybase ancestor, are a bit long in the tooth nowadays. The original code base was developed decades ago before ISO (previously ANSI) SQL standards were formalized. As SQL Server evolved to respect ISO SQL standards, SET options were introduced to avoid breaking existing applications that expected non-ISO behavior. This allows legacy T-SQL code to run on newer SQL Server versions without changes while supporting ISO SQL standards for new development. The SQL Server product team goes through great effort, albeit sometimes to a fault, to provide backwards compatibility as to not block upgrades.

Some SQL Server features require ISO SQL standard settings
plus other session settings to be set properly in order to be used and avoid runtime errors. Features that require these settings include:

  • Filtered Indexes
  • Indexed Views
  • Indexes on computed columns
  • XML indexes
  • Query notifications (a.k.a. SqlDependency)

The below session settings, sometimes called the “magic 7 settings”, must be set properly when using these features. Otherwise, the above
indexes will not be used or a runtime error will be raised when data are
modified:

  • QUOTED_IDENTIFIER ON
  • ANSI_NULLS ON
  • ANSI_PADDING ON
  • ANSI_WARNINGS  ON
  • ARITHABORT ON
  • CONCAT_NULL_YIELDS_NULL ON
  • NUMERIC_ROUNDABORT OFF

Modern Microsoft SQL Server drivers (e.g. ODBC, OLE DB, SqlClient, JDBC) connect with all these session settings properly set by default. The lone exception is that ARITHABORT OFF is set by default in the context of a database in the SQL Server 2000 compatibility level (80). ARITHABORT OFF will not be an issue for most since the unsupported SQL Server 2008 version was the last to include the SQL Server 2000 database compatibility level. ARITHABORT will be set ON by default in SQL Server 2012 and later versions.

QUOTED_IDENTIFIER, ANSI_NULLS, and ANSI_PADDING Session Settings
Given these session settings are initially set properly to ON by default, an OFF setting is a result of one or more of the following:

• An explict T-SQL SET OFF statement after the connection is made
• Non-default OFF setting specified by API connection (e.g. connection string keyword, DSN property, or set in app code programmatically)
• Overridden by a persisted object meta-data setting

Improper SET OFF statements most often leak into T-SQL code due to inattention to detail. The same is true for OFF API connection settings and persisted meta-data settings. T-SQL code that adheres to ISO SQL standard single-quotes literal enclosures and IS NULL/IS NOT NULL comparison operators can run with these settings either ON or OFF with the same outcome. Consequently, there’s no reason not to use the proper ON default settings when these ISO SQL practices are followed. One can simply remove unintentional T-SQL SET OFF statements, fix API settings, and correct persisted meta-data (discussed later) in order to future proof code so that it can run with or without features that require the proper settings.

Legacy T-SQL code that requires OFF settings due to non-ISO compliant constructs needs to be remediated to leverage the aforementioned features before changing settings to ON. Below is a summary of these settings and considerations for changing to use the ON setting.

QUOTED_IDENTIFIER
QUOTED_IDENTIFIER OFF is required only when T-SQL code uses double-quotes instead of single quotes to enclose literals. It’s typically a trivial effort to fix non-conformant code to use single quotes instead of double quotes with a search/replace. A caveat is that single quotes embedded within literals must to be escaped with two consecutive single quotes. The code will then run regardless of the QUOTED_IDENTIFIER setting and follow ISO SQL standards. Only minimal testing is usually needed after remediation since the setting is evaluated at parse time; parsing errors will occur immediately if double-quotes are used to enclose literals. An exception is dynamic SQL where errors won’t be found until one tries to execute an invalid dynamic SQL statement containing double-quote literal enclosures.

The current session QUOTED_IDENTIFER setting is persisted as meta-data when a view, stored procedure, function, or trigger is created or altered. To change the persisted OFF setting to ON, recreate the object from a session with QUOTED_IDENTIFER and ANSI_NULLS ON. Be mindful to ensure the session setting is ON when executing DDL scripts to prevent improper settings going forward. See the considerations topic later in this article for gotchas with SQL Server tools.

ANSI_NULLS
ANSI_NULLS OFF has long been deprecated. The OFF setting allows code to test for NULL values using equality/inequality predicates instead of the ISO SQL standard “IS NULL” and “IS NOT NULL” operators. For example, the “ColumnName = NULL” will evaluate to TRUE instead of UNKNOWN with the ANSI_NULLS OFF setting. Such code should be changed to “ColumnName IS NULL” to follow ISO SQL standards and provide the same behavior regardless of the session ANSI_NULLS setting. Changes made for ANSI_NULLS compliance necessitate more extensive testing because runtime behavior changes rather than parse time errors like QUOTED_IDENTIFIER.

Like QUOTED_IDENTIFER, the current session ANSI_NULLS setting is persisted as meta-data when a view, stored procedure, function, or trigger is created or altered. Recreate the object from a session with QUOTED_IDENTIFER and ANSI_NULLS ON to change the persisted OFF setting to ON and take care to ensure the setting is ON when executing DDL scripts.

ANSI_PADDING
ANSI_PADDING OFF has also been deprecated for quite some time and the SQL Server documentation specifically calls out “ANSI_PADDING should always be set to on.” In summary, a column-level ANSI_PADDING OFF setting causes nullable fixed-length char(n) and binary(n) columns to behave like variable-length varchar(n) and varbinary(n) columns. Furthermore, SQL Server automatically trims trailing blank characters from character data and leading binary zeros from binary data and stores the values as variable length instead of storing the provided value as-is during inserts and updates. Varchar(n)/varbinary(n) columns with ANSI_PADDING OFF are similarly trimmed. Note that it is the persisted ANSI_NULLS column meta-data setting that determines the storage and trimming behavior, not the current session ANSI_PADDING setting. The session ANSI_PADDING must still be ON when using features that require proper settings.

The current session ANSI_PADDING setting is persisted as column-level meta data when tables are created and new columns added to an existing tables. This setting affects only char(n) NULL, binary(n) NULL, and varchar(n)/varbinary(n) columns regardless of nullability. The setting doesn’t apply to varchar(MAX) , varbinary(MAX), char(n) NOT NULL, binary(n) NOT NULL, and other data types.

Since SQL Server comparison operators ignore trailing blanks when comparing strings as well as leading binary zeros when comparing binary values, there isn’t usually an impact from a T-SQL perspective with changing ANSI_PADDING from OFF to ON (aside from storage when the provided values aren’t already trimmed). However, application code might consider training blanks and leading binary zeros, resulting in differences when comparing trimmed and non-trimmed values. Testing is needed depending on how data are used in the app code.

To change a persisted column ANSI_PADDING setting from OFF to ON, execute ALTER TABLE…ALTER COLUMN from an ANSI_PADDING ON session, specifying the same definition as the existing column. Note that this technique will only change ANSI_PADDING from OFF to ON. Altering an existing ANSI_PADDING ON column from an ANSI_PADDING OFF session will not change the persisted setting.

Considerations to Ensure Proper Settings
All “magic 7 settings” are set properly by default with current drivers so one might think ensuring proper settings is easy. This is largely true for application code but, sadly, not with SQL Server tools due to backward compatibility behavior and inattention to detail when using them.

SQL Server Agent uses ODBC (which sets all “magic 7 settings” properly) but then explicitly sets QUOTED_IDENTIFIER OFF after connecting for backwards compatibility. The implication is one needs to explicitly add a SET QUOTED_IDENTIFIER ON statement to T-SQL scripts executed by SQL Server Agent T-SQL job steps. This is optional when executing stored procedures because the sticky QUOTED_IDENTIFIER ON setting will override the session setting.

SQLCMD similarly uses ODBC and explicitly sets QUOTED_IDENTIFIER OFF. This is a common cause of inadvertent persisted QUOTED_IDENTIFIER OFF leaking into databases as meta-data when SQLCMD is used to deploy database changes. One must specify the SQLCMD -I (uppercase eye) argument to opt-in for QUOTED_IDENTIFIER ON. Additionally, deployment scripts should either explicitly include SET QUOTED_IDENTIFIER ON, SET ANSI_NULLS ON, and SET ANSI_PADDING ON statements or omit these set statements entirely so session settings are used. Avoid including SET OFF statements in scripts for these options.

BCP also uses ODBC, but as you might have guessed, explicitly sets QUOTED_IDENTIFIER OFF too. One needs to opt-in for QUOTED_IDENTIFIER ON by specify the -q (lower case queue) BCP argument to avoid runtime errors.

SSMS, which uses SqlClient, is nicer in that it sets all options properly by default and doesn’t turn set QUOTED_IDENTIFIER OFF behind your back. But be aware that SSMS will honor customized SET options specified for the current window (Query–>Query Options–>ANSI) and new windows (Tools–>Options–>Query Execution–>SQL Server–>ANSI). Make sure the Magic 7 settings are properly set in SSMS options.

Be mindful that SET statements executed in an SSMS query window change the session settings for the lifetime of the query window connection. Take care when executing DDL scripts in a reused query window and, when in doubt, check current session settings using DBCC USEROPTIONS to verify proper settings.
SSMS (and SMO) scripting tools have a terrible habit of including an extraneous SET ANSI_PADDING OFF at the end of CREATE TABLE scripts. Either remove the statement after scripting or set the ANSI PADDING generation option to False (Tools–>Options–>SQL Server Object Explorer–>Scripting–>Generate SET ANSI PADDING commands). This will help avoid unintentionally creating ANSI_PADDING OFF columns.

How to Identify Improper Persisted Settings
It’s common to find wrong settings in existing databases for the reasons mentioned earlier. The catalog view queries below will identify objects with problem persisted OFF settings.

--stored procedures, views, functions, triggers with QUOTED_IDENTIFIER or ANSI_NULLS OFF
 SELECT
       OBJECT_SCHEMA_NAME(o.object_id) AS SchemaName
     , OBJECT_NAME(o.object_id) AS ObjectName
     , o.type_desc AS ObjectType
 FROM sys.objects AS o
 WHERE
     0 IN(
           OBJECTPROPERTY(o.object_id, 'ExecIsQuotedIdentOn')
         , OBJECTPROPERTY(o.object_id, 'ExecIsAnsiNullsOn')
 )
 ORDER BY
       SchemaName
     , ObjectName;
 --columns with ANSI_PADDING OFF
 SELECT
       OBJECT_SCHEMA_NAME(t.object_id) AS SchemaName
     , OBJECT_NAME(t.object_id) AS ObjectName
     , c.name AS ColumnName
 FROM sys.tables AS t
 JOIN sys.columns AS c ON c.object_id = t.object_id
 JOIN sys.types AS ty ON ty.system_type_id = c.system_type_id AND ty.user_type_id = c.user_type_id
 WHERE
     c.is_ansi_padded = 0
     AND (
         (ty.name IN ('varbinary','varchar') AND c.max_length <> -1)
         OR (ty.name IN ('binary','char') AND c.is_nullable = 1)
 );

Summary
Attention to connection settings will facilitate using SQL Server features.

Performance testing with DBCC DROPCLEANBUFFERS

DBCC DROPCLEANBUFFERS is a common practice when unit testing SQL Server performance on an isolated test instance. This allows one to evaluate different candidates for query, stored procedure, and index tuning based on execution times in a worst-case cold buffer cache scenario and provides better test repeatability by leveling the playing field before each test. However, clearing cache in this way has considerations one should be aware of.

An important detail sometimes overlooked is that one must first execute a CHECKPOINT command in the context of the database(s) to be tested before executing DBCC DROPCLEANBUFFERS. DBCC DROPCLEANBUFFERS only frees pages that are not dirty (cached version same as on disk version) so modified pages will remain in cache when CHECKPOINT isn’t first executed. Overlooking the CHECKPOINT can result in non-repeatable test timings. One should always run CHECKPOINT before DBCC DROPCLEANBUFFERS.

One can make the argument that DBCC DROPCLEANBUFFERS might not be particularly valuable for testing. First, the storage engine in SQL Server Enterprise Edition (or Developer Edition, which is often used when testing) behaves differently with a cold cache versus a warm one. With a warm cache, a page not already in cache (e.g. index seek by primary key) will be fetched from disk using a single 8K page IO request as one expects. However, when the cache isn’t fully warmed up (Buffer Manager’s Target Pages not yet met), the entire 64K extent (8 contiguous 8K pages) is read for the single page request regardless of whether the adjacent pages are actually needed by the query. This has the benefit of warming the cache much more quickly than would otherwise occur, but given that the normal steady state of a production SQL Server is a warm cache, testing with a cold cache isn’t a fair comparison of different plans. More data than normal will be transferred from storage so timings may not be indicative of actual performance.

The storage engine also behaves differently during scans when data are not already cached regardless of the SQL Server edition. During sequential scans, read-ahead prefetches multiple extents from storage at a time so that data is in cache by the time it is actually needed by the query. This greatly reduces the time needed for large scans because fewer IOPS are required and sequential access reduces costly seek time against spinning media. Furthermore, Enterprise and Developer editions perform read-ahead reads more aggressively than lesser editions, up to 4MB (500 pages) in a single scatter-gather IO in later SQL Server versions.

The implication with cold cache performance testing is that both full extent reads and read-ahead prefetches are much more likely to occur such that test timings of different execution plans are not fairly comparable. These timings will over emphasize hardware (storage) performance rather than query performance as intended. Given hardware differences on a test system and that cold cache is not the typical production state, cold cache testing isn’t a good indicator of query performance and resource usage one will experience in a production system.

I recommend using logical reads as a primary performance measure when evaluating query and index tuning candidates. Logical reads is a count of the number of pages touched by the query regardless of whether data was read from storage or already cached, making it a better comparison indicator of data access resource utilization. The number of logical reads can be determined by running the query or procedure with SET STATISTICS IO ON and will be consistent regardless of whether physical IO was needed or not. Query times may be used as a secondary measure by running the query more than once, discarding the results of first run, and taking the average of subsequent executions. This is not to say these logical read measurements and timings will predict actual production performance but will allow one to more accurately evaluate resource usage of different execution plans.

AddWithValue is Evil

AddWithValue is the root cause of many SQL Server performance and concurrency problems. Albeit the SqlParameterCollection.AddWithValue method is a slightly more convenient way to add parameters and values to a SqlCommand, the implications are insidious because the high costs of SQL Server resource utilization is not apparent during development. It’s often not until the application is in production and under load that it is realized that queries are slow due to full scans, excessive resource usage, lock escalation, blocking, cache bloat, and many deadlocks.

A best practice in SQL Server development has long been strongly-typed parameters. Not only does this practice thwart SQL injection (barring non-parameterized dynamic SQL statements), performance is generally improved due to caching of queries that differ only by values, avoiding the cost of compiling the query each time it is executed. Importantly, matching parameter definitions with referenced columns promotes efficiency by avoiding implicit conversions that can preclude efficient index use.

To follow best practices, avoid AddWithValue and instead add parameters and values using a method that allows specification of the desired SqlDbType along with the appropriate length, precision, and scale. Although slightly more verbose than AddWithValue, the performance benefits are huge and worth a few keystrokes.

The Problem with Addwithvalue

The nastiness with AddWithValue is that ADO.NET infers the parameter definition from the supplied object value. Parameters in SQL Server are inherently strongly-typed, including the SQL Server data type, length, precision, and scale. Types in .NET don’t always map precisely to SQL Server types</a>, and are sometimes ambiguous, so AddWithValue has to makes guesses about the intended parameter type.

The guesses AddWithValue makes can have huge implications when wrong because SQL Server uses well-defined data type precedence rules when expressions involve unlike data types; the value with the lower precedence is implicitly converted to the higher type. The implicit conversion itself isn’t particularly costly but is a major performance concern when it is the column value rather than the parameter value must be converted, especially in a WHERE or JOIN clause predicate. The implicit column value conversion can prevent indexes on the column from being used with an index seek (i.e. non-sargable expression), resulting in a full scan of every row in the table or index.

Below are some of the most common pitfalls with AddWithValue.

NET Strings

Strings are arguably the single biggest problem with AddWithValue. AddWithValue infers SqlDbType.NVarChar when a string object is provided because strings in .NET are Unicode. The resultant SQL Server parameter type is nvarchar(n) or nvarchar(MAX), depending on the length of the string value provided. The type is nvarchar(n) when the string length is 4000 characters or less and nvarchar(MAX) when over 4000 characters.

Performance problems due to AddWithValue inferring nvarchar are quite common in the wild when the parameter value is compared to varchar columns in queries. Remember that nvarchar has a higher precedence than varchar, which can prevent indexes from being used efficiently. Although SQL Server can still coerce a sargable expression in this case if the indexed varchar columns has a Windows collation (but not a legacy SQL collation), it is still best to specify the proper varchar parameter type as described below.

Another issue with strings and AddWithValue is cache bloat and lack of reuse. AddWithValue uses the actual length of the supplied value as the nvarchar parameter length. This results in a separate cached plan for queries that differ only by parameter length. For example, you’ll end up with 20 different cached plans with 20 different @LastName parameter value lengths. Consider the number of plans is greatly exacerbated with many string parameters (e.g. @LastName, @FirstName, @Address) and is especially an issue with large parameterized IN clauses.

A better method to add string parameters is with the SqlParameterCollection.Add method. In addition to the desired name and data type, this overload allows one to specify the length, which should match the column maximum length (-1 for MAX types). The value can be easily assigned to the returned SqlParameter instance Value property in a one-liner:

command.Parameters.Add("@LastName", SqlDbType.VarChar, 30).Value = "Smith";
command.Parameters.Add("@GenderCode", SqlDbType.Char, 1).Value = "M";

.NET DateTime

AddWithValue infers SqlDbType.Datetime parameter type when a DateTime object value is provided. Although a .NET DateTime aligns closely with SQL Server datetime2(7), AddWithValue uses datetime for backwards compatibility.

The SQL Server datetime type has a fixed precision of 3 with accuracy to 1/300 fractional seconds so ADO.NET will round the value accordingly and sub-millisecond values will be lost, resulting in a value that ends with 0, 3, or 7 milliseconds. This behavior is compatible with a SQL Server datetime column value but loses accuracy and precision when used with a datetime2 column of precision 3 or greater. Values will be rounded to the lesser datetime precision and sub-millisecond values will be lost.

One can explicitly specify SqlDbType.DateTime2 to avoid loss of accuracy when working with the SQL Server datetime2 type. Note that the parameter Scale property is used to specify datetime precision in ADO.NET rather than the Precision property as one might expect. The default Scale is 7, the same as the default datetime2 precision in SQL Server. Although the value will be automatically be rounded when the target column precision is less, I recommend one match the column precision regardless.

This example uses Add instead of AddWithValue, specifying Scale of 3 to match a target column datetime2(3), which is more precise than datetime with 1 byte less storage.

command.Parameters.Add(new SqlParameter("@InsertTimestamp", SqlDbType.DateTime2) { Scale = 3, Value = DateTime.Now });

.NET Decimal

AddWithValue infers a SqlDbType.Decimal with the same precision and scale as the supplied object value. Varying precision and scale properties can bloat cache similarly to string lengths as described earlier so it’s best to explicitly specify the precision and scale matching the column.

Here’s an example that adds a decimal value without AddWithValue, specifying precision and scale (decimal(18,4)):

command.Parameters.Add( new SqlParameter("@InvoiceTotal", SqlDbType.Decimal) { Precision = 18, Scale = 4, Value = 1234.56m });

Summary

I hope this article helps you understand the importance of the strongly-typed parameter best practice. Note only will this help avoid performance and concurrency problems, the attention to detail will save compute costs in the cloud.