The Curious Case of Undetected SQL Exceptions

Undetected database errors are insidious. It can be really bad when an error gets dropped on the floor, resulting in incomplete or wrong results. Consider that this simple SELECT query returns an empty result set instead of raising a SqlException for the divide by zero error:

One generally assumes SQL errors raised during batch execution will also raise a SQL exception in the application. However, there are cases involving multi-statement batches and stored procedures where errors that occur might not be raised in the client application, as the above example shows. These scenarios can be distilled as:

1) An error is caught by T-SQL TRY/CATCH while executing a row-returning statement.
2) An error occurs after a row-returning statement successfully executes.

To ensure SQL exceptions are raised as expected, one must either code T-SQL to avoid these scenarios entirely or ensure the client application data access layer consumes all subsequent results returned by SQL Server even when only a single result set is expected. Row-returning statements include SELECT (not variable assignment), OUTPUT clause in an INSERT/UPDATE/DELETE/MERGE statement that returns rows to the client, as well as some specialized commands like RESTORE FILELISTONLY, DBCC commands with TABLE_RESULTS, etc.

Below is a C# ADO.NET example of this defensive programming technique. Even though a single result set is expected, the code still invokes NextResult afterwards to process the entire result stream and ensure SQL exceptions are raised in the app when errors occur.

This consideration applies to all SQL Server API and programming languages although I focus on C# and ADO.NET with System.Data.SqlClient (.NET Framework Data Provider for SQL Server) in this article. The specific methods for consuming all results will vary depending on the API (e.g. getMoreResults() in JDBC) but the basic concept is the same.

Regardless of the method one uses to execute SQL Server queries, ADO.NET uses a data reader to return command results even when higher-level objects (e.g. Dataset) or ORMs (e.g. Entity Framework) are used. The low-level ADO.NET command ExecuteReader method exposes the data reader whereas ExecuteScalar and ExecuteNonQuery do not expose the internal reader.

ExecuteScalar returns the first column of the first row returned as a scalar value but doesn’t call NextResult on the internal data reader to retrieve subsequent results. Consequently, errors may go undetected with ExecuteScalar. ExecuteScalar will not raise an exception if a T-SQL error occurs after the first row is returned. Also, if no rows are returned because the row-returning statement erred and the error was caught in T-SQL, ExecuteScalar returns a null object without raising an exception.

ExecuteNonQuery executes the entire batch of statements and returns the accumulated count of affected rows as a scalar value, discarding rows returned (if any). The returned value will be -1 if SET NOCOUNT ON is specified. Because ExecuteNonQuery internaly consumes all results in the process, errors will be raised without additional ADO.NET programming, albeit one doesn’t typically use ExecuteNonQuery to execute a batch that returns rows. Again, the ADO.NET error detection issue only applies to row-returning statements.

The remainder of this article discusses T-SQL error handling and ADO.NET defensive programming techniques in more detail and discusses techniques to avoid undetected database errors in ADO.NET.

T-SQL Error Handling Objectives
T-SQL and ADO.NET data access code must work in concert with one another to ensure SQL errors are detected in application code. The T-SQL constructs used in multi-statement batches can affect if and how when errors are reported by ADO.NET during batch execution. I’ll start by citing core T-SQL error handling objectives, which can be summarized as:

1) Ensure a multi-statement T-SQL batch doesn’t continue after an error occurs.
2) Rollback transaction after errors.
3) Raise error so that the client application is aware a problem occurred.

The T-SQL building blocks used to achieve these objectives are:
2) Structured error handling (SEH) (a.k.a. TRY/CATCH)
3) Control-of-flow (e.g. IF @@ERROR GOTO ErrorHandler)

T-SQL Behavior Without SEH and XACT_ABORT ON
When a runtime error occurs with the XACT_ABORT session setting ON outside a TRY block, SQL Server will stop batch execution immediately, rollback the transaction (if any), and raise the error. Consequently, a single SET XACT_ABORT ON statement will meet all aforementioned error handling objectives without T-SQL procedural code. However, the XACT_ABORT setting is not considered when user-defined errors are raised with RAISERROR so control-of-flow statements are required to meet objectives #2 and #3 when RAISERROR is employed.

SET XACT_ABORT ON also rolls back open transactions following an attention event like an explicit cancel or query timeout, which would otherwise leave the transaction open. This is one reason why I strongly recommend using SET XACT_ABORT ON, especially in procs that include BEGIN TRAN, regardless of whether or not SEH is also used.

T-SQL Behavior Without SEH and XACT_ABORT OFF
When an error occurs with the SET XACT_ABORT session setting OFF and SEH is not used, SQL Server will raise the error immediately but, depending on the error and severity, batch execution might continue and the transaction not rolled back. The T-SQL batch must use control-of-flow statements after each statement to avoid continuing after errors and roll back the transaction (objectives #1 and #2).

T-SQL Behavior With T-SQL Structured Error Handling
When an error occurs during statement execution with a T-SQL structured error handler is in scope, the CATCH block error handler is entered, after marking the transaction uncommittable if SET XACT_ABORT is ON. SEH meets the first error handling objective by skipping subsequent statements in the TRY block after an error. It is the responsibility of the error handling code in the CATCH BLOCK to roll back the transaction if needed and raise the error. The simple T-SQL handler below achieves objectives #2 and #3 in Azure SQL Database and SQL Server 2012 and later:

THROW is not available In SQL 2008 R2 and earlier so one must use RAISERROR instead in older versions. The error handler below provides similar functionality a THROW, although RAISERROR obfuscates the original error as a user error with message number 50000+.

T-SQL Interaction with ADO.NET
ADO.NET will reliably detect errors in batches without T-SQL SEH when no result sets are returned or only the last statement in the batch returns rows. When a row returning statement is not the last statement in the batch, ADO.NET code must call NextResult to ensure ADO.NET raises errors for subsequent statements in the batch that may have erred. Avoid row returning statements that are not the last one in a batch unless you are certain the calling code consumes all results with NextResult.

Using T-SQL SEH when result sets are returned has implications on ADO.NET data access code to ensure database exceptions are raised after SQL errors. As illustrated with the ExecuteReader example at the beginning of this article, when an error is caught in T-SQL during a row-returning statement, ADO.NET will not raise the database exception immediately and instead return the partial or empty result set generated by the failed statement. The next data reader Read method call will return false after the error. The error raised in the CATCH block by the T-SQL THROW or RAISERROR statement is considered a separate result by ADO.NET and will not be raised as a database exception until NextResult is called.

Unless you have a specific reason to use SEH in row-returning batches, I suggest instead using SET XACT_ABORT ON alone as this will address core T-SQL error handling objectives and allow ADO.NET to detect errors without calling NextResult. However, as mentioned earlier, ADO.NET code will still need to call NextResult if the row-returning statement is not the last statement in the batch.

I hope this information will help you ensure database errors in multi-statement batches are detected. The interaction between T-SQL and ADO.NET isn’t as intuitive as it could be.

Deprecated SQL Server Data Access Technologies

I hope the warning excerpt below from the SQL Server Books Online is not a surprise to you (emphasis mine):

SQL Server Native Client (SNAC) is not supported beyond SQL Server 2012
. Avoid using SNAC in new development work, and plan to modify applications that currently use it. The Microsoft ODBC Driver for SQL Server provides native connectivity from Windows to Microsoft SQL Server and Microsoft Azure SQL Database.

The above warning was removed from the SQL Server documentation shortly after this article was first published. However, I still think it’s prudent to use the stand-alone ODBC driver instead of OLE DB for new native applications. The OLE DB story continues to evolve and I hope we see a roadmap and guidance from Microsoft soon.

The OLE DB story at last has clarity. Microsoft has decided to undeprecate (a new word) OLE DB and released a new OLE DB provider, Microsoft OLE DB Driver for SQL Server (MSOLEDBSQL). This new driver will maintained out-of-band with the SQL Server database engine lifecycle, following the same release model as other SQL Server client drivers, facilitating incremental enhancements. The initial release of MSOLEDBSQL does not support column level encryption with Always Encrypted so one must currently use SqlClient, ODBC, or JDBC when AE column encryption is required.

Keep in mind the legacy Microsoft OLE DB Provider for SQL Server (SQLOLEDB) remains deprecated. If you continue down the OLE DB path for data access in native code, use MSOLEDBSQL going forward.

Let me first mention that the Books Online should have made it clear that this warning applies only to SNAC data access by applications. The warning does not apply to Microsoft SQL Server features and products that use SNAC as part of the OLE DB stack, like Linked Servers, SSIS, and SSAS. Microsoft SQL Server 2012 Native Client is still installed and used as a component dependency to support these features in SQL Server 2014 and later versions, although that may change in the future.

More concerning is OLE DB in general. Even before SQL Server 2012 was released, Microsoft announced in August, 2011 that ODBC was the preferred technology for relational database access and that OLE DB was deprecated. Below is an excerpt from the Microsoft is Aligning with ODBC for Native Relational Data Access FAQ which should be alarming to folks who continue to use Microsoft OLE DB providers to access SQL Server 2014 and later databases (note Denali is the code name for SQL Server 2012 and, again, emphasis mine):

Question6: If I have an OLE DB application that I write for Denali, will it be supported on a post Denali version of SQL Server that is released during the life of Denali?

Answer: No, in fact we may explicitly block the OLE DB applications on post-Denali versions of SQL Server. It is recommended that you plan your migration soon to ODBC, if you want to start using newer versions of SQL Server as soon as they release.

The explicit block of OLE DB data access has yet not happened (as of this writing) but I think it’s prudent to heed the announcement and warning, at least for applications that currently use SQL Server 2014 and later versions as well as SQL Azure Database, or may need to use newer SQL versions in the future. Microsoft has a history of going out of their way to provide backwards compatibility in the SQL Server space but it’s been 3 SQL Server versions (soon to be 4 versions) since the deprecation announcement.

The bottom line is that native Windows applications (e.g. unmanaged C++, VB6, VBA, Classic ASP, etc.) should generally use ODBC and the latest Microsoft ODBC Driver for SQL Server. This stand-alone ODBC driver is a free download from Not only should SNAC not be used against SQL Server 2014 and later versions, the deprecated SQL Server OLE DB provider and ODBC driver that ship with Windows should also be avoided. This practice will facilitate more seamless upgrades to new versions of SQL Server and Azure SQL Database.

Preparing for the Future
The message is loud and clear that ODBC is the supported and preferred path for native applications going forward. The Data Access Technologies Road Map provides an overview and history of Microsoft data access technologies, which I recommend you peruse to ensure you are not inadvertently using deprecated or unsupported technologies for new development and, for existing applications, consider moving from legacy data access technologies to current ones when practical.

The current Microsoft ODBC Driver for SQL Server as of this writing is ODBC Driver 13 for SQL Server. Note that that both the 13.0 and 13.1 versions of this driver have the same “ODBC Driver 13 for SQL Server” display name listed under installed programs and ODBC Data Source Administrator. If installed, the driver will be listed under installed programs along with the corresponding driver version (when viewed detail mode). The 13.1 version adds support for the Always Encrypted feature. These ODBC Drivers are available from the link I mentioned earlier.

The remainder of this article reviews common data access technologies for native (unmanaged) Windows applications that are explicitly identified as unsupported, deprecated, or may have a limited future along with remediation considerations. This isn’t to say these technologies won’t work, just that using them may block upgrades to new versions of SQL Server and Azure SQL Database as well as prohibit using new SQL Server features. Applications that are aligned with current technologies are much better positioned for seamless SQL Server upgrades, both on-prem and in the cloud.

Deprecated Windows Data Access Components
Windows Data Access Components (WDAC), which was formally known as Microsoft Data Access Components (MDAC) in older Windows versions, is included with Windows to provide data access infrastructure for ODBC, OLE DB, ADO Classic, and managed ADO.NET out-of-the box. WDAC includes a mix of deprecated, mature, and modern components. WDAC is part of the operating system and maintained by Windows Update.

Both the SQL Server ODBC driver and OLE DB provider included with WDAC are deprecated; these are provided only for legacy application backwards compatibility and should not be used for new application development. The WDAC ODBC driver named “SQL Server” and OLE DB provider named “Microsoft OLE DB Provider for SQL Server” (SQLOLEDB) were both deprecated in favor of SQL Server Native Client when SQL Server 2005 was released for over 10 years ago. Not only do these old components not support new data types introduced after SQL Server 2000 directly, newer features like Availability Groups, MARS, Always Encrypted, and idle connection resiliency are not supported either. Native applications should install and use a separately installed SQL Server driver rather instead of the WDAC “SQL Server ODBC driver, SQLOLEDB provider, or SNAC ODBC driver/OLE DB provider.

WDAC also includes the Microsoft OLE DB Provider for ODBC Drivers (MSDASQL), which acts as a bridge to allow OLE DB applications to use an ODBC driver instead of an OLE DB provider for low-level data access. The main purpose of MSDASQL was to allow OLE DB programs to access RDBMS products other than SQL Server that had an ODBC driver but no OLE DB provider, with the expectation that vendors would eventually jump on the OLE DB bandwagon. However, that didn’t happen widely because OLE DB relies on Windows-only COM interfaces and many vendors preferred to support only ODBC interfaces. Microsoft recognized the value of ODBC for cross-platform SQL Server data access, which is one of the reasons called out for OLE DB deprecation. Although MSDASQL is an OLE DB provider, it is not deprecated for SQL Server relational data access because it uses ODBC to access the data store. MSDASQL can facilitate transitioning to ODBC in existing OLE DB applications. I’ll discuss this in more detail shortly.

ADO (ActiveX Data Objects, not to be confused with ADO.NET) components are included with WDAC. Although not deprecated, ADO a mature OLE DB technology that hasn’t been enhanced since ADO 6.0 was released with WDAC in Windows Vista over 10 years ago. ADO types do not directly support data types introduced after SQL Server 2000, regardless of the underlying provider/driver used. I’ll leave speculation on the future of ADO as an exercise for the reader.

Changing Existing ODBC Applications
Changing an existing ODBC application (ADO or direct ODBC function calls) to use the Microsoft ODBC Driver for SQL Server is often simply a matter of installing the driver and changing the connection string or DSN to use it. Unmanaged C++ applications that use the ODBC call level interface directly may need to be recompiled using the header files included with the SDK install of the ODBC driver. It’s often trivial to switch from SNAC or the WDAC SQL Server ODBC driver. Application testing should be done to ensure compatibility but ODBC driver upgrades are usually transparent.

Changing from OLE DB to ODBC in ADO Classic
There is still quite a bit of OLE DB SQL Server access in the wild in my experience, varying from C++, VBA, Classic ASP, and even in VB.NET apps migrated from VB6 that were never retrofitted to use to SqlClient (which should be done as it provides high-performance managed data access for .NET applications). From unmanaged code, ADO provides an easy to use object-oriented COM interface on top of the 100+ complex OLE DB interfaces. Low-level data access is performed by the OLE DB provider specified in the connection string or connection object. MSDASQL (Microsoft OLE DB Provider for ODBC Drivers) is the default provider in ADO so ODBC is used when no OLE DB provider is specified. When MSDASQL is used explicitly or by default, the connection string must specify either a DSN or ODBC driver.

Below are DSN-less ADO ODBC connection string examples that use the Microsoft ODBC Driver for SQL Server with and without a trusted connection. “Provider=MSDASQL” could have been specified in the connection string but is not technically required since it’s the default provider:

The examples below use an ODBC DSN with the default database specified in the DSN configuration. The first uses Integrated Windows authentication (specified in the DSN configuration) and the second example is functionally identical except using SQL Server authentication. Be aware that the Microsoft ODBC Driver for SQL Server does not store user credentials in the DSN configuration so those must be specified by the application:

ADO applications use a handful of objects to interact with SQL Server (mostly connection, command, recordset, record, transaction, parameter, and field objects). These objects abstract the implementation details such that it is possible to use ADO with any DBMS product, OLE DB provider, or ODBC driver as long as the low-level driver/provider supports the requested functionality. One can theoretically transition from OLE DB to ODBC with only a connection string change. However, the devil is in the details when switching from OLE DB to ODBC via MSDASQL.

MSDASQL converts OLE DB method calls into their equivalent ODBC function calls. Commonly used fast-forward read-only client-side cursors (a.k.a. firehose cursor, same as the only type ADO.NET uses) tend to work well in my experience. However, ADO provides a plethora of options for cursor location, execution, cursor types, and locking modes that ADO and MSDASQL may or may not translate well to ODBC equivalents. If you run into problems during testing with ODBC, make sure the requested ADO options are appropriate for the task at hand. It is common for ADO applications to use advanced options and pessimistic locking inappropriately. A firehose cursor with optimistic locking is often best for the task at hand and will perform better too.

Like ADO, MSDASQL is a mature technology that hasn’t gotten much love lately. I could be wrong but I don’t expect patches that are not security related. If you run into issues with MSDASQL, you may be better off developing a work-around rather than wait for a fix.

Changing OLE DB Interface Calls to ODBC
C++ applications that use OLE DB interfaces directly rather than the higher-level ADO API can be converted to ODBC by using the MSDASQL provider similarly as discussed in the ADO topic. The same considerations apply.

Note that MSDASQL does add another layer, which may be an issue for applications that are especially performance-sensitive. The best approach for C++ applications that are expected to be around for a while might be to move from OLE DB to ODBC function calls (or via MFC). This will provide the highest level of performance and alignment with ODBC.

Remediation Summary
Inasmuch as I recommend staying aligned with product technology lifecycles, whether or not one should modify an existing application to use currently data access technologies is ultimately a business decision. Business benefits include not only supportability but also agility in making functional enhancements that leverage newer features as well as non-functional ones, such as support for Availability Groups. The costs of the development and testing needed to use the Microsoft ODBC Driver for SQL Server will vary considerably depending on application design. Those costs need to be weighed against the benefits of the effort.

That being said, I always recommend one avoid deprecated technologies in new development work even if that requires moving outside one’s comfort zone. Technical debt is a hidden cost that often needs to be paid back with interest.

p.s. For history buffs, this post by Hal Berenson, retired Microsoft Distinguished Engineer and General Manager, provides additional background on OLE DB and data access technologies.

SQL Server TVP Performance Gotchas

Table-valued parameters have important considerations that developers and DBAs need to be aware of. It is essential that application code specify the proper data type and length for TVP columns in order to achieve optimal performance and reduce unnecessary overhead. Additionally, a trace (Extended Events, server-side SQL Trace, Profiler) that captures RPC events of an inappropriately defined TVP can not only exacerbate performance issues, but affect stability of the SQL Server instance in some cases.

The graph below summarizes the impact the application code max column length specification can have on performance, without and with a trace running. All tests used the same table type of 10 varchar(50) columns and a 10,000 row TVP rows passed via a DataTable object of 10 string columns. The only variables were the max column length specified by the app code and a trace running on the database server. The client application was run on a different machine than the database server and elapsed time measured by the application.

TVP Performance Comparison
Figure 1: Impact of TVP maximum column length specification with and without tracing

The average duration was 113ms without a trace running when the app code used the default -1 max column length. However, when max column length 50 was specified (matching the varchar(50) column of the table type), the average duration dropped significantly to 75ms. The trivial code change of specifying the string column max length of 50 improved performance by 33%.

I then ran the same pair of tests while a trace captured the RPC completed events. The average duration of the default max length test increased from 113ms to 9.324ms with the trace running, an over 80x degradation in performance! The test with the explicit 50 character max length was not nearly impacted as much, increasing from 75ms to 89ms (which is tolerable, IMHO).

TVP Internals
Under the hood, TVP data are passed to SQL Server over the Tabular Data Stream (TDS) protocol. The client API sends TVP column meta-data to SQL Server describing the data type, length, and other meta-data for each TVP column followed by data rows with each column in native format matching the preceding data type specification. SQL Server uses the provided column meta-data to prepare and fill buffers for efficient processing on the server side, leveraging native types to eliminate parsing overhead similarly to other parameterized queries.

Before a query or stored proc with a TVP starts executing, SQL Server creates a table in tempdb with the same schema as the parameter table type and uses bulk insert internally to efficiently load the table with TVP rows streamed by the client application. The size of a TVP is constrained only by available tempdb storage. SQL Server executes the query/proc after the TVP temp table is loaded and the parameterized T-SQL query/proc can then use the TVP data.

The TVP columns provided by the client application do not have to match the schema of the target table type; SQL Server implicitly converts TVP values to match the target table type column when data types differ. Although not optimal, implicit conversion is generally not a major factor in overall TVP performance.

The application-specified TVP max column length can impact performance significantly, and in some cases dramatically, as illustrated by the performance tests shown earlier. SQL Server prepares to receive TVP data up to the max length specified by the client application rather than the defined size of the target table type column. When the specified max length of variable length columns exceed the 8000 byte tipping point, SQL Server uses a different code path to allow for large object (LOB) values up to 2GB. Unless the table type actually contains LOB values (varchar(MAX), nvarchar(MAX)), database server resources are wasted unnecessarily when an inappropriate max column length is specified.

LOB values are especially problematic when a trace captures the RPC completed event of a TVP query. Tracing uses memory from the OBJECTSTORE_LBSS memory pool to build trace records that contain TVP LOB values. From my observations of the sys.dm_os_memory_clerks DMV, each LOB cell of a TVP requires about 8K during tracing regardless of the actual value length. This memory adds up very quickly when many rows and lob columns are passed via a TVP with a trace running. For example, the 10,000 row TVP with 10 LOB columns used in the earlier test required over 800MB memory for a single trace record. Consider that a large number of TVP LOB cells and/or concurrent TVP queries can cause queries to fail with insufficient memory errors. In extreme cases, the entire instance can become unstable and even crash under due to tracing of TVP queries.

Specifying Proper TVP Parameter Column Meta-Data
A SQL Server development best practice has long been to use strongly-typed parameters with attention to detail regarding the parameter data type and length such that it is consistent with the types on the server. This practice improves performance by avoiding implicit data type conversions, promotes sargable expressions, avoids unnecessary procedure cache bloat, and inherently validates data for proper typing on the client before it is sent to the database server. With scalar parameters, one need only specify the correct SqlDbType along with the proper length, precision/scale (avoiding AddWithValue method to add parameters) and all is well in the world.

TVP parameters require additional column meta-data not applicable to scalar parameters. The parameter data type of a TVP in .NET is always SqlDbType.Structured. The additional TVP column meta-data is inferred from the supplied parameter value, which may be a DbDataReader, IEnumerable, or DataTable object. These objects inherently contain column meta-data and methods enumerate rows, which the SqlClient API uses to send the TVP to SQL Server.

DataTable objects are most commonly used as TVP values. DataTables are easy to use and can serve as containers for data beyond just TVP usage. But unlike DbDataReader and IEnumerable objects, a big gotcha with a DataTable is that the default data type String with maximum length of -1 (2GB LOB). This is the .NET equivalent of the SQL Server nvarchar(MAX) data type and has many insidious and negative implications with a TVP. First, values of types other than string that are added to a DataTable string column will be converted to string (DateTime, Integer, GUID, etc.). Consequently, using the default DataTable column string data type for non-string types will:

• increase client memory requirements compared to more compact native types
• incur conversion overhead
• prevent strong-typed data validation on client side
• require date format aware formatting of date and datetime values
• require using a period as decimal separators
• increase network usage compared with smaller native types

When String is the proper column data type, developers must be especially mindful of the max length specification when the DataTable is used as the TVP value. Strings in Windows and .NET are Unicode, requiring 2 bytes per character. This means a max length of over 4000 characters will cross the 8000 byte threshold for LOB data on the server side regardless of the table type on the server. Avoid using long string columns in TVPs when many rows are passed and never use the default -1 length unless a MAX type is actually intended.

DBAs who support applications that use TVPs should be aware of the repercussions tracing can have on performance and SQL Server memory. Avoid capturing TVP RPC completed events of large TVP requests, if possible. When tracing TVP RPC completed events, monitor the OBJECTSTORE_LBSS memory pool for excessive memory usage.

I’d like the thank SQL Server MVP Ola Hallengren for his suggestion for me to write this article.

Help Stop SQL Injection Madness

SQL injection continues to be the most exploited security vulnerability on the internet. A SQL injection vulnerability can put companies out of business and expose innocent people to sensitive data disclosure and damage, all due to developer ignorance or carelessness. Sadly, although there is much information available on SQL injection exploits and mitigation, SQL injection exploits continue because developers use flawed database access techniques. This article details the best, and simplest, method to prevent SQL injection in SQL Server: parameterized SQL statements.

Parameterized SQL queries and stored procedure calls (with CommandType.StoredProcedure and no dynamic SQL within proc code) prevent unintended SQL from being executed. I’ll discuss exactly why shortly. Parameterized SQL also provides many benefits besides security. Notably:

  • allows quotes in data without programmatic escaping
  •  avoids the need to format date strings according to DATEFORMAT session settings
  • handles decimal separators automatically
  •  improves performance due to plan reuse
  •  provides cleaner code

Most articles on SQL injection focus on validating user input rather than parameterized queries.  Input validation certainly has benefits related to data integrity and user-friendly validation messages but it is at best a secondary defense against injection compared to parameterized queries.  Input validation using a blacklist of prohibited keywords and characters is especially weak.

SQL Injection Overview
I am puzzled by the number of examples in articles and forum posts that show building SQL statements with literals using a string concatenation technique.  I suspect the main reason for this practice is that developers write code that generates the exact same ad-hoc SQL statement they would write using a query tool like SQL Server Management Studio or Visual Studio.  This leads to the nasty habit of not using parameters.  For example, a developer might develop and test a query like this in SSMS:

After testing the query, the C# developer includes the SQL statement in the application code but, instead of a hard-coded value for CustomerID, the desired value is concatenated with the remainder of the SQL statement as a literal.

The query runs perfectly well and returns the expected result in testing.  It is not until after production deployment that a script kiddie or hacker finds the SQL injection vulnerability.  A malicious person can easily manipulate this SQL statement in ways unintended by the developer.  For example, a hacker could enter CustomerID value “1 OR 1 = 1” in the text box to harvest the names and email addresses of all customers in the database.  This vulnerability can also be exploited to execute additional SQL statements in the same batch, which is often used to update data with malicious html script that is subsequently returned and executed by unsuspecting client browsers.  There are many ways to exploit a SQL injection vulnerability that result in sensitive data disclosure and/or database manipulation.  I won’t detail those here but rather recommend a very simple defense; use a parameterized statement instead of literals.

Mitigating SQL Injection
Below is a functionally identical parameterized query.

This query can be incorporated into the C# application code with the parameter and corresponding value added to the parameterized SQL command as a strongly-typed parameter:

As you can see, the effort needed to do the same job with a parameterized command is about the same as the string concatenation method, yet the security benefits are immeasurable.

Why Parameters Prevent SQL Injection
Parameterized SQL is more secure for a couple of reasons.  In the case of an ad-hoc query, the SQL statement with parameter markers is hard-coded in the application (or generated via an ORM framework) and passed to SQL Server as-is.  Parameter values are passed separately rather than inside the SQL statement itself.  This ensures the SQL statement cannot be modified by the values provided.

Similarly, a stored procedure call (using CommandType.StoredProcedure) sends only the stored procedure name to SQL Server with separately passed parameter values.  Using stored procedures also allow one to more strictly adhere to the security principal of least privilege.  By granting only stored procedure execute permissions, permissions on indirectly referenced objects are not needed as long as the ownership chain is unbroken.  This practices limits ad-hoc access to underlying data should an attacker find a back door into the database under the application security context.  Personally, I’m a staunch advocate of stored procedures for this and other reasons too.  Stored procedures provide a well-defined database interface, allowing queries and database schema to be refactored without breaking changes.

I should add that it is perfectly acceptable to build a SQL statement dynamically in either application code or stored procedures, but only if parameterized and actually needed for the task at hand.  A common use case is a dynamic search query that is constructed based on user criteria. Instead of coding a separate query for each possible search permutation, one can add parameterized WHERE clauses as needed.  Under no circumstances should a SQL statement string be built with concatenation of untrusted values.

Additional Precautions
Use Windows Authentication for SQL Server database access.  This practice eliminates the need to store security credentials in application configuration files.  If SQL authentication must be used, protect the credentials with encryption.  This is easily done in .NET by encrypting the connectionStrings section of the app.config file.  See

Adhere to Principal of least privilege.  Never use a privileged account, such as a sysadmin role member, for routine application access.  Grant permissions only on database objects which are used directly by applications.  Use stored procedures so that one can leverage ownership chaining, eliminating the need to grant permissions on tables directly.

Specify strongly-typed parameters.  The parameter data type can be inferred (e.g. using SqlCommand.Parameters.AddWithValue method) but a best practice is to explicitly specify a parameter data type that matches the underlying column data type and maximum length.  These practices will minimize the size of the SQL Server procedure cache and avoid implicit conversions that can negatively affect performance.

Maximizing Performance with Table-Valued Parameters

I’ve seen TVPs improve performance by orders of magnitude when used appropriately.  This feature isn’t used as often as it should be so I’ll show how to implement a TVP using a C# application example and discuss TVP considerations.

Why Table-Valued Parameters Improve Performance
TVPs allow one to pass multiple rows or values at once to a query or stored procedure.  By doing so, the query can leverage set-based operations to greatly improve performance compared to executing the query/proc many times.  TVPs decrease network latency by reducing network round trips.  Prior to TVPs, one had to employ workarounds like passing a string containing a delimited list of records (or XML) and parsing on the SQL side.  Those techniques are at best a kludge.  Instead TVPs are the right tool for the job in SQL Server 2008 and later.

TVPs (and table variables) are temp tables.  TVPs and table variables are persisted in tempdb much the same way as regular temp tables (# prefix).  The big differences are that TVPs and table variables are limited in scope to the current batch and do not have statistics.  Also, table parameters/variables allow only indexes declared via primary key nor unique constraints; non-unique indexes are not supported.

When a TVP Is passed to SQL Server from a client application, the data is bulk-inserted into tempdb where it is made available to the SQL statement or stored procedure using the declared TVP.  This bulk insert is handled by the client API transparently to the application and allows a large number of rows to be passed to SQL Server efficiently.  In T-SQL, one can declare and load a table variable, where it can be used directly in a SQL statement or passed as a TVP to a stored procedure or query (invoked with sp_executesql).

The exact threshold where TVPs outperform single-row operations is much lower than you might expect.  Your mileage may vary but, at least in my experience, it’s only a few rows.  The performance tests I ran here show that the threshold where a TVP outperforms individual requests with a simple list of customers is only 4 rows.  That said, I would not recommend TVPs for single-row operations due to the start-up overhead but it’s a good choice when the most common use case is multiple rows.

How to Use TVPs
In order to pass a TVP to a SQL statement or stored procedure, one must first create a user-defined table type in SQL Server.  The table type defines the schema of the table SQL Server expects for the TVP.  Listing 1 shows the CREATE DDL for the table type I use in the AdventureWorks2012 database to pass a list of customers to the stored procedure in Listing 2 that returns all orders for these customers.  Note that the stored procedure TVP must be declared as READONLY.

Listing 1: Table type DDL

Listing 2: Stored procedure to return orders for multiple customers

TVPs are passed from a .NET application using parameter data type DbType.Structured.  The actual parameter value can be an object of type DataTable, DbDataReader or IEnumberable<SqlDataRecord>.  A DataTable is appropriate when the required TVP data is already in a DataTable object needed for other purposes.  A DbDataReader is a good choice when the source data is the result of a query (note that any DbDataReader may be used, not just a SqlDataReader).   For other cases, I recommend an IEnumerable<SqlDataRecord>.  It is very easy to implement IEnumerable<SqlDataRecord>, especially if you already have an enumerable object like an array or collection containing the values you need to pass.  See the code at the end of this article for an example class that implements IEnumerable<SqlDataRecord> for a TVP.

A benefit with DbDataReader and IEnumerable<SqlDataRecord> is that the source data can be streamed to SQL server without first loading all the TVP data into memory in either the client application or in SQL Server.  This is a big consideration when a lot of data needs to be passed because you are not constrained by memory on ether the client or SQL Server side.

Listing 3 shows the stored procedure I use to compare performance with the TVP method.  This proc takes only a single @CustomerID parameter so it must be called once for each customer.

Listing 3: Stored procedure to return orders for a single customer

The complete C# console application I used for the performance tests is in Listing 4.  The code demonstrates the 3 ways to pass a TVP from C# and logs timings for each invocation.  As you can see in Figure 1, the TVP outperformed individual calls once more than 3 customers were passed.  Note that I had to use a logarithmic y-axis scale because of the huge performance difference.  For example, 10,000 individual calls averaged about 1.5 seconds whereas passing 10,000 customers via a TVP took less to 100 milliseconds.

Figure 1:  TVP versus individual call performance
TVP performance graph

Listing 4: Passing TVPs in C#


Microsoft decided to undeprecate OLE DB 6 years after the deprecation announcement mentioned in this post. A new OLE DB provider, Microsoft OLE DB Driver for SQL Server (MSOLEDBSQL), is now available. Keep in mind the legacy Microsoft OLE DB Provider for SQL Server (SQLOLEDB) that ships with Windows remains deprecated. If you continue down the OLE DB path for data access in native code, download and use MSOLEDBSQL going forward.


I was very surprised when Microsoft announced deprecation of OLE DB provider for SQL Server data access last week on the Data Access Blog and MSDN Forums Announcement. The next release of SQL Server, code-named “Denali”, will be the last to ship a new SQL Server Native Client OLE DB provider. The SQL Server Native Client OLE DB driver will continue to be supported for 7 years after the Denali release so we have plenty of time to plan accordingly.

The other Microsoft-supplied OLE DB driver for SQL Server, SQLOLEDB, has been deprecated for many years now. The deprecated SQLOLEDB driver (and deprecated SQLSRV32.DLL ODBC driver) is part of the older MDAC package and is currently included in Windows operating systems as part of Windows Data Access Components for backwards compatibility. Windows 7 is the last Windows version that will include a SQL Server OLE DB and ODBC driver out of the box. Microsoft recommends that we use the SQL Server Native Client ODBC driver as the SQL Server data access technology of choice from native code going forward.

What This Means to You

Avoid using OLE DB for new SQL Server application development. Update your technology roadmap to move towards migrating existing SQL Server applications that use the SQLNCLI, SQLNCLI10, SQLNCLI11 or SQLOLEDB OLE DB providers to the SQL Server Native Client ODBC driver.

Note that much is still unknown since current versions of SQL Server rely heavily on OLE DB. Although this is purely speculation on my part, it stands to reason that we will see improved ODBC support across all Microsoft products and SQL Server features that currently rely on OLE DB for relational data access.

New SQL Server Development

Use one of the following SQL Server relational database access technologies for new development:

·         Managed code (e.g. C#, VB.NET, managed C++): Use Sysem.Data SqlClient. SqlClient is part of the .NET framework and is the preferred way to access SQL Server from managed code (C#, VB.NET, managed C++). The only reason I can think why not to use SqlClient from managed code is if an application needs to also support other DBMS products using the same interface without coding an additional abstraction layer. In that case accessing different database products Sysem.Data.Odbc is an alternative.

·         Native code (e.g. unmanaged C++): Use ODBC with the Server Native Client driver. The ODBC call-level interface can be used directly or via the higher-level ADO API. The SQL Server Native Client ODBC driver is included with SQL Server and also available as a separate download. 

Migrating Existing Applications

I sometimes see existing managed applications use ADO (e.g. ADODB.Connection) instead of SqlClient. ADO is a COM-based API primarily intended to be used from native code rather than managed code. Typically, these applications were either converted from VB 6 or the developer used ADO instead of ADO.NET due to unfamiliarity with the ADO.NET object model.  This is a good opportunity to convert such code to use System.Data.SqlClient, which will perform better than OLE DB or ODBC from managed code. 

If you have an ADO application where performance is not a concern or the conversion is not worth the effort, an alternative is to simply change the provider to MSDASQL (OLE DB Provider for ODBC Drivers) and add the SQL Server Native Client ODBC driver specification. This can be done with a simple connection string change and the MSDASQL provider will translate the ADO OLE DB calls to ODBC. For example, to use the SQL Server 2008 SNAC ODBC driver:

Old OLE DB connection string: “Provider=SQLNCLI10.1;Data Source=MyServer;Integrated Security=SSPI”

New ODBC connection string: “Provider=MSDASQL;Driver={SQL Server Native Client 10.0};Server=MyServer;Trusted_Connection=Yes”


The same connection string change can be used for any ADO application, including ASP classic, legacy VB 6 or unmanaged C++.

Perhaps the biggest challenge will be native code that uses the OLE DB COM interfaces directly instead of going through higher level APIs like ADO. I’ve seen this most commonly done for performance sensitive applications in C++. The best approach here will be to convert the application to use the ODBC call-level interface directly. This will provide the highest SQL Server data access performance from native code. The difficulty of such a change will depend much on the application object model and design. Ideally, data access libraries are shared and abstracted so that low-level data access code changes only need to be made in one place.

Why SQLOLEDB and SQLNCLI Was Deprecated

If you’ve used SQL Server for a long time like me, you’ve seen a number of APIs come and go ( APIs are largely driven by changes in development and platform technologies that change over time. It is possible for Microsoft to support legacy APIs indefinitely but doing so would waste precious development resources on maintenance instead of adding new features that are important to us. COM-based APIs like OLE DB are complex and it just doesn’t make sense to have many APIs that basically do the same thing. 

So we now have the short list of SQL Server relational data access APIs going forward:

·         SqlClient (managed code)

·         JDBC (Java)

·         ODBC (for native code)


I’m a big fan of open, cross-platform standards so I’m glad that Microsoft chose ODBC over OLE DB for relational database access. ODBC is an implementation of the SQL call-level interface standard (ISO/IEC 9075-3). In contrast, the COM-based OLE DB SQL Server provider relies on proprietary Microsoft Windows COM technology. The SNAC ODBC driver is a truly native driver and provides the fastest SQL Server database access from native code.


SQL Server Connection Strings

This is the first of a series of posts on SQL Server connection strings. I don’t think connection strings are all that complicated but I often see developers have problems because they simply cloned an existing connection string (or found one on the internet) and tweaked it for the task at hand without really understanding what the keywords and values mean. This often results in run-time errors that can be tricky to diagnose. 
In this post, I’ll provide a connection string overview and discuss SqlClient connection strings and examples. I’ll discuss OLE DB and ODBC (used via ADO or ADO.NET) and JDBC in more detail the future articles.
SQL Server can be accessed using several technologies, each of which has different connection string particulars. Connection strings are provider/driver specific so one first needs to decide on a client API before formulating the proper string can be created. 
All connection strings share the same basic format, name/value pairs separated by semicolons, but the actual connection string keywords may vary by provider. Which keywords are required or optional also vary by provider and providers often share the same keywords (or provide synonyms) to minimize the connection string changes when switching between different providers. Most connection string keywords are optional and need to be specified only when the default is not appropriate. Connection string values should be enclosed in single or double quotes when the value may include a semicolon or equal sign (e.g. Password=”a&==b=;1@23″)
The purpose of a connection string is to supply a SQL Server provider/driver with the information needed to establish a connection to a SQL Server instance and may also be used to specify other configuration values, such as whether connection pooling is used. At the end of the day, the provider/driver needs to know at least:
·         SQL Server name (or address)
·         Authentication method (Windows or SQL Server)
·         Login credentials (login and password for SQL Server authentication)
One typically uses the .Net Framework Provider for SQL Server (abbreviated to SqlClient here) in managed code and a SQL Server OLE DB provider or ODBC driver from unmanaged code. It is possible to use OLE DB or ODBC for SQL Server data access in managed code but there is seldom a reason to do so since SqlClient offers high-performance access to SQL Server natively.
The authoritative reference for SqlClient connection strings is My goal is not to rehash all of the keywords or illustrate the many combinations here but rather show the ones most commonly used along with best practices. I use the primary keywords rather than synonyms or equivalent keywords in the examples.
The SqlConnectionStringBuilder class provides a programmatic way to build connection strings needed by SqlConnection class. The nice thing about SqlConnectionStringBuilder is that it provides IntelliSense and avoids connection string typos. It should always be used when constructing connection strings based in user input (e.g. user id and password prompt). But you still need to know which connection string properties (keywords) you need to set along with the default values. The examples here apply regardless of whether or not you use yjr SqlConnectionStringBuilder class.
SqlClient Connection String Keyword Examples
Unlike other providers, there is no “Provider” or “Driver” connection string keyword in a SqlClient connection string.  The .Net Framework Provider for SQL Server is implicit with a SqlConnection class so it is redundant to also specify the provider.
I’ll start with the minimal keyword(s) needed. The minimal SqlClient connection string need only specify the authentication method.  The example below specifies Windows authentication using “Integrated Security=SSPI”. This connection string will connect the default instance on the local machine using Windows authentication under the current process Windows security credentials. 

Integrated Security=SSPI
Listing 1: Connect to local default instance using Windows authentication
To connect to the local default instance using SQL authentication, just specify the credentials using the “User ID” and “Password” keywords instead of “Integrated Security=SSPI” keyword. SQL authentication is the default when “Integrated Security” or “Trused_Connection” keyword is not specified. Although I commonly see “Persist Security Info=False” also specified (a best practice from a security perspective), that is the default setting and may be omitted. Be aware that you should encrypt connection strings (or passwords in general) stored in configuration files when using SQL authentication.

User ID=MyLogin;Password=MiP@ssw0rd
Listing 2: Connect to local default instance using SQL authentication
One often connects to a remote SQL Server. Along with the authentication method, add the Data Source keyword to specify the desired SQL Server name or network address.

Data Source=SQLSERVERNAME;Integrated Security=SSPI
Listing 3: Connect to default instance on host SQLSERVERNAME using Windows authentication

Data Source=SQLSERVERNAME;User ID=MyLogin;Password=MiP@ssw0rd
Listing 4: Connect to instance on host SQLSERVERNAME using SQL authentication
Note that these same connection strings may be used to connect locally or remotely. Personally, I recommend always specifying the Data Source even when connecting locally. This makes it easy to move the application to another machine using with the same configuration and helps avoid oversights.
It is usually best to let SqlClient determine the appropriate network library to use rather than an explicit specification. SqlClient will figure out the appropriate network library based on the specified Data Source value. When you connect to a local instance using an unqualified name (or the value “(local)”), Shared Memory is used by default. SqlClient will use TCP/IP if a FQDN (e.g. SQLSERVERNAME.MyDOMAIN.COM) or IP address is specified regardless of whether the instance is local or remote. Since TCP/IP is most commonly used nowadays, I’ll focus on TCP/IP in this article and use a FQDN in the subsequent examples to avoid ambiguity.
It is often desirable to specify the initial database context in the connection sting. If omitted, the default database of the authenticated account is used. This is accomplished using either the “Initial Catalog” or “Database” keyword. I suggest always including the “Initial Catalog” keyword.

Data Source=SQLSERVERNAME.MYDOMAIN.COM;Integrated Security=SSPI;Initial Catalog=MyDatabase
Listing 4: Connect to default instance on host SQLSERVERNAME using Windows authentication with initial database context of MyDatabase
Named Instances
The connection strings I’ve shown so far assume the target is a default SQL Server instance listening on port 1433. One can run multiple instances of SQL Server on the same host using the named instance feature. If your target database instance is a named instance, SqlClient will also need to know the instance name or instance port number. The instance name can be specified by appending a backslash and instance name to the Data Source value:

Data Source=SQLSERVERNAME.MYDOMAIN.COM\MYINSTANCE;Integrated Security=SSPI;Initial Catalog=MyDatabase
Listing 5: Connect to named instance on host SQLSERVERNAME using Windows authentication with initial database context of MyDatabase
As an aside, I often see connectivity problems with named instances due to oversights in the SQL Server configuration. When an instance name is specified, SqlClient interrogates the SQL Server Brower service on the SQL Server host to determine the instance port (or named pipe name). The SQL Server Brower service is disabled by default so you need to enable and start it in order to connect by the instance name. This can be done using the SQL Server Configuration Manager tool. Also, since the SQL Server Brower service communicates over UDP port 1434, that port must be allowed through firewalls.
You can specify a port number instead of instance name to directly to a named instance (or to a default instance listing on a non-standard port). The port may be specified by appending a comma and port number to the data source value. The needed port number can be ascertained from the SQL Server Configuration Manager tool.

Data Source=SQLSERVERNAME.MYDOMAIN.COM,60086;Integrated Security=SSPI;Initial Catalog=MyDatabase
Listing 6: Connect to instance on host SQLSERVERNAME listening on port 60086 using Windows authentication with initial database context of MyDatabase
Additional Keywords

In addition to the “Data Source”, “Initial Catalog” and “Integrated Security” (or “User Id” and “Password”) keywords I’ve discussed so far, I recommend that “Application Name” also be specified. The specified string is helps identify the application when monitoring activity on the database server. This is especially useful when an application server or client hosts multiple applications.

Data Source=SQLSERVERNAME.MYDOMAIN.COM;Integrated Security=SSPI;Initial Catalog=MyDatabase;Application Name=Connection String Example
Listing 7: Connect to default instance on host SQLSERVERNAME using Windows authentication with initial database context of MyDatabase with application name specification
In my opinion, the many other keywords are noise unless the default values are inappropriate for your environment. 

You can get by nicely in most cases with only the 4 or 5 SqlClient connection string keywords I’ve discussed here. I suggest you establish a connection string standard that includes the “Data Source”, “Initial Catalog”, “Application Name” keywords plus the authentication method, “Integrated Security=SSPI” or “User Id” and “Password”.

Why Parameters are a Best Practice

Parameterized queries are arguably the single most important database development best practice. Parameters, with proper SQL data types, provide many benefits. These include:

– improve security by preventing injection of unwanted SQL (barring non-parameterized server-side dynamic SQL)
– eliminate need to enclose literals in quotes (or not) depending on data type
– eliminate need prefix Unicode literal strings with N
– allow single quotes within strings without the need to escape them
– avoid the need to format date/time/datetime string literals in a particular way (which vary by culture)
– do not require decimal separators (which vary by culture)
– improve performance by reducing compilation costs and promoting execution plan cache reuse
– allow WHERE clause predicates with Always Encrypted columns
– code that’s cleaner and easier to maintain

Despite these many benefits, it is all too common to see applications build SQL statements by concatenating program variable values as SQL literals with values that vary by execution. I believe this simply due to ignorance because one often learns SQL using an ad-hoc tool like SSMS in order to focus on SQL syntax and database concepts, overlooking the important database programming aspects needed in practice. Developers unwittingly build SQL queries in application code as if they were typing SQL statements into an SSMS query window. Sadly, instructors sometimes teach database programming using string concatenation instead of parameterized queries and many online examples use improper techniques too, spreading this poor practice. When one posts questions with non-parameterized query code in online help forums, the egregious practice often distracts answerers from addressing the real question, attracts downvotes, and in some cases, a parameterized query would have avoided the need for the question to begin with.

This article discusses parameter basics in commonly used programming languages and Microsoft SQL Server APIs, along with code examples and best practices. Additionally, internals of how parameters are actually sent to SQL Server are described to provide insight into why parameterized queries are so important. This article focuses specifically on Microsoft SQL Server but parameters and their benefits apply to other DBMS products as well.


Here’s a C# example of the common mistake of building an ad-hoc query with string concatenation:

A common misconception is that using stored procedures inherently prevent SQL injection. Stored procedures can help mitigate security risks but must also be called properly to release parameterization benefits. The same problems as the previous ad-hoc query exists with the stored procedure example below because parameter values are passed as literals.

Techniques for building parameterized SQL statements vary by client API, driver, programming language, and/or framework. The underlying concept is the same, though. Instead of including literals in SQL statements for values that differ for each execution, specify parameter tokens in their place and add parameters with values to the command separately. The SQL statement itself need not change between executions but parameter values can differ for each execution without modifying the SQL statement itself.

The parameter creation techniques used here are not the only ones available to create parameters. I’ve included documentation links that show other parameter creation methods and guidance as to which should be avoided.

Most ORM frameworks (Entity Framwork, Hibernate, build parameterized queries on your behalf. These are beyond the focus of this article but I suggest you peruse the Parameters Under the Covers section of this article to ensure parameters are indeed being passed as expected as an RPC request by the framework you are using.

Parameters with ADO.NET

I’ll first mention one should use an ADO.NET and a managed ADO.NET provider in .NET applications whenever possible. ADO.NET does allow one to use unmanaged ODBC and OLE DB drivers via the System.Data.Odbc and System.Data.OleDb namespace objects but those are intended to be used only when no suitable managed ADO.NET provider is available. For Microsoft SQL Server, the managed providers are System.Data.SqlClient (included with the .NET framework) or the newer Microsoft.Data.SqlClient (.NET Core package, currently in preview). Managed providers provide the best performance in .NET applications by avoiding the cost of interop with unmanaged code, which is especially costly with large result sets.

Also, it’s not uncommon to see VB.NET applications sometimes use the legacy ADODB (ADO classic) API instead of ADO.NET because they started life in the VB6 world and were never changed to use ADO.NET. I’ll discuss ADO classic later in the context of unmanaged code. I suggest such lacy code be remediated to use ADO.NET if possible.

Below are C# parameterized equivalents of the earlier queries. These same objects and techniques can be used in VB.NET and managed C++ with only linguistic differences. The only change to the original SELECT statement is the parameter tokens within the query text plus adding parameters to the command’s Parameters collection (instead of concatenating values as SQL literal text). Parameter names (‘@’ prefix) are used as tokens for parameter values in SqlClient queries and allow for mnemonic names to make it easy to identify individual parameters within the query.

Importantly, note that parameter tokens within a query are never enclosed in quotes because these are not literals. Also, the same parameter may be specified more than once within a query with SqlClient without the need to add an additional parameter with the same value.

It’s generally best to specify CommandType.StoredProcedure when calling a stored procedure because this reduces server-side parsing and enforces that stored procedure parameter arguments be passed as client parameters instead of literals. With SqlClient and CommandType.StoredProcedure, only the stored procedure name is specified as the command text (i.e. no parameter tokens). The procedure name should be schema-qualified for maximum performance, a consideration in high-volume OLTP workloads. Parameters are added to the Parameters collection and implicitly mapped to the actual stored procedure parameters by name rather than by ordinal so the parameter collection order does not matter when multiple parameters are involved.

One can also execute a stored procedure as a parameterized query using a T-SQL EXECUTE statement and CommandType.Text similarly to a parameterized query. There is no benefit in doing so compared to CommandType.StoredProcedure for a single proc call (and it does slightly bloat the request). However, CommandType.Text allows one to execute a parameterized batch of multiple statements in a single round trip. The batch may be a mix of EXECUTE and other DML statements.

With an EXECUTE statement, stored procedure parameters can be specified using either named (i.e. @stored_proc_parameter_name1 = @command_parameter_name1, @stored_proc_parameter_name2 = @command_parameter_name2, …) or positional (@command_parameter_name1, @command_parameter_name2, …) arguments. Stating the obvious, positional syntax necessitates the parameter argument order on the EXECUTE statement match the stored procedure definition order. However, the order of parameters within the parameters collection isn’t significant with either named or positional arguments because the parameters collection is mapped to the parameters within the SQL text by name with SqlClient.

Below is a named parameter example. The stored procedure parameter name and the client parameter name are the same in this example.

Note that the EXECUTE keyword could have be omitted in this example because SQL Server assumes the first statement of a batch is EXECUTE. I recommend the EXECUTE be explicitly specified even when the first statement in a batch.

Parameters with ADO.NET System.Data.OleDb

Again, .NET applications should always use a managed provider when one is available (SqlClient in the case of SQL Server). That being said, I use unmanaged SQL Server drivers in these examples to illustrate how queries can be parameterized in ADO.NET for other DBMS products that don’t have a managed .NET provider. The only difference would be the connection string.

Here are the equivalent OLE DB parameterized queries in ADO.NET using the latest SQL Server OLE DB provider. The differences compared to same SqlClient query are the connection string, question mark parameter tokens, and the System.Data.OleDb objects.

Parameters with ADO.NET System.Data.Odbc

Below is the ODBC parameterized query in ADO.NET. The only difference compared to the earlier OLE DB version is the connection string and System.Data.Odbc namespace objects. Although this example uses a DSN-less connection string, one could also specify an ODBC DSN in the connection string (e.g. “DSN=YourDSN”) and omit the attributes that are redundant with the DSN configuration.

Similarly, here is the ODBC version of the same previous stored procedure calls. However, with ODBC and CommandType.StoredProcedure, the command text specifies an ODBC procedure call escape sequence with parameter marker(s) rather than just the stored procedure name.

Parameters with ADO Classic

ADO “Classic” (not to be confused with ADO.NET) is a mature legacy technology that hasn’t been enhanced for about 20 years. An unmanaged COM-based API, ADO classic is still used today in classic ASP web applications, VBA (Visual Basic for Applications), legacy VB 6 code, and VB.NET code that was converted from VB 6 without being refactored to use modern managed ADO.NET APIs.

It is best to use an OLE DB driver with ADO classic because these unmanaged APIs natively use COM-based technologies. That being said, one can use ODBC (which is ultimately a call-level interface rather than COM) with ADO via the Microsoft OLE DB Provider for ODBC drivers (MSDASQL, which is the default OLE DB provider when no provider is specified). This is needed when no OLE DB provider available for a non SQL Server DBMS but should otherwise be avoided since ADO must go through the extra MSDASQL layer. A SQL Server OLE DB provider/driver (i.e. SQLOLEDB, SQLNCLI, and MSOLEDBSQL) should be used directly to access SQL Server from ADO classic.

Regardless of the underlying driver, ADO programming and parameterization is similar with either OLE DB or ODBC. The ADO classic object model is different than its ADO.NET cousin but parameterization concepts and techniques are similar. I’ve included C# examples here only to illustrate the ADO parameter object model differences compared to the previous ADO.NET examples. In practice, one would not typically use ADO classic in managed code.

Parameterized ADO query example:

Parameterized ADO proc call example:

ADO Parameters with VBScript

Below is an example stand-alone VBScript example of the query and proc call.

ADO Parameters with ASP classic

The ADO code for classic ASP is the same as the preceding stand-alone VBScript example except that the objects are instantiated using the IIS Server.CreateObject method:

Parameters Under the Covers

The job of all SQL Server client APIs regardless of programming language is to manage the Tabular Data Stream (TDS) protocol session used for SQL Server communication. This involves establishing the session (network connection and authentication), converting application requests into internal TDS request messages (e.g. batch and RPC requests), sending those to SQL Server, and receiving the results returned by SQL Server over TDS. Although details of the TDS protocol session are an implementation detail transparent to applications, delving into the way TDS requests are sent to the database engine provides valuable insight into the concepts, benefits, and use of parameters in the SQL Server world.

When a non-parameterized query is executed (i.e. an empty parameter collection and CommandType.Text in ADO.NET), the client API sends the command text to SQL Server as a SQL batch request over TDS. There is no provision for parameters in a SQL batch request. The database engine parses the batch text (which may contain multiple statements), generates an execution plan, and executes the batch. If the batch text (including literal values) exactly matches that of a previously executed query (or an auto-parameterized query) that is still cached, the plan is reused. Because a batch request has no parameters, the application must include T-SQL literals within the text for values that vary by execution. This requires the app code to build the query by concatenating SQL language text with literal strings and for SQL Server to parse those literals.

The onus is on the application to construct literal strings embedded within the SQL batch text properly so that they are parsed as desired by SQL Server. This necessitates enclosing literal values in single quotes when required and formatting the string value according to the data type it represents. The app code must double up single quotes within string literal values so that it’s interpreted as a quote instead of literal enclosure. Date/time literals should be formatted as ISO 8601 literal values to ensure the intended value is used regardless of session the DATEFORMAT setting, a common mistake with non-parameterized queries. Decimal literals must specify the ANSI/ISO standard period as the decimal separator instead of a comma used by some cultures. Source values must be validated to ensure adherence to the intended data type and not interpreted as SQL language elements (injecting unintended SQL).

Parameterized queries (e.g. non-empty parameters collection or CommandType.StoredProcedure) are processed differently. The client API sends the command to SQL Server as a remote procedure call (RPC batch) TDS request instead of a SQL batch request. This RPC batch request includes the batch text with the supplied parameter tokens (e.g. CommandType.Text) or stored procedure (e.g. CommandType.StoredProcedure). The actual parameters, containing parameter definition meta-data and values as specified by the client application, are included in separate structures as part the TDS RPC request stream sent to SQL Server. Most importantly, the actual parameter values as passed as native data types as specified by the client application. This has a many positive implications.

The database engine need not parse parameter values since they are already in the native structure of the specified parameter data type, such as a 32-bit integer, datetime structure, decimal structure, Unicode string, etc. SQL Server only checks that parameter values adhere to the TDS protocol spec and then uses the supplied parameter values directly without parsing. There is no notion of datetime formatting, string escape sequences, decimal separators, etc. because no parsed occurs. Unwanted SQL cannot be injected via parameters (barring server-side dynamic SQL) because the values are not part of a T-SQL statement.

I’ll add that the query text you see in a server-side trace of an RPC request is not necessarily what SQL Server actually received by the client API. The query text in trace data is actually a reversed-engineered rendering of the TDS protocol request in familiar T-SQL format. For example, when you pass a table-valued parameter, you’ll see an RPC batch request with batch text containing table variable declared, a bunch of INSERT statements into the variable, followed by it’s use as a parameter in the query or stored procedure. In actuality, the TVP column meta-data tokens were passed over TDS followed by the stream of TVP rows. The database engine bulk-inserted the TVP data into a tempdb structure matching the column meta-data, not individual inserts as the trace suggested.

Similarly, scalar parameter values will show as literal representations in the trace instead of the actual binary structure passed. For example, a datetime value will show as a ‘yyyy-MM-dd HH:mm:ss.fff’ literal (not an unambiguous ISO 8601 datetime format) rather than the actual binary datetime structure passed to SQL Server.

Much of this is implementation details but something you should be aware of when interpreting RPC requests in traces.

Best Practices

Parameter definitions should specify the desired data type, maximum length, precision, and scale of each parameter. Attention to these parameter definition details will help maximize performance by avoiding implicit conversions, promoting sargable expressions, and reusable cached plans.

A common problem (i.e. mistake) is using the AddWithValue method to create parameters. AddWithValue infers the data type and length based on the client variable value and may result in non-sargable expressions and cache bloat as I detailed in this post. The bottom line is that one should use a method that allows specification of the desired SqlDbType, length, etc. such as Add.

Security with Ownership Chains

Security with Ownership Chains


Ownership chaining is one of my favorite SQL Server security features.  I like ownership chaining because it allows me to confine data access to specific stored procedures, views and functions while preventing direct access to the underlying tables.  This approach provides a well-defined application interface that insulates applications from the physical implementation and allows me to change things on the back-end as long as I don’t change the interface.  Also, thanks to ownership chaining, I have many options to vertically and horizontally partition data to meet security objectives while ensuring tables are not accessed directly.  

Ignorance is not bliss when it comes to ownership chaining security, though.  I’ve seen cases where data access was inadvertently allowed because ownership chaining wasn’t considered.  It is imperative that SQL Server DBAs understand ownership chaining in the SQL Server security model so that data is properly secured.  I want to point out a couple of important security concepts as it relates to ownership chaining with the hope that it will help SQL Server DBAs implement a security scheme that is appropriate for their environment and also easy to manage.

Ownership Chain Behavior

SQL Server always checks permissions on directly referenced objects using the current connection security context.  However, permissions are evaluated differently when objects are accessed indirectly, such as those used in a trigger, view or stored procedure.  The behavior of ownership chaining is that object permissions are not evaluated when the calling and referenced object owners are the same.  Also, because object permission checking is short-circuited in an unbroken chain, ownership chaining takes precedence over an impersonated security context (EXECUTE AS clause) as well as denied object permissions.   Consequently, permissions on indirectly referenced objects are irrelevant in an unbroken ownership chain.  This ownership chaining behavior gives a DBA control over exactly who can use which objects while preventing ad-hoc access to other objects.

Ownership chaining behavior might not be intuitive at first.  I like to think of it from the perspective that the purpose of granting object permissions is so the object can actually be used.  For example, it is of no use to grant a user SELECT permissions on a view unless data can be retrieved from the view.  The same applies even when a stored procedure or function impersonates with EXECUTE AS; the impersonated principal must be able to use objects referenced by the proc in order for the procedure to be useful.  I keep in mind that with ownership chaining, I effectively grant permissions on exposed object functionality regardless of how the functionality is implemented.

Ownership chaining is limited to a single owner in order to prevent escalation of privileges in environments where non-privileged users can create objects.  Ownership chaining also does not apply to dynamic SQL executed from a stored procedure; the invoking (or impersonated) user needs permissions on objects referenced by dynamic SQL statements.  I should add that a one of the useful purposes of EXECUTE AS and signing modules with certificates is to provide needed permissions in a broken ownership chain or granting statement permissions without granting user permissions directly.  See SQL Server MVP Erland Sommarskog’s article Giving Permissions through Stored Procedures for a discussion of these techniques.

Ownership Best Practices

Ownership is obviously central to ownership chaining.  In the vast majority of environments, it’s probably best to simply inherit the owner (a.k.a. AUTHORIZATION) from the object’s schema instead of overriding on individual objects.  It is possible to assign ownership to individual objects (using ALTER AUTHORIZATION) in order to implement chaining that is independent of schema but I think this is overly complex for most environments.  One reason I can think of why one might specify a different owner at the object level is in the case where a table contains particularly sensitive data and one wants to deliberately prevent inadvertent access via ownership chaining.  I think such cases are rare, though. 

I described some of the practices I follow for schema and object ownership in Keep Schema and Ownership Simple.  In summary, I specify dbo as the schema authorization (owner) unless I want to break the ownership chain between schemas to create a security boundary.  When a security boundary is desired, I create a role (or user without login) with the same name as the schema for the schema authorization.

Keep Schema and Ownership Simple

I like to keep things simple because simplicity is easier to manage and less prone to error.  When I’m faced with schema design decisions, I pick the selection with the least complexity that meets my objectives.  Here are some of my thoughts regarding schema and ownership in SQL 2005.


A schema is basically a container that categorizes database objects and simplifies security administration.  As a namespace, schemas logically organize objects without the need for special object naming rules.  Different objects can have the same name as long as they exist in different schemas because the schema name is essentially an extension of the object name that will “uniqueify” the name within a database. 

Categorizing objects by schema is particularly useful in complex databases with many objects.  There is some subjectivity on exactly how one might draw schema boundaries but the basic concept is the same; group related objects into different schema to provide organization to an otherwise unwieldy schema.  Classifying related objects by schema makes complex databases easier to understand and manage.

Schema also simplifies security administration because permissions can be granted en mass at the schema level.  For example, I can grant EXECUTE permissions on all objects in a schema with a single statement like “GRANT EXECUTE ON SCHEMA::Sales TO SalesRole”.  I can grant CONTROL permissions on a schema to allow privileged users to full control over a specific schema but not others in the same database.

Even with the option to use multiple schemas, I tend to use the built-in dbo schema.  I do this because most of the applications I maintain were developed before SQL 2005 and all objects are already in the dbo schema.  Some of those legacy systems could benefit from multiple schemas but I’ll continue to use dbo for those applications to be consistent until I need to add a group of new objects that are appropriate for a separate schema.  The new SQL 2005 databases I’ve developed thus far have been fairly simple and haven’t warranted using multiple schemas for either classification or security purposes.


The owner is important for two main reasons:  1) the owner has powerful CONTROL permissions over all owned objects and 2) the owner determines whether or not the ownership chain is broken between objects.  Schema-contained objects will inherit the schema owner unless explicitly overridden using ALTER AUTHORIZATION.  Personally, I think it best for objects to inherit the schema owner in the vast majority of cases; if an object warrants a different owner than the schema, the object probably belongs in a different schema.

I use the built-in dbo principal for ownership unless I have a reason to do otherwise.  This approach is perfect in environments where only db-owner role members can create objects and schemas are used solely as a namespace rather than a security boundary.  The dbo principal exists in all databases so there is no a need to create a user or role for ownership purposes.  Simple is good.

Different schema owners provide a security boundary between objects in different schema because this breaks the ownership chain.  With a broken chain, explicit permissions on indirectly referenced objects are needed by the end user or the impersonated principal.  Different schema owners ensure that I don’t inadvertently provide access to data in different schema via ownership chaining.

Note that an owner can be any database principal and does not necessarily need to be associated with a login.  I find this feature especially handy in situations where I want to specify an owner other than dbo.  Since I can specify an owner that is not a real person, I won’t need to change ownership if the owner leaves the company or moves on to other roles in the organization. 

It’s probably best to create a principal (role or a user without login) with the same name as the schema for ownership purposes when different owners are desired.  The only case I can think of where it might be appropriate for a real person to own a schema (at least initially) is in a development environment when non-dbo users might create schemas and objects.