Bulk Load Batch Size Considerations in SQL Server 2016

Bulk load has long been the fastest way to mass insert rows into a SQL Server table, providing orders of magnitude better performance compared to traditional INSERTs. SQL Server database engine bulk load capabilities are leveraged by T-SQL BULK INSERT, INSERT…SELECT, and MERGE statements as well as by SQL Server client APIs like ODBC, OLE DB, ADO.NET, and JDBC. SQL Server tools like BCP and components like SSIS leverage these client APIs to optimize insert performance.

SQL Server 2016 and later improves performance further by turning on bulk load context and minimal logging by default when bulk loading into SIMPLE and BULK LOGGED recovery model databases, which previously required turning on trace flags as detailed in this blog post by Parikshit Savjani of the MSSQL Tiger team. That post also includes links to other great resources that thoroughly cover minimal logging and data loading performance, which I recommend you peruse if you use bulk load often. I won’t repeat all that information here but do want to call attention to the fact that these new bulk load optimizations can result in much more unused space when a small batch size is used compared to SQL Server 2014 and older versions.

Bulk Load Batch Size Implications
An important consideration in SQL 2016 and later with bulk load context and minimal logging is that each batch allocates new extent(s) (8 contiguous 8K pages, 64K) rather than using existing unused pages in existing extents. This improves concurrency and space allocation performance but possibly with the cost of significantly higher unused space than previous versions when few rows are loaded per batch.

The optimal batch size for bulk load operations involves trade-offs between throughput, data space utilization, concurrency, and transaction log space (FULL recovery model). To avoid excessive unused space with bulk load context, adjust the batch size when possible such that data are loaded in multiples of the 64K extent size.

There are cases when one must load with small batches, such as when loading small files. One solution to mitigate unused space in this scenario is to not use TABLOCK so that bulk load context isn’t used. This isn’t much of a performance concern with small batches anyway. Another method us to turn on trace flag 692 to disable the default bulk load context in SQL Server 2016, effectively reverting to pre-SQL 2016 behavior.

Don’t confuse batch size with the rows per batch hint. The rows per batch hint is used by the SQL Server optimizer to help optimize the load process because the number of rows that will be loaded is otherwise unknown, defaulting to an estimate of 10,000 rows.

SQL Server 2016 SP1 Standard Edition Enhancements

I seldom get excited about service packs but the changes released with SQL Server 2016 SP1 are the most significant I’ve seen in a SQL Server service pack in 20+ years. Microsoft announced this week at the Microsoft Connect(); developer’s conference that SQL Server 2016 SP1, which is available for download immediately, allows features previously available only in Enterprise/Developer Editions to be used in lessor Standard, Web, Express, and LocalDB Editions too. Features like table partitioning, In-Memory OLTP, and columnstore are now options for developers and DBAs using SQL Server Standard Edition and even the free Express Edition in production. See SQL Server 2016 Service Pack 1 (SP1) released !!! for the complete matrix of programmability features by edition along with other cool SP1 information.

The implications are huge now that SQL Server has the same programmability surface area among editions. The choice of the production edition can be made independently based on operational needs rather than programmability features. Developers can use a free edition (i.e. LocalDB, Express or Developer) without fear a feature won’t be available in production as long as prod is running SQL Server 2016 SP1 or greater. DBAs can now choose the appropriate edition for production based on other considerations like advanced high availability, TDE, Auditing as well as performance features like higher supported memory, more number of cores, and advanced scanning. This separation of concerns avoids the need to lock in the production edition early in the application lifecycle, making development easier and production implementation more flexible.

Real World Use Case Scenario
I work with an ISV with hundreds of customers running a mix of Standard and Enterprise Edition. Their needs vary widely and SQL Server Enterprise Edition is not an option for some due to budget constraints. Some tables are often quite large so partitioning is required for manageability and, for their reporting workload, partitioning also improves performance of large scans due to partition elimination. The ugliness though, is that table partitioning (and/or columnstore) is the right tool for the job but was not an option for customers on Standard Edition.

The ISV initially compromised and used view partitioning instead of table partitioning so that the same code would run regardless of edition. Although that provided the expected manageability benefits, there were some downsides. Compilation time increased significantly as the number of partitioned view member tables increased as did the query plan complexity. This sometimes resulted in poor query plans against very large tables and especially impacted larger and most valued customers, most of which were running Enterprise Edition.

To address the problem before SQL Server 2016 SP1, the ISV added conditional code to the application so that either view or table partitioning could be used depending on the SQL Server edition. This wasn’t ideal as it added code complexity and doubled the number of QA test cases for application features that performed partition maintenance. However, since the resultant benefits for their larger customers on Enterprise Edition were quite significant; the additional costs of development and testing were well-justified.

Now that table partitioning is available in SQL Server 2016 SP1 Standard Edition, they plan to require SQL Server 2016 SP1 (or later) going forward, use table partitioning unconditionally, and perhaps introduce usage of other features like columnstore that were previously Enterprise only. Not only will this simplify the code base and test cases, customers on Standard Edition will be happier with their experience and can upgrade to Enterprise if they so choose without reinstalling or reconfiguring the application. It will of course take some time before all their customers upgrade to the latest product version and SQL 2016 SP1+ but the future is much brighter now.

Perform Due Diligence
If you are new to features previously available only in Enterprise Edition, I suggest you perform due diligence before using these features. Memory-optimized features like columnstore and In-Memory OLTP require additional physical memory and insufficient memory with memory-optimized features will be a production show-stopper. Make sure your hardware is sized appropriately regardless of edition and, in the case of editions other than Enterprise or Developer, memory requirements don’t exceed the maximum capacity limits for that edition. Although very powerful, In-Memory OLTP is a fundamentally different paradigm that you might be accustomed to regarding transactional behavior and isolation levels. Be sure you fully understand these features before using it in development or production.

I hope these changes are enough motivation for you to consider upgrading to SQL Server 2016 SP1, especially if you are running Standard Edition or are currently on an older SQL Server version. Together with the fact that SQL Server 2016 just runs faster, the time and effort spend in upgrading is a solid investment that will pay dividends regardless of edition.

SQL Saturday – St. Louis 2016

Please join us for SQL Saturday, St. Louis 2016. We have 25 sessions scheduled covering a wide range of topics on the Microsoft Data Platform stack.

As in the past few years, the event will be held at the SLU Center for Workforce & Organizational Development, 3545 Lindell Blvd , Saint Louis, Missouri 63103. Visit the event home page for registration, directions and session details.

See you there!