Table Partitioning Best Practices

SQL Server table partitioning has a number of gotchas without proper planning.  This article demonstrates those that commonly cause grief and recommends best practices to avoid them.

Implications of the Partition Function Range Specification

One needs a good understanding of how the RANGE LEFT/RIGHT specification affects partition setup and management.  The RANGE specification determines:

  • The partition created by SPLIT
  • The partition removed with MERGE
  • The permanent partition that can never be removed from partition schemes

Below are nuances of the RANGE specification that commonly surprise people.

The Partition Created by SPLIT

New partitions are created by splitting a partition function.  A partition function SPLIT splits an existing partition into 2 separate ones, changing all of the underlying partition schemes, tables, and indexes.  Below are the actions performed when a LEFT or RIGHT partition is SPLIT, with important the differences in bold:

The actions performed by a SPLIT of a RANGE LEFT partition function:

  • Identify existing partition to be split, which is the one that contains the new boundary (or the last partition if no existing boundaries are higher than the one being added)
  • Add the new boundary to the partition function, maintaining boundary order and incrementing subsequent partition numbers
  • Create a new partition to the left of the existing one on the NEXT USED filegroup of each partition scheme that uses the function
  • For each table/index using the affected partition scheme(s), move rows from the existing split partition that are less than or equal to the new boundary into the newly created partition on the left

The actions performed by a SPLIT of a RANGE RIGHT partition function:

  • Identify existing partition to be split, which is the one that contains the new boundary (or the first partition if no existing boundaries are less than the one being added)
  • Add the new boundary to the partition function, maintaining boundary order and incrementing subsequent partition numbers
  • Create a new partition to the right of the existing one on the NEXT USED filegroup for each partition scheme that uses the function
  • For each table/index using the affected partition scheme(s), move rows from the existing split partition that are greater than or equal to the new boundary into the newly create partition on the right

I generally suggest one use RANGE RIGHT instead because it is more natural, and helps avoid common pitfalls when adding incremental partition boundaries.  RANGE LEFT behavior is not intuitive and trips up many DBAs but I will discuss for completeness.  Consider this example of a RANGE LEFT partition function on a datetime column, where the initial setup is for 2 years of data (2013 and 2014):

This initial setup results in 2 partitions and data properly mapped to the 2 yearly filegroups of the scheme.  Now, we need to prepare for year 2015 so we add a new filegroup, set the NEXT USED filegroup, SPLIT the function for year 2015, and insert data for 2015:

This SPLIT results in adding the new boundary to the function as expected:

However, the partition scheme is not as desired.  The resultant partition scheme is:

As you can see, the filegroup for 2015 was inserted into the scheme before the 2014 one, resulting in 2014 data in the FG_2015 filegroup and 2015 data in the FG_2014 filegroup.  Furthermore, not only is the filegroup mapping now wrong, data movement was required to move the entire year of 2014 data into the new partition.  No big deal here since only 3 rows were moved by this demo script but in a production table, this movement could be a show stopper.  Logging during SWITCH/MERGE data movement during is about 4 times that of normal DML, which is especially costly when working with large tables containing millions or billions of rows as is commony used in table partitioning.

Remember that the new partition by a SPLIT is the one that includes the specified boundary, which is to the left of the existing split partition with a RANGE LEFT function (the 2014 partition here).  Data are moved from the existing SPLIT partition into the newly created partition according to the boundaries of the new partition (greater than ‘2013-12-31T23:59:59.997’ and less than or equal to ‘2014-12-31T23:59:59.997’).

Again, I recommend using a RANGE RIGHT function to avoid this non-intuitive behavior.  Below is the equivalent RANGE RIGHT script for yearly partitions, which results in the desired filegroup mappings as well as no data movement.  Also, note that the datetime boundaries are exact date specifications with RANGE RIGHT, which is also more intuitive when working with temporal datetime, datetime2, and datetimeoffset data types that include a time component.

The resultant partition function and scheme definitions are as desired with this RANGE RIGHT partition function:

The Partition Removed by MERGE

When a partition is removed with MERGE, the dropped partition is the one that includes the specified boundary.  If the dropped partition is not empty, all data will be moved into the adjacent remaining partition.  Like SPLIT, costly data movement during partition maintenance should be avoided so it is best to plan such than only empty partitions are removed.  A MERGE should typically done after a purge/archive of data using SWITCH.

Below is a summary of a RANGE LEFT partition function MERGE:

  • Identify existing partition to be removed, which is the one that includes (to the left of) the specified existing boundary
  • For each table/index using the affected partition scheme(s), move rows from this partition into the adjacent right partition
  • Remove boundary from the partition function, maintaining boundary order and decrementing subsequent partition numbers
  • Remove the filegroup from each partition scheme that uses the function, unless the filegroup is mapped to another partition or is the NEXT USED filegroup

Below is a summary of a RANGE RIGHT partition function MERGE:

  • Identify existing partition to be removed, which is the one that includes (to the right of) the specified existing boundary
  • For each table/index using the affected partition scheme(s), move rows from this partition into the adjacent left partition
  • Remove boundary from the partition function, maintaining boundary order and decrementing subsequent partition numbers
  • Remove the filegroup from each partition scheme that uses the function, unless the filegroup is mapped to another partition or is the NEXT USED filegroup

I suggest explicit partition boundaries for expected data ranges to facilitate using both MERGE and SPLIT, and an additional one for the permanent partition (which I’ll detail shortly).  This practice helps ensure data are both logically and physically aligned, providing more natural partition management.

The Permanent Partition

You might not be aware that each partition scheme has a permanent partition that can never be removed.  This is the first partition of a RANGE RIGHT function and the last partition of a RANGE LEFT one.  Be mindful of this permanent partition when creating a new partition scheme when multiple filegroups are involved because the filegroup on which this permanent partition is created is determined when the partition scheme is created and cannot be removed from the scheme.

My recommendation is that one create explicit partition boundaries for all expected data ranges plus a lower and upper boundary for data outside the expected range, and map these partitions to appropriately named filegroups.  This practice will make the purpose of each partition/filegroup clear and help avoid accidentally placing data on the wrong filegroup.

Consider mapping partitions containing data outside the expected range to a dummy filegroup with no underlying files.  This will guarantee data integrity much like a check constraint because data outside the allowable range cannot be inserted.  If you must accommodate errant data rather than rejecting it outright, instead map these partitions to a generalized filegroup like DEFAULT or one designated specifically for that purpose.

I suggest specifying a NULL value for the first boundary of a RANGE RIGHT partition function.  This NULL boundary serves as the upper boundary of the permanent first partition as well as the lower boundary for the second partition containing data outside the expected range.  No rows are less than NULL so the first partition will always be empty.  It is therefore safe to map the first partition to the previously mentioned dummy filegroup even if you need to house data outside the expected range.  That being said, there is no harm in mapping the first partition to another filegroup other than lack of clarity.

For the last boundary of a RANGE RIGHT function, I suggest specifying the lowest value outside the expected range and also mapping the partition to either the dummy filegroup, or one designated to contain unexpected data.  The boundaries between the first boundary (NULL) and this one designate partitions for expected data.

Summary

In summary, I recommend a RANGE RIGHT function with the following setup:

  • First boundary value NULL
  • Subsequent boundary values for expected data partitions
  • A final boundary value of greater than the expected range
  • Map first, second, and last partitions to either a dummy filegroup or one designated for unexpected data
  • Map remaining expected data partitions to appropriately named filegroups

A similar RANGE LEFT function can be setup as follows.  I’m including this for only for completeness as the RANGE RIGHT setup above is a best practice, in my humble opinion.

  • First boundary for data less than the expected range
  • Subsequent boundaries for expected data partitions
  • A final boundary value of the maximum allowable value for the partitioning data type (which is another kludge that bolsters the case for RANGE RIGHT)
  • Map first, second from last, and last partitions to either a dummy filegroup or one designated for unexpected data
  • Map remaining expected data partitions to appropriately named filegroups

Below is an example script of applying these techniques with a RANGE RIGHT function, including adding an incremental partition for a new year.  Partitions for data outside the expected data range are mapped to the FG_NoData filegroup (which contains no files) so any attempt to insert data outside the expected range will fail.