Remediating a Table Partitioning Mess

Partitioning often involves exceptionally large tables, which are quite unforgiving when it comes to remediating a mess due to improper partition design or maintenance.  In this post, I’ll discuss the common design mistake of using a RANGE LEFT partition function containing exact date boundaries with a date type that may include a time component (datetime, datetime2 or datetimeoffset) and show how to change it to RANGE RIGHT with minimal data movement.

Background
The most common use case for table partitioning is an incremental value.  Partitioning on a date, datetime, datetime2 or datetimeoffset value that occurs naturally in the data allows incremental load using SPLIT and SWITCH operations and purge/archive using SWITCH and MERGE.  SWITCH is nearly instantaneous regardless of the amount of underlying data involved because only meta-data changes are needed to move partitions from one table to another.  SPLIT and MERGE are also very fast meta-data only operations as long as no data movement is required.  For this reason, it is critical to setup boundaries correctly and plan as to perform SPLIT and MERGE against empty partitions whenever possible.

Note that RANGE LEFT or RANGE RIGHT specification determines:

  • The partition that contains exact boundary value matches
  • The partition is created by SPLIT
  • The partition is removed by MERGE

A common design mistake is using exact date boundaries in a RANGE LEFT function with datetime, datetime2 or datetimeoffset data type.  This error can result in rows in the wrong partition because these data types include a time component and the implications are not obvious, especially to partitioning newbies.  Rows that exactly match the date boundary (i.e. midnight on boundary date) are inserted into the lower partition instead of the higher one as desired.  For example, consider a RANGE LEFT function is a monthly sliding window scenario and boundaries of ‘2013-01-01T00:00:00’ and ‘2013-02-01T00:00:00’:

CREATE PARTITION FUNCTION PF_Monthly(datetime2(0))
    AS RANGE LEFT FOR VALUES('2013-01-01T00:00:00', '2013-02-01T00:00:00');

CREATE PARTITION SCHEME PS_Monthly AS
    PARTITION PF_Monthly TO (December2012, January2013, February2013);

This setup looks reasonable at first glance but consider the actual partitions when a table is created on the partition scheme:

  • Partition 1 (December2012 filegroup):  <= ‘2013-01-01T00:00:00’
  • Partition 2 (January2013 filegroup):  > ‘2013-01-01T00:00:00’ AND <= ‘2013-01-02T00:00:00’
  • Partition 3 (February2013 filegroup):  > ‘2013-02-01T00:00:00’

A row with the January datetime of ‘2013-01-01T00:00:00’ will end up in the first partition (December2012 filegroup).  Sliding window maintenance, which purges the oldest month (December2012), will include the ‘2013-01-01T00:00:00’ row(s) inadvertently.  Similarly, a row with value ‘2013-02-01T00:00:00’ will end up in the second partition (January2013 fielgroup) and require both the second and third partitions to be accessed when data from all of February are queried.

Remediation
There are two ways to address this problem.  One is to change the boundary values of the existing RANGE LEFT function to the maximum inclusive value instead of an exact date. For example, instead of ‘2013-01-01T00:00:00’ for January, 2013, change the boundary values to the maximum value according to the partition function data type as follows:

  • *datetime: ‘2013-01-31T23:59:59.997’
  • datetime2(0): ‘2013-01-31T23:59:59’
  • datetime2(1): ‘2013-01-31T23:59:59.9’
  • datetime2(2): ‘2013-01-31T23:59:59.99’
  • datetime2(3): ‘2013-01-31T23:59:59.999’
  • datetime2(4): ‘2013-01-31T23:59:59.9999’
  • datetime2(5): ‘2013-01-31T23:59:59.99999’
  • datetime2(6): ‘2013-01-31T23:59:59.999999’
  • datetime2(7): ‘2013-01-31T23:59:59.9999999’
  • **datetimeoffset (0): ‘2013-01-31T23:59:59-05:00’’
  • **datetimeoffset (1): ‘2013-01-31T23:59:59.9-05:00’’
  • **datetimeoffset (2): ‘2013-01-31T23:59:59.99-05:00’’
  • **datetimeoffset (3): ‘2013-01-31T23:59:59.999-05:00’’
  • **datetimeoffset (4): ‘2013-01-31T23:59:59.9999-05:00’’
  • **datetimeoffset (5): ‘2013-01-31T23:59:59.99999-05:00’’
  • **datetimeoffset (6): ‘2013-01-31T23:59:59.999999-05:00’’
  • **datetimeoffset(7): ‘2013-01-31T23:59:59.9999999-05:00’’

*Importantly, one needs to specify 997 as the milliseconds value for datetime because a higher millisecond value will be rounded to the next exact date (e.g. ‘2013-02-01T00:00:00.000’)

**The actual offset for datetimeoffset partitioning depends on your environment but is typically the same for each boundary.

Rather than changing the boundary specification, I suggest instead changing the function from LEFT to RIGHT and continue using an exact date boundary.  The inclusive date boundary is more intuitive (in my opinion) and allows the same date boundaries to be used regardless of data type.  The same amount of work is required for a mass change of existing boundaries as is required to change from RANGE LEFT to RANGE RIGHT.

The script below converts a table partitioned using a RANGE LEFT function with exact date boundaries to one partitioned RANGE RIGHT as follows:

  1. create a staging table exactly like the source table and partitioned using the same scheme
  2. create a non-partitioned holding table with the same schema source but without constraints and indexes
  3. create a new RANGE RIGHT function with the same exact date boundaries as the existing RANGE LEFT scheme
  4. create a new partition scheme with the same filegroups as the source table partition scheme but based on the new RANGE RIGHT function
  5. create a new table exactly like the source but partitioned using the new RANGE RIGHT scheme
  6. for each existing partition
    1. switch source partition to staging table
    2. move rows with exact boundary matches to holding table
    3. create check constraint on staging table matching the target right boundaries
    4. switch staging partition to new table
    5. drop check constraint
    6. after all partitions are switched to the new table, insert exact boundary matching rows to new table
    7. drop the old table and staging tables
    8. rename the new table, constraints and indexes to the old names

Since this script uses SWITCH to accomplish the bulk of the work, the table must be aligned; data and indexes must be partitioned using the same (or functionally identical) partition function.  The SWITCH operations are nearly instantaneous and the only data movement required is moving rows that exactly match the boundary values to/from the holding table, which shouldn’t take long unless you have a lot of matching rows.  The long pole in the tent from a performance perspective is likely creating the needed check constraint on the staging table because all rows must be scanned.  However, this method is still considerably faster and requires less space than repartitioning the table by recreating indexes using the new partition scheme.

Here’s the setup script for the problem table and sample data:

--existing RANGE LEFT function with exact date boundaries
CREATE PARTITION FUNCTION PF_Left(datetime)
       AS RANGE LEFT FOR VALUES('20130101','20130201');
--existing scheme based on RANGE LEFT
CREATE PARTITION SCHEME PS_Left
       AS PARTITION PF_Left ALL TO ([DEFAULT]);
--existing table, constraint and index partitioned with RANGE LEFT scheme
CREATE TABLE dbo.OldTable(
       PartitioningColumn datetime NOT NULL
       ,DataColumn int NOT NULL
       ) ON PS_Left(PartitioningColumn);
ALTER TABLE dbo.OldTable
	ADD CONSTRAINT PK_OldTable PRIMARY KEY CLUSTERED (PartitioningColumn, DataColumn) 
	ON PS_Left(PartitioningColumn);
CREATE INDEX idx_OldTable_DataColumn ON dbo.OldTable(DataColumn) 
	ON PS_Left(PartitioningColumn);
--sample data
INSERT INTO dbo.OldTable VALUES
       ('20121231',1)
       ,('20130101',1)
       ,('20130102',2)
       ,('20130103',5)
       ,('20130131',2)
       ,('20130201',4)
       ,('20130202',3);
GO

And the remediation script to change to RANGE RIGHT:

--new RANGE RIGHT function with same boundaries as existing RANGE LEFT function
CREATE PARTITION FUNCTION PF_Right(datetime)
       AS RANGE RIGHT FOR VALUES('20130101','20130201');

--new scheme based on RANGE RIGHT with same filegroups as RANGE LEFT scheme
CREATE PARTITION SCHEME PS_Right
       AS PARTITION PF_Right ALL TO ([DEFAULT]);

--new staging table exactly line existing source table
CREATE TABLE dbo.OldTableStaging(
       PartitioningColumn datetime NOT NULL
	   ,DataColumn int NOT NULL
       ) ON PS_Left(PartitioningColumn);
ALTER TABLE dbo.OldTableStaging
	ADD CONSTRAINT PK_OldTableStaging PRIMARY KEY CLUSTERED (PartitioningColumn, DataColumn) 
	ON PS_Left(PartitioningColumn);
CREATE INDEX idx_OldTableStaging_DataColumn ON dbo.OldTableStaging(DataColumn) 
	ON PS_Left(PartitioningColumn);

--new holding table for exact boundary matches (no constraints or indexes needed)
CREATE TABLE dbo.OldTableStagingExactBoundaryMatches(
       PartitioningColumn datetime NOT NULL
	   ,DataColumn int NOT NULL
       ) ON [DEFAULT];

--new table exactly like existing table but partitioned with RANGE RIGHT scheme
CREATE TABLE dbo.NewTable(
       PartitioningColumn datetime NOT NULL
	   ,DataColumn int NOT NULL
       ) ON PS_Right(PartitioningColumn);
ALTER TABLE dbo.NewTable
	ADD CONSTRAINT PK_NewTable PRIMARY KEY CLUSTERED (PartitioningColumn, DataColumn) 
	ON PS_Right(PartitioningColumn);
CREATE INDEX idx_NewTable_DataColumn ON dbo.NewTable(DataColumn) 
	ON PS_Right(PartitioningColumn);
GO

--script to convert RANGE LEFT to RANGE RIGHT
SET NOCOUNT ON;
DECLARE 
       @LowerBoundary datetime
       ,@UpperBoundary datetime
       ,@PartitionNumber int
       ,@Sql nvarchar(MAX)
       ,@PartitionFunctionName sysname = N'PF_Right';

DECLARE PartitionBoundaries CURSOR LOCAL FAST_FORWARD FOR
WITH 
       PartitionNumbers AS (
              SELECT 1 AS PartitionNumber
              UNION ALL
              SELECT boundary.boundary_id + 1
              FROM sys.partition_functions AS pf
              JOIN sys.partition_range_values AS boundary ON pf.function_id = boundary.function_id
              WHERE pf.name = @PartitionFunctionName
              )
       ,PartitionBoundaries AS (
              SELECT
                     (SELECT CAST(boundary.value AS datetime)
                     FROM sys.partition_functions AS pf
                     JOIN sys.partition_range_values AS boundary ON pf.function_id = boundary.function_id
                     WHERE
                           pf.name = @PartitionFunctionName
                           AND boundary.boundary_id = PartitionNumbers.PartitionNumber - 1
                     ) AS LowerBoundary
                     ,(SELECT CAST(boundary.value AS datetime)
                     FROM sys.partition_functions AS pf
                     JOIN sys.partition_range_values AS boundary ON pf.function_id = boundary.function_id
                     WHERE
                           pf.name = @PartitionFunctionName
                           AND boundary.boundary_id = PartitionNumbers.PartitionNumber
                     ) AS UpperBoundary
                     ,PartitionNumbers.PartitionNumber
              FROM PartitionNumbers
              )
SELECT LowerBoundary, UpperBoundary, PartitionNumber
FROM PartitionBoundaries
ORDER BY PartitionNumber;

OPEN PartitionBoundaries;
WHILE 1 = 1
BEGIN
       --get next partition boundary
       FETCH NEXT FROM PartitionBoundaries INTO @LowerBoundary, @UpperBoundary, @PartitionNumber;
       IF @@FETCH_STATUS = -1 BREAK;

       --move partition from partitioned left table to staging
       RAISERROR('Switching out partition %d', 0, 1, @PartitionNumber) WITH NOWAIT;
       ALTER TABLE dbo.OldTable
              SWITCH PARTITION @PartitionNumber TO dbo.OldTableStaging PARTITION @PartitionNumber;

       --move exact boundary matches to holding table
       DELETE FROM dbo.OldTableStaging WITH(TABLOCKX)
       OUTPUT deleted.* INTO dbo.OldTableStagingExactBoundaryMatches
       WHERE PartitioningColumn = @UpperBoundary;
       RAISERROR('Moved %d exact boundary matches to holding table', 0, 1, @@ROWCOUNT) WITH NOWAIT;

       --create check constraint to match RANGE RIGHT boundary
       SET @Sql = N'ALTER TABLE dbo.OldTableStaging
              ADD CONSTRAINT CK_TestStaging CHECK (';
       IF @LowerBoundary IS NOT NULL
       BEGIN
              SET @Sql = @Sql + N'PartitioningColumn >= ''' + CONVERT(char(8), @LowerBoundary,112) + N'''';
              IF @UpperBoundary IS NOT NULL
              BEGIN
                     SET @Sql = @Sql + N' AND ';
              END;
       END;
       IF @UpperBoundary IS NOT NULL
       BEGIN
              SET @Sql = @Sql + N'PartitioningColumn < ''' + CONVERT(char(8), @UpperBoundary,112) + N'''';
       END;
       SET @Sql = @Sql + ');';
       RAISERROR('Creating check constraint: %s', 0, 1, @Sql) WITH NOWAIT;
       EXEC sp_executesql @Sql;

       --move staging table partition to partitioned right table
       RAISERROR('Switching in partition %d', 0, 1, @PartitionNumber) WITH NOWAIT;
       ALTER TABLE dbo.OldTableStaging
              SWITCH PARTITION @PartitionNumber TO dbo.NewTable PARTITION @PartitionNumber;

       --remove staging table constraint
       RAISERROR('Dropping check constraint', 0, 1, @PartitionNumber) WITH NOWAIT;
       ALTER TABLE dbo.OldTableStaging
              DROP CONSTRAINT CK_TestStaging;

END;
CLOSE PartitionBoundaries;
DEALLOCATE PartitionBoundaries;

--insert exact boundary matches
INSERT INTO dbo.NewTable WITH (TABLOCKX) (PartitioningColumn, DataColumn) 
       SELECT PartitioningColumn, DataColumn FROM dbo.OldTableStagingExactBoundaryMatches WITH (TABLOCKX);
RAISERROR('Inserted %d exact boundary matches from holding table', 0, 1, @@ROWCOUNT) WITH NOWAIT;

IF NOT EXISTS(SELECT * FROM OldTable)
BEGIN
	--drop old table
	DROP TABLE dbo.OldTable;
	--rename table, constraints and indexes to same names as original table
	EXEC sp_rename 'dbo.NewTable.idx_NewTable_DataColumn', 'idx_OldTable_DataColumn', 'INDEX';
	EXEC sp_rename 'dbo.PK_NewTable', 'PK_OldTable';
	EXEC sp_rename 'dbo.NewTable', 'OldTable';
END;
--drop staging tables
DROP TABLE dbo.OldTableStaging;
DROP TABLE dbo.OldTableStagingExactBoundaryMatches;
GO