Microsoft SQL Server Script DOM

I once heard someone refer to the Microsoft SQL Server Script DOM library as the crown jewels of SQL Server tools. That may be a bit of an overstatement but reliably parsing and analyzing T-SQL scripts programmatically is an exceedingly complex task due to the many variations of T-SQL constructs and contexts in which T-SQL language elements may occur. Albeit one can parse T-SQL code using string functions, regex expressions, et.al., such parsing is fragile at best due to the difficulty in considering variants of multi-line statements, multi-statement lines, whitespace, comments that contain T-SQL code, etc.

Whether you’re a DBA needing to identify problem code in existing objects or a database developer wanting to incorporate robust code analysis rules in a CI/CD pipeline, the T-SQL Script Document Object Model (DOM) library is the right tool for these jobs and more.

The script DOM library official documentation is rather scant, including auto-generated class documentation without usage examples and behavior details. This introductory article reviews script DOM concepts and programming while showing how leverage the power of this library using PowerShell (version 5 or later), which is especially handy for ad-hoc needs. One can also use the library in any .NET language (e.g. C#, VB.NET) for a variety of purposes such as database code analysis rules in SSDT. Expertise in .NET programming or PowerShell is not required but a cursory knowledge of object-oriented programming and .NET is helpful to understand and extend the example code included here.

Script DOM Overview

The T-SQL script DOM library, Microsoft.SqlServer.TransactSql.ScriptDom.dll, includes classes and methods to parse, process, and format T-SQL scripts. The library is part of the Microsoft SQL Server Data-Tier Application Framework (DacFx) and, as a stand-alone client library, it can be used without installing SQL Server or connecting to a database engine. The library is available for .NET Framework and .NET Core versions so that it will run on Windows, macOS, or Linux.

The Windows PowerShell script examples in this article use the .NET Framework version installed with the sqlpackage MSI. Alternatively, one can use the assembly included in the DacFx NuGet package, or the assembly distributed with an existing SQL Server tools installation, or a .NET Core version from the sqlpackage download page. The PowerShell scripts in this article will run on any machine (or container) with PowerShell or PowerShell Core installed simply by specifying the location of the assembly Add-Type PowerShell command. One can similarly add a reference to the NuGet package or assembly for use in compiled .NET Framework applications.

Note that the T-SQL script DOM library parses only T-SQL; it does not recognize SQLCMD commands or scripting variables understood by SQL Server tools like SQLCMD, SSMS (in SQLCMD mode), and Azure Data Studio. The library can, however, parse scripts containing GO batch terminators even though GO is not technically a T-SQL statement.

Parsing Overview

Scripts are parsed by invoking the Parse method of T-SQL script DOM library TSqlParser class. The parser understands the complex T-SQL abstract syntax tree and splits T-SQL source into atomic TSqlParserTokens of TSqlTokenTypes that represent keywords, identifiers, punctuation, literals, whitespace, etc. These low-level tokens are grouped into more meaningful TSqlFragment objects that represent language elements of the script DOM, such as batches, statements, clauses, etc. Fragments, rather than the low-level parser tokens, are most often used in practice, although the underlying tokens are available for specialized requirements

The Parse method returns a TSqlFragment object of type TSqlScript containing all fragments within the script. This top-level fragment of the DOM hierarchy provides programmatic access to all language element fragments in the script. Nearly 1,000 different fragment types exist today due to the many granular T-SQL language elements.

Each fragment includes properties applicable to the specific fragment type it represents plus the properties common to all fragments (inherited from TSqlFragment) listed below as extracted from the documentation.

Property Description
FirstTokenIndex Gets or sets the first index of the token.
FragmentLength Defines the number of characters the fragment takes up in the script it was parsed.
LastTokenIndex Gets or sets the last index of the token.
ScriptTokenStream Gets or sets the script token stream.
StartColumn Gets the start column.
StartLine Gets the start line.
StartOffset Defines the character offset of fragments starting location in the script it was parsed.

The script source can be passed to Parse as a TextReader or as an IList<TSqlParserToken>. All the examples in this article use the TextReader method overload with a StringReader object containing the source script.

Parsing Examples

Here is an example that parses trigger code to ensure it is syntactically valid for each SQL Server version from SQL Server 2000 to SQL Server 2019. This trigger uses RAISERROR syntax that was deprecated decades ago (SQL Server 2000?) and removed from the product entirely in SQL Server 2012. Parsing will fail with an invalid syntax error when a TSqlParser110 or later parser version is used.

# trigger with deprecated/obsolete RAISERROR syntax
$script = @"
CREATE TRIGGER dbo.tr_OrderHeader
ON dbo.OrderHeader
FOR DELETE
AS
IF EXISTS(SELECT 1 FROM dbo.OrderDetail AS od WHERE od.OrderID IN(SELECT d.OrderID FROM deleted AS d))
BEGIN
	RAISERROR 50001 'OrderDetail rows exist for order. Delete order detail before order header.'
	ROLLBACK
END
"@

# parse script with specified parser and display success or parsing errors
Function Parse-TSqlScript($parser, $script) {

    # create an ParseError collection for any errors returned by parser
    $parseErrors = New-Object System.Collections.Generic.List[Microsoft.SqlServer.TransactSql.ScriptDom.ParseError]

    # create a StringReader for the script for parsing
    $stringReader = New-Object System.IO.StringReader($script)

    # parse the script
    $tSqlFragment = $parser.Parse($stringReader, [ref]$parseErrors)

    # display parse success or error
    if($parseErrors.Count -eq 0) {
        Write-Host "$($parser.GetType().Name): No parsing errors" -ForegroundColor Green
    }
    else {
        Write-Host "$($parser.GetType().Name): $($parseErrors.Count) parsing error(s): $(($parseErrors | ConvertTo-Json))" -ForegroundColor Yellow
    }

}

# ############
# ### MAIN ###
# ############
try {

    # load Script DOM assembly for use by this PowerShell session
    Add-Type -Path "C:\Program Files\Microsoft SQL Server\150\DAC\bin\Microsoft.SqlServer.TransactSql.ScriptDom.dll"

    # SQL Server 2000
    $parser = New-Object Microsoft.SqlServer.TransactSql.ScriptDom.TSql80Parser($true)
    Parse-TSqlScript -parser $parser -script $script

    # SQL Server 2005
    $parser = New-Object Microsoft.SqlServer.TransactSql.ScriptDom.TSql90Parser($true)
    Parse-TSqlScript -parser $parser -script $script

    # SQL Server 2008
    $parser = New-Object Microsoft.SqlServer.TransactSql.ScriptDom.TSql100Parser($true)
    Parse-TSqlScript -parser $parser -script $script

    # SQL Server 2012
    $parser = New-Object Microsoft.SqlServer.TransactSql.ScriptDom.TSql110Parser($true)
    Parse-TSqlScript -parser $parser -script $script

    # SQL Server 2014
    $parser = New-Object Microsoft.SqlServer.TransactSql.ScriptDom.TSql120Parser($true)
    Parse-TSqlScript -parser $parser -script $script

    # SQL Server 2016
    $parser = New-Object Microsoft.SqlServer.TransactSql.ScriptDom.TSql130Parser($true)
    Parse-TSqlScript -parser $parser -script $script

    # SQL Server 2017
    $parser = New-Object Microsoft.SqlServer.TransactSql.ScriptDom.TSql140Parser($true)
    Parse-TSqlScript -parser $parser -script $script

    # SQL Server 2019
    $parser = New-Object Microsoft.SqlServer.TransactSql.ScriptDom.TSql150Parser($true)
    Parse-TSqlScript -parser $parser -script $script

}
catch {

    throw

}

Script output:

TSql80Parser: No parsing errors
TSql90Parser: No parsing errors
TSql100Parser: No parsing errors
TSql110Parser: 1 parsing error(s): {
    "Number":  46010,
    "Offset":  193,
    "Line":  7,
    "Column":  12,
    "Message":  "Incorrect syntax near 50001."
}
TSql120Parser: 1 parsing error(s): {
    "Number":  46010,
    "Offset":  193,
    "Line":  7,
    "Column":  12,
    "Message":  "Incorrect syntax near 50001."
}
TSql130Parser: 1 parsing error(s): {
    "Number":  46010,
    "Offset":  193,
    "Line":  7,
    "Column":  12,
    "Message":  "Incorrect syntax near 50001."
}
TSql140Parser: 1 parsing error(s): {
    "Number":  46010,
    "Offset":  193,
    "Line":  7,
    "Column":  12,
    "Message":  "Incorrect syntax near 50001."
}
TSql150Parser: 1 parsing error(s): {
    "Number":  46010,     "Offset":  193,
    "Line":  7,
    "Column":  12,
    "Message":  "Incorrect syntax near 50001."
} 

I have seen invalid code like the above trigger in existing production databases due to past database upgrades. These objects can no longer be executed and should be dropped or remediated. The script below extends the previous example, parsing all existing stored procedures, triggers, and functions in an existing database according to the desired target SQL Server version (SQL 2019) and the existing object quoted identifier setting passed as the TSqlParser constructor argument.

$connectionString = "Data Source=.;Initial Catalog=AdventureWorks2012;Integrated Security=SSPI"

# parse script with specified parser and display success or parsing errors
Function Parse-TSqlScript($parser, $script) {

    # create an ParseError collection for any errors returned by parser
    $parseErrors = New-Object System.Collections.Generic.List[Microsoft.SqlServer.TransactSql.ScriptDom.ParseError]

    # create a StringReader for the script for parsing
    $stringReader = New-Object System.IO.StringReader($script)

    # parse the script
    $tSqlFragment = $parser.Parse($stringReader, [ref]$parseErrors)

    # display parse success or error
    if($parseErrors.Count -eq 0) {
        Write-Host "$($parser.GetType().Name): No parsing errors" -ForegroundColor Green
    }
    else {
        Write-Host "$($parser.GetType().Name): $($parseErrors.Count) parsing error(s): $(($parseErrors | ConvertTo-Json))" -ForegroundColor Yellow
    }

}

# ############
# ### MAIN ###
# ############
try {

    # load Script DOM assembly for use by this PowerShell session
    Add-Type -Path "C:\Program Files\Microsoft SQL Server\150\DAC\bin\Microsoft.SqlServer.TransactSql.ScriptDom.dll"

    $query = @"
SELECT 
	  QUOTENAME(OBJECT_SCHEMA_NAME(object_id)) + N'.' + QUOTENAME(OBJECT_NAME(object_id)) AS ObjectName
	, OBJECTPROPERTY(object_id, 'ExecIsQuotedIdentOn') AS ExecIsQuotedIdentOn
	, definition 
FROM sys.sql_modules;
"@

    $connection = New-Object Data.SqlClient.SqlConnection($connectionString)
    $command = New-Object Data.SqlClient.SqlCommand($query, $connection)
    $connection.Open()
    $reader = $command.ExecuteReader()
    while ($reader.Read()) {

        $parser = New-Object Microsoft.SqlServer.TransactSql.ScriptDom.TSql150Parser($reader["ExecIsQuotedIdentOn"])

        Write-Host "Parsing $($reader["ObjectName"]) ..."
        Parse-TSqlScript -parser $parser -script $($reader["definition"])

    }
    $connection.Close()
}    
catch {

    throw

} 

Below are the first few lines output if output. No parsing errors were found in this case.

Parsing [Sales].[vStoreWithContacts] ...
TSql150Parser: No parsing errors
Parsing [Sales].[vStoreWithAddresses] ...
TSql150Parser: No parsing errors
Parsing [Purchasing].[vVendorWithContacts] ...
TSql150Parser: No parsing errors
Parsing [Purchasing].[vVendorWithAddresses] ...
TSql150Parser: No parsing errors
Parsing [dbo].[ufnGetAccountingStartDate] ...
TSql150Parser: No parsing errors 

The same technique as above can parse version-controlled T-SQL scripts or any scripts stored in files. This example parses all scripts in a given directory and subdirectories.

# parse script with specified parser and display success or parsing errors
Function Parse-TSqlScript($parser, $script) {

    # create an ParseError collection for any errors returned by parser
    $parseErrors = New-Object System.Collections.Generic.List[Microsoft.SqlServer.TransactSql.ScriptDom.ParseError]

    # create a StringReader for the script for parsing
    $stringReader = New-Object System.IO.StringReader($script)

    # parse the script
    $tSqlFragment = $parser.Parse($stringReader, [ref]$parseErrors)

    # display parse success or error
    if($parseErrors.Count -eq 0) {
        Write-Host "$($parser.GetType().Name): No parsing errors" -ForegroundColor Green
    }
    else {
        Write-Host "$($parser.GetType().Name): $($parseErrors.Count) parsing error(s): $(($parseErrors | ConvertTo-Json))" -ForegroundColor Yellow
    }

}

# ############
# ### MAIN ###
# ############
try {

    # load Script DOM assembly for use by this PowerShell session
    Add-Type -Path "C:\Program Files\Microsoft SQL Server\150\DAC\bin\Microsoft.SqlServer.TransactSql.ScriptDom.dll"
    $scriptFiles = Get-ChildItem "C:\SqlScripts\*.sql" -Recurse
    # parse with quoted identifier on
    $parser = New-Object Microsoft.SqlServer.TransactSql.ScriptDom.TSql150Parser($true)
    foreach ($scriptFile in $scriptFiles) {

        $script = [IO.File]::ReadAllText($scriptFile.FullName)
        Write-Host "Parsing $($scriptFile.FullName) ..."
        Parse-TSqlScript -parser $parser -script $script

    }

}    
catch {

    throw

}

Partial output from the script above:

Parsing C:\SqlScripts\AdventureWorks2012\AdventureWorks2012.Database.sql ...
TSql150Parser: No parsing errors
Parsing C:\SqlScripts\AdventureWorks2012\dbo.AccountNumber.UserDefinedDataType.sql ...
TSql150Parser: No parsing errors
Parsing C:\SqlScripts\AdventureWorks2012\dbo.AWBuildVersion.Table.sql ...
TSql150Parser: No parsing errors
Parsing C:\SqlScripts\AdventureWorks2012\dbo.DatabaseLog.Table.sql ...
TSql150Parser: No parsing errors
Parsing C:\SqlScripts\AdventureWorks2012\dbo.ErrorLog.Table.sql ...
TSql150Parser: No parsing errors
Parsing C:\SqlScripts\AdventureWorks2012\dbo.Flag.UserDefinedDataType.sql ...
TSql150Parser: No parsing errors
Parsing C:\SqlScripts\AdventureWorks2012\dbo.Name.UserDefinedDataType.sql ...
TSql150Parser: No parsing errors 

Analyzing T-SQL Scripts

Parsing code for syntax errors is only a small subset of T-SQL Script DOM capability. One can also analyze T-SQL code after parsing using the TSqlScript fragment returned by the Parse method.

One way to examine scripts is by traversing the parsed DOM hierarchy of TSqlScript, TSqlBatch, and TSqlStatement fragments. This technique has shortcomings as illustrated with this example, which iterates over the TSqlBatch Statements collection looking for “SELECT *” constructs. The script displays a warning message and location within the script where a SelectStarExpression fragment is found.

# this is the script to parse
$script = @"
SELECT * FROM dbo.SomeTable;
IF 1 = 1
BEGIN
    SELECT * FROM dbo.SomeOtherTable;
END;
"@

try {

    # load Script DOM assembly for use by this PowerShell session
    Add-Type -Path "C:\Program Files\Microsoft SQL Server\150\DAC\bin\Microsoft.SqlServer.TransactSql.ScriptDom.dll"
    # create a TSql150Parser parser (SQL Server 2019)
    $parser = New-Object Microsoft.SqlServer.TransactSql.ScriptDom.TSql150Parser($true)

    # create an ParseError collection for any errors returned by parser
    $parseErrors = New-Object System.Collections.Generic.List[Microsoft.SqlServer.TransactSql.ScriptDom.ParseError]

    # create a StringReader for the script for parsing
    $stringReader = New-Object System.IO.StringReader($script)

    # parse the script
    $tSqlFragment = $parser.Parse($stringReader, [ref]$parseErrors)

    # raise an exception if any parsing errors occur
    if($parseErrors.Count -gt 0) {
        throw "$($parseErrors.Count) parsing error(s): $(($parseErrors | ConvertTo-Json))"
    }

    # show number of batches in this script
    Write-Host "Script starts at line $($tSqlFragment.StartLine), column $($tSqlFragment.StartColumn), length $($tSqlFragment.FragmentLength), and contains $($tSqlFragment.Batches.Count) batches"

    # iterate over batches in this script
    foreach ($batchFragment in $tSqlFragment.Batches) {

        # show starting line number of this batch
        Write-Host "`tBatch at line $($batchFragment.StartLine), column $($batchFragment.StartColumn), length $($batchFragment.FragmentLength), and contains $($batchFragment.Statements.Count) statements"

        # iterate over statements in this batch
        foreach ($statementFragment in $batchFragment.Statements) {

            # show statement starting line number and statement type
            Write-Host "`t`tStatement at line $($statementFragment.StartLine), column $($statementFragment.StartColumn), length $($statementFragment.FragmentLength) and is a $($statementFragment.GetType().Name)"

            # for SELECT statements, look for SELECT * expression and display warning if found
            if($statementFragment.GetType().Name -eq "SelectStatement") {
                foreach($selectElementFragment in $statementFragment.QueryExpression.SelectElements) {
                    if($selectElementFragment.GetType().Name -eq "SelectStarExpression") {
                        Write-Host "`t`tWARNING: 'SELECT *' found at line $($selectElementFragment.StartLine), column $($selectElementFragment.StartColumn), length $($selectElementFragment.FragmentLength)" -ForegroundColor Yellow
                    }
                }
                
            }
        }
    }
}
catch {
    throw
} 

Output:

Script starts at line 1, column 1, length 90, and contains 1 batches
	Batch at line 1, column 1, length 90, and contains 2 statements
		Statement at line 1, column 1, length 28 and is a SelectStatement
		WARNING: 'SELECT *' found at line 1, column 8, length 1
		Statement at line 2, column 1, length 60 and is a IfStatement 

You might have noticed the second SELECT * statement in the script, which is conditional, was not identified. This is because the code examined only SELECT statements that are immediate children of a batch and the unidentified SELECT statement is a child of an IfStatement fragment (grandchild of batch). The missed statement could have been identified by adding code to evaluate IfStatement fragments too, examining its ThenStatement property and descendants to see if a SelectStatement with a SelectStarExpression exists somewhere under the DOM tree. But as you can imagine, the code would become quite unwieldly and one would also need to also need to examine the ElseStatement property of IfStatement fragments, consider nested constructs, BEGIN/END blocks, etc., where a SELECT * might occur.

Navigating the T-SQL DOM with the above technique may be useful for some specialized use cases but too cumbersome with scripts containing statements of varying types and complex constructs. A more robust solution for this, and most other code analysis needs, is a visitor. A visitor class allows one to more easily process desired TSqlFragment types regardless of where they occur in the script while ignoring others. The remainder of the article focuses on fragment visitors.

Fragment Visitors

A custom fragment visitor class allows one to use a visitor pattern to more easily process relevant T-SQL fragment types with concise code while ignoring fragment types unneeded for the task at hand. A visitor class can be created in any .NET programming language that supports object-oriented inheritance, such as C#, VB.NET, and PowerShell version 5 or later. The class simply derives from the T-SQL Script DOM library TSqlConcreteFragmentVisitor type (or TSqlFragmentVisitor for specialized needs) and overrides the Visit base class method for fragment type(s) of interest. The base class includes plumbing to invoke the overridden Visit methods for the accepted fragment and its descendants without verbose code.

A custom visitor is used by invoking the Accept method on a fragment instance with the custom visitor object as the argument. Accept invokes the visitor’s Visit method for the current fragment type, which will be the overridden method if defined by the custom visitor or the base class. Each child of the fragment is then visited similarly, invoking the overridden Visit method for the fragment’s type if one exists. Consequently, Accept visits not only the current fragment, but all of its descendants and executes the overridden Visit method for each fragment along the way. This allows one to invoke Accept on the TSqlScript fragment returned by parse to visit all overridden Visit methods of the custom visitor class.

Here’s a PowerShell visitor example that identifies SelectStarExpression fragments anywhere in a T-SQL script using a visitor that derives from TSqlConcreateFragmentVisitor. Be aware than in PowerShell, the T-SQL Script DOM assembly must be added with Add-Type before running a script because the visitor base class type is defined in the external T-SQL Script DOM assembly. I’ll discuss more about this and other PowerShell considerations after reviewing visitors.

# this is the script to parse
$script = @"
SELECT * FROM dbo.SomeTable;
IF 1 = 1
BEGIN
    SELECT * FROM dbo.SomeOtherTable;
END; 
"@

try {


    class MyVisitor: Microsoft.SqlServer.TransactSql.ScriptDom.TSqlConcreteFragmentVisitor {

        [void]Visit ([Microsoft.SqlServer.TransactSql.ScriptDom.SelectStarExpression] $fragment) {
            Write-Host "`t`tWARNING: 'SELECT *' found at line $($fragment.StartLine), column $($fragment.StartColumn), length $($fragment.FragmentLength)" -ForegroundColor Yellow
        }
    
    }

    $parser = New-Object Microsoft.SqlServer.TransactSql.ScriptDom.TSql150Parser($true)
    $parseErrors = New-Object System.Collections.Generic.List[Microsoft.SqlServer.TransactSql.ScriptDom.ParseError]
    $stringReader = New-Object System.IO.StringReader($script)

    $tSqlFragment = $parser.Parse($stringReader, [ref]$parseErrors)
    if($parseErrors.Count -gt 0) {
        throw "$($parseErrors.Count) parsing error(s): $(($parseErrors | ConvertTo-Json))"
    }

    $visitor = [MyVisitor]::new()

    $tSqlFragment.Accept($visitor)

}
catch {
    throw
}

Script output:

WARNING: 'SELECT *' found at line 1, column 8, length 1
WARNING: 'SELECT *' found at line 4, column 12, length 1 

The code invoked Accept on only the root TSqlScript fragment returned by Parse. The fragment Accept method did the rest of the work by visiting the fragment and its descendants, invoking the overridden Visit(SelectStarExpression) method whenever a SelectStarExpression fragment was encountered. This behavior can be observed in more detail by also overriding Visit(TSqlFragment) to display information about every fragment visited:

$script = @"
SELECT * FROM dbo.SomeTable;
IF 1 = 1
BEGIN
    SELECT * FROM dbo.SomeOtherTable;
END; 
"@

try {

    class MyVisitor: Microsoft.SqlServer.TransactSql.ScriptDom.TSqlConcreteFragmentVisitor {

        [void]Visit ([Microsoft.SqlServer.TransactSql.ScriptDom.SelectStarExpression] $fragment) {
            Write-Host "WARNING: 'SELECT *' found at line $($fragment.StartLine), column $($fragment.StartColumn), length $($fragment.FragmentLength)" -ForegroundColor Yellow
        }

        [void]Visit ([Microsoft.SqlServer.TransactSql.ScriptDom.TSqlFragment] $fragment) {
            Write-Host "$($fragment.GetType().Name) found at line $($fragment.StartLine), column $($fragment.StartColumn), length $($fragment.FragmentLength)"
        }
    
    }

    $parser = New-Object Microsoft.SqlServer.TransactSql.ScriptDom.TSql150Parser($true)
    $parseErrors = New-Object System.Collections.Generic.List[Microsoft.SqlServer.TransactSql.ScriptDom.ParseError]
    $stringReader = New-Object System.IO.StringReader($script)

    $frament = $parser.Parse($stringReader, [ref]$parseErrors)
    if($parseErrors.Count -gt 0) {
        throw "$($parseErrors.Count) parsing error(s): $(($parseErrors | ConvertTo-Json))"
    }

    $visitor = [MyVisitor]::new()

    $frament.Accept($visitor)

}
catch {

    throw

}

Output showing every visited fragment along with warnings:

TSqlScript found at line 1, column 1, length 91
TSqlBatch found at line 1, column 1, length 90
SelectStatement found at line 1, column 1, length 28
QuerySpecification found at line 1, column 1, length 27
WARNING: 'SELECT *' found at line 1, column 8, length 1
FromClause found at line 1, column 10, length 18
NamedTableReference found at line 1, column 15, length 13
SchemaObjectName found at line 1, column 15, length 13
Identifier found at line 1, column 15, length 3
Identifier found at line 1, column 19, length 9
IfStatement found at line 2, column 1, length 60
BooleanComparisonExpression found at line 2, column 4, length 5
IntegerLiteral found at line 2, column 4, length 1
IntegerLiteral found at line 2, column 8, length 1
BeginEndBlockStatement found at line 3, column 1, length 50
StatementList found at line -1, column -1, length -1
SelectStatement found at line 4, column 5, length 33
QuerySpecification found at line 4, column 5, length 32
WARNING: 'SELECT *' found at line 4, column 12, length 1
FromClause found at line 4, column 14, length 23
NamedTableReference found at line 4, column 19, length 18
SchemaObjectName found at line 4, column 19, length 18
Identifier found at line 4, column 19, length 3
Identifier found at line 4, column 23, length 14 

TSqlFragmentVisitor versus TSqlConcreteFragmentVisitor

I mentioned earlier a custom visitor class derives from either TSqlConcreteFragmentVisitor or TSqlFragmentVisitor. The difference between these base classes is that TSqlFragmentVisitor includes a Visit method for each of the nearly one thousand different TSqlFragment types (984 as of this writing to be precise), including the abstract base types fragments types derive from. In contrast. TSqlConcreteFragmentVisitor, includes an overridable Visit method for only TSqlFragment and concrete derived types (currently 856). The implication is that the same fragment can be visited more than once when TSqlFragmentVisitor is used as a visitor base class, once as TSqlFragment, again as base type(s) when applicable, and finally as the concreate derived type.

This example shows a trivial select statement visited as TSqlFragment, TSqlStatement, and SelectStatement when a visitor derives from TSqlFragmentVisitor:

$script = "SELECT 1;"

try {

class MyVisitor: Microsoft.SqlServer.TransactSql.ScriptDom.TSqlFragmentVisitor {

    [void]Visit ([Microsoft.SqlServer.TransactSql.ScriptDom.TSqlFragment] $fragment) {
        Write-Host "$($fragment.GetType().Name) visited as TSqlFragment at line $($fragment.StartLine), column $($fragment.StartColumn), length $($fragment.FragmentLength)"
    }

    [void]Visit ([Microsoft.SqlServer.TransactSql.ScriptDom.TSqlStatement] $fragment) {
        Write-Host "$($fragment.GetType().Name) visited as TSqlStatement at line $($fragment.StartLine), column $($fragment.StartColumn), length $($fragment.FragmentLength)"
    }

    [void]Visit ([Microsoft.SqlServer.TransactSql.ScriptDom.SelectStatement] $fragment) {
        Write-Host "$($fragment.GetType().Name) visited as SelectStatement at line $($fragment.StartLine), column $($fragment.StartColumn), length $($fragment.FragmentLength)"
    }
    
}

$parser = New-Object Microsoft.SqlServer.TransactSql.ScriptDom.TSql150Parser($true)
$parseErrors = New-Object System.Collections.Generic.List[Microsoft.SqlServer.TransactSql.ScriptDom.ParseError]
$stringReader = New-Object System.IO.StringReader($script)

$frament = $parser.Parse($stringReader, [ref]$parseErrors)
if($parseErrors.Count -gt 0) {
    throw "$($parseErrors.Count) parsing error(s): $(($parseErrors | ConvertTo-Json))"
}

$visitor = [MyVisitor]::new()

$frament.Accept($visitor)

}
catch {
    throw
}

Output showing all visited fragments:

TSqlScript visited as TSqlFragment at line 1, column 1, length 9
TSqlBatch visited as TSqlFragment at line 1, column 1, length 9
SelectStatement visited as TSqlStatement at line 1, column 1, length 9
SelectStatement visited as TSqlFragment at line 1, column 1, length 9
SelectStatement visited as SelectStatement at line 1, column 1, length 9
QuerySpecification visited as TSqlFragment at line 1, column 1, length 8
SelectScalarExpression visited as TSqlFragment at line 1, column 8, length 1
IntegerLiteral visited as TSqlFragment at line 1, column 8, length 1 

One typically derives from TSqlConcreteFragmentVisitor and overrides concrete Visit methods of only relevant concrete types for most use cases. This ensures a fragment is visited only once as the relevant concrete type (plus TSqlFragment if desired). Below is the output from the same script as above except using TSqlConcreteFragmentVisitor as the visitor’s base class showing the select statement fragment is visited only as the concrete SelectStatement type but not as the TSqlStatement abstract type:

TSqlScript visited as TSqlFragment at line 1, column 1, length 9
TSqlBatch visited as TSqlFragment at line 1, column 1, length 9
SelectStatement visited as SelectStatement at line 1, column 1, length 9
QuerySpecification visited as TSqlFragment at line 1, column 1, length 8
SelectScalarExpression visited as TSqlFragment at line 1, column 8, length 1
IntegerLiteral visited as TSqlFragment at line 1, column 8, length 1

Compiled languages like C# provide more type safety and will result in a compilation error if one tries to override a Visit method of a TSqlConcreteFragmentVisitor sealed base fragment type. PowerShell is more forgiving and will silently ignore the override of a TSqlConcreteFragmentVisitor sealed Visit method (i.e. TSqlStatement in this example).

PowerShell Considerations

A PowerShell class that derives from a type defined in an external assembly will not compile unless the assembly with the base type has already been loaded. Consequently, the Microsoft.SqlServer.TransactSql.ScriptDom.dll assembly must first be loaded before a script with a visitor class can be run.

With an interactive PowerShell session, one can run the Add-Type command manually from before running a script with a visitor class. This only needs to be done once and the assembly will be available in the app domain for the duration of the PowerShell session.

Unattended PowerShell script files with visitor classes are more problematic since one cannot execute Add-Type interactively. As a work-around, a wrapper script can be employed to execute the Add-Type command and then “dot source” the script file with the visitor class. For example, after saving any of the previous visitor examples to a file named “VisitorExample.ps1”, the wrapper script code below can be saved to a file named “Execute-VisitorExample.ps1” in the same directory:

Add-Type -Path "C:\Program Files\Microsoft SQL Server\150\DAC\bin\Microsoft.SqlServer.TransactSql.ScriptDom.dll"
.\"VisitorExample.ps1"

The wrapper script can then be run to execute the visitor:

powershell -ExecutionPolicy RemoteSigned -File "Execute-VisitorExample.ps1"

The .NET Framework NuGet package version can used from PowerShell too by registering the NuGet package repo as a trusted source, installing the package (for current user or all users), and adding the assembly from the local source location. Here’s an example of this technique as part of a wrapper script:

# Create trusted NuGet package source, if needed
$packageSource = Get-PackageSource | where { ($_.Location -EQ "https://www.nuget.org/api/v2") -and ($_.ProviderName -eq "NuGet") -and ($_.IsTrusted -eq $true) }
if($packageSource -eq $null) {
    Register-PackageSource NuGetV2 https://www.nuget.org/api/v2 -ProviderName NuGet -Trusted
}

# Install package, if needed. Note scope AllUsers requires admin
$dacFxPackage = Install-Package Microsoft.SqlServer.DacFx.x64 -Source ($packageSource.Name) -Scope CurrentUser
# Get package
$dacFxPackage = Get-Package -Name Microsoft.SqlServer.DacFx.x64

# Load Microsoft.SqlServer.TransactSql.ScriptDom.dll into app domain for use in PS scripts
$packageFolderPath = [System.IO.Path]::GetDirectoryName($dacFxPackage.Source)
Add-Type -LiteralPath "$packageFolderPath\lib\net46\Microsoft.SqlServer.TransactSql.ScriptDom.dll"

# Dot source visitor example
.\"VisitorExample.ps1"

Summary

This introductory article showed T-SQL script DOM library methods to parse and analyze T-SQL code. I hope this helped you gain an understanding of the library’s capabilities and get started for various needs. I’ll follow up with articles discussing more complex use cases, including leveraging visitors to ensure adherence to naming conventions, identify problem T-SQL code constructs, T-SQL code formatting, and other custom tooling.

Improper SET Option Errors

SQL Server backwards compatibility SET options are hidden land mines that explode when one tries to use a feature that requires proper session settings, such as a filtered index, indexed view, etc. The QUOTED_IDENTIFIER, ANSI_NULLS, and ANSI_PADDING settings are especially problematic. These are persisted as meta-data with view, stored procedure, function, trigger, and column definitions and, since persisted settings override current session settings, a nasty runtime error like “…failed because the following SET options have incorrect settings: ‘QUOTED_IDENTIFIER…’” occurs when a persisted setting is set to OFF even though the current session settings are set properly. “Sticky” OFF settings are a common problem and often accidental rather than because the setting is required by T-SQL code that doesn’t follow ISO SQL standards.

This article reviews QUOTED_IDENTIFIER, ANSI_NULLS, and ANSI_PADDING settings, settings persisted as database object meta-data, and considerations with SQL Server tools. I’ll discuss how to identify and remediate problem objects and considerations to ensure proper ON settings going forward.

Background
Microsoft SQL Server, along with its Sybase ancestor, are a bit long in the tooth nowadays. The original code base was developed decades ago before ISO (previously ANSI) SQL standards were formalized. As SQL Server evolved to respect ISO SQL standards, SET options were introduced to avoid breaking existing applications that expected non-ISO behavior. This allows legacy T-SQL code to run on newer SQL Server versions without changes while supporting ISO SQL standards for new development. The SQL Server product team goes through great effort, albeit sometimes to a fault, to provide backwards compatibility as to not block upgrades.

Some SQL Server features require ISO SQL standard settings
plus other session settings to be set properly in order to be used and avoid runtime errors. Features that require these settings include:

  • Filtered Indexes
  • Indexed Views
  • Indexes on computed columns
  • XML indexes
  • Query notifications (a.k.a. SqlDependency)

The below session settings, sometimes called the “magic 7 settings”, must be set properly when using these features. Otherwise, the above
indexes will not be used or a runtime error will be raised when data are
modified:

  • QUOTED_IDENTIFIER ON
  • ANSI_NULLS ON
  • ANSI_PADDING ON
  • ANSI_WARNINGS  ON
  • ARITHABORT ON
  • CONCAT_NULL_YIELDS_NULL ON
  • NUMERIC_ROUNDABORT OFF

Modern Microsoft SQL Server drivers (e.g. ODBC, OLE DB, SqlClient, JDBC) connect with all these session settings properly set by default. The lone exception is that ARITHABORT OFF is set by default in the context of a database in the SQL Server 2000 compatibility level (80). ARITHABORT OFF will not be an issue for most since the unsupported SQL Server 2008 version was the last to include the SQL Server 2000 database compatibility level. ARITHABORT will be set ON by default in SQL Server 2012 and later versions.

QUOTED_IDENTIFIER, ANSI_NULLS, and ANSI_PADDING Session Settings
Given these session settings are initially set properly to ON by default, an OFF setting is a result of one or more of the following:

• An explict T-SQL SET OFF statement after the connection is made
• Non-default OFF setting specified by API connection (e.g. connection string keyword, DSN property, or set in app code programmatically)
• Overridden by a persisted object meta-data setting

Improper SET OFF statements most often leak into T-SQL code due to inattention to detail. The same is true for OFF API connection settings and persisted meta-data settings. T-SQL code that adheres to ISO SQL standard single-quotes literal enclosures and IS NULL/IS NOT NULL comparison operators can run with these settings either ON or OFF with the same outcome. Consequently, there’s no reason not to use the proper ON default settings when these ISO SQL practices are followed. One can simply remove unintentional T-SQL SET OFF statements, fix API settings, and correct persisted meta-data (discussed later) in order to future proof code so that it can run with or without features that require the proper settings.

Legacy T-SQL code that requires OFF settings due to non-ISO compliant constructs needs to be remediated to leverage the aforementioned features before changing settings to ON. Below is a summary of these settings and considerations for changing to use the ON setting.

QUOTED_IDENTIFIER
QUOTED_IDENTIFIER OFF is required only when T-SQL code uses double-quotes instead of single quotes to enclose literals. It’s typically a trivial effort to fix non-conformant code to use single quotes instead of double quotes with a search/replace. A caveat is that single quotes embedded within literals must to be escaped with two consecutive single quotes. The code will then run regardless of the QUOTED_IDENTIFIER setting and follow ISO SQL standards. Only minimal testing is usually needed after remediation since the setting is evaluated at parse time; parsing errors will occur immediately if double-quotes are used to enclose literals. An exception is dynamic SQL where errors won’t be found until one tries to execute an invalid dynamic SQL statement containing double-quote literal enclosures.

The current session QUOTED_IDENTIFER setting is persisted as meta-data when a view, stored procedure, function, or trigger is created or altered. To change the persisted OFF setting to ON, recreate the object from a session with QUOTED_IDENTIFER and ANSI_NULLS ON. Be mindful to ensure the session setting is ON when executing DDL scripts to prevent improper settings going forward. See the considerations topic later in this article for gotchas with SQL Server tools.

ANSI_NULLS
ANSI_NULLS OFF has long been deprecated. The OFF setting allows code to test for NULL values using equality/inequality predicates instead of the ISO SQL standard “IS NULL” and “IS NOT NULL” operators. For example, the “ColumnName = NULL” will evaluate to TRUE instead of UNKNOWN with the ANSI_NULLS OFF setting. Such code should be changed to “ColumnName IS NULL” to follow ISO SQL standards and provide the same behavior regardless of the session ANSI_NULLS setting. Changes made for ANSI_NULLS compliance necessitate more extensive testing because runtime behavior changes rather than parse time errors like QUOTED_IDENTIFIER.

Like QUOTED_IDENTIFER, the current session ANSI_NULLS setting is persisted as meta-data when a view, stored procedure, function, or trigger is created or altered. Recreate the object from a session with QUOTED_IDENTIFER and ANSI_NULLS ON to change the persisted OFF setting to ON and take care to ensure the setting is ON when executing DDL scripts.

ANSI_PADDING
ANSI_PADDING OFF has also been deprecated for quite some time and the SQL Server documentation specifically calls out “ANSI_PADDING should always be set to on.” In summary, a column-level ANSI_PADDING OFF setting causes nullable fixed-length char(n) and binary(n) columns to behave like variable-length varchar(n) and varbinary(n) columns. Furthermore, SQL Server automatically trims trailing blank characters from character data and leading binary zeros from binary data and stores the values as variable length instead of storing the provided value as-is during inserts and updates. Varchar(n)/varbinary(n) columns with ANSI_PADDING OFF are similarly trimmed. Note that it is the persisted ANSI_NULLS column meta-data setting that determines the storage and trimming behavior, not the current session ANSI_PADDING setting. The session ANSI_PADDING must still be ON when using features that require proper settings.

The current session ANSI_PADDING setting is persisted as column-level meta data when tables are created and new columns added to an existing tables. This setting affects only char(n) NULL, binary(n) NULL, and varchar(n)/varbinary(n) columns regardless of nullability. The setting doesn’t apply to varchar(MAX) , varbinary(MAX), char(n) NOT NULL, binary(n) NOT NULL, and other data types.

Since SQL Server comparison operators ignore trailing blanks when comparing strings as well as leading binary zeros when comparing binary values, there isn’t usually an impact from a T-SQL perspective with changing ANSI_PADDING from OFF to ON (aside from storage when the provided values aren’t already trimmed). However, application code might consider training blanks and leading binary zeros, resulting in differences when comparing trimmed and non-trimmed values. Testing is needed depending on how data are used in the app code.

To change a persisted column ANSI_PADDING setting from OFF to ON, execute ALTER TABLE…ALTER COLUMN from an ANSI_PADDING ON session, specifying the same definition as the existing column. Note that this technique will only change ANSI_PADDING from OFF to ON. Altering an existing ANSI_PADDING ON column from an ANSI_PADDING OFF session will not change the persisted setting.

Considerations to Ensure Proper Settings
All “magic 7 settings” are set properly by default with current drivers so one might think ensuring proper settings is easy. This is largely true for application code but, sadly, not with SQL Server tools due to backward compatibility behavior and inattention to detail when using them.

SQL Server Agent uses ODBC (which sets all “magic 7 settings” properly) but then explicitly sets QUOTED_IDENTIFIER OFF after connecting for backwards compatibility. The implication is one needs to explicitly add a SET QUOTED_IDENTIFIER ON statement to T-SQL scripts executed by SQL Server Agent T-SQL job steps. This is optional when executing stored procedures because the sticky QUOTED_IDENTIFIER ON setting will override the session setting.

SQLCMD similarly uses ODBC and explicitly sets QUOTED_IDENTIFIER OFF. This is a common cause of inadvertent persisted QUOTED_IDENTIFIER OFF leaking into databases as meta-data when SQLCMD is used to deploy database changes. One must specify the SQLCMD -I (uppercase eye) argument to opt-in for QUOTED_IDENTIFIER ON. Additionally, deployment scripts should either explicitly include SET QUOTED_IDENTIFIER ON, SET ANSI_NULLS ON, and SET ANSI_PADDING ON statements or omit these set statements entirely so session settings are used. Avoid including SET OFF statements in scripts for these options.

BCP also uses ODBC, but as you might have guessed, explicitly sets QUOTED_IDENTIFIER OFF too. One needs to opt-in for QUOTED_IDENTIFIER ON by specify the -q (lower case queue) BCP argument to avoid runtime errors.

SSMS, which uses SqlClient, is nicer in that it sets all options properly by default and doesn’t turn set QUOTED_IDENTIFIER OFF behind your back. But be aware that SSMS will honor customized SET options specified for the current window (Query–>Query Options–>ANSI) and new windows (Tools–>Options–>Query Execution–>SQL Server–>ANSI). Make sure the Magic 7 settings are properly set in SSMS options.

Be mindful that SET statements executed in an SSMS query window change the session settings for the lifetime of the query window connection. Take care when executing DDL scripts in a reused query window and, when in doubt, check current session settings using DBCC USEROPTIONS to verify proper settings.
SSMS (and SMO) scripting tools have a terrible habit of including an extraneous SET ANSI_PADDING OFF at the end of CREATE TABLE scripts. Either remove the statement after scripting or set the ANSI PADDING generation option to False (Tools–>Options–>SQL Server Object Explorer–>Scripting–>Generate SET ANSI PADDING commands). This will help avoid unintentionally creating ANSI_PADDING OFF columns.

How to Identify Improper Persisted Settings
It’s common to find wrong settings in existing databases for the reasons mentioned earlier. The catalog view queries below will identify objects with problem persisted OFF settings.

--stored procedures, views, functions, triggers with QUOTED_IDENTIFIER or ANSI_NULLS OFF
 SELECT
       OBJECT_SCHEMA_NAME(o.object_id) AS SchemaName
     , OBJECT_NAME(o.object_id) AS ObjectName
     , o.type_desc AS ObjectType
 FROM sys.objects AS o
 WHERE
     0 IN(
           OBJECTPROPERTY(o.object_id, 'ExecIsQuotedIdentOn')
         , OBJECTPROPERTY(o.object_id, 'ExecIsAnsiNullsOn')
 )
 ORDER BY
       SchemaName
     , ObjectName;
 --columns with ANSI_PADDING OFF
 SELECT
       OBJECT_SCHEMA_NAME(t.object_id) AS SchemaName
     , OBJECT_NAME(t.object_id) AS ObjectName
     , c.name AS ColumnName
 FROM sys.tables AS t
 JOIN sys.columns AS c ON c.object_id = t.object_id
 JOIN sys.types AS ty ON ty.system_type_id = c.system_type_id AND ty.user_type_id = c.user_type_id
 WHERE
     c.is_ansi_padded = 0
     AND (
         (ty.name IN ('varbinary','varchar') AND c.max_length <> -1)
         OR (ty.name IN ('binary','char') AND c.is_nullable = 1)
 );

Summary
Attention to connection settings will facilitate using SQL Server features.

Always Use Semicolon Statement Terminators

ANSI-standard semicolon statement terminators are often omitted in T-SQL queries and many developers are unaware that this is syntax is deprecated.  Omitting statement terminators is a dangerous practice because, even if the batch compiles, you may get unexpected results.  Consider the insidious examples below pointed out by SQL Server MVP Erland Sommarskog:

BEGIN TRY
	BEGIN TRAN
	SELECT 1/0 AS CauseAnException
	COMMIT
END TRY
BEGIN CATCH
	SELECT ERROR_MESSAGE()
	THROW
END CATCH

Few of us will catch (no pun intended) the bug in the above script.  What results do you expect after running the above script under SQL Server 2012 or later versions?  Rather than leave this as an exercise for the reader, I’ll spoil the fun and mention that no run-time error is raised at all.  Instead, the THROW statement is interpreted as a column alias for the ERROR_MESSAGE() column.  This sort of coding error is especially nasty because catch blocks are rarely unit tested and this catch block coding mistake hides the run-time error entirely without raising an exception.

Similarly, the absence of statement terminators in the script below causes another problem.  Can you spot it?

BEGIN TRY
	BEGIN TRAN
	SELECT 1/0 AS CauseAnException
	COMMIT
END TRY
BEGIN CATCH
    IF @@TRANCOUNT > 0 ROLLBACK TRAN
    THROW
END CATCH

At least an error is raised in this case, albeit not the one you might expect.  The resultant error is “Cannot roll back THROW. No transaction or savepoint of that name was found”.  This coding bug obfuscates the preceding divide by zero error and prevents the THROW statement from being executed.

Below is another example where the absence of the semi-colon terminator obfuscates the root cause of the error. As you may know, GO is not a T-SQL statement but a batch terminator command recognized by SSMS and other SQL Server tools and utilities. This script executes as expected from an SSMS query window because SSMS parses the script and executes each batch individually when GO commands are encountered:

SELECT 'Creating view'
GO
CREATE VIEW dbo.foo AS SELECT 1 AS bar

However, running the same script with PowerShell (or any other client application) fails with the error “CREATE VIEW must be the first statement in a query batch”:

try
{
    $connectionString = "Data Source=.;Initial Catalog=tempdb;Integrated Security=SSPI"
    $connection = New-Object System.Data.SqlClient.SqlConnection($ConnectionString)
    $Script = 
@"
SELECT 'Creating view'
GO
CREATE VIEW dbo.foo AS SELECT 1 AS bar
"@

    $command = New-Object System.Data.SqlClient.SqlCommand($Script, $connection)
    $connection.Open()
    $result = $command.ExecuteNonQuery()
    $connection.Close()
}
catch [Exception]
{
    throw;
}

In this case, SQL Server interprets the GO as a column alias in the first SELECT query and the batch errs on the CREATE VIEW statement during compilation. If you a semi-colon is added to the end of the first SELECT statement, the correct error message results: “Incorrect syntax near ‘GO'”.

As a side note, one can execute scripts containing GO terminators programmatically using the SMO API, which is also used by some SQL Server tools. See this Stackoverflow answer. Another approach I’ve used is to parse scripts in code using the Transact-SQL script DOM and execute each batch individually. I’ll follow up with a separate article detailing that method and add the link here.

Semicolons Will Become Mandatory
Microsoft announced with the SQL Server 2008 release that semicolon statement terminators will become mandatory in a future version so statement terminators other than semicolons (whitespace) are currently deprecated.  This deprecation announcement means that you should always use semicolon terminators in new development.  I honestly don’t expect SQL Server to strictly enforce mandatory semicolons in the near future but it is still a best practice to use semicolon statement to avoid issues like those mentioned earlier as well as facilitate code maintainability.  I suggest specifying statement terminators in all new development and perhaps adding terminators to existing code as you perform maintenance.

Transact-SQL does not currently enforce the ANSI semicolon statement terminator requirement.  Instead, semicolon statement terminators are optional and any whitespace (spaces, tabs, newline) may be used instead.  The exception to this rule is that many of the statements introduced in SQL Server 2005 and later require the preceding statement to be properly terminated in order for the batch to compile.

Below are some guidelines I suggest on when to, and when not to, use semicolon statement terminators.

Suggested Guidelines
The Transact-SQL parser is quite lax, allowing any whitespace (e.g. space, tab, newline) to be used.  This laxness results in ambiguity like the examples at the beginning of this article demonstrate.  Similarly, statement terminators may not only be omitted, they may also be used in inappropriately.  I strongly suggest you adhere to the T-SQL syntax documented in the Books Online even if the parser allows otherwise.  This practice will help future-proof your code since relying on undocumented behavior is inherently risky.

Don’t precede a statement with a semicolon
Remember that the purpose of semicolons is to terminate SQL statements, not begin them.  A common mistake I see is throwing a semicolon in front of statements in order to get a batch of statements to compile, especially with newer statements like WITH (CTE expression) that require previous statement termination.  Although the T-SQL parser currently ignores extraneous and misplaced semi-colons, I suggest they be specified in the appropriate place according statement syntax documented in the SQL Server Books Online.

Specify semicolons at the end of each stand-alone SQL statement
Not only will this conform to the ANSI standard, your intent will be clearer and the code easier to read.

Terminate control-of-flow statement blocks at the end of the control-of-flow scope
Control-of-flow statements are not covered by the ANSI SQL standard because these are proprietary SQL extensions.  The SQL Server Books Online is sketchy on the subject and many of the examples (as of this writing) are inconsistent and do not always include statement terminators.  Furthermore, control-of-flow statement blocks are confusing due to the many variations, nesting, and optional BEGIN/END specifications.

Below are examples illustrating what I believe to be proper use of statement terminators control-of-flow block terminators using IF statements in SQL 2008 and later versions.  The same concepts apply to other control-of-flow constructs like WHILE and TRY/CATCH blocks.  I should add that this batch example will not compile under SQL 2005 because an explicit BEGIN/END block is required to execute a common table expression conditionally in that version.  T-SQL parser enhancements eliminated that requirement in SQL 2008 and later.

IF 1 = 1
  WITH a AS (SELECT 1 a) SELECT a FROM a; --terminate statement and IF

IF 1 = 1
  WITH a AS (SELECT 1 a) SELECT a FROM a; --terminate statement
ELSE
  WITH a AS (SELECT 1 a) SELECT a FROM a; --terminate statement and IF

IF 1 = 1
  BEGIN
    WITH a AS (SELECT 1 a) SELECT a FROM a; --terminate statement
  END; --terminate IF

IF 1 = 1
  BEGIN
    WITH a AS (SELECT 1 a) SELECT a FROM a; --terminate statement
  END
ELSE
  BEGIN
    WITH a AS (SELECT 1 a) SELECT a FROM a; --terminate statement
  END; --terminate IF

IF 1 = 1
  WITH a AS (SELECT 1 a) SELECT a FROM a; --terminate statement
ELSE
  BEGIN
    WITH a AS (SELECT 1 a) SELECT a FROM a; --terminate statement
  END; --terminate IF

IF 1 = 1
  BEGIN
    WITH a AS (SELECT 1 a) SELECT a FROM a; --terminate statement
  END
ELSE
  WITH a AS (SELECT 1 a) SELECT a FROM a; --terminate statement and IF

IF 1 = 1
  BEGIN
    WITH a AS (SELECT 1 a) SELECT a FROM a; --terminate statement
	IF 1 = 1
		WITH a AS (SELECT 1 a) SELECT a FROM a; --terminate statement
	ELSE
	BEGIN
		WITH a AS (SELECT 1 a) SELECT a FROM a; --terminate statement
	END; --terminate inner nested IF
	WITH a AS (SELECT 1 a) SELECT a FROM a; --terminate statement
  END
ELSE
  WITH a AS (SELECT 1 a) SELECT a FROM a; --terminate statement and outer IF

Summary
Consistent user of semicolons helps avoid bugs in code that might otherwise go undetected.  Code with statement terminators can also be more easily modified without introducing compile errors and make code easier to maintain because the end of each statement is readily apparent to subsequent developers.  Importantly, you’ll be better positioned for future SQL Server versions by consistently using semicolon statement terminators.