I once heard someone refer to the Microsoft SQL Server Script DOM library as the crown jewels of SQL Server tools. That may be a bit of an overstatement but reliably parsing and analyzing T-SQL scripts programmatically is an exceedingly complex task due to the many variations of T-SQL constructs and contexts in which T-SQL language elements may occur. Albeit one can parse T-SQL code using string functions, regex expressions, et.al., such parsing is fragile at best due to the difficulty in considering variants of multi-line statements, multi-statement lines, whitespace, comments that contain T-SQL code, etc.
Whether you’re a DBA needing to identify problem code in existing objects or a database developer wanting to incorporate robust code analysis rules in a CI/CD pipeline, the T-SQL Script Document Object Model (DOM) library is the right tool for these jobs and more.
The script DOM library official documentation is rather scant, including auto-generated class documentation without usage examples and behavior details. This introductory article reviews script DOM concepts and programming while showing how leverage the power of this library using PowerShell (version 5 or later), which is especially handy for ad-hoc needs or CI/CD pipeline custom tasks. One can also use the library in any .NET language (e.g. C#, VB.NET) for a variety of purposes such as database code analysis rules in SSDT. Deep expertise in .NET programming or PowerShell is not required but a cursory knowledge of object-oriented programming, PowerShell, and/or .NET is helpful to understand and extend the example code included here.
Script DOM Overview
The T-SQL script DOM library, Microsoft.SqlServer.TransactSql.ScriptDom.dll, includes classes and methods to parse, process, and format T-SQL scripts. Microsoft has open-sourced the script DOM as a project on GitHub and made it available as a NuGet Package. The Microsoft.SqlServer.TransactSql.ScriptDom assembly is a stand-alone client library that can be used without installing SQL Server or connecting to a database engine. The library includes .NET Framework and .NET Core assemblies so that it may be used on Windows, macOS, or Linux.
The PowerShell script examples in this article use the .NET Framework version included in the Microsoft.SqlServer.TransactSql.ScriptDom NuGet Package. The assembly is also included with other components and tools, including the DacFx NuGet package and sqlpackage. These scripts in this article will run on any machine (or container) with PowerShell or PowerShell Core installed by specifying the appropriate assembly .NET version on the Add-Type PowerShell command. One can similarly add a reference to the NuGet package for use in compiled .NET Framework applications.
Note that the T-SQL script DOM library parses only T-SQL; it does not recognize SQLCMD commands or scripting variables understood by SQL Server tools like SQLCMD, SSMS (in SQLCMD mode), and Azure Data Studio. The library can, however, parse scripts containing GO batch terminators even though GO is not technically a T-SQL statement.
Parsing Overview
Scripts are parsed by invoking the Parse method of T-SQL script DOM library TSqlParser class. The parser understands the complex T-SQL abstract syntax tree and splits T-SQL source into atomic TSqlParserTokens of TSqlTokenTypes that represent keywords, identifiers, punctuation, literals, whitespace, etc. These low-level tokens are grouped into more meaningful TSqlFragment objects that represent language elements of the script DOM, such as batches, statements, clauses, etc. Fragments, rather than the low-level parser tokens, are most often used in practice, although the underlying tokens are available for specialized requirements
The Parse method returns a TSqlFragment object of type TSqlScript containing all fragments within the script. This top-level fragment of the DOM hierarchy provides programmatic access to all language element fragments in the script. Nearly 1,000 different fragment types exist today due to the many granular T-SQL language elements.
Each fragment includes properties applicable to the specific fragment type it represents plus the properties common to all fragments (inherited from TSqlFragment) listed below as extracted from the documentation.
Property | Description |
FirstTokenIndex | Gets or sets the first index of the token. |
FragmentLength | Defines the number of characters the fragment takes up in the script it was parsed. |
LastTokenIndex | Gets or sets the last index of the token. |
ScriptTokenStream | Gets or sets the script token stream. |
StartColumn | Gets the start column. |
StartLine | Gets the start line. |
StartOffset | Defines the character offset of fragments starting location in the script it was parsed. |
The script source can be passed to Parse as a TextReader or as an IList<TSqlParserToken>. All the examples in this article use the TextReader method overload with a StringReader object containing the source script.
Parsing Examples
Here is an example that parses trigger code to ensure it is syntactically valid for each SQL Server version from SQL Server 2000 to SQL Server 2019. This trigger uses RAISERROR syntax that was deprecated decades ago (SQL Server 2000?) and removed from the product entirely in SQL Server 2012. Parsing will fail with an invalid syntax error when a TSqlParser110 or later parser version is used.
# trigger with deprecated/obsolete RAISERROR syntax $script = @" CREATE TRIGGER dbo.tr_OrderHeader ON dbo.OrderHeader FOR DELETE AS IF EXISTS(SELECT 1 FROM dbo.OrderDetail AS od WHERE od.OrderID IN(SELECT d.OrderID FROM deleted AS d)) BEGIN RAISERROR 50001 'OrderDetail rows exist for order. Delete order detail before order header.' ROLLBACK END "@ # parse script with specified parser and display success or parsing errors Function Parse-TSqlScript($parser, $script) { # create an ParseError collection for any errors returned by parser $parseErrors = New-Object System.Collections.Generic.List[Microsoft.SqlServer.TransactSql.ScriptDom.ParseError] # create a StringReader for the script for parsing $stringReader = New-Object System.IO.StringReader($script) # parse the script $tSqlFragment = $parser.Parse($stringReader, [ref]$parseErrors) # display parse success or error if($parseErrors.Count -eq 0) { Write-Host "$($parser.GetType().Name): No parsing errors" -ForegroundColor Green } else { Write-Host "$($parser.GetType().Name): $($parseErrors.Count) parsing error(s): $(($parseErrors | ConvertTo-Json))" -ForegroundColor Yellow } } # ############ # ### MAIN ### # ############ try { # load Script DOM assembly for use by this PowerShell session # Create trusted NuGet package source, if needed $packageSource = Get-PackageSource | where { ($_.Location -EQ "https://www.nuget.org/api/v2") -and ($_.ProviderName -eq "NuGet") -and ($_.IsTrusted -eq $true) } if($packageSource -eq $null) { Register-PackageSource NuGetV2 https://www.nuget.org/api/v2 -ProviderName NuGet -Trusted } # Install package, if needed. $tSqlScriptDomPackage = Install-Package Microsoft.SqlServer.TransactSql.ScriptDom -Source ($packageSource.Name) -Scope CurrentUser # Get package $tSqlScriptDomPackage = Get-Package -Name Microsoft.SqlServer.TransactSql.ScriptDom # Load Microsoft.SqlServer.TransactSql.ScriptDom.dll .NET framework assembly into app domain for use in PS scripts $tSqlScriptDomPackageFolderPath = [System.IO.Path]::GetDirectoryName($tSqlScriptDomPackage.Source) Add-Type -LiteralPath "$tSqlScriptDomPackageFolderPath\lib\net462\Microsoft.SqlServer.TransactSql.ScriptDom.dll" # SQL Server 2000 $parser = New-Object Microsoft.SqlServer.TransactSql.ScriptDom.TSql80Parser($true) Parse-TSqlScript -parser $parser -script $script # SQL Server 2005 $parser = New-Object Microsoft.SqlServer.TransactSql.ScriptDom.TSql90Parser($true) Parse-TSqlScript -parser $parser -script $script # SQL Server 2008 $parser = New-Object Microsoft.SqlServer.TransactSql.ScriptDom.TSql100Parser($true) Parse-TSqlScript -parser $parser -script $script # SQL Server 2012 $parser = New-Object Microsoft.SqlServer.TransactSql.ScriptDom.TSql110Parser($true) Parse-TSqlScript -parser $parser -script $script # SQL Server 2014 $parser = New-Object Microsoft.SqlServer.TransactSql.ScriptDom.TSql120Parser($true) Parse-TSqlScript -parser $parser -script $script # SQL Server 2016 $parser = New-Object Microsoft.SqlServer.TransactSql.ScriptDom.TSql130Parser($true) Parse-TSqlScript -parser $parser -script $script # SQL Server 2017 $parser = New-Object Microsoft.SqlServer.TransactSql.ScriptDom.TSql140Parser($true) Parse-TSqlScript -parser $parser -script $script # SQL Server 2019 $parser = New-Object Microsoft.SqlServer.TransactSql.ScriptDom.TSql150Parser($true) Parse-TSqlScript -parser $parser -script $script # SQL Server 2022 $parser = New-Object Microsoft.SqlServer.TransactSql.ScriptDom.TSql160Parser($true) Parse-TSqlScript -parser $parser -script $script } catch { throw }
Script output:
TSql80Parser: No parsing errors TSql90Parser: No parsing errors TSql100Parser: No parsing errors TSql110Parser: 1 parsing error(s): { "Number": 46010, "Offset": 193, "Line": 7, "Column": 12, "Message": "Incorrect syntax near 50001." } TSql120Parser: 1 parsing error(s): { "Number": 46010, "Offset": 193, "Line": 7, "Column": 12, "Message": "Incorrect syntax near 50001." } TSql130Parser: 1 parsing error(s): { "Number": 46010, "Offset": 193, "Line": 7, "Column": 12, "Message": "Incorrect syntax near 50001." } TSql140Parser: 1 parsing error(s): { "Number": 46010, "Offset": 193, "Line": 7, "Column": 12, "Message": "Incorrect syntax near 50001." } TSql150Parser: 1 parsing error(s): { "Number": 46010, "Offset": 193, "Line": 7, "Column": 12, "Message": "Incorrect syntax near 50001." }
I have seen invalid code like the above trigger in existing production databases due to past database upgrades. These objects can no longer be executed and should be dropped or remediated. The script below extends the previous example, parsing all existing stored procedures, triggers, and functions in an existing database according to the desired target SQL Server version (SQL 2019) and the existing object quoted identifier setting passed as the TSqlParser constructor argument.
$connectionString = "Data Source=.;Initial Catalog=AdventureWorks2012;Integrated Security=SSPI" # parse script with specified parser and display success or parsing errors Function Parse-TSqlScript($parser, $script) { # create an ParseError collection for any errors returned by parser $parseErrors = New-Object System.Collections.Generic.List[Microsoft.SqlServer.TransactSql.ScriptDom.ParseError] # create a StringReader for the script for parsing $stringReader = New-Object System.IO.StringReader($script) # parse the script $tSqlFragment = $parser.Parse($stringReader, [ref]$parseErrors) # display parse success or error if($parseErrors.Count -eq 0) { Write-Host "$($parser.GetType().Name): No parsing errors" -ForegroundColor Green } else { Write-Host "$($parser.GetType().Name): $($parseErrors.Count) parsing error(s): $(($parseErrors | ConvertTo-Json))" -ForegroundColor Yellow } } # ############ # ### MAIN ### # ############ try { # Create trusted NuGet package source, if needed $packageSource = Get-PackageSource | where { ($_.Location -EQ "https://www.nuget.org/api/v2") -and ($_.ProviderName -eq "NuGet") -and ($_.IsTrusted -eq $true) } if($packageSource -eq $null) { Register-PackageSource NuGetV2 https://www.nuget.org/api/v2 -ProviderName NuGet -Trusted } # Install package, if needed. $tSqlScriptDomPackage = Install-Package Microsoft.SqlServer.TransactSql.ScriptDom -Source ($packageSource.Name) -Scope CurrentUser # Get package $tSqlScriptDomPackage = Get-Package -Name Microsoft.SqlServer.TransactSql.ScriptDom # Load Microsoft.SqlServer.TransactSql.ScriptDom.dll .NET framework assembly into app domain for use in PS scripts $tSqlScriptDomPackageFolderPath = [System.IO.Path]::GetDirectoryName($tSqlScriptDomPackage.Source) Add-Type -LiteralPath "$tSqlScriptDomPackageFolderPath\lib\net462\Microsoft.SqlServer.TransactSql.ScriptDom.dll" $query = @" SELECT QUOTENAME(OBJECT_SCHEMA_NAME(object_id)) + N'.' + QUOTENAME(OBJECT_NAME(object_id)) AS ObjectName , OBJECTPROPERTY(object_id, 'ExecIsQuotedIdentOn') AS ExecIsQuotedIdentOn , definition FROM sys.sql_modules; "@ $connection = New-Object Data.SqlClient.SqlConnection($connectionString) $command = New-Object Data.SqlClient.SqlCommand($query, $connection) $connection.Open() $reader = $command.ExecuteReader() while ($reader.Read()) { $parser = New-Object Microsoft.SqlServer.TransactSql.ScriptDom.TSql160Parser($reader["ExecIsQuotedIdentOn"]) Write-Host "Parsing $($reader["ObjectName"]) ..." Parse-TSqlScript -parser $parser -script $($reader["definition"]) } $connection.Close() } catch { throw }
Below are the first few lines output if output. No parsing errors were found in this case.
Parsing [Sales].[vStoreWithContacts] ... TSql160Parser: No parsing errors Parsing [Sales].[vStoreWithAddresses] ... TSql160Parser: No parsing errors Parsing [Purchasing].[vVendorWithContacts] ... TSql160Parser: No parsing errors Parsing [Purchasing].[vVendorWithAddresses] ... TSql160Parser: No parsing errors Parsing [dbo].[ufnGetAccountingStartDate] ... TSql160Parser: No parsing errors
The same technique as above can parse version-controlled T-SQL scripts or any scripts stored in files. This example parses all scripts in a given directory and subdirectories.
# parse script with specified parser and display success or parsing errors Function Parse-TSqlScript($parser, $script) { # create an ParseError collection for any errors returned by parser $parseErrors = New-Object System.Collections.Generic.List[Microsoft.SqlServer.TransactSql.ScriptDom.ParseError] # create a StringReader for the script for parsing $stringReader = New-Object System.IO.StringReader($script) # parse the script $tSqlFragment = $parser.Parse($stringReader, [ref]$parseErrors) # display parse success or error if($parseErrors.Count -eq 0) { Write-Host "$($parser.GetType().Name): No parsing errors" -ForegroundColor Green } else { Write-Host "$($parser.GetType().Name): $($parseErrors.Count) parsing error(s): $(($parseErrors | ConvertTo-Json))" -ForegroundColor Yellow } } # ############ # ### MAIN ### # ############ try { # Create trusted NuGet package source, if needed $packageSource = Get-PackageSource | where { ($_.Location -EQ "https://www.nuget.org/api/v2") -and ($_.ProviderName -eq "NuGet") -and ($_.IsTrusted -eq $true) } if($packageSource -eq $null) { Register-PackageSource NuGetV2 https://www.nuget.org/api/v2 -ProviderName NuGet -Trusted } # Install package, if needed. $tSqlScriptDomPackage = Install-Package Microsoft.SqlServer.TransactSql.ScriptDom -Source ($packageSource.Name) -Scope CurrentUser # Get package $tSqlScriptDomPackage = Get-Package -Name Microsoft.SqlServer.TransactSql.ScriptDom # Load Microsoft.SqlServer.TransactSql.ScriptDom.dll .NET framework assembly into app domain for use in PS scripts $tSqlScriptDomPackageFolderPath = [System.IO.Path]::GetDirectoryName($tSqlScriptDomPackage.Source) Add-Type -LiteralPath "$tSqlScriptDomPackageFolderPath\lib\net462\Microsoft.SqlServer.TransactSql.ScriptDom.dll" $scriptFiles = Get-ChildItem "C:\SqlScripts\*.sql" -Recurse # parse with quoted identifier on $parser = New-Object Microsoft.SqlServer.TransactSql.ScriptDom.TSql160Parser($true) foreach ($scriptFile in $scriptFiles) { $script = [IO.File]::ReadAllText($scriptFile.FullName) Write-Host "Parsing $($scriptFile.FullName) ..." Parse-TSqlScript -parser $parser -script $script } } catch { throw }
Partial output from the script above:
Parsing C:\SqlScripts\AdventureWorks2012\AdventureWorks2012.Database.sql ... TSql160Parser: No parsing errors Parsing C:\SqlScripts\AdventureWorks2012\dbo.AccountNumber.UserDefinedDataType.sql ... TSql160Parser: No parsing errors Parsing C:\SqlScripts\AdventureWorks2012\dbo.AWBuildVersion.Table.sql ... TSql160Parser: No parsing errors Parsing C:\SqlScripts\AdventureWorks2012\dbo.DatabaseLog.Table.sql ... TSql160Parser: No parsing errors Parsing C:\SqlScripts\AdventureWorks2012\dbo.ErrorLog.Table.sql ... TSql160Parser: No parsing errors Parsing C:\SqlScripts\AdventureWorks2012\dbo.Flag.UserDefinedDataType.sql ... TSql160Parser: No parsing errors Parsing C:\SqlScripts\AdventureWorks2012\dbo.Name.UserDefinedDataType.sql ... TSql160Parser: No parsing errors
Analyzing T-SQL Scripts
Parsing code for syntax errors is only a small subset of T-SQL Script DOM capability. One can also analyze T-SQL code after parsing using the TSqlScript fragment returned by the Parse method.
One way to examine scripts is by traversing the parsed DOM hierarchy of TSqlScript, TSqlBatch, and TSqlStatement fragments. This technique has shortcomings as illustrated with this example, which iterates over the TSqlBatch Statements collection looking for “SELECT *” constructs. The script displays a warning message and location within the script where a SelectStarExpression fragment is found.
# this is the script to parse $script = @" SELECT * FROM dbo.SomeTable; IF 1 = 1 BEGIN SELECT * FROM dbo.SomeOtherTable; END; "@ try { # Create trusted NuGet package source, if needed $packageSource = Get-PackageSource | where { ($_.Location -EQ "https://www.nuget.org/api/v2") -and ($_.ProviderName -eq "NuGet") -and ($_.IsTrusted -eq $true) } if($packageSource -eq $null) { Register-PackageSource NuGetV2 https://www.nuget.org/api/v2 -ProviderName NuGet -Trusted } # Install package, if needed. $tSqlScriptDomPackage = Install-Package Microsoft.SqlServer.TransactSql.ScriptDom -Source ($packageSource.Name) -Scope CurrentUser # Get package $tSqlScriptDomPackage = Get-Package -Name Microsoft.SqlServer.TransactSql.ScriptDom # Load Microsoft.SqlServer.TransactSql.ScriptDom.dll .NET framework assembly into app domain for use in PS scripts $tSqlScriptDomPackageFolderPath = [System.IO.Path]::GetDirectoryName($tSqlScriptDomPackage.Source) Add-Type -LiteralPath "$tSqlScriptDomPackageFolderPath\lib\net462\Microsoft.SqlServer.TransactSql.ScriptDom.dll" # create a TSql160Parser parser (SQL Server 2022) $parser = New-Object Microsoft.SqlServer.TransactSql.ScriptDom.TSql160Parser($true) # create an ParseError collection for any errors returned by parser $parseErrors = New-Object System.Collections.Generic.List[Microsoft.SqlServer.TransactSql.ScriptDom.ParseError] # create a StringReader for the script for parsing $stringReader = New-Object System.IO.StringReader($script) # parse the script $tSqlFragment = $parser.Parse($stringReader, [ref]$parseErrors) # raise an exception if any parsing errors occur if($parseErrors.Count -gt 0) { throw "$($parseErrors.Count) parsing error(s): $(($parseErrors | ConvertTo-Json))" } # show number of batches in this script Write-Host "Script starts at line $($tSqlFragment.StartLine), column $($tSqlFragment.StartColumn), length $($tSqlFragment.FragmentLength), and contains $($tSqlFragment.Batches.Count) batches" # iterate over batches in this script foreach ($batchFragment in $tSqlFragment.Batches) { # show starting line number of this batch Write-Host "`tBatch at line $($batchFragment.StartLine), column $($batchFragment.StartColumn), length $($batchFragment.FragmentLength), and contains $($batchFragment.Statements.Count) statements" # iterate over statements in this batch foreach ($statementFragment in $batchFragment.Statements) { # show statement starting line number and statement type Write-Host "`t`tStatement at line $($statementFragment.StartLine), column $($statementFragment.StartColumn), length $($statementFragment.FragmentLength) and is a $($statementFragment.GetType().Name)" # for SELECT statements, look for SELECT * expression and display warning if found if($statementFragment.GetType().Name -eq "SelectStatement") { foreach($selectElementFragment in $statementFragment.QueryExpression.SelectElements) { if($selectElementFragment.GetType().Name -eq "SelectStarExpression") { Write-Host "`t`tWARNING: 'SELECT *' found at line $($selectElementFragment.StartLine), column $($selectElementFragment.StartColumn), length $($selectElementFragment.FragmentLength)" -ForegroundColor Yellow } } } } } } catch { throw }
Output:
Script starts at line 1, column 1, length 90, and contains 1 batches Batch at line 1, column 1, length 90, and contains 2 statements Statement at line 1, column 1, length 28 and is a SelectStatement WARNING: 'SELECT *' found at line 1, column 8, length 1 Statement at line 2, column 1, length 60 and is a IfStatement
You might have noticed the second SELECT * statement in the script, which is conditional, was not identified. This is because the code examined only SELECT statements that are immediate children of a batch and the unidentified SELECT statement is a child of an IfStatement fragment (grandchild of batch). The missed statement could have been identified by adding code to evaluate IfStatement fragments too, examining its ThenStatement property and descendants to see if a SelectStatement with a SelectStarExpression exists somewhere under the DOM tree. But as you can imagine, the code would become quite unwieldly and one would also need to also need to examine the ElseStatement property of IfStatement fragments, consider nested constructs, BEGIN/END blocks, etc., where a SELECT * might occur.
Navigating the T-SQL DOM with the above technique may be useful for some specialized use cases but too cumbersome with scripts containing statements of varying types and complex constructs. A more robust solution for this, and most other code analysis needs, is a visitor. A visitor class allows one to more easily process desired TSqlFragment types regardless of where they occur in the script while ignoring others. The remainder of the article focuses on fragment visitors.
Fragment Visitors
A custom fragment visitor class allows one to use a visitor pattern to more easily process relevant T-SQL fragment types with concise code while ignoring fragment types unneeded for the task at hand. A visitor class can be created in any .NET programming language that supports object-oriented inheritance, such as C#, VB.NET, and PowerShell version 5 or later. The class simply derives from the T-SQL Script DOM library TSqlConcreteFragmentVisitor type (or TSqlFragmentVisitor for specialized needs) and overrides the Visit base class method for fragment type(s) of interest. The base class includes plumbing to invoke the overridden Visit methods for the accepted fragment and its descendants without verbose code.
A custom visitor is used by invoking the Accept method on a fragment instance with the custom visitor object as the argument. Accept invokes the visitor’s Visit method for the current fragment type, which will be the overridden method if defined by the custom visitor or the base class. Each child of the fragment is then visited similarly, invoking the overridden Visit method for the fragment’s type if one exists. Consequently, Accept visits not only the current fragment, but all of its descendants and executes the overridden Visit method for each fragment along the way. This allows one to invoke Accept on the TSqlScript fragment returned by parse to visit all overridden Visit methods of the custom visitor class.
Be aware of a Catch-22 with PowerShell visitors. The script with the visitor class definition will not compile until the script DOM assembly is loaded and one cannot execute the Add-Type command to load the assembly unless the script compiles. The work-around is to execute the Add-Type command in a separate wrapper script before invoking visitor class scripts, or load the assembly manually in an interactive session beforehand. This is described in more detail later in the PowerShell Considerations section. The visitor examples that follow assume this code snippet was already run in the Powershell session to load the assembly:
# Get Microsoft.SqlServer.TransactSql.ScriptDom NuGet package and Add-Type try { # Create trusted NuGet package source, if needed $packageSource = Get-PackageSource | where { ($_.Location -EQ "https://www.nuget.org/api/v2") -and ($_.ProviderName -eq "NuGet") -and ($_.IsTrusted -eq $true) } if($packageSource -eq $null) { Register-PackageSource NuGetV2 https://www.nuget.org/api/v2 -ProviderName NuGet -Trusted } # Install package, if needed. $dacFxPackage = Install-Package Microsoft.SqlServer.TransactSql.ScriptDom -Source ($packageSource.Name) -Scope CurrentUser # Get package $dacFxPackage = Get-Package -Name Microsoft.SqlServer.TransactSql.ScriptDom # Load Microsoft.SqlServer.TransactSql.ScriptDom.dll into app domain for use in PS scripts $packageFolderPath = [System.IO.Path]::GetDirectoryName($dacFxPackage.Source) Add-Type -LiteralPath "$packageFolderPath\lib\net462\Microsoft.SqlServer.TransactSql.ScriptDom.dll" } catch { throw }
Here’s a PowerShell example that identifies SelectStarExpression fragments anywhere in a T-SQL script using a visitor class deriving from TSqlConcreateFragmentVisitor.
# this is the script to parse $script = @" SELECT * FROM dbo.SomeTable; IF 1 = 1 BEGIN SELECT * FROM dbo.SomeOtherTable; END; "@ try { class MyVisitor: Microsoft.SqlServer.TransactSql.ScriptDom.TSqlConcreteFragmentVisitor { [void]Visit ([Microsoft.SqlServer.TransactSql.ScriptDom.SelectStarExpression] $fragment) { Write-Host "`t`tWARNING: 'SELECT *' found at line $($fragment.StartLine), column $($fragment.StartColumn), length $($fragment.FragmentLength)" -ForegroundColor Yellow } } $parser = New-Object Microsoft.SqlServer.TransactSql.ScriptDom.TSql160Parser($true) $parseErrors = New-Object System.Collections.Generic.List[Microsoft.SqlServer.TransactSql.ScriptDom.ParseError] $stringReader = New-Object System.IO.StringReader($script) $tSqlFragment = $parser.Parse($stringReader, [ref]$parseErrors) if($parseErrors.Count -gt 0) { throw "$($parseErrors.Count) parsing error(s): $(($parseErrors | ConvertTo-Json))" } $visitor = [MyVisitor]::new() $tSqlFragment.Accept($visitor) } catch { throw }
Script output:
WARNING: 'SELECT *' found at line 1, column 8, length 1
WARNING: 'SELECT *' found at line 4, column 12, length 1
The code invoked Accept on only the root TSqlScript fragment returned by Parse. The fragment Accept method did the rest of the work by visiting the fragment and its descendants, invoking the overridden Visit(SelectStarExpression) method whenever a SelectStarExpression fragment was encountered. This behavior can be observed in more detail by also overriding Visit(TSqlFragment) to display information about every fragment visited:
$script = @" SELECT * FROM dbo.SomeTable; IF 1 = 1 BEGIN SELECT * FROM dbo.SomeOtherTable; END; "@ try { class MyVisitor: Microsoft.SqlServer.TransactSql.ScriptDom.TSqlConcreteFragmentVisitor { [void]Visit ([Microsoft.SqlServer.TransactSql.ScriptDom.SelectStarExpression] $fragment) { Write-Host "WARNING: 'SELECT *' found at line $($fragment.StartLine), column $($fragment.StartColumn), length $($fragment.FragmentLength)" -ForegroundColor Yellow } [void]Visit ([Microsoft.SqlServer.TransactSql.ScriptDom.TSqlFragment] $fragment) { Write-Host "$($fragment.GetType().Name) found at line $($fragment.StartLine), column $($fragment.StartColumn), length $($fragment.FragmentLength)" } } $parser = New-Object Microsoft.SqlServer.TransactSql.ScriptDom.TSql160Parser($true) $parseErrors = New-Object System.Collections.Generic.List[Microsoft.SqlServer.TransactSql.ScriptDom.ParseError] $stringReader = New-Object System.IO.StringReader($script) $frament = $parser.Parse($stringReader, [ref]$parseErrors) if($parseErrors.Count -gt 0) { throw "$($parseErrors.Count) parsing error(s): $(($parseErrors | ConvertTo-Json))" } $visitor = [MyVisitor]::new() $frament.Accept($visitor) } catch { throw }
Output showing every visited fragment along with warnings:
TSqlScript found at line 1, column 1, length 91 TSqlBatch found at line 1, column 1, length 90 SelectStatement found at line 1, column 1, length 28 QuerySpecification found at line 1, column 1, length 27 WARNING: 'SELECT *' found at line 1, column 8, length 1 FromClause found at line 1, column 10, length 18 NamedTableReference found at line 1, column 15, length 13 SchemaObjectName found at line 1, column 15, length 13 Identifier found at line 1, column 15, length 3 Identifier found at line 1, column 19, length 9 IfStatement found at line 2, column 1, length 60 BooleanComparisonExpression found at line 2, column 4, length 5 IntegerLiteral found at line 2, column 4, length 1 IntegerLiteral found at line 2, column 8, length 1 BeginEndBlockStatement found at line 3, column 1, length 50 StatementList found at line -1, column -1, length -1 SelectStatement found at line 4, column 5, length 33 QuerySpecification found at line 4, column 5, length 32 WARNING: 'SELECT *' found at line 4, column 12, length 1 FromClause found at line 4, column 14, length 23 NamedTableReference found at line 4, column 19, length 18 SchemaObjectName found at line 4, column 19, length 18 Identifier found at line 4, column 19, length 3 Identifier found at line 4, column 23, length 14
TSqlFragmentVisitor versus TSqlConcreteFragmentVisitor
I mentioned earlier a custom visitor class derives from either TSqlConcreteFragmentVisitor or TSqlFragmentVisitor. The difference between these base classes is that TSqlFragmentVisitor includes a Visit method for each of the nearly one thousand different TSqlFragment types (984 as of this writing to be precise), including the abstract base types fragments types derive from. In contrast. TSqlConcreteFragmentVisitor, includes an overridable Visit method for only TSqlFragment and concrete derived types (currently 856). The implication is that the same fragment can be visited more than once when TSqlFragmentVisitor is used as a visitor base class, once as TSqlFragment, again as base type(s) when applicable, and finally as the concreate derived type.
This example shows a trivial select statement visited as TSqlFragment, TSqlStatement, and SelectStatement when a visitor derives from TSqlFragmentVisitor:
$script = "SELECT 1;" try { class MyVisitor: Microsoft.SqlServer.TransactSql.ScriptDom.TSqlFragmentVisitor { [void]Visit ([Microsoft.SqlServer.TransactSql.ScriptDom.TSqlFragment] $fragment) { Write-Host "$($fragment.GetType().Name) visited as TSqlFragment at line $($fragment.StartLine), column $($fragment.StartColumn), length $($fragment.FragmentLength)" } [void]Visit ([Microsoft.SqlServer.TransactSql.ScriptDom.TSqlStatement] $fragment) { Write-Host "$($fragment.GetType().Name) visited as TSqlStatement at line $($fragment.StartLine), column $($fragment.StartColumn), length $($fragment.FragmentLength)" } [void]Visit ([Microsoft.SqlServer.TransactSql.ScriptDom.SelectStatement] $fragment) { Write-Host "$($fragment.GetType().Name) visited as SelectStatement at line $($fragment.StartLine), column $($fragment.StartColumn), length $($fragment.FragmentLength)" } } $parser = New-Object Microsoft.SqlServer.TransactSql.ScriptDom.TSql160Parser($true) $parseErrors = New-Object System.Collections.Generic.List[Microsoft.SqlServer.TransactSql.ScriptDom.ParseError] $stringReader = New-Object System.IO.StringReader($script) $frament = $parser.Parse($stringReader, [ref]$parseErrors) if($parseErrors.Count -gt 0) { throw "$($parseErrors.Count) parsing error(s): $(($parseErrors | ConvertTo-Json))" } $visitor = [MyVisitor]::new() $frament.Accept($visitor) } catch { throw }
Output showing all visited fragments:
TSqlScript visited as TSqlFragment at line 1, column 1, length 9
TSqlBatch visited as TSqlFragment at line 1, column 1, length 9
SelectStatement visited as TSqlStatement at line 1, column 1, length 9
SelectStatement visited as TSqlFragment at line 1, column 1, length 9
SelectStatement visited as SelectStatement at line 1, column 1, length 9
QuerySpecification visited as TSqlFragment at line 1, column 1, length 8
SelectScalarExpression visited as TSqlFragment at line 1, column 8, length 1
IntegerLiteral visited as TSqlFragment at line 1, column 8, length 1
One typically derives from TSqlConcreteFragmentVisitor and overrides concrete Visit methods of only relevant concrete types for most use cases. This ensures a fragment is visited only once as the relevant concrete type (plus TSqlFragment if desired). Below is the output from the same script as above except using TSqlConcreteFragmentVisitor as the visitor’s base class showing the select statement fragment is visited only as the concrete SelectStatement type but not as the TSqlStatement abstract type:
TSqlScript visited as TSqlFragment at line 1, column 1, length 9
TSqlBatch visited as TSqlFragment at line 1, column 1, length 9
SelectStatement visited as SelectStatement at line 1, column 1, length 9
QuerySpecification visited as TSqlFragment at line 1, column 1, length 8
SelectScalarExpression visited as TSqlFragment at line 1, column 8, length 1
IntegerLiteral visited as TSqlFragment at line 1, column 8, length 1
Compiled languages like C# provide more type safety and will result in a compilation error if one tries to override a Visit method of a TSqlConcreteFragmentVisitor sealed base fragment type. PowerShell is more forgiving and will silently ignore the override of a TSqlConcreteFragmentVisitor sealed Visit method (i.e. TSqlStatement in this example).
PowerShell Considerations
A PowerShell class that derives from a type defined in an external assembly will not compile unless the assembly with the base type has already been loaded. Consequently, the Microsoft.SqlServer.TransactSql.ScriptDom.dll assembly must first be loaded before a script with a visitor class can be run.
With an interactive PowerShell session, one can run the Add-Type command manually from before running scripts with visitor classes as shown earlier in the Fragment Visitors topic. This only needs to be done once and the assembly will be available in the app domain for the duration of the PowerShell session.
Unattended PowerShell script files with visitor classes are more problematic since one cannot execute Add-Type interactively. As a work-around, a wrapper script can be employed to execute the Add-Type command and then “dot source” the script file with the visitor class. For example, after saving any of the previous visitor examples to a file named “VisitorExample.ps1”, the wrapper script code below can be saved to a file named “Execute-VisitorExample.ps1” in the same directory:
# Get Microsoft.SqlServer.TransactSql.ScriptDom NuGet package, Add-Type, and invoke script DOM visitor script try { # Create trusted NuGet package source, if needed $packageSource = Get-PackageSource | where { ($_.Location -EQ "https://www.nuget.org/api/v2") -and ($_.ProviderName -eq "NuGet") -and ($_.IsTrusted -eq $true) } if($packageSource -eq $null) { Register-PackageSource NuGetV2 https://www.nuget.org/api/v2 -ProviderName NuGet -Trusted } # Install package, if needed. $dacFxPackage = Install-Package Microsoft.SqlServer.TransactSql.ScriptDom -Source ($packageSource.Name) -Scope CurrentUser # Get package $dacFxPackage = Get-Package -Name Microsoft.SqlServer.TransactSql.ScriptDom # Load Microsoft.SqlServer.TransactSql.ScriptDom.dll into app domain for use in PS scripts $packageFolderPath = [System.IO.Path]::GetDirectoryName($dacFxPackage.Source) Add-Type -LiteralPath "$packageFolderPath\lib\net462\Microsoft.SqlServer.TransactSql.ScriptDom.dll" # Use "dot source" to Invoke scripts .\"VisitorExample.ps1" } catch { throw }
The wrapper script can then be run to execute visitor scripts:
powershell -ExecutionPolicy RemoteSigned -File "Execute-VisitorExample.ps1"
Summary
This introductory article showed T-SQL script DOM library methods to parse and analyze T-SQL code. I hope this helped you gain an understanding of the library’s capabilities and get started for various needs. I’ll follow up with articles discussing more complex use cases, including leveraging visitors to ensure adherence to naming conventions, identify problem T-SQL code constructs, T-SQL code formatting, and other custom tooling.