In part 3 we looked at the basic operators that describe getting and projecting data.
In this post we will see how basic filters are specified using the LogOp_Select operator, commonly resulting from a SQL WHERE statement.
Operators
Examples
Optimization
Get the Code
Remember This
Operators
LogOp_Select Filter data
The select operator describes filtering - aka applying a predicate - using two child operators. The first is a logical operator describing some relational data to be filtered, and the second is an operator describing the filter itself.
The filter is an expression returning a potentially three valued boolean - true, false or unknown.
Examples of operators that return boolean values include:
- Scalar comparisons, describing =, >, <, <>.
- Scalar comparisons, describing =, >, <, <>.
- Logical comparisons, describing AND, OR
- Relational comparisons, describing SOME, ALL, EXISTS, NOT EXISTS - we will examine these in detail in a later post.
ScaOp_Comp Compare values
The scalar comparison ScaOp_Comp specifies a comparison type (see table below), and has two children specifying the values to be compared.
Operator |
T-SQL |
ScaOp_Comp x_cmpEq |
= |
ScaOp_Comp x_cmpGt |
> |
ScaOp_Comp x_cmpLt |
< |
ScaOp_Comp x_cmpGe |
>= or !< |
ScaOp_Comp x_cmpLe |
<= or !> |
ScaOp_Comp x_cmpNe |
<> or != |
ScaOp_Comp x_cmpIs |
No explicit support |
ScaOp_Comp x_cmpIsNot |
No explicit support |
Uncovered: Null Equality - Not supported in T-SQL, but there under the hood.
In the above list there are two operators - is and is not - that have no explicit equivalent in T-SQL
If we want to test if two values are equal, and return true if both values are null, we either write the somewhat clumsy:
Or, in some set comparison queries, we can use a more elegant EXISTS & INTERSECT construct.
If we want to test if two values are equal, and return true if both values are null, we either write the somewhat clumsy:
WHERE a = b OR a IS NULL AND b IS NULL -- True if a and b are both null
However, in the SQL standard - but not implemented in T-SQL - is the following syntax:
WHERE a IS NOT DISTINCT FROM b -- True if a and b are both null
It is frustrating then to discover that in certain queries (such as the EXISTS & INTERSECT construct) we then see that the engine does in fact include an operator that supports the null equality comparison functionality written into the SQL standard over twenty years ago.
For a deeper look at this limitation, and potential solutions, see this post by Paul White.
For a deeper look at this limitation, and potential solutions, see this post by Paul White.
In addition to the above, SQL LIKE is a special case comparison with it's own operator syntax:
ScaOp_Intrinsic like
LogOp_Logical Apply boolean logic - AND / OR
The LogOp_Logical operator describes boolean logical operations such as AND. Each child operator describes a boolean result, to be compared. Note that unlike SQL, the operator can have more than two child inputs - a sample query demonstrates this later, in the examples section.
Operator |
T-SQL |
ScaOp_Logical x_lopAnd |
AND |
ScaOp_Logical x_lopOr |
OR |
No operator. NOT expressions are rewritten using De Morgan's laws. |
NOT |
Why no operator for NOT?
NOT has no operator in the logical trees, because negation is rewritten to eliminate it. For example:
Can be rewritten as:
NOT (a = b AND x = y)
a <> b OR x <> y
The rewritten expression - seen in the first logical tree (Converted) - is an example of De Morgan's laws for propositional logic. A runnable example is included in the queries below. It is not important to understand the academic theory, but it is worth understanding that rewrites such as these can make it easier for subsequent optimization rules to further simply the query.
ScaOp_Identifier Specify a column
This leaf operator specifies a column - either physical or derived.
QCOL: [{table alias}].{column name}] - Column from a table specified in a child operator
COL: {expression name} - A derived column, where the expression is defined in a child operator.
ScaOp_Const Specify a constant
This leaf operator specifies a constant.
TI ({type information}) - Type specification
XVAR ({value specification}) - Value specification. XVAR is an example of the extensive internal use of a type we know externally as sql_variant. (Paul White).
Whilst the individual elements of the specification have many variations, I won't in this post attempt document them all. However, they are usually sufficiently readable to understand the meaning in the context of the tree.
Examples
The examples use the AdventureWorks database. Any version from 2008R2 onwards will work - available here. The queries specify a set of trace flags that will print the logical trees (and the physical output tree) to the messages window in SSMS. The tree output shown was produced by the SQL Server 2019 optimizer. Older versions may have different behaviour.
Equality Comparison
-- Simple equality comparison, comparing a column with a constant
SELECT p.Name
FROM Production.Product AS p
WHERE p.ProductID = 500
OPTION (RECOMPILE, QUERYTRACEON 8605, QUERYTRACEON 8606, QUERYTRACEON 8607, QUERYTRACEON 3604);
/* *** Converted Tree: ***
LogOp_Project QCOL: [p].Name
LogOp_Select
LogOp_Get TBL: Production.Product(alias TBL: p) Production.Product TableID=482100758 TableReferenceID=0 IsRow: COL: IsBaseRow1000
ScaOp_Comp x_cmpEq
ScaOp_Identifier QCOL: [p].ProductID
ScaOp_Const TI(int,ML=4) XVAR(int,Not Owned,Value=500)
AncOp_PrjList
LogOp_Select has two childen:
A logical operator describing the input data
A scalar comparison operator (ScaOp_Comp), specifying an equality comparison (x_cmpEq)
ScaOp_Comp has two children, describing the two values to be compared
ScaOp_Identifier specifies a column, using the alias declared in the LogOp_Get
ScaOp_Const specifies a constant, with a data type and value.
*/
Boolean Comparison
-- Simple comparison with boolean expression
SELECT p.Name
FROM Production.Product AS p
WHERE p.Color = 'Silver'
AND p.ProductLine = 'M'
OPTION (RECOMPILE, QUERYTRACEON 8605, QUERYTRACEON 8606, QUERYTRACEON 8607, QUERYTRACEON 3604);
/* *** Converted Tree: ***
LogOp_Project QCOL: [p].Name
LogOp_Select
LogOp_Get TBL: Production.Product(alias TBL: p) Production.Product TableID=482100758 TableReferenceID=0 IsRow: COL: IsBaseRow1000
ScaOp_Logical x_lopAnd
ScaOp_Comp x_cmpEq
ScaOp_Identifier QCOL: [p].Color
ScaOp_Const TI(nvarchar collate 872468488,Var,Trim,ML=12) XVAR(nvarchar,Owned,Value=Len,Data = (12,83105108118101114))
ScaOp_Comp x_cmpEq
ScaOp_Identifier QCOL: [p].ProductLine
ScaOp_Const TI(nvarchar collate 872468488,Var,Trim,ML=2) XVAR(nvarchar,Owned,Value=Len,Data = (2,77))
AncOp_PrjList
LogOp_Select has two childen:
A logical operator describing the input data
A logical comparison operator (ScaOp_Logical), specifying an AND comparison (x_lopAnd)
ScaOp_Logical has two children, specifying the two comparisons eacg returning a boolean result, as input to the AND expression.
*/
3 Part Boolean Comparison
-- Simple comparison with 3 part boolean expression
SELECT p.Name
FROM Production.Product AS p
WHERE p.Color = 'Silver'
AND p.ProductLine = 'M'
AND p.FinishedGoodsFlag = 1
OPTION (RECOMPILE, QUERYTRACEON 8605, QUERYTRACEON 8606, QUERYTRACEON 8607, QUERYTRACEON 3604);
/* *** Converted Tree: *** (partial. operators omitted for clarity, indicated by :)
:
ScaOp_Logical x_lopAnd
ScaOp_Comp x_cmpEq
:
ScaOp_Comp x_cmpEq
:
ScaOp_Comp x_cmpEq
:
Note that unlike SQL syntax, the logical operator can take more than 2 inputs.
*/
SQL NOT, eliminated using De Morgan's laws
-- NOT expression rewritten using De Morgan's laws.
SELECT p.ProductID
FROM Production.Product AS p
WHERE NOT (
p.Color = 'Silver'
AND p.ProductLine = 'M'
)
OPTION (RECOMPILE, QUERYTRACEON 8605, QUERYTRACEON 8606, QUERYTRACEON 8607, QUERYTRACEON 3604);
/* *** Converted Tree: *** (partial. operators omitted for clarity, indicated by :)
:
ScaOp_Logical x_lopOr
ScaOp_Comp x_cmpNe
ScaOp_Identifier QCOL: [p].Color
ScaOp_Const TI(nvarchar collate 872468488,Var,Trim,ML=12) XVAR(nvarchar,Owned,Value=Len,Data = (12,83105108118101114))
ScaOp_Comp x_cmpNe
ScaOp_Identifier QCOL: [p].ProductLine
ScaOp_Const TI(nvarchar collate 872468488,Var,Trim,ML=2) XVAR(nvarchar,Owned,Value=Len,Data = (2,77))
:
Tree shows the expression with NOT eliminated, and the AND converted to OR
*/
IN comparison with a set of constants
- *** IN Test against constants
SELECT p.Name
FROM Production.Product AS p
WHERE p.ProductID IN ( 500, 501 )
OPTION (RECOMPILE, QUERYTRACEON 8605, QUERYTRACEON 8606, QUERYTRACEON 8607, QUERYTRACEON 3604);
/* *** Converted Tree: ***
:
ScaOp_Logical x_lopOr
ScaOp_Comp x_cmpEq
ScaOp_Identifier QCOL: [p].ProductID
ScaOp_Const TI(int,ML=4) XVAR(int,Not Owned,Value=501)
ScaOp_Comp x_cmpEq
ScaOp_Identifier QCOL: [p].ProductID
ScaOp_Const TI(int,ML=4) XVAR(int,Not Owned,Value=500)w
:
Described using OR and multiple equality comparisons
*/
Contradiction (Explicit)
In this example, the optimizer recognises a contradiction that will prevent any rows being returned from the relevant table.
-- Optimization: Contradiction (Explicit)
SELECT p.Name
FROM Production.Product AS p
WHERE 1=0
OPTION (RECOMPILE, QUERYTRACEON 8605, QUERYTRACEON 8606, QUERYTRACEON 8607, QUERYTRACEON 3604);
/* *** Converted Tree: ***
LogOp_Project QCOL: [p].Name
LogOp_Select
LogOp_Get TBL: Production.Product(alias TBL: p) Production.Product TableID=482100758 TableReferenceID=0 IsRow: COL: IsBaseRow1000
ScaOp_Const TI(bit,ML=1) XVAR(bit,Not Owned,Value=0)
AncOp_PrjList
Initially, a select operator has a filter defined with a zero constant, i.e. always false
*** Simplified Tree: ***
LogOp_ConstTableGet (0) COL: IsBaseRow1000 QCOL: [p].ProductID QCOL: [p].Name COL: ProductNumber COL: rowguid
In this tree, the 'always false' select has been replaced by LogOp_ConstTableGet.
This includes a rowcount (0), indicating no rows will be returned.
Instead it gives an indication that some table metadata will be needed (e.g. column names, data types).
*/
Contradiction (Query vs. Check Constraint)
If a query specifies a condition that contradicts the requirement of a check constraint, the optimizer can eliminate the redundant part of the tree.
-- Optimization: Contradiction (Check Constraint)
SELECT p.Name
FROM Production.Product AS p
WHERE p.ListPrice < 0
OPTION (RECOMPILE, QUERYTRACEON 8605, QUERYTRACEON 8606, QUERYTRACEON 8607, QUERYTRACEON 3604);
/* *** Converted Tree: ***
:
LogOp_Select
LogOp_Get TBL: Production.Product(alias TBL: p) Production.Product TableID=482100758 TableReferenceID=0 IsRow: COL: IsBaseRow1000
ScaOp_Comp x_cmpLt
ScaOp_Identifier QCOL: [p].ListPrice
ScaOp_Const TI(money,ML=8) XVAR(money,Not Owned,Value=(10000units)=(0))
:
The specified comparison is in the converted tree.
*** Simplified Tree: ***
LogOp_ConstTableGet (0) COL: IsBaseRow1000 QCOL: [p].ProductID QCOL: [p].Name COL: ProductNumber COL: rowguid
In this tree, the comparison that can never return true due to the check constraint, is replaced with LogOp_ConstTableGet (0)
*/
Domain Simplification
Here, the optimizer can determine that two ranges specified with an AND can be rationalized to their intersection; simply p.ProductID = 5
This same optimization logic will also merge overlapping ranges specified with an OR operator.
This can be useful when queries are generated with unexpected range overlaps - e.g. poor queries generated by an ORM framework, or from a user interface that allows users to create redundant criteria. In the latter case, it's important to be aware of this rule, to avoid wasted work attempting to implement the rationalization in the application, when the optimizer is already capable of performing the work.
-- Optimization: Domain Simplification (Rule: SelPredNorm)
SELECT p.ProductID
FROM Production.Product AS p
WHERE p.ProductID
BETWEEN 1 AND 5
AND p.ProductID
BETWEEN 5 AND 10
OPTION (RECOMPILE, QUERYTRACEON 8605, QUERYTRACEON 8606, QUERYTRACEON 8607, QUERYTRACEON 3604);
/* *** Converted Tree: ***
:
ScaOp_Logical x_lopAnd
ScaOp_Comp x_cmpGe
ScaOp_Identifier QCOL: [p].ProductID
ScaOp_Const TI(int,ML=4) XVAR(int,Not Owned,Value=1)
ScaOp_Comp x_cmpLe
ScaOp_Identifier QCOL: [p].ProductID
ScaOp_Const TI(int,ML=4) XVAR(int,Not Owned,Value=5)
ScaOp_Comp x_cmpGe
ScaOp_Identifier QCOL: [p].ProductID
ScaOp_Const TI(int,ML=4) XVAR(int,Not Owned,Value=5)
ScaOp_Comp x_cmpLe
ScaOp_Identifier QCOL: [p].ProductID
ScaOp_Const TI(int,ML=4) XVAR(int,Not Owned,Value=10)
:
Initialy the comparison is expressed in full.
*** Simplified Tree: ***
:
ScaOp_Comp x_cmpEq
ScaOp_Identifier QCOL: [p].ProductID
ScaOp_Const TI(int,ML=4) XVAR(int,Not Owned,Value=5)
:
In the simplified tree, it is rationalised to ProductId = 5
*/
Queries on this page: Logical Trees Part 4 - Filtering.sql
Github Repo for everything in the blog: SQL-Server-Examples
- To best leverage indexes, understand SArgability
- The optimizer has logic that will deal with some 'strange' query constructs e.g. generated by an ORM.
- When considering a poorly performing and unreadable ORM generated query, it is sometimes better to focus on the optimized result (plan in SSMS) rather than be distracted by bad query constructs that the optimizer has coped with and therefore aren't the cause of the problem.
- If your application generates it's own non-ORM queries, be aware of what the optimizer will manage for you - i.e. don't write your own optimizations when they aren't needed. (e.g domain simplification).
In part 5 we will look at joins, including clever tricks the optimizer can do with them to simplify plans.
Very nice and useful series. I'm looking forward to reading Part 5. Alessandro
ReplyDelete