Hi Friends,

SQL Server Cardinality Estimation is a complicated topic until you understand the internals of SQL Server Statistics. Here are some quick notes that will answer some common questions that you generally have:

How does SQL Server Query Optimizer estimate a default cardinality?

Let’s see and example (turn ON Actual Execution Plan or press Ctrl + M)

SQL Server Cardinality Estimation_1

You can observe that optimizer estimated 1.7 rows (approx) vs 1 actual row returned.

Let us take another example:

SQL Server Cardinality Estimation_2

This time optimizer estimates 1.9 rows (approx) vs 21 actual row returned.

What’s going on? Answer is simple: SQL Server estimates the number of rows based on Statistics. Either statistics already exist on TotalDue column or SQL Server will automatically create one if AUTO_CREATE_STATS is turn ON. Either way, SQL Server uses this statistics object and more precisely, the histogram, to compute its estimation. You can read this blog post to learn more about histogram.

Now, let’s make some changes to our code:

SQL Server Cardinality Estimation_3

Estimation: 6.61864 (observe that the same code, same predicate value, gave us an estimation of 1.7)

Let’s change the variable value and run the same code again:

SQL Server Cardinality Estimation_4

Estimation: 6.61864 (observe that the same code, same predicate value, gave us an estimation of 1.9)

Will the estimation be always 6.61864? What’s going on?

In the previous two examples, the optimizer was able to extract the value from the query to compute its estimation. In the next two example, since we are using local variables, the optimizer does not have the value until runtime. So will the estimation always be 6.61864 whenever we use local variable? Yes, if the table data does not change or more precisely, if the statistics behind the table does not change. When does statistics change, etc, etc is another blog post. But let’s come to the main question, where does 6.61864 come from? How does the optimizer compute this default value?

The formula for this default estimation is:

Total Rows * Density (where Density is 1/number of distinct values)

SQL Server, apart from marinating the histogram also maintains density information for the columns.

So let us compute this:

SQL Server Cardinality Estimation_5

Density = 1/4754 = 2.103491796381994e-4

Total Rows * Density = 31465 * 2.103491796381994e-4 = 6.6186….

Or

SQL Server Cardinality Estimation_6

That’s how the default estimation was computed in this case. Hope the quick note was helpful.

Want to watch this in action? Watch here!

 

Amit Bansal
SQL MCM, SQL MVP
Director – eDominer Systems, Peopleware India, SQLMaestros

Follow me on Twitter | Follow me on FaceBook | Connect on LinkedIn | Google+
FB page | Know more about me

SQL Server ALTER TABLE REBUILD
Cardinality Estimation in SQL Server - Quick Notes, Part 2