The SQL Optimizer is heavily dependent on Statistics for the estimates it makes and thereby a cheap execution plan. Remember, the optimizer is cost-based. Column statistics are automatically created when you create an index on a specific column. And even if you are not creating an index, yet you put the column in the predicate, SQL Server will first create stats for the column values (provided AUTO_CREATE_STATISTICS is ON which is the default). Either way, when you have single column stats or multi-column stats, the most critical thing is the histogram for the first column in your stats. Histogram, as the name suggests, is a bucketing technique of how the column data is distributed – the highest value of a bucket, which defines a boundary, how many rows are between two boundary values, how many rows are equal to a specific boundary value, how many unique values are there between a range and, how many rows are there on average per distinct value. All of this helps the optimizer make the right estimates and, of course, this needs to be up-to-date. If not, estimates can go wrong resulting in a poor plan choice. Sounds interesting? Watch the video below.
Get new content directly in your inbox – Click here
You can also visit DataPlatformGeeks (DPG) Video Library to watch more videos and to download video resources (presentation & demo code).
Hope the video was worth your time.
Happy Learning.
Leave A Comment