Liking these SQL notes? Help spread the word. Tweet Now.
Subscribe to SQLMaestros newsletter and receive ABSQLNotes in your inbox.
Can you help us improve? Do you have any suggestions? Write to us at firstname.lastname@example.org
When you are tuning queries, there are two fundamental concepts that you need to be well aware of - Density & Selectivity. Today, let's talk about Selectivity. In simple words, it means, how selective your predicates are. Low selectivity means your query is returning a high number of records and, high selectivity means your query is returning a low number of records. A lot of decisions that the optimizer takes depend on selectivity. A classic example could be Scan vs Seek. Typically, when a given predicate is returning a high number of records, the optimizer may deploy a Scan operator to read the data pages. Conversely, if the predicate returns a low number of records, the optimizer may deploy a Seek operator to read the data. Assumptions are being made here that there is a non-clustered index on the column used in the predicate. Given this simple example, if you were ever wondering why are you seeing a scan sometimes and seek other times, think Selectivity - it helps the optimizer take a cost-based decision.
Did you know that you can get multiple missing index hints in your execution plan? When you switch over to the execution plan tab, you can see one missing index hint on the top (in green color). When you right-click the hint -> Missing Index Details, you get the index definition script in a new query window. Most would think that that is the only missing index hint for your query. In reality, there could be multiple missing index hints at the same time for your given query/plan. To see all the missing index hints, right-click the SELECT operator, click Properties, scroll down to the Missing Index section in the properties window, and expand all the missing index hints. Not very intuitive, but still helpful.
What is an Index Range Scan? Let's say you have a clustered index on the OrderDate column. When you have a predicate that says WHERE OrderDate > 01/01/2022 ORDER BY OrderDate, the optimizer performs a range scan. It is simply scanning the logically ordered data from a specific point. Note that the data is always logically ordered by the clustering key using the linked list. The data may or may not be physically ordered, that is another story for another day. The linked list is used to perform the scan on the supposedly ordered data. But then why does the optimizer show an Index Seek iterator for range scans. Well, there is no separate iterator for range scans and the Index Scan iterator is not the same as range scan. Index seek iterator makes sense. The optimizer first seeks to the point 01/01/2022 in the traversal and then starts scanning from that point.
Continuing my efforts on penning down some notes on SQL Server IO, here is the most critical (probably) weapon in your arsenal - PAGEIOLATCH wait type - identifying workloads that are waiting on page read IO completion. There are many IO operations (and of different sizes) in SQL Server. One of the most common is reading data pages (8 KB) from the disk (when they are not already in the memory). This is physical IO. And as the data page is read, a latch is acquired to protect the physical integrity of the page. If your SELECT query is waiting to complete this operation due to IO bottleneck, then most likely, your query is waiting on PAGEIOLATCH wait type. In other words, PAGEIOLATCH wait type indicates a delay in reading data pages from the disk. Capturing this wait type can give you a lot of insights related to IO slowness. But yes, you need to know which queries are causing this wait type. The cumulative wait stats that you get from sys.dm_os_wait_stats might not be very helpful if you want to deeply diagnose the problem. You got to capture query-level wait stats using Extended Events.
Continuing my efforts on penning down some notes on SQL Server IO, here is another weapon in your arsenal - identifying workloads that are causing high IO (reads & writes). The DMV to be used here is dm_exec_query_stats. This DMV is overwhelming with just so many attributes to compute. Amongst many metrics, you have a bunch of around logical reads and writes & physical reads and writes & a few variations of them. A straightforward retrieval of these metrics may not be very helpful so as usual, we need to do a lot of mathematics and make the metrics meaningful. For example, if you see the attached image, we are computing the total physical reads caused by multiple executions of queries. Note the descending order to get the highest values first. In other words, who is causing the most physical IO. Hey, not logical IO, but the focus here is physical IO - reading from the disk. Likewise, you can compute physical writes, logical reads, logical writes, and a lot more.
In one of my previous #ABSQLNotes, I talked about computing latency via sys.dm_io_virtual_file_stats DMV - a critical metric to determine storage performance. Another way to measure latency is via the Average Disk Second Per Read perfmon counter. In simple words: how much time does it take to complete one read operation. And of course, this will be in milliseconds. The lower the better. The easiest way to compute this is to use the Performance Monitor counter: LogicalDisk\Avg. Disk sec/Read. There are many guidelines on the internet talking about preferred values for this counter. It ideally should be less than 10 ms. With today's fast disks, even lesser. You need to create a baseline for this metric to get an average over time. And then slice it around for peak & non-peak hours. This value is computed for the entire disk and is not exclusive to SQL Server. But this is an important measurement of how fast the disks are able to complete their read operation.
For any database software, IO is so critical. The READS and the WRITES. And the amount of time it takes to get to the data - the latency factor. So in hindsight, the overall storage strategy plays such a critical role. You really want a very fast storage subsystem for your queries to go super fast. When I deal with IO metrics, latency comes on top of my mind. Simply put, latency is the time it takes for the IO request to be completed. Well, the storage custodians have many ways to calculate it and monitor it. As a SQL pro, how do you compute latency from inside SQL Server? Answer: leverage the sys.dm_io_virtual_file_stats DMV. This DMV has to two attributes - num_of_reads & io_stall_read_ms - for each data & log file in your SQL instance. So looking at the attributes, does it ring some bells? How can you compute latency now?
In one of my previous notes, I had talked about the histogram, which is one of the most critical things in the stats object. Histogram, as the name suggests, is a bucketing technique of how the column data is distributed - the highest value of a bucket, which defines a boundary, how many rows are between two boundary values, how many rows are equal to a specific boundary value, how many unique values are there between a range and, how many rows are there on average per distinct value. All of this helps the optimizer make the right estimates. Here is a sample histogram on TotalDue column of SalesOrderHeader table (not putting down all the columns for brevity):
RANGE_HI_KEY RANGE_ROWS EQ_ROWS
============= ============ =========
26.2769 40 142
30.1444 17 202
Now, let's we write a query:
SELECT * FROM Sales.SalesOrderHeader
WHERE TotalDue = 30.1444
The above query will return 202 rows. If check the cardinality estimation from the execution plan, you will observe the Estimated Number of Rows = 202 and the Actual Number of Rows=202. This is perfect and the best-case scenario for the optimizer where the estimate and the actual matched 100%.
How did this happen? Well, this is called Histogram Step Hit. The predicate value mentioned in the query (WHERE condition) has a step representation in the histogram and the optimizer does a perfect estimation with EQ_ROWS (the number of rows equal to the step value), which is 202.
The above is just a quick explanation of the ways how the optimizer leverages the histogram.
In the SQL Community, if you ask anybody about the Missing Index Hints of SSMS, people will give negative feedback and tell you that they are useless. Some frown upon those index hints. I will tell you otherwise. Yes, hints are hints. Don't follow them blindly, but look at them, evaluate and take a call. In most cases, those are covering indexes that are being recommended. Will you go ahead and create covering indexes just because SQL Server is telling you to do so. Not at all. Having a lot of covering indexes on a single table is a recipe for disaster (kind of). But sometimes those hints make complete sense and at other times, they are good candidates for a review. Listen, not everyone is an expert index tuner. So you need a starting point. Well, that hint is not a bad starting point. If I am you and am just getting started with index tuning, I will take a lot of those hints (as recommendations), copy/paste them into my workbook and look for common patterns based on my querying pattern. I am certain, that some decent index definitions may come out of it. Yeah, that could be a winner. So do not ignore hints completely. They are a good starting point.
You create indexes. SQL Server creates statistics. So many of these objects get created to support your query. Stats help in estimations. Indexes help the engine to get to the data faster. But are these objects really being used? Well, for indexes, you can track their usage using DMVs like sys.dm_db_index_usage_stats, sys.dm_db_index_operational_stats, etc. But for stats objects, it isn't so straightforward. You got to depend on a few trace flags like 9204, 9292, 2388, etc. These can tell you which STATS objects have been used by the optimizer for estimation purposes.
The SQL Optimizer is heavily dependent on Statistics for the estimates it makes and thereby a cheap execution plan. Remember, the optimizer is cost-based. Column statistics are automatically created when you create an index on a specific column. And even if you are not creating an index, yet you put the column in the predicate, SQL Server will first create stats for the column values (provided AUTO_CREATE_STATISTICS is ON which is the default). Either way, when you have single column stats or multi-column stats, the most critical thing is the histogram for the first column in your stats. Histogram, as the name suggests, is a bucketing technique of how the column data is distributed - the highest value of a bucket, which defines a boundary, how many rows are between two boundary values, how many rows are equal to a specific boundary value, how many unique values are there between a range and, how many rows are there on average per distinct value. All of this helps the optimizer make the right estimates and, of course, this needs to be up-to-date. If not, estimates can go wrong resulting in a poor plan choice.
You have just moved into a new residential society with 5 huge towers. Each tower has 50 floors and each floor has 2 apartments. That is a total of 500 families. You want to host a new year bash at your home and want to invite your neighbors spread across all the towers. But you are not sure what kind of cuisine they may prefer. You decide to go to each tower and randomly ask around their preferences. Each tower has 100 families/homes, so with your randomness, you land up talking to 30 families in each tower. Of course, talking to all the 500 families would have been a lot of work. Well, you did a great job in surveying 150 families. And now you have some data - people love all sorts of cuisines but what stands out is the Indian food 😉 followed by Italian and then Mexican. OK, you have some idea what to do. And you start your party preparations. On the day when all your neighbors are coming in, you are extremely pleased to see that the survey worked out so well and the food choices were just right. Lucky you. SQL Server does the same. When you want the data out of a table with a predicate (WHERE clause), it estimates the number of records that may be returned and generates a "possibly good" execution plan. This estimation is based on statistics the optimizer generated by surveying the column data on which you have the predicate. Exactly like you surveyed the families.
There are many types of memory allocations in SQL Server. One such memory grant is for query execution. The relational engine needs memory to execute your query. More memory is needed if you have expensive hash and sort operations. Let's take the example of the SORT operator. This is an expensive in-memory operation. And if enough memory is not available, the operation spills down to disk. The memory requirement increases and is directly proportionate to the size of the data it's sorting. Sorting a million rows will need more memory than sorting a thousand rows. Unfortunately, it has to sort all the data before it can output the first row. And that is why it is a blocking operator too. In the execution plan, when you take the cursor over the SELECT operator, you can see the memory grant for the query execution. You can also see the overall query cost. Keep an eye on the memory grant. Sometimes, it will run into GBs - one of the reasons for the sudden memory burst (requirement). The optimizer estimates the amount of memory needed based on the estimation of the data size it has to sort. This memory is reserved at the beginning of the query execution by a memory clerk called MEMORYCLERK_SQLQERESERVATIONS. You can track this clerk and many others using the DMV sys.dm_os_memory_clerks.
In one of my previous notes, I talked about Wait Types. Summary: When your query is waiting on resources, a specific wait type is assigned to it by SQL Server which you can track & capture. There are two DMVs (Dynamic Management Views) that show you real-time information about waits currently happening in the engine while the workload is running. sys.dm_os_waiting_tasks and sys.dm_exec_requests. While the former (os_waiting_tasks) is quite self-explanatory and shows a lot of details about the current waits happening, the latter (exec_requests) is also informative and has a column wait_type that shows the current wait encountered by the session. But there is a catch. Observe the nomenclature of the DMVs. "Task" vs "Session". Got it? No? sys.dm_os_waiting_tasks shows you task-level wait information and sys.dm_exec_requests shows you session-level wait information. What is the difference? Well, a request can be broken down into multiple tasks and thereby multiple threads, and each thread could be waiting on a different wait type (possible scenario). So sys.dm_os_waiting_tasks will show you one record for each task/thread and the specific wait type. But sys.dm_exec_requests will show you a record for the overall request and may mislead you by showing a generic wait type. Underlying the session/request, a lot is going on at the task/thread level. So when you are troubleshooting query slowness, focus on task-level waits, not session/request level.
Taking a very simplified approach, your query that you send to SQL Server, is either RUNNING or WAITING for resources. When it's waiting, it is still running, right? When you are on your way to office (think non-pandemic days 🙂 ), you wait for the elevator, you wait for the cab, etc. While you wait for many of these resources, you are still "on your way to office". It's the same with your query. In it's lifetime of "the run", it has to encounter many instances where it needs resources and it has to wait for them if those resources are not instantly available. SQL Server records these waits and you can capture them. This helps in understanding why was your query running "slow". Well, the query never runs "slow", it probably is just waiting for a resource. Let's say it wants access to a data page which is locked by another transaction. On a lighter note, you never reach late to the office, it’s the obstacles you face on your way ;). If you are able to track and capture what your query was waiting on, query tuning becomes a lot simpler.
There are many ways how you can troubleshoot a slow-running query. One technique is to capture the wait stats for the query. Waits? What are those? In most simplistic terms, either your query is running or waiting for resources. When the query is waiting for resources, SQL Server will assign one or more wait type(s) to the query depending on what it is waiting for. Maybe it's waiting for CPU, or memory, or IO, or waiting for a lock to be released - there are so many reasons why the execution can be stalled when the query has to wait for something. Thankfully, SQL Server captures that information but there is no easy way to extract that information 'only' for your given query. Extended Events can help here. You can set up an extended events session and capture wait stats 'only' for your query. You can apply a variety of filters to capture wait stats 'only' for your query. Cut down on the noise and only extract what you need. Extended Events (XE) is conceptually similar to the profiler tool but advanced and lightweight. While setting up the session, use the wait_info event class.
Just because you see an Index Seek operation, do not relax. All may not be well. Scan or seek, it all depends on the amount of work being done by SQL Server to fetch the data. Let's say you have a non-clustered index on a column and the optimizer seeks on the index to fetch the data. If the predicate is highly selective (less number of records are returned), then most likely, things are well. If it's low-selective (more number of records are returned), then even though it's an index seek operation, it might be touching a larger portion of the leaf level increasing the IO. Therefore, seeking performance is directly proportionate to the number of pages the storage engine is touching. On another note, multi-column indexes may be very tricky. Depending on the predicate operator, the optimizer may be seeking on the first column of the index, but not the second column. In this case, the first column is the SEEK predicate while the second column is the residual predicate. The execution plan is non-intuitive and such nuances are not clearly visible. So, the summary is: when you see an Index Seek operation, look for finer details like IO cost (reads, etc), cardinality estimation & predicates.
Many SQL folks believe that the execution plan they see in the plan cache is an Actual Execution Plan. Unfortunately, it is an Estimated Execution Plan. An Estimated Execution Plan does not have actual runtime statistics that may be helpful in query tuning. Actual Execution Plan has the runtime stats that you really need. Well, you can use the Query Store feature that can automatically persist the Actual Execution Plans. But remember, Query Store is a full-fledged feature that needs careful planning before you implement it. Are there other ways of getting Actual Execution Plans from the Plan Cache? Extended Events? Well, again, needs to be implemented, and capturing plans may turn out to be an expensive affair if you do not filter enough. What I mean is, try to capture only what you want out and cut down the noise. So what else? In SQL Server 2019, you can use the DMF sys.dm_exec_query_plan_stats to retrieve the equivalent of the last known actual execution plan for a previously cached query plan. Note that, this is an opt-in feature. If you want to enable this at the server level for all the databases, use trace flag 2451. To enable at the database level, turn ON the LAST_QUERY_PLAN_STATS database scoped config option.
A thread is the lowest unit of execution. Your SQL query, at the least, needs one thread to run. Depending on the SQL instance configurations and the eligibility of the query, SQL Server may decide to execute your query with multiple threads - this is called parallelism. If too many such workloads are parallelized and if they are long-running, your subsequent incoming queries might not even get one thread to run. In other words, they will be waiting for a thread to be assigned to them because SQL Server has a limited number of threads (by default) based on the number of processors on the hardware. This wait is called THREADPOOL. It is not a good situation to be in and may go unnoticed. If you are facing this wait type, consider reducing parallelism (one of the ways). Consider increasing the Cost Threshold of Parallelism (server config) so that not all inexpensive workloads go in parallel. Or change the MAXDOP value to a reasonable number, say 4. Especially on hardware with a lot of cores. Maybe you can set MAXDOP to the number of cores per NUMA node. This will reduce parallelism and thereby make more threads available in the thread pool.