Hi Friends,

A very common issue that I see on forums is that SQL Server NOT IN clause not working. Here is a quick repro:

Note: These are dummy tables just for a quick demo so please do not try to find the logic between the two tables πŸ™‚

I create a customers table with custid and custname columns. I insert three records in it.

USE tempdb

create table tbl_customers
(custid int, custname varchar(50))

insert into tbl_customers

insert into tbl_customers

insert into tbl_customers
(custid) values(3)

Note that a NULL gets inserted in the third record for custname columns. This is intentional.

This is what the customers table looks like, note the NULL there.

sql_server_not_in_clause_not working

I create another table: Orders table with two columns, ordered and custname and insert three records.

create table tbl_orders
(orderid int, custname varchar(50))

insert into tbl_orders

insert into tbl_orders

insert into tbl_orders

Note that in the third order I insert a customer name β€˜C’ that does not exist in customers table. (I told you not to find the logic, just understand the technical issue)

This is what the Orders table looks like:

sql_server_not_in_clause_not working

Now my final query: Give me all the customer names from Orders table that do not exist in customers table.

select custname from tbl_orders
where custname
NOT IN(select custname from tbl_customers)

And I get an empty result set.

sql_server_not_in_clause_not working

I was expecting β€˜C’ to be returned.

The technical issue is:

NOT IN clause in SQL Server is nothing but a series of NOT EQUAL TO. One of the values from the subquery is a NULL. The result set (custname from tbl_customers) contains A, B & NULL. Every value from the outer query is compared with every value from the inner query. And when the comparison is done with NULL, it returns UNKNOWN because NULL is an unknown value and anything compared with unknown can only result in unknown. And UNKNOWN in this case of Logical comparison is treated as FALSE and an empty result set is being returned.

Solution 1: Put an additional filter.

select custname from tbl_orders
where custname
NOT IN(select custname from tbl_customers
        where custname is not null)

Solution 2: Use Left Join

SELECT t.custname
FROM tbl_orders t LEFT JOIN
tbl_customers c ON t.custname = c.custname
WHERE c.custname IS NULL

Solution 3: Use NOT EXIST clause

select custname from tbl_orders o
NOT EXISTS(select custname from tbl_customers c
            where c.custname = o.custname)

Solution 4: Using EXCEPT operator

select custname from tbl_orders
select custname from tbl_customers

All the solutions give you this output:

sql_server_not_in_clause_not working

Note that EXCEPT operator behaves a little differently. If there were two instances of C, it will only show one instance. It applies a DISTINCT SORT and removes duplication.

Now, if you run all four of them and check their execution plans, the first three seem to perform equally well. However the EXCEPT solution is expensive because of the DISTINCT SORT.

sql_server_not_in_clause_not working_5

So choice is yours and what you expect from the resultset!

And that’s why it is very important to know your data well. You should be knowing whether the target column allows NULL or is it NULLABLE or not πŸ™‚

Want to watch this in action? Watch here!



Data Platform Virtual Summit 2020

Subscribe to SQLMaestros YouTube channel. If you want more learning content in your inbox, subscribe to SQLMaestros Bulletin.

SQLMaestros YouTube | SQLMaestros Bulletin | SQLMaestros Twitter