Logical difference between NOT IN and NOT EXISTS:
Though 'Not in' and 'Not exists' serve the same purpose, there is a logical difference
in their operation other than 'Not Exists' being correlated.
Before I explain the difference let us see an example.
CREATE TABLE #capital (
capital_id INT,
city_name VARCHAR(100))
CREATE TABLE #city (
city_name VARCHAR(100),
country VARCHAR(100))
INSERT INTO #capital
SELECT 1,
'New Delhi'
UNION
SELECT 2,
'Singapore'
UNION
SELECT 3,
'London'
INSERT INTO #city
SELECT 'New Delhi',
'India'
UNION
SELECT 'Singapore',
'Singapore'
UNION
SELECT NULL,
NULL
Two tables are created.. namely #capital,#city. #capital table contains three cities
.. New Delhi,singapore, London. The #city table contains New Delhi,Singapore and a NULL
row.
We would like to write a query to find the city that is present in Capital and not in
city table. Expected result is London.
First let us take 'NOT IN':
SELECT *
FROM #capital
WHERE city_name NOT IN (SELECT city_name
FROM #city)
Straight forward query..
To our surprise, the query returns Zero rows.
Now for NOT EXISTS.
SELECT *
FROM #capital cp
WHERE NOT EXISTS (SELECT cy.city_name
FROM #city cy
WHERE cy.city_name = cp.city_name)
Hey presto..it works!!! London is returned.
Why this behaviour with NOT in?
The reason lies in the way in which 'NOT IN' handles NULL.
First point to note is 'NOT IN' condition can return three possible values.
They are True,False,NULL. But 'NOT EXISTS' always returns either True/False.
Closer look at how NOT IN works:-
NOT IN condition returns TRUE when condition check is successful
EX: 'London' NOT IN ( 'New Delhi','Singapore').
NOT IN condition returns FALSE when condition check FAILS
EX: 'London' NOT IN ( 'London','Singapore').
NOT IN condition returns NULL when condition check is made against NULL.
EX:
'London' NOT IN
( 'New Delhi','Singapore',NULL)
Returns NULL because 'London' not in ( NULL) will always return NULL and not TRUE as expected. As for the row to be successfully returned the result of the condition checked ( in this case 'not in') should be TRUE. But because of NULL returned by NOT in ( instead of TRUE ) it is discarded. So if the sub query contains NULL then for NOT IN condition no rows will be returned for any value.
Though 'IN' also has a similar behavior the anomaly is realized only in NOT IN because
'London' IN
( 'New Delhi','Singapore',NULL)
Returns NULL and not TRUE, so 'London' is not returned.
'London' IN
( 'New Delhi','Singapore','London',NULL)
Returns True as 'London' in ( 'London' ) is TRUE though 'London' in ( NULL ) is NULL.
Note that 'IN' needs just one TRUE.
But NULL comparison using '=' returns FALSE always ( assuming you haven't changed the default behavior using SET ANSI_DEFAULTS option). ie NULL = 'London' is always False. So, Not Exists always works.
As I have stressed before , another reason to correlate queries.
Performance wise as well NOT EXISTS performs better than NOT IN.
Tuesday, November 3, 2009
NOT IN VERSUS NOT EXISTS
Subscribe to:
Post Comments (Atom)
2 comments:
Hey. i was searching for this topic for some time. Thanks for the example.
Welcome :)
Post a Comment