Phil Factor's Phrenetic Phoughts

Simple-Talk columnist
The wilder shores of Transact SQL    Phil on Twitter   Phil on SQL Server Central"

Much ado about Null

Published Monday, December 04, 2006 7:42 PM

One of the best ways to test the knowledge of a database programmer is to ask very simple questions. I was reminded of this the other day whilst trying to put an explanation of a Relational Database into language which the proverbial Man in the Pub could understand. There is always a temptation to scurry for the cover of a thicket of long words to cover up ones wooly thinking.  You can't do that with the Man in the Pub, not if you want him to buy the next round, that is. It gave me pause for thought.

The concept of NULL is one of those subjects that sorts out the sheep from the goats. Despite what you've been told it means 'Unknown'. SQL 92 finally nailed the confusion that slipped in because of its misleading name. It doesn't mean 'Nothing' . The 'propagation of nulls' is nothing more than common sense.

For example if you have an unknown quantity and add 1 to it, it is still unknown. In fact, whatever mathematical operations you do with an unknown quantity are likely to end up with an unknown result

SELECT NULL+1

/*
-----------
NULL
*/

It will be the same with strings. An unknown string, concatenated with the words 'hello world will result in a different unknown string.

SELECT 'A known value concatenated with '+NULL

/* 
---------------------------------
NULL

(1 row(s) affected)*/

And, of course, SQL92 will spit on the idea of two unknowns being equal, or the result of something being compared with null to be anything other than unknown (i.e. NULL)

IF NULL NULL

      
Select 
'his server is not SQL-92 compliant'

ELSE

      Select 
'This server is SQL-92 compliant'

It is when you are doing aggregate functions that it all gets interesting.

CREATE TABLE #test (runner VARCHAR(80),Score INT)

INSERT INTO #test SELECT 'Fred',10
INSERT INTO #test SELECT 'Dave',4
INSERT INTO #test SELECT 'Simon',5
INSERT INTO #test SELECT 'Brian',11
INSERT INTO #test SELECT 'Neil',2
INSERT INTO #test SELECT 'Lionel',6
INSERT INTO #test SELECT 'Bart',7
INSERT INTO #test SELECT 'Helen',9
INSERT INTO #test SELECT 'Kathryn',5
INSERT INTO #test SELECT 'Richard',0
SELECT AVG(scoreFROM #test
/*
-----------
5

(1 row(s) affected)
*/
SELECT SUM(ScoreFROM #test
/*
-----------
59

(1 row(s) affected)
*/
UPDATE #test SET score=NULL WHERE runner='richard'
SELECT AVG(scoreFROM #test
/*
-----------
6
Warning: Null value is eliminated by an aggregate or other SET operation.

(1 row(s) affected)*/
SELECT SUM(scoreFROM #test
/*
-----------
59
Warning: Null value is eliminated by an aggregate or other SET operation.

(1 row(s) affected)
*/

Now, you can see what is happening here. The row with an unknown score is being removed from the calculation so that the sum is divided by the nine known scores rather then the ten. The sum is the sum of all non-null rows.

Well, the sum of a column with a null value in it is surely NULL. so the average is as well. Instead, SQL Server assumes you've made a mistake and whisks out the row with the null value in it.

There is only one thing to be done with this sort of contradiction, and that is to avoid doing any aggregate operations on rows which have a NULL in them.

The behaviour of NULL, especially with NULLIF or COALESCE allows you to do all sorts of tricks in SQL,  which are complicated to achieve any other way. Just as long as you remember that, in SQL 92, NULL means Unknown, not Nothing!

Comments

 

Richard Mitchell said:

But then you also have the fun of if ANSI_NULLS is turned on and off which changes the behaviour of comparisons, when ON
            NULL = NULL returns NULL
when OFF
            NULL = NULL returns TRUE

Oh the joy.
December 5, 2006 4:38 AM
 

Phil Factor said:

If ANSI_NULLS is turned off then all hell breaks loose with the logical simplicity of NULLS. This feature is now deprecated, thank goodness.
When SET ANSI_NULLS is ON, all comparisons against a null value evaluate to UNKNOWN. When SET ANSI_NULLS is OFF, comparisons of all data against a null value evaluate to TRUE if the data value is NULL

SET ANSI_NULLS OFF

SELECT 
'This is '+CASE WHEN NULL=NULL THEN 'not ' ELSE '' END +'SQL-92 compliant'
/*
----------------------------
This is not SQL-92 compliant

(1 row(s) affected)
*/
SET ANSI_NULLS ON
SELECT 
'This is '+CASE WHEN NULL=NULL THEN 'not ' ELSE '' END +'SQL-92 compliant'
/*
-------------------------
This is SQL-92 compliant

(1 row(s) affected)
*/

Now we know that in our harmonious world, NULL=NULL is not true, but what about NOT (NULL=NULL). Because the SQL 92 behaviour specifies that the result of the logical expression is NULL, then both expressions give the same result of 'UNKNOWN', which becomes NOT TRUE if it appears in IF or CASE expressions. It takes a bit of thought to realise that, in a logical universe where we accept the existence of the UNKNOWN, (<logical expression>)  should give the same result as NOT (<logical expression>) 

December 5, 2006 4:56 AM
 

crappy said:

Welll, I'd agree with you if you'd successfully proven that

Nothing != Unknown

Perhaps in SQL93
December 11, 2006 12:25 AM
 

Phil Factor said:

I think that what we're saying is that an unknown result remains unknown whatever logical or mathematical expression one applies to it; which seems reasonable to me. The danger comes because NULL is accepted in conditional expressions such as 'IF', 'WHEN' or 'CASE', rather than triggering an error.
December 11, 2006 2:58 AM
 

rudyx said:

I was always taught that NULL means "unquantifiable". I do believe that it is more accurate and less confusing term than "unknown". This goes back a bit to mathematics where you have lots of "unknowns", a.k.a. variables, which in most cases are "quantifiable" !
December 11, 2006 3:42 PM
 

Phil Factor said:

I agree that 'Unquantifiable' is much more accurate, although probably not a word so commonly used as 'Unknown'. Also, both SQL92 and BOL refer to 'unknown' when discussing NULL, so I haven't the choice.

BOL for SQL Server 2005 makes a pretty good job of explaining how NULL works in SQL Server so I'd refer anyone to it.

SQL Server 2005 Books Online: Null Values
December 13, 2006 9:42 AM
You need to sign in to comment on this blog

















<December 2006>
SuMoTuWeThFrSa
262728293012
3456789
10111213141516
17181920212223
24252627282930
31123456
A SysAdmin's Guide to Change Management
 In the first in a series of monthly articles, ‘Confessions of a Sys Admin’, Matt describes the issues... Read more...

Exchange: Recovery Storage Groups
 It can happen at any time: You get a request, as Admin, from your company, to provide the contents of... Read more...

Build Your Own Virtualized Test Lab
 Desmon explains the fundamentals of building a test lab for Windows servers and Enterprise applications... Read more...

Rendering Hierarchical Data with the Treeview
 It sometimes happens that Web Server controls that visualize data don't quite fit with the way that... Read more...

SQL Server 2008: Performance Data Collector
 With Performance Data Collector in SQL Server 2008, you can now store performance data from a number of... Read more...