Chapter 4. Reading Data Efficiently and Correctly

Originally, this chapter was called "Reading Data Efficiently." Thinking about it for a while actually convinced me that "and Correctly" should be appended to the title. The reason is that this chapter is not just about speed; it is also about getting things right and ensuring that the result can be generated fast. What good is a fast result if it happens to be just plain wrong?

The following issues will be covered:

  • NULL handling
  • Fixing joins
  • Using cursors
  • Synchronized seq scans
  • Prepared queries
  • Indexes and foreign keys

Understanding the power of NULL

Some of you might wonder why a section about NULL is included in this book. Remember that it is all about troubleshooting, and trust me, NULL can be a real source of trouble if its pitfalls are not understood properly. NULL is actually a pretty old thing. E. F. Codd (the father of relational databases) mentioned the concept of NULL in a 1975 paper. The idea of NULL is therefore central to the world of relational systems.

However, NULL has not always been widely understood and accepted among SQL developers. Many people have experienced troubles in the face of NULL. In this section, we will shed some light on this mysterious thing and solve the riddle of NULL.

The first and most important thing to know is that NULL does not mean empty string; it means unknown. Not knowing is not the same as empty. For example, if you know that your wallet is empty, then it means you know that there is nothing in it. Not knowing what is in your wallet is a totally different situation. It is important to keep this in mind. The golden rule is that NULL means undefined.

Seeing NULL in action

After this bit of theoretical introduction, it is time to dive into some real action to see how NULL really behaves. The first important observation is that NULL is not the same as an empty string:

test=# SELECT NULL = '';
 ?column? 
----------
 
(1 row)

Many people expect this query to return false. In fact, the return value is NULL. Why? Imagine you are holding nothing in your left hand. We don't know what is in your right hand. Do both hands contain the same thing? We don't know. The second hand might be empty as well but we don't know for certain. Therefore, the answer to this question is unknown.

To check whether a certain value is NULL or not, IS NULL has to be used. In this case, the answer will be definitely false:

test=# SELECT '' IS NULL;
 ?column? 
----------
 f
(1 row)

The same logic applies to the next example:

test=# SELECT NULL = NULL;
 ?column? 
----------
 
(1 row)

test=# SELECT NULL IS NULL;
 ?column? 
----------
 t
(1 row)

Again, the first query returns NULL, as we don't know if both sides are the same. Remember that the unknown item in your left hand may not be the same as the unknown item in your right hand.

NULL and storage

The good news is that NULL does not need space in the disk. In PostgreSQL, every row contains a so-called NULL bitmap. It indicates whether a field happens to be NULL or not. Therefore, an individual NULL value does not need any additional space in the disk.

This is not quite true for an empty string. A varchar string will need a fair amount of space even if it is empty because it is a perfectly defined value.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.133.132.99