Creating index access methods

So far in this book, you came across examples of creating new data types or user-defined types and operators. What we haven't discussed so far is how to index these types. In PostgreSQL, an index is more of a framework that can be extended or customized for using different strategies. In order to create new index access methods, we have to create an operator class. Let's take a look at a simple example.

Let's consider a scenario where you have to store some special data such as an ID or a social security number in the database. The number may contain non-numeric characters, so it is defined as a text type:

CREATE TABLE test_ssn (ssn text);
 
INSERT INTO test_ssn VALUES ('222-11-020878'),
INSERT INTO test_ssn VALUES ('111-11-020978'),

Let's assume that the correct order for this data is such that it should be sorted on the last six digits and not the ASCII value of the string.

The fact that these numbers need a unique sort order presents a challenge when it comes to indexing the data. This is where PostgreSQL operator classes are useful. An operator allows a user to create a custom indexing strategy.

Creating an indexing strategy is about creating your own operators and using them alongside a normal B-tree.

Let's start by writing a function that changes the order of digits in the value and also gets rid of the non-numeric characters in the string to be able to compare them better:

CREATE OR REPLACE FUNCTION fix_ssn(text)
 RETURNS text AS $$
 BEGIN
 
        RETURN substring($1,8) || replace(substring($1,1,7),'-',''),
 
END; 
$$LANGUAGE 'plpgsql' IMMUTABLE;

Let's run the function and verify that it works:

testdb=# SELECT fix_ssn(ssn) FROM test_ssn;
  fix_ssn   
-------------
 02087822211
 02097811111
(2 rows)

Before an index can be used with a new strategy, we may have to define some more functions depending on the type of index. In our case, we are planning to use a simple B-tree, so we need a comparison function:

CREATE OR REPLACE FUNCTION ssn_compareTo(text, text)
RETURNS int AS
$$
 BEGIN
    IF fix_ssn($1) < fix_ssn($2)
    THEN
        RETURN -1;
    ELSIF fix_ssn($1) > fix_ssn($2)
    THEN
        RETURN +1;
       ELSE
           RETURN 0;
     END IF;
END;
$$ LANGUAGE 'plpgsql' IMMUTABLE;

It's now time to create our operator class:

CREATE OPERATOR CLASS ssn_ops
FOR TYPE text USING btree
AS
OPERATOR        1       <  ,
OPERATOR        2       <= ,
OPERATOR        3       =  ,
OPERATOR        4       >= ,
OPERATOR        5       >  ,
FUNCTION        1       ssn_compareTo(text, text);

You can also overload the comparison operators if you need to compare the values in a special way, and use the functions in the compareTo function as well as provide them in the CREATE OPERATOR CLASS command.

We will now create our first index using our brand new operator class:

CREATE INDEX idx_ssn ON test_ssn (ssn ssn_ops);

We can check whether the optimizer is willing to use our special index, as follows:

testdb=# SET enable_seqscan=off; 
testdb=# EXPLAIN SELECT * FROM test_ssn WHERE ssn = '02087822211';
                                  QUERY PLAN                                  
------------------------------------------------------------------
 Index Only Scan using idx_ssn on test_ssn  (cost=0.13..8.14 rows=1 width=32)
   Index Cond: (ssn = '02087822211'::text)
(2 rows)

Therefore, we can confirm that the optimizer is able to use our new index.

You can read about index access methods in the PostgreSQL documentation at http://www.postgresql.org/docs/current/static/xindex.html.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.223.162.55