So far in this book, you came across examples of creating new data types or user-defined types and operators. What we haven't discussed so far is how to index these types. In PostgreSQL, an index is more of a framework that can be extended or customized for using different strategies. In order to create new index access methods, we have to create an operator class. Let's take a look at a simple example.
Let's consider a scenario where you have to store some special data such as an ID or a social security number in the database. The number may contain non-numeric characters, so it is defined as a text type:
CREATE TABLE test_ssn (ssn text); INSERT INTO test_ssn VALUES ('222-11-020878'), INSERT INTO test_ssn VALUES ('111-11-020978'),
Let's assume that the correct order for this data is such that it should be sorted on the last six digits and not the ASCII value of the string.
The fact that these numbers need a unique sort order presents a challenge when it comes to indexing the data. This is where PostgreSQL operator classes are useful. An operator allows a user to create a custom indexing strategy.
Creating an indexing strategy is about creating your own operators and using them alongside a normal B-tree.
Let's start by writing a function that changes the order of digits in the value and also gets rid of the non-numeric characters in the string to be able to compare them better:
CREATE OR REPLACE FUNCTION fix_ssn(text) RETURNS text AS $$ BEGIN RETURN substring($1,8) || replace(substring($1,1,7),'-',''), END; $$LANGUAGE 'plpgsql' IMMUTABLE;
Let's run the function and verify that it works:
testdb=# SELECT fix_ssn(ssn) FROM test_ssn; fix_ssn ------------- 02087822211 02097811111 (2 rows)
Before an index can be used with a new strategy, we may have to define some more functions depending on the type of index. In our case, we are planning to use a simple B-tree, so we need a comparison function:
CREATE OR REPLACE FUNCTION ssn_compareTo(text, text) RETURNS int AS $$ BEGIN IF fix_ssn($1) < fix_ssn($2) THEN RETURN -1; ELSIF fix_ssn($1) > fix_ssn($2) THEN RETURN +1; ELSE RETURN 0; END IF; END; $$ LANGUAGE 'plpgsql' IMMUTABLE;
It's now time to create our operator class:
CREATE OPERATOR CLASS ssn_ops FOR TYPE text USING btree AS OPERATOR 1 < , OPERATOR 2 <= , OPERATOR 3 = , OPERATOR 4 >= , OPERATOR 5 > , FUNCTION 1 ssn_compareTo(text, text);
You can also overload the comparison operators if you need to compare the values in a special way, and use the functions in the compareTo
function as well as provide them in the CREATE OPERATOR CLASS
command.
We will now create our first index using our brand new operator class:
CREATE INDEX idx_ssn ON test_ssn (ssn ssn_ops);
We can check whether the optimizer is willing to use our special index, as follows:
testdb=# SET enable_seqscan=off; testdb=# EXPLAIN SELECT * FROM test_ssn WHERE ssn = '02087822211'; QUERY PLAN ------------------------------------------------------------------ Index Only Scan using idx_ssn on test_ssn (cost=0.13..8.14 rows=1 width=32) Index Cond: (ssn = '02087822211'::text) (2 rows)
Therefore, we can confirm that the optimizer is able to use our new index.
You can read about index access methods in the PostgreSQL documentation at http://www.postgresql.org/docs/current/static/xindex.html.
18.223.162.55