A hash store, key value store, or associative array is a famous data structure among modern programing languages such as Java, Python, and Node.js. Also, there are dedicated database frameworks to handle this kind of data, such as the redis database.
PostgreSQL has supported hash store—hstore
—since the PostgreSQL version 9.0. The hstore
extension allows developers to leverage the best in different worlds. It increases the developer's agility without sacrificing the powerful features of PostgreSQL. Also, hstore
allows the developer to model semistructured data and sparse arrays in a relational model.
To create the hstore
, one simply needs to execute the following command:
CREATE EXTENSION hstore;
The textual representation of hstore
includes zero or higher key=> value
, followed by a comma. An example of the hstore
data type is as follows:
car_portal=# SELECT 'tires=>"winter tires", seat=>leather'::hstore; hstore ------------------------------------------- "seat"=>"leather", "tires"=>"winter tires" (1 row)
One could also generate a single value hstore
using the hstore(key, value)
function:
car_portal=# SELECT hstore('´Hello', 'World'); hstore ------------------- "´Hello"=>"World" (1 row)
Note that in hstore
, the keys are unique, as shown in the following example:
car_portal=# SELECT 'a=>1, a=>2'::hstore; hstore ---------- "a"=>"1" (1 row)
In the car web portal, let's assume that the developer wants to support several other attributes, such as air bags, air conditioning, power steering, and so on. The developer, in the traditional relational model, should alter the table structure and add new columns. Thanks to hstore
, the developer can store this information using the key value store without having to keep altering the table structure, as follows:
ALTER TABLE car_portal_app.car ADD COLUMN features hstore;
One limitation of the hstore
is that it is not a full document store, so it is difficult to represent nested objects in an hstore
. One advantage of an hstore
is that it can be indexed using the GIN and GiST indexes.
The ->
operator is used to get a value for a certain key. To append an hstore
, the ||
concatenation operator can be used. Furthermore, the minus sign (-
) is used to delete a key value pair. To update an hstore
, the hstore
can be concatenated with another hstore
that contains the updated value. The following example shows how hstore
keys can be inserted, updated, and deleted:
CREATE TABLE car_test_hstore ( car_id INT PRIMARY KEY, features hstore ); INSERT INTO car_test_hstore(car_id, features) VALUES (1, 'Engine=>Diesel'::hstore); -- To add a new key UPDATE car_test_hstore SET features = features || hstore ('Seat', 'Lethear') WHERE car_id = 1; -- To update a key, this is similar to add a key UPDATE car_test_hstore SET features = features || hstore ('Engine', 'Petrol') WHERE car_id = 1; -- To delete a key UPDATE car_test_hstore SET features = features - 'Seat'::TEXT WHERE car_id = 1; SELECT * FROM car_test_hstore WHERE car_id = 1; --- Result car_id | features --------+-------------------- 1 | "Engine"=>"Petrol" (1 row)
The hstore
data type is very rich in functions and operators; there are several operators to compare hstore
content. For example, the ?
, ?&
, and ?|
operators can be used to check whether hstore
contains a key, set of keys, or any of the specified keys, respectively. Also, an hstore
can be cast to arrays, sets, and JSON documents.
As an hstore
data type can be cast to a set using each (hstore)
function, one can use all the relational algebra set operators on an hstore
, such as DISTINCT
, GROUP BY
, and ORDER BY
.
The following example shows how to get distinct hstore
keys; this could be used to validate hstore
keys:
TRUNCATE Table car_test_hstore; INSERT INTO car_test_hstore(car_id, features) VALUES (1, 'Engine=>Diesel'::hstore); INSERT INTO car_test_hstore(car_id, features) VALUES (2, 'engine=>Petrol, seat=>lether'::hstore); car_portal=# SELECT DISTINCT (each(features)).key FROM car_test_hstore; key -------- Engine seat engine (3 rows)
An hstore
data type can be indexed using the GIN and GiST indexes, and picking the right index type depends on several factors, such as the number of rows in the table, available space, index search and update performance, the queries' pattern, and so on.
In general, GIN index lookups are three times fasters than GiST, but the former are more expensive to build and update and require more space. To properly pick up the right index, it is good to perform a benchmarking.
The following example shows the effect of using the GIN index in retrieving a record that has a certain key. The ?
operator returns true
if hstore
contains a key:
CREATE INDEX ON car_test_hstore USING GIN (features); SET enable_seqscan to off; EXPLAIN SELECT car_id, features->'Engine' FROM car_test_hstore WHERE features ? 'Engine'; QUERY PLAN -------------------------------------------------------------------------------------------- Bitmap Heap Scan on car_test_hstore (cost=8.00..12.02 rows=1 width=36) Recheck Cond: (features ? 'Engine'::text) -> Bitmap Index Scan on car_test_hstore_features_idx1 (cost=0.00..8.00 rows=1 width=0) Index Cond: (features ? 'Engine'::text) (4 rows)
Certainly, if an operator is not supported by the GIN index, such as the ->
operator, one can still use the B-tree index as follows:
CREATE INDEX ON car_test_hstore((features->'Engine')); SET enable_seqscan to off; EXPLAIN SELECT car_id, features->'Engine' FROM car_test_hstore WHERE features->'Engine'= 'Diesel'; QUERY PLAN ------------------------------------------------------------------------------------------------- Index Scan using car_test_hstore_expr_idx on car_test_hstore (cost=0.13..8.15 rows=1 width=36) Index Cond: ((features -> 'Engine'::text) = 'Diesel'::text) (2 rows)
3.147.36.61