Some applications require large amounts of data that have to be read by the database. Usually, two major cases can be distinguished:
In this section, both the cases will be covered because they have slightly different requirements.
Cursors are a fundamental concept provided by all reasonable relational databases. However, cursors are often ignored and not used the way they should be, which leads to problems all along the way.
Why would anybody want to use a cursor, and what is it all about anyway? Consider the following example:
SELECT * FROM very_large_table;
For PostgreSQL, this is no problem at all. It will calculate the result and send it to the client server. However, the client may instantly crash because it will try to store all of the data in the memory. If the table is really large, the client may just not cope with all of the data coming from the server. The client will consume a lot of memory or even die. The biggest problem is that whether your client will survive or not will depend on the amount of data in the table—a sure recipe for disaster, which is also difficult to identify.
In general, your applications should be designed in a way such that they will always work, regardless of the amount of data and regardless of the result sets coming back from the server. A cursor is here to fix this very problem. The idea behind a cursor is to allow the client to keep only as much data around as will be needed at a specific time.
To make use of cursors, the DECLARE
command has to be used:
test=# h DECLARE Command: DECLARE Description: define a cursor Syntax: DECLARE name [ BINARY ] [ INSENSITIVE ] [ [ NO ] SCROLL ] CURSOR [ { WITH | WITHOUT } HOLD ] FOR query
In our example, the goal is to read 10 rows without having to keep all 10 rows in the memory at once. Demo data can be created easily:
test=# CREATE TABLE t_test AS SELECT * FROM generate_series(1, 10) AS x; SELECT 10
In the next step, the data should be read by the application. The most important thing to do here is to wrap the operation inside a transaction (in this book, transactions will also be covered in detail in one of the later chapters). The cursor can then be easily defined:
test=# BEGIN; BEGIN test=# DECLARE mycur CURSOR FOR SELECT * FROM t_test; DECLARE CURSOR
Note that creating the cursor does not do anything at this point. It merely means that PostgreSQL is now prepared to send a lot of data to you. The real work is done as soon as FETCH
is called:
test=# FETCH NEXT FROM mycur; x --- 1 (1 row)
FETCH NEXT
will yield one row from the result set. It is not possible to call FETCH NEXT
for every row in the result. The trouble with FETCH NEXT
is that it is really really slow. If you fetch 1 million rows one by one, it means a lot of network overhead. Therefore, it is highly recommended to fetch data in larger buckets. In general, buckets of several hundred to several thousand rows make perfect sense, as memory consumption and network overhead would be in perfect balance. However, don't take this as a fixed guideline. Things depend on your type of application, your data, your network, and so on. It can make sense to experiment.
Here is how more than one row can be fetched at a time:
test=# FETCH 3 FROM mycur; x --- 2 3 4 (3 rows)
It is now possible to loop through the entire result as needed.
To finish the operation, it is enough to commit the transaction. All cursors will be closed automatically:
test=# COMMIT; COMMIT
It is important to note that after a COMMIT
statement, the cursor is not there anymore:
test=# FETCH NEXT FROM mycur; ERROR: cursor "mycur" does not exist
By default, a cursor is only visible inside a transaction, and therefore, very long cursor-related operations can also cause issues such as replication conflicts and so on. If you really want a cursor to survive a transaction, you have to create a so-called WITH HOLD
cursor. In this case, however, you have to take care of cleaning up the cursor again by calling CLOSE
:
test=# h CLOSE Command: CLOSE Description: close a cursor Syntax: CLOSE { name | ALL }
WITH HOLD
cursors should be handled with care. The most important observation is that a WITH HOLD
cursor has to materialize its result, which can have a lot of impact on the amount of storage required on the database server.
Sometimes, reading is not about shipping data to the client but all about processing data server side. This is exactly when synchronized seq scans come into play.
Let's imagine the following scenario: a 10 TB table contains some old reporting data. It is stored on a server containing a couple of traditional mechanical disks. Let's consider the following query:
SELECT error_type, count(*) FROM big_table GROUP BY 1;
Assuming that error_type
has just a couple of different values, PostgreSQL will turn to a seq scan and process the query. The I/O system might deliver data at the rate of 500 MB per second. All disks will be spinning nicely.
Let's assume now that the query shown here has already processed 2 TB of data. Then somebody comes along and fires up a second, similar query. Even if the data has already been read by the previous job, it has definitely fallen out of the cache by now. Therefore, it has to be read once again. But things are even worse; those spinning disks will start jumping between the first and the second query. The more the queries added, the more random the disk behavior. After about 10 queries, the total throughput may drop to approximately 60 MB per second, which means 6 MB per database connection. Clearly, this is not a desirable performance level and is unacceptable.
Synchronized seq scans come to the rescue. Let's assume the first query has already processed 2 TB of the table at the full speed of 500 MB per second. When the second (and similar) query kicks off on the same table, PostgreSQL will make it start reading at around 2 TB. The second query will continue from there. The advantage of synchronizing these scans is that one can benefit from the cache entries generated by the other one. Reading from the disk once can, under good conditions, feed both (or even more) of the scans with data. Thus, performance is not 500 MB per second divided by two minus additional random access per scan, but ideally 500 MB per second multiplied by the number of scans. Running scans in parallel can improve speed dramatically by drastically reducing the I/O bottleneck and turning it slowly into a CPU bottleneck, which is a lot easier to scale.
The important thing is that synchronized sequential scans are on by default, and in general, it is a good idea to ensure that this setting is really set to on:
test=# SHOW synchronize_seqscans ; synchronize_seqscans ---------------------- on (1 row)
You can enjoy their full potential on the default configuration.
In many cases, it makes more sense to execute things in parallel than in sequence because many analytical jobs may share the same I/O.
3.21.93.20