Improving performance

Performance is one of the key factors when considering pgbouncer in the first place. To make sure that performance stays high, some issues have to be taken seriously.

First of all it is recommended to make sure that all nodes participating in your setup are fairly close to each other. This greatly helps to reduce network roundtrip times and thus boosts performance. There is no point in reducing the overhead of calling fork() and paying for this gain with network time. Just as in most scenarios reducing network time and latency is definitely a huge asset.

Basically pgbouncer can be placed on a dedicated pgbouncer server, on the database node directly, or on the webserver. In general, it is recommended to avoid putting database infrastructure on the web server. If you have a larger setup, a dedicated server might be a good option.

One additional issue, which is often forgotten, is related to pooling itself: As we have stated already, the idea of pgbouncer is to speed up the process of getting a database connection. However, what if the pool is short on connections? If there are no spare database connections idling around, what will happen? Well, you will consume a lot of time to make those connections by forking them in the backend. To fix this problem it is recommended to set min_pool_size to a reasonable value. This is especially important if many connections are created at the same time (if a web server is restarted, for example). Always make sure that your pool is reasonably sized to sustain high performance (in terms of creating new connections).

Tip

The perfect value for min_pool_size will depend on the type of application you are running. However, we have good experiences with substantially higher values than the default.

A simple benchmark

In this chapter we have already outlined that it is very beneficial to use pgbouncer if many shorted lived connections have to be created by an application. To prove our point we have compiled an extreme example. The goal is to run a test doing as little as possible—we want to measure merely how much time we burn to open a connection. To do so we have set up a virtual machine with just one CPU.

The test itself will be performed using pgbench (a contrib module widely used to benchmark PostgreSQL).

We can easily create ourself a nice and shiny test database:

pgbench -i p1

Then we can write ourselves a nice sample SQL command, which should be executed repeatedly:

SELECT 1;

Now we can run an extreme test against our standard PostgreSQL installation:

hs@VM:test$ pgbench -t 1000 -c 20 -S p1 -C -f select.sql
starting vacuum...end.
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 20
number of threads: 1
number of transactions per client: 1000
number of transactions actually processed: 20000/20000
tps = 67.540663 (including connections establishing)
tps = 15423.090062 (excluding connections establishing)

We want to run 20 concurrent connections. They all execute 1000 single transactions. –C indicates that after every single transaction the benchmark will close the open connection and create a new one. This is a typical case on a web server without pooling—each page might be a separate connection.

Now, keep in mind—this test has been designed to look ugly. We can observe that keeping the connection alive will make sure that we can execute roughly 15,000 transactions per second on our single VM CPU. If we have to fork a connection each time, we will drop to just 67 transactions per second – as we have stated before: This kind of overhead is worth thinking about.

Let us now repeat the test and connect to PostgreSQL through pgbouncer:

hs@VM:test$ pgbench -t 1000 -c 20 -S p1 -C -f select.sql -p 6432
starting vacuum...end.
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 20
number of threads: 1
number of transactions per client: 1000
number of transactions actually processed: 20000/20000
tps = 1013.264853 (including connections establishing)
tps = 2765.711593 (excluding connections establishing)

As you can see our throughput has risen to 1013 transactions per second. This is 15 times more than before—indeed a nice gain.

However, we also have to see that our performance level has dropped if we did not close the connection to pgbouncer. Remember, the bouncer, the benchmark tool, and PostgreSQL are all running on the same single CPU. This does have an impact here (context switches are not too cheap in a virtualized environment).

Keep in mind that this is an extreme example—if you repeat the same test with longer transactions you will see that the gap will logically become much smaller. Our example has been designed to demonstrate our point.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.191.67.40