If you read the end of the previous recipe, you might assume we don't recommend that you use foreign tables at all. However, we would like to reassure you that foreign tables are not all doom and gloom. To prove it, we're going to use a disarmingly simple technique to optimize them: views.
It's true that PostgreSQL foreign data wrappers cannot combine queries for multiple tables on the same server. Provided we have access to the remote server, we can rectify this situation by creating a view to encapsulate the core of the query we want to perform. We can do this because PostgreSQL only knows the name of remote objects, not their composition. We can take advantage of this and use views to force remote joins.
In this recipe, we will describe how to use a remote view in place of a foreign table.
As we will be using the pgbench_accounts
foreign table in this recipe, please follow all the previous recipes before proceeding.
For this recipe, we will continue to use the pg-primary
and pg-report
database servers. All queries should be performed by the postgres
user in the pgbench
database. Follow these steps to enforce better remote JOIN
performance:
pg-primary
:CREATE VIEW v_pgbench_accounts_self_join AS SELECT a1.aid, a2.bid, a2.abalance FROM pgbench_accounts a1 JOIN pgbench_accounts a2 USING (aid);
bench_user
on the new view on pg-primary
:GRANT SELECT ON v_pgbench_accounts_self_join TO bench_user;
pg-report
:CREATE FOREIGN TABLE pgbench_accounts_self ( aid INTEGER NOT NULL, bid INTEGER, abalance INTEGER ) SERVER primary_db OPTIONS (table_name 'v_pgbench_accounts_self_join'),
bench_user
on the foreign table on pg-report
:GRANT SELECT ON pgbench_accounts_self TO bench_user;
EXPLAIN VERBOSE SELECT aid, bid, abalance FROM pgbench_accounts_self WHERE aid BETWEEN 500000 AND 500005;
For the first step, we create a view named v_pgbench_accounts_self_join
on pg-primary
that uses the same columns and the same self-join we attempted in the previous recipe. Then, we grant access to bench_user
so that the view is usable on the pg-report
server.
Next, we create a foreign table just as we did in the Creating a foreign table recipe, but this time, we name the local foreign table pgbench_accounts_self
even though the view has a much different name. This should illustrate that names do not have to necessarily match and that PostgreSQL doesn't care whether the remote object is a table or a view. Once again, we grant access to the foreign table to the mapped bench_user
user and consider our work complete.
Before we consider this operation a success, let's examine a verbose EXPLAIN
that uses the foreign table. Here's the output from our test system:
This is much better! Now, we can see that the WHERE
clause is being sent to restrict output from the v_pgbench_accounts_self_join
view. As this view is evaluated on the pg-primary
server, the join happens there as well. We have successfully combined two foreign tables into one.
As powerful as this technique might be, its utility is limited by the fact that we're using views to circumvent normal table access methods. This means our foreign table now has the same limitations as views. Unless the view is very simple—which would defeat the purpose of using a view like this—we cannot perform any of the following actions:
However, there is one thing we can do with a foreign table view that we can't do with a local view. As foreign tables can be analyzed to gather statistics, we can analyze foreign table views as well. This produces local statistics that may include correlations that PostgreSQL would normally not find.
In the current state of the PostgreSQL foreign data architecture, this might not mean much. Yet, as techniques and the underlying code improve, what is now merely an interesting fluke might become an advanced optimization approach. Only time will tell.
3.133.156.251