Optimizing foreign table access

If you read the end of the previous recipe, you might assume we don't recommend that you use foreign tables at all. However, we would like to reassure you that foreign tables are not all doom and gloom. To prove it, we're going to use a disarmingly simple technique to optimize them: views.

It's true that PostgreSQL foreign data wrappers cannot combine queries for multiple tables on the same server. Provided we have access to the remote server, we can rectify this situation by creating a view to encapsulate the core of the query we want to perform. We can do this because PostgreSQL only knows the name of remote objects, not their composition. We can take advantage of this and use views to force remote joins.

In this recipe, we will describe how to use a remote view in place of a foreign table.

Getting ready

As we will be using the pgbench_accounts foreign table in this recipe, please follow all the previous recipes before proceeding.

How to do it...

For this recipe, we will continue to use the pg-primary and pg-report database servers. All queries should be performed by the postgres user in the pgbench database. Follow these steps to enforce better remote JOIN performance:

  1. Create a view for the basis of the join on pg-primary:
    CREATE VIEW v_pgbench_accounts_self_join AS
    SELECT a1.aid, a2.bid, a2.abalance
      FROM pgbench_accounts a1
      JOIN pgbench_accounts a2 USING (aid);
  2. Grant access to bench_user on the new view on pg-primary:
    GRANT SELECT ON v_pgbench_accounts_self_join
       TO bench_user;
  3. Create a foreign table that references the view on pg-report:
    CREATE FOREIGN TABLE pgbench_accounts_self
    (
        aid       INTEGER NOT NULL,
        bid       INTEGER,
        abalance  INTEGER
    )
    SERVER primary_db
    OPTIONS (table_name 'v_pgbench_accounts_self_join'),
  4. Grant access to bench_user on the foreign table on pg-report:
    GRANT SELECT ON pgbench_accounts_self
       TO bench_user;
  5. Examine the new query plan on pg-report with this SQL statement:
    EXPLAIN VERBOSE
    SELECT aid, bid, abalance
      FROM pgbench_accounts_self
     WHERE aid BETWEEN 500000 AND 500005;

How it works...

For the first step, we create a view named v_pgbench_accounts_self_join on pg-primary that uses the same columns and the same self-join we attempted in the previous recipe. Then, we grant access to bench_user so that the view is usable on the pg-report server.

Next, we create a foreign table just as we did in the Creating a foreign table recipe, but this time, we name the local foreign table pgbench_accounts_self even though the view has a much different name. This should illustrate that names do not have to necessarily match and that PostgreSQL doesn't care whether the remote object is a table or a view. Once again, we grant access to the foreign table to the mapped bench_user user and consider our work complete.

Before we consider this operation a success, let's examine a verbose EXPLAIN that uses the foreign table. Here's the output from our test system:

How it works...

This is much better! Now, we can see that the WHERE clause is being sent to restrict output from the v_pgbench_accounts_self_join view. As this view is evaluated on the pg-primary server, the join happens there as well. We have successfully combined two foreign tables into one.

There's more...

As powerful as this technique might be, its utility is limited by the fact that we're using views to circumvent normal table access methods. This means our foreign table now has the same limitations as views. Unless the view is very simple—which would defeat the purpose of using a view like this—we cannot perform any of the following actions:

  • We cannot insert into a foreign table view
  • We cannot update records in a foreign table view
  • We cannot delete from a foreign table view

However, there is one thing we can do with a foreign table view that we can't do with a local view. As foreign tables can be analyzed to gather statistics, we can analyze foreign table views as well. This produces local statistics that may include correlations that PostgreSQL would normally not find.

In the current state of the PostgreSQL foreign data architecture, this might not mean much. Yet, as techniques and the underlying code improve, what is now merely an interesting fluke might become an advanced optimization approach. Only time will tell.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.133.156.251