Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 10. Data Distribution

In this chapter, we will learn how clever data management can increase uptime even further. We will cover the following recipes in this chapter:

Identifying horizontal candidates
Setting up a foreign PostgreSQL server
Mapping a remote user
Creating a foreign table
Using a foreign table in a query
Optimizing foreign table access
Transforming foreign tables into local tables
Creating a scalable nextval replacement
Building a sharding API
Talking to the right shard
Moving a shard to another server

Introduction

Every business has the goal of being successful. The consequence of having a successful business when there's a database involved is increasingly high volume. This volume can be composed of query activity, data accumulation, or both. A PostgreSQL database that is not prepared for vast amounts of data or transaction load will slowly falter until the platform suffers.

Customers notice bad performance just as readily as outages. If our database is struggling to service queries, we have three options:

Spend time optimizing the platform to reduce database interaction
Buy a more capable database server
Store data on several PostgreSQL servers

Indeed, we should probably always implement step one in any case. Yet, there is a limit to candidates for optimization. If the platform is using an ORM, making query changes can be difficult. Frontend caching can prevent a vast amount of database accesses, but we need to consider cold caches, refreshes, and write volume. Writes must touch the database regardless of the cache state, so we need a solution independent of optimization.

We can also buy a newer, bigger, and better server. We can add CPUs, memory, and storage to a single expensive server until we saturate its available slots and ports. If we've maximized the most expandable server currently manufactured, we have a problem if the database volume continues to increase. What can we do?

A good platform architect will see this potential disaster before it strikes. We must make the assumption that our business and software will be successful beyond our wildest dreams, and act accordingly. If we were Facebook, Instagram, or Skype, we would recognize the necessity of using multiple database servers early, enabling horizontal growth. It just so happens that PostgreSQL has a rich interface for database federation that we can leverage.

That will be the focus of this chapter. A highly available PostgreSQL cluster isn't only online and responding now, it does so in the future as well. Whether we accomplish horizontal distribution through assigned regions, associated groups, or at random, we need the infrastructure in place to facilitate this type of access. We will use PostgreSQL features to split up our data and ensure that the platform can run for years to come for the millions of users that will follow.

Note

The features we will discuss in this chapter rely on the PostgreSQL foreign data wrapper, which wasn't introduced until PostgreSQL 9.3. We strongly recommend that you upgrade any old PostgreSQL clusters to 9.3 when possible if you foresee a future need for widely distributed data. You will not be able to implement many of the ideas discussed here until then.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 10. Data Distribution

Create new playlist

Sign In

Sign Up

Chapter 10. Data Distribution

Introduction

Note

Table of Contents for
10. Data Distribution