Home Page Icon
Home Page
Table of Contents for
Table of Contents
Close
Table of Contents
by Hans-Jürgen Schönig, Zoltan Böszörmenyi
PostgreSQL Replication
PostgreSQL Replication
Table of Contents
PostgreSQL Replication
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Support files, eBooks, discount offers and more
Why Subscribe?
Free Access for Packt account holders
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Downloading the example code
Errata
Piracy
Questions
1. Understanding Replication Concepts
The CAP theory and physical limitations
Understanding the CAP theory
Why the speed of light matters
Long distance transmission
Why latency matters
Different types of replication
Synchronous versus asynchronous replication
Understanding replication and data loss
Considering the performance issues
Single-master versus multi-master replication
Logical versus physical replication
When to use physical replication
When to use logical replication
Using sharding and data distribution
Understanding the purpose of sharding
An example of designing a sharded system
An example of querying different fields
Pros and cons of sharding
Choosing between sharding and redundancy
Increasing and decreasing the size of a cluster
Combining sharding and replication
Various sharding solutions
PostgreSQL-based sharding
External frameworks/middleware
Summary
2. Understanding the PostgreSQL Transaction Log
How PostgreSQL writes data
The PostgreSQL disk layout
Looking into the data directory
PG_VERSION – PostgreSQL version number
base – the actual data directory
Growing data files
Performing I/O in chunks
Relation forks
global – the global data
Dealing with standalone data files
pg_clog – the commit log
pg_hba.conf – host-based network configuration
pg_ident.conf – ident authentication
pg_multixact – multi-transaction status data
pg_notify – LISTEN/NOTIFY data
pg_serial – information about committed serializable transactions
pg_snapshot – exported snapshots
pg_stat_tmp – temporary statistics data
pg_subtrans – subtransaction data
pg_tblspc – symbolic links to tablespaces
pg_twophase – information about prepared statements
pg_XLOG – the PostgreSQL transaction log (WAL)
postgresql.conf – the central PostgreSQL configuration file
Writing one row of data
A simple INSERT statement
Crashing during WAL-writing
Crashing after WAL-writing
Read consistency
The purpose of the shared buffer
Mixed reads and writes
The XLOG and replication
Understanding consistency and data loss
All the way to the disk
From memory to memory
From memory to the disk
One word about batteries
Beyond fsync()
PostgreSQL consistency levels
Tuning checkpoints and the XLOG
Understanding the checkpoints
Configuring checkpoints
About segments and timeouts
To write or not to write?
Scenario 1 – Storing stock-market data
Scenario 2 – Bulk loading
Scenario 3 – I/O spikes and throughput considerations
Conclusion
Tweaking WAL buffers
The internal structure of the XLOG
Understanding the XLOG records
Making the XLOG deterministic
Making the XLOG reliable
LSNs and shared buffer interaction
Debugging the XLOG and putting it all together
Summary
3. Understanding Point-In-Time-Recovery
Understanding the purpose of PITR
Moving to the bigger picture
Archiving the transaction log
Taking base backups
Using pg_basebackup
Modifying pg_hba.conf
Signaling the master server
pg_basebackup – basic features
pg_basebackup – self-sufficient backups
Making use of traditional methods to create base backups
Tablespace issues
Keeping an eye on network bandwidth
Replaying the transaction log
Performing a basic recovery
More sophisticated positioning in the XLOG
Cleaning up the XLOG on the way
Switching the XLOG files
Summary
4. Setting up Asynchronous Replication
Setting up streaming replication
Tweaking the config files on the master
Handling pg_basebackup and recovery.conf
Making the slave readable
The underlying protocol
Configuring a cascaded replication
Turning slaves to masters
Mixing streaming and file-based recovery
The master configuration
The slave configuration
Error scenarios
Network connection between the master and slave is dead
Rebooting the slave
Rebooting the master
Corrupted XLOG in the archive
Making the streaming-only replication more robust
Efficient cleanup and the end of recovery
Gaining control over the restart points
Tweaking the end of your recovery
Conflict management
Dealing with the timelines
Summary
5. Setting up Synchronous Replication
Setting up synchronous replication
Understanding the downside of synchronous replication
Understanding the application_name parameter
Making synchronous replication work
Checking replication
Understanding performance issues
Setting synchronous_commit to on
Setting synchronous_commit to remote_write
Setting synchronous_commit to off
Setting synchronous_commit to local
Changing durability settings on the fly
Understanding practical implications and performance
Redundancy and stopping replication
Summary
6. Monitoring Your Setup
Checking your archive
Checking the archive_command
Monitoring the transaction log archive
Checking pg_stat_replication
Relevant fields in pg_stat_replication
Checking for operating system processes
Dealing with monitoring tools
Installing check_postgres
Deciding on a monitoring strategy
Summary
7. Understanding Linux High Availability
Understanding the purpose of high availability
Measuring availability
History of high-availability software
OpenAIS and Corosync
Linux-HA (Heartbeat) and Pacemaker
Terminology and concepts
High availability is all about redundancy
PostgreSQL and high availability
High availability with quorum
High availability with STONITH
Summary
8. Working with pgbouncer
Understanding fundamental pgbouncer concepts
Installing pgbouncer
Configuring your first pgbouncer setup
Writing a simple config file and starting pgbouncer up
Dispatching requests
More basic settings
Authentication
Connecting to pgbouncer
Java issues
Pool modes
Cleanup issues
Improving performance
A simple benchmark
Maintaining pgbouncer
Configuring the admin interface
Using the management database
Extracting runtime information
Suspending and resuming operations
Summary
9. Working with pgpool
Installing pgpool
Installing pgpool-regclass and insert_lock
Understanding pgpool features
Understanding the pgpool architecture
Setting up replication and load balancing
Password authentication
Firing up pgpool and testing the setup
Attaching hosts
Checking replication
Running pgpool with streaming replication
Optimizing pgpool configuration for master/slave mode
Dealing with failovers and high availability
Using PostgreSQL streaming and Linux HA
pgpool mechanisms for high availability and failover
Summary
10. Configuring Slony
Installing Slony
Understanding how Slony works
Dealing with logical replication
The slon daemon
Replicating your first database
Deploying DDLs
Adding tables to replication and managing problems
Performing failovers
Planned failovers
Unplanned failovers
Summary
11. Using Skytools
Installing skytools
Dissecting skytools
Managing pgq-queues
Running pgq
Creating queues and adding data
Adding consumers
Configuring the ticker
Consuming messages
Dropping queues
Using pgq for large projects
Using londiste to replicate data
Replicating our first table
One word about walmgr
Summary
12. Working with Postgres-XC
Understanding the Postgres-XC architecture
Data nodes
GTM – Global Transaction Manager
Coordinators
GTM Proxy
Installing Postgres-XC
Configuring a simple cluster
Creating the GTM
Optimizing for performance
Dispatching the tables
Optimizing the joins
Optimizing for warehousing
Creating a GTM Proxy
Creating the tables and issuing the queries
Adding nodes
Handling failovers and dropping nodes
Handling node failovers
Replacing the nodes
Running a GTM standby
Summary
13. Scaling with PL/Proxy
Understanding the basic concepts
Dealing with the bigger picture
Partitioning the data
Setting up PL/Proxy
A basic example
Partitioned reads and writes
Extending and handling clusters in a clever way
Adding and moving partitions
Increasing the availability
Managing the foreign keys
Upgrading the PL/Proxy nodes
Summary
Index
Search in book...
Toggle Font Controls
Playlists
Add To
Create new playlist
Name your new playlist
Playlist description (optional)
Cancel
Create playlist
Sign In
Email address
Password
Forgot Password?
Create account
Login
or
Continue with Facebook
Continue with Google
Sign Up
Full Name
Email address
Confirm Email Address
Password
Login
Create account
or
Continue with Facebook
Continue with Google
Prev
Previous Chapter
Cover
Next
Next Chapter
PostgreSQL Replication
Table of Contents
PostgreSQL Replication
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Support files, eBooks, discount offers and more
Why Subscribe?
Free Access for Packt account holders
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Downloading the example code
Errata
Piracy
Questions
1. Understanding Replication Concepts
The CAP theory and physical limitations
Understanding the CAP theory
Why the speed of light matters
Long distance transmission
Why latency matters
Different types of replication
Synchronous versus asynchronous replication
Understanding replication and data loss
Considering the performance issues
Single-master versus multi-master replication
Logical versus physical replication
When to use physical replication
When to use logical replication
Using sharding and data distribution
Understanding the purpose of sharding
An example of designing a sharded system
An example of querying different fields
Pros and cons of sharding
Choosing between sharding and redundancy
Increasing and decreasing the size of a cluster
Combining sharding and replication
Various sharding solutions
PostgreSQL-based sharding
External frameworks/middleware
Summary
2. Understanding the PostgreSQL Transaction Log
How PostgreSQL writes data
The PostgreSQL disk layout
Looking into the data directory
PG_VERSION – PostgreSQL version number
base – the actual data directory
Growing data files
Performing I/O in chunks
Relation forks
global – the global data
Dealing with standalone data files
pg_clog – the commit log
pg_hba.conf – host-based network configuration
pg_ident.conf – ident authentication
pg_multixact – multi-transaction status data
pg_notify – LISTEN/NOTIFY data
pg_serial – information about committed serializable transactions
pg_snapshot – exported snapshots
pg_stat_tmp – temporary statistics data
pg_subtrans – subtransaction data
pg_tblspc – symbolic links to tablespaces
pg_twophase – information about prepared statements
pg_XLOG – the PostgreSQL transaction log (WAL)
postgresql.conf – the central PostgreSQL configuration file
Writing one row of data
A simple INSERT statement
Crashing during WAL-writing
Crashing after WAL-writing
Read consistency
The purpose of the shared buffer
Mixed reads and writes
The XLOG and replication
Understanding consistency and data loss
All the way to the disk
From memory to memory
From memory to the disk
One word about batteries
Beyond fsync()
PostgreSQL consistency levels
Tuning checkpoints and the XLOG
Understanding the checkpoints
Configuring checkpoints
About segments and timeouts
To write or not to write?
Scenario 1 – Storing stock-market data
Scenario 2 – Bulk loading
Scenario 3 – I/O spikes and throughput considerations
Conclusion
Tweaking WAL buffers
The internal structure of the XLOG
Understanding the XLOG records
Making the XLOG deterministic
Making the XLOG reliable
LSNs and shared buffer interaction
Debugging the XLOG and putting it all together
Summary
3. Understanding Point-In-Time-Recovery
Understanding the purpose of PITR
Moving to the bigger picture
Archiving the transaction log
Taking base backups
Using pg_basebackup
Modifying pg_hba.conf
Signaling the master server
pg_basebackup – basic features
pg_basebackup – self-sufficient backups
Making use of traditional methods to create base backups
Tablespace issues
Keeping an eye on network bandwidth
Replaying the transaction log
Performing a basic recovery
More sophisticated positioning in the XLOG
Cleaning up the XLOG on the way
Switching the XLOG files
Summary
4. Setting up Asynchronous Replication
Setting up streaming replication
Tweaking the config files on the master
Handling pg_basebackup and recovery.conf
Making the slave readable
The underlying protocol
Configuring a cascaded replication
Turning slaves to masters
Mixing streaming and file-based recovery
The master configuration
The slave configuration
Error scenarios
Network connection between the master and slave is dead
Rebooting the slave
Rebooting the master
Corrupted XLOG in the archive
Making the streaming-only replication more robust
Efficient cleanup and the end of recovery
Gaining control over the restart points
Tweaking the end of your recovery
Conflict management
Dealing with the timelines
Summary
5. Setting up Synchronous Replication
Setting up synchronous replication
Understanding the downside of synchronous replication
Understanding the application_name parameter
Making synchronous replication work
Checking replication
Understanding performance issues
Setting synchronous_commit to on
Setting synchronous_commit to remote_write
Setting synchronous_commit to off
Setting synchronous_commit to local
Changing durability settings on the fly
Understanding practical implications and performance
Redundancy and stopping replication
Summary
6. Monitoring Your Setup
Checking your archive
Checking the archive_command
Monitoring the transaction log archive
Checking pg_stat_replication
Relevant fields in pg_stat_replication
Checking for operating system processes
Dealing with monitoring tools
Installing check_postgres
Deciding on a monitoring strategy
Summary
7. Understanding Linux High Availability
Understanding the purpose of high availability
Measuring availability
History of high-availability software
OpenAIS and Corosync
Linux-HA (Heartbeat) and Pacemaker
Terminology and concepts
High availability is all about redundancy
PostgreSQL and high availability
High availability with quorum
High availability with STONITH
Summary
8. Working with pgbouncer
Understanding fundamental pgbouncer concepts
Installing pgbouncer
Configuring your first pgbouncer setup
Writing a simple config file and starting pgbouncer up
Dispatching requests
More basic settings
Authentication
Connecting to pgbouncer
Java issues
Pool modes
Cleanup issues
Improving performance
A simple benchmark
Maintaining pgbouncer
Configuring the admin interface
Using the management database
Extracting runtime information
Suspending and resuming operations
Summary
9. Working with pgpool
Installing pgpool
Installing pgpool-regclass and insert_lock
Understanding pgpool features
Understanding the pgpool architecture
Setting up replication and load balancing
Password authentication
Firing up pgpool and testing the setup
Attaching hosts
Checking replication
Running pgpool with streaming replication
Optimizing pgpool configuration for master/slave mode
Dealing with failovers and high availability
Using PostgreSQL streaming and Linux HA
pgpool mechanisms for high availability and failover
Summary
10. Configuring Slony
Installing Slony
Understanding how Slony works
Dealing with logical replication
The slon daemon
Replicating your first database
Deploying DDLs
Adding tables to replication and managing problems
Performing failovers
Planned failovers
Unplanned failovers
Summary
11. Using Skytools
Installing skytools
Dissecting skytools
Managing pgq-queues
Running pgq
Creating queues and adding data
Adding consumers
Configuring the ticker
Consuming messages
Dropping queues
Using pgq for large projects
Using londiste to replicate data
Replicating our first table
One word about walmgr
Summary
12. Working with Postgres-XC
Understanding the Postgres-XC architecture
Data nodes
GTM – Global Transaction Manager
Coordinators
GTM Proxy
Installing Postgres-XC
Configuring a simple cluster
Creating the GTM
Optimizing for performance
Dispatching the tables
Optimizing the joins
Optimizing for warehousing
Creating a GTM Proxy
Creating the tables and issuing the queries
Adding nodes
Handling failovers and dropping nodes
Handling node failovers
Replacing the nodes
Running a GTM standby
Summary
13. Scaling with PL/Proxy
Understanding the basic concepts
Dealing with the bigger picture
Partitioning the data
Setting up PL/Proxy
A basic example
Partitioned reads and writes
Extending and handling clusters in a clever way
Adding and moving partitions
Increasing the availability
Managing the foreign keys
Upgrading the PL/Proxy nodes
Summary
Index
Add Highlight
No Comment
..................Content has been hidden....................
You can't read the all page of ebook, please click
here
login for view all page.
Day Mode
Cloud Mode
Night Mode
Reset