© The Author(s), under exclusive license to APress Media, LLC, part of Springer Nature 2022
R. ReidPractical CockroachDBhttps://doi.org/10.1007/978-1-4842-8224-3_9

9. Production

Rob Reid1  
(1)
Liss, Hampshire, UK
 
In the last chapter of this book, we’ll cover some practices and tools that will ensure our cluster is ready for whatever a production environment can throw at it:
  • Best practices Learning new technologies can be overwhelming, and knowing how to do things correctly can seem a daunting prospect. This section will explore some best practices to consider when running a production cluster.

  • Upgrading clusters We'll need to keep our nodes up to date to take advantage of the latest CockroachDB features and fixes.

  • Backing up and restoring data While CockroachDB’s philosophy of being “multiactive” negates the need for DR (Disaster Recovery) planning, it’s still good practice to consider taking database snapshots to return to in the event of accidental database corruption (e.g., DELETE * FROM customer...).

  • Moving a cluster Having had the privilege of working at the cosmetics retailer Lush, I appreciate that no cloud provider lasts forever.1 This section will show you how to lift and shift a CockroachDB cluster from one place to another in the event of a major infrastructure overhaul.

Best Practices

The best practices documentation2 on the Cockroach Labs website is comprehensive and a great reference for anyone wishing to get the best out of their CockroachDB clusters. We’ll put these best practices to work in this section and see what happens if we don’t.

Unless stated, many of the examples that follow in this chapter will use the three-node, load-balanced cluster I created in Chapter 8.

SELECT Performance

In this section, we’ll look at some performance best practices concerning SELECT statements and what configuration values you can tweak to enjoy the best possible performance.

If you find that SELECT queries are running slowly and you’ve ruled out indexes as the culprit, your database might lack cache memory. By default, each node will have a cache size of 128MiB (~134MB), which stores information required by queries. This size works well as a default for local database development, but you may find that increasing it will make for faster SELECT performance. Cockroach Labs recommends setting the cache size to at least 25% of the machine’s available memory for production deployments.

To update this setting, pass a value to the --cache argument when you start your node:
// Set the cache size to 30% of the node's total memory:
$ cockroach start --cache=.30 ...
$ cockroach start --cache=30% ...
// Set the cache size to a specific value:
$ cockroach start --cache=200MB ...
$ cockroach start --cache=200MiB ...

Another setting to consider changing is the --max-sql-memory argument , which will use 25% of a node’s total memory by default. Increasing this value will allow CockroachDB to allocate more memory for aggregation operations like GROUP BY and ORDER BY and increase the number of possible concurrent connections to the database, easing connection pool contention.

To update this setting, pass a value to the --max-sql-memory argument when you start your node :
// Set the cache size to 30% of the node's total memory:
$ cockroach start --max-sql-memory=.30 ...
$ cockroach start --max-sql-memory=30% ...
// Set the cache size to a specific value:
$ cockroach start --max-sql-memory=200MB ...
$ cockroach start --max-sql-memory=200MiB ...

For a full list of memory recommendations, visit the flags section3 of the start command documentation and the production checklist.4

INSERT Performance

In this section, we’ll look at some performance best practices and why you’ll want to follow them. We’ll start by performing some INSERT statements to see why it’s important to use multirow DML statements over single-row DML statements.

First, we’ll create a database and a table:
CREATE TABLE sensor_reading (
    sensor_id UUID PRIMARY KEY,
    reading DECIMAL NOT NULL,
    timestamp TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

Next, we’ll write a script to simulate the bulk insertion of 1000 rows and compare the performance of doing it via single and multirow DML statements .

I’ll use Go for these tests, so my imports and main function look as follows:
package main
import (
      "context"
      "fmt"
      "log"
      "time"
      "github.com/google/uuid"
      "github.com/jackc/pgx/v4/pgxpool"
)
func main() {
      connStr := "postgres://root@localhost:26257/best_practices?sslmode=disable"
      db, err := pgxpool.Connect(context.Background(), connStr)
      if err != nil {
            log.Fatalf("error connecting to db: %v", err)
      }
      defer db.Close()
      ...
}
Next, we’ll implement the single-row DML INSERT code . This code will make 1000 separate requests (involving 1000 separate network hops) to insert a row into the sensor_reading table and time how long the complete operation takes:
func insertSingleRowDML(db *pgxpool.Pool, rows int) (time.Duration, error) {
      const stmt = `INSERT INTO sensor_reading (sensor_id, reading) VALUES ($1, $2)`
      start := time.Now()
      for i := 0; i < rows; i++ {
            if _, err := db.Exec(context.Background(), stmt, uuid.New(), 1); err != nil {
                  return 0, fmt.Errorf("inserting row: %w", err)
            }
      }
      return time.Since(start), nil
}
Next, we’ll perform the single-row INSERT statements in batches. This will send multiple INSERT statements to CockroachDB at the same time and execute them in a single transaction. It’s important to consider breaking large inserts into smaller ones when bulk-inserting data because the more data you’re inserting, the longer you’ll lock the table you’re inserting into. The following function allows me to execute batch inserts in small, concurrent chunks:
func insertSingleRowDMLBatched(db *pgxpool.Pool, rows, chunks int) (time.Duration, error) {
      const stmt = `INSERT INTO sensor_reading (sensor_id, reading) VALUES ($1, $2)`
      start := time.Now()
      eg, _ := errgroup.WithContext(context.Background())
      for c := 0; c < chunks; c++ {
            eg.Go(func() error {
                  batch := &pgx.Batch{}
                  for i := 0; i < rows/chunks; i++ {
                        batch.Queue(stmt, uuid.New(), i)
                  }
                  res := db.SendBatch(context.Background(), batch)
                  if _, err := res.Exec(); err != nil {
                        return fmt.Errorf("running batch query insert: %w", err)
                  }
                  return nil
            })
      }
      if err := eg.Wait(); err != nil {
            return 0, fmt.Errorf("in batch worker: %w", err)
      }
      return time.Since(start), nil
}

Note that this implementation is still executed as single-row DML in the database. The difference between this implementation and the first, however, is that there are now far fewer network hops, making for faster end-to-end queries.

If we really want to see performance gains from our queries, we’ll need to execute multirow DML statements . The following does just that, with the help of two additional functions. The first helper function is called counter, and its job is to simply return an incremented number every time it’s called:
func counter(start int) func() int {
      return func() int {
            defer func() { start++ }()
            return start
      }
}
The second helper function is called argPlaceholders , and its job is to manually construct the syntax used in a multirow DML statement .
func argPlaceholders(rows, columns, start int) string {
      builder := strings.Builder{}
      counter := counter(start)
      for r := 0; r < rows; r++ {
            builder.WriteString("(")
            for c := 0; c < columns; c++ {
                  builder.WriteString("$")
                  builder.WriteString(strconv.Itoa(counter()))
                  if c < columns-1 {
                        builder.WriteString(",")
                  }
            }
            builder.WriteString(")")
            if r < rows-1 {
                  builder.WriteString(",")
            }
      }
      return builder.String()
}
Here are some examples to help you understand how argPlaceholders works:
fmt.Println(argPlaceholders(1, 1, 1)) // -> ($1)
fmt.Println(argPlaceholders(1, 2, 1)) // -> ($1,$2)
fmt.Println(argPlaceholders(2, 1, 1)) // -> ($1),($2)
fmt.Println(argPlaceholders(3, 2, 1)) // -> ($1,$2),($3,$4),($5,$6)
Putting everything together, we get the multirow DML insert function:
func insertMultiRowDML(db *pgxpool.Pool, rows, chunks int) (time.Duration, error) {
      const stmtFmt = `INSERT INTO sensor_reading (sensor_id, reading) VALUES %s`
      start := time.Now()
      eg, _ := errgroup.WithContext(context.Background())
      for c := 0; c < chunks; c++ {
            eg.Go(func() error {
                  argPlaceholders := argPlaceholders(rows/chunks, 2, 1)
                  stmt := fmt.Sprintf(stmtFmt, argPlaceholders)
                  var args []interface{}
                  for i := 0; i < rows/chunks; i++ {
                        args = append(args, uuid.New(), i)
                  }
                  if _, err := db.Exec(context.Background(), stmt, args...); err != nil {
                        return fmt.Errorf("inserting rows: %w", err)
                  }
                  return nil
            })
      }
      if err := eg.Wait(); err != nil {
            return 0, fmt.Errorf("in batch worker: %w", err)
      }
      return time.Since(start), nil
}
So how fast are our queries? I’ll run them each five times and take an average :

Test name

Duration

insertSingleRowDML (rows = 1000)

6.82s

insertSingleRowDMLBatched (rows = 1000, chunks = 1)

4.89s

insertSingleRowDMLBatched (rows = 1000, chunks = 2)

2.77s

insertSingleRowDMLBatched (rows = 1000, chunks = 4)

1.87s

insertMultiRowDMLBatched (rows = 1000, chunks = 1)

49.35ms

insertMultiRowDML (rows = 1000, chunks = 2)

35.61ms

It’s clear to see that by executing multirow DML, our execution times are a fraction of the single-row DML equivalents. There’s a good reason multirow DML statements for bulk operations are considered best practice.

UPDATE Performance

If you need to bulk update a table’s data, it’s important to consider the following:
  • Batching As with INSERT statements , large UPDATE statements will result in table locking and poor performance if not executed carefully. Run queries using multirow DML where possible.

  • Filtering Rather than attempting to update data on the fly, consider using a multipass technique, whereby the primary key columns for the rows needing updates are first selected before being used in a subsequent UPDATE statement. This way, CockroachDB is not being asked to run a slow, locking query but instead a nonlocking read query and then a fast, indexed locking query.

Let’s compare the performance of an UPDATE statement that updates data on the fly vs. an UPDATE statement that uses the multipass approach. As with the INSERT examples, I’ll provide code for both scenarios before running them.

Before we begin, we’ll insert some random data into the table again, this time, shortcutting the Go code and going straight to CockroachDB with the help of its generate_series function. This statement will insert 1000 random entries into the table for the past 1000 or so days and will be the basis for our UPDATE statements:
INSERT INTO sensor_reading (sensor_id, reading, timestamp)
SELECT
    gen_random_uuid(),
    CAST(random() AS DECIMAL),
    '2022-02-18' - CAST(s * 10000 AS INTERVAL)
FROM generate_series(1, 100000) AS s;

Let’s assume our sensor readings in the 1990s were high by 0.001 (no one said this wouldn’t get hideously contrived) and need fixing. We’ll fix the data using UPDATE statements.

First, we’ll create update on-the-fly solution :
func updateOnTheFly(db *pgxpool.Pool) (time.Duration, error) {
      const stmt = `UPDATE sensor_reading
                   SET reading = reading - 0.001
                   WHERE date_trunc('decade', timestamp) = '1990-01-01'`
      start := time.Now()
      if _, err := db.Exec(context.Background(), stmt); err != nil {
            return 0, fmt.Errorf("updating rows: %w", err)
      }
      return time.Since(start), nil
}

Now, we’ll create a multipass solution . There’s more going on in this code than there was in the on-the-fly example, simply because we’re performing more operations to achieve a faster query that results in less index contention. For this example, I present three functions.

First, a function to select IDs out of the table (note that you can select whatever column(s) will allow you to most efficiently fetch your data). This function
  • Creates a SQL statement and injects an additional selection criterion if we’ve already run an iteration and collected an offset ID from which to begin the next batch of IDs to update5

  • Selects the IDs of any rows matching the selection criteria using a time-travel6 query (a query that tolerates slightly stale data in return for reduced transaction contention and increased performance)

  • Creates an array of IDs to return

We’ll name the function updateMultiPassSelect:
func updateMultiPassSelect(db *pgxpool.Pool, lastID string, selectLimit int) ([]string, error) {
      const selectStmtFmt = `SELECT sensor_id FROM sensor_reading
                            AS OF SYSTEM TIME '-5s'
                            WHERE date_trunc('decade', timestamp) = '1990-01-01'
                            %s
                            ORDER BY sensor_id
                            LIMIT $1`
      args := []interface{}{}
      selectStmt := selectStmtFmt
      if lastID != "" {
            selectStmt = fmt.Sprintf(selectStmtFmt, "AND sensor_id > $2")
            args = append(args, selectLimit, lastID)
      } else {
            selectStmt = fmt.Sprintf(selectStmtFmt, "")
            args = append(args, selectLimit)
      }
      // Fetch ids in a batch.
      rows, err := db.Query(context.Background(), selectStmt, args...)
      if err != nil {
            return nil, fmt.Errorf("fetching rows: %w", err)
      }
      // Read the ids into a collection.
      ids := []string{}
      var id string
      for rows.Next() {
            if err = rows.Scan(&id); err != nil {
                  return nil, fmt.Errorf("scanning row: %w", err)
            }
            ids = append(ids, id)
      }
      return ids, nil
}
The next function will update any rows matching the IDs we fetched from the previous query. This function
  • Creates a subset of the IDs passed

  • Runs a query to update rows based on that ID subset

  • Updates the ID subset to point to a different set of IDs

  • Once all IDs have been processed, returns

We’ll name the function updateMultiPassUpdate :
func updateMultiPassUpdate(db *pgxpool.Pool, ids []string, limit int) error {
      const updateStmt = `UPDATE sensor_reading
                         SET reading = reading - 0.001
                         WHERE sensor_id = ANY $1`
      updateIDs := ids
      for {
            idCount := min(len(updateIDs), limit)
            if idCount == 0 {
                  return nil
            }
            if _, err := db.Exec(context.Background(), updateStmt, pq.Array(updateIDs[:idCount])); err != nil {
                  return fmt.Errorf("updating rows: %w", err)
            }
            if idCount < limit {
                  return nil
            }
            updateIDs = updateIDs[idCount:]
      }
}
func min(x, y int) int {
      if x < y {
            return x
      }
      return y
}
And finally, the function itself. This
  • Fetches a batch of IDs that have not been processed via the select function and passes them to the update function.

  • If at any point we’ve run out of IDs to process, the function returns.

We’ll name this function updateMultiPass:
func updateMultiPass(db *pgxpool.Pool, selectLimit, updateLimit int) (d time.Duration, err error) {
      start := time.Now()
      var lastID string
      for {
            ids, err := updateMultiPassSelect(db, lastID, selectLimit)
            if err != nil {
                  return 0, fmt.Errorf("fetching ids to update: %w", err)
            }
            if len(ids) == 0 {
                  break
            }
            if err = updateMultiPassUpdate(db, ids, updateLimit); err != nil {
                  return 0, fmt.Errorf("updating items: %w", err)
            }
            if len(ids) < 1000 {
                  break
            }
            lastID = ids[len(ids)-1]
      }
      return time.Since(start), nil
}
Now for the results. The difference between the two methods isn’t as obvious as the INSERT example , but for the reduced table locking, an additional performance bump is great:

Test name

Duration

updateOnTheFly

3.69s

updateMultiPass (selectLimit = 10,000, updateLimit = 1,000)

2.04s

Check out the Cockroach Labs site for a great example of a multipass bulk-update operation using Python.7

Cluster Maintenance

The cluster you originally deploy to production on day one is unlikely to resemble the cluster you have running after year one (and especially after year five). If you’re planning your CockroachDB cluster years ahead, you
  • Are potentially paying for resources you’re not making full use of

  • Are not taking full advantage of CockroachDB’s scaling capabilities

In this section, we’ll create a small cluster of nodes and perform the following cluster-wide operations :
  • Scale the cluster by adding additional nodes.

  • Increment the version of CockroachDB.

  • Move the cluster onto new nodes.

We’ll start with scaling the cluster. If your CockroachDB deployment is running via Cockroach Cloud, this scenario is handled for you; all cluster scaling is automatic. If your cluster is hosted in Kubernetes, it’s as simple as follows:
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: cockroachdb
spec:
  serviceName: "cockroachdb"
  replicas: 3 # => 5
  ...
For the examples in this chapter, we’ll spin up a cluster manually, so we have full control over each of the nodes. Let’s start with three nodes in three separate command-line terminal sessions and then initialize it with a fourth terminal. Note that I’m using different ports and stores in this example because the cluster is running on one machine. In reality, this cluster will be running across multiple machines, so the only difference between the node start command will be the join addresses:
$ cockroach start
  --insecure
  --store=node1
  --listen-addr=localhost:26257
  --http-addr=localhost:8080
  --join=localhost:26257,localhost:26258,localhost:26259
$ cockroach start
  --insecure
  --store=node2
  --listen-addr=localhost:26258
  --http-addr=localhost:8081
  --join=localhost:26257,localhost:26258,localhost:26259
$ cockroach start
  --insecure
  --store=node3
  --listen-addr=localhost:26259
  --http-addr=localhost:8082
  --join=localhost:26257,localhost:26258,localhost:26259
$ cockroach init --insecure --host=localhost:26257
Time to scale it. CockroachDB was designed to run anywhere, with scalability and survivability being its raison d'etre. As such, scaling a CockroachDB cluster is almost comically simple:
$ cockroach start
  --insecure
  --store=node4
  --listen-addr=localhost:26260
  --http-addr=localhost:8083
  --join=localhost:26257,localhost:26258,localhost:26259

In the preceding command, we start a new node in the exact same way we started the three initial (or “seed”) nodes. Every node in the cluster will use the gossip protocol 8 to communicate with one other to organize scaling. The three seed nodes are all it takes to create the basis for this gossip network, and they don’t need to be aware of additional nodes. This allows you to continue to scale your cluster without making any changes to the configuration of the seed nodes.

Running the command starts a fourth node, which immediately joins the cluster, expanding its capacity.

Before moving onto the next example, stop the nodes and remove their data directories (e.g., node1, node2, and node3).

Scaling a cluster into a different region requires a little more configuration but is about as straightforward as scaling a single-region cluster. First, we’ll create a cluster in eu-central-1 (Frankfurt) before scaling into eu-west-3 (Paris):
$ cockroach start
  --insecure
  --store=node1
  --listen-addr=localhost:26257
  --http-addr=localhost:8080
  --locality=region=eu-central-1,zone=eu-central-1a
  --join='localhost:26257, localhost:26258, localhost:26259'
$ cockroach start
  --insecure
  --store=node2
  --listen-addr=localhost:26258
  --http-addr=localhost:8081
  --locality=region=eu-central-1,zone=eu-central-1a
  --join='localhost:26257, localhost:26258, localhost:26259'
$ cockroach start
  --insecure
  --store=node3
  --listen-addr=localhost:26259
  --http-addr=localhost:8082
  --locality=region=eu-central-1,zone=eu-central-1a
  --join='localhost:26257, localhost:26258, localhost:26259'
$ cockroach init --insecure --host=localhost:26257
With the nodes in Frankfurt up and running, let’s use the CockroachDB shell to interrogate the regions and zones:
$ cockroach sql --insecure
SHOW regions;
     region    |      zones      | database_names | primary_region_of
---------------+-----------------+----------------+--------------------
  eu-central-1 | {eu-central-1a} | {}             | {}
Now, we’ll spin up the nodes in our Paris cluster and have them join the nodes in the existing Frankfurt cluster :
$ cockroach start
  --insecure
  --store=node4
  --listen-addr=localhost:26260
  --http-addr=localhost:8083
  --locality=region=eu-west-3,zone=eu-west-3a
  --join='localhost:26257, localhost:26258, localhost:26259'
$ cockroach start
  --insecure
  --store=node5
  --listen-addr=localhost:26261
  --http-addr=localhost:8084
  --locality=region=eu-west-3,zone=eu-west-3a
  --join='localhost:26257, localhost:26258, localhost:26259'
$ cockroach start
  --insecure
  --store=node6
  --listen-addr=localhost:26262
  --http-addr=localhost:8085
  --locality=region=eu-west-3,zone=eu-west-3a
  --join='localhost:26257, localhost:26258, localhost:26259'
With the nodes in Paris up and running, let’s run the regions query again to see how our cluster looks now:
SHOW regions;
     region    |      zones      | database_names | primary_region_of
---------------+-----------------+----------------+--------------------
  eu-central-1 | {eu-central-1a} | {}             | {}
  eu-west-3    | {eu-west-3a}    | {}             | {}

Next, we’ll upgrade the version of CockroachDB on all of our cluster nodes. Starting with just the nodes of the Frankfurt cluster for simplicity, we’ll update each of the nodes in turn, with zero downtime.

On my machine, the Frankfurt nodes are currently running on v21.2.0 of CockroachDB:
$ cockroach version
Build Tag:        v21.2.0
Build Time:       2021/11/15 14:00:58
Distribution:     CCL
Platform:         darwin amd64 (x86_64-apple-darwin19)
Go Version:       go1.16.6
C Compiler:       Clang 10.0.0
Build Commit ID:  79e5979416cb426092a83beff0be1c20aebf84c6
Build Type:       release
The most up-to-date version of CockroachDB available at the time of writing is v21.2.5 . Following an upgrade of CockroachDB, I can confirm this on the command line:
$ cockroach version
Build Tag:        v21.2.5
Build Time:       2022/02/07 21:04:05
Distribution:     CCL
Platform:         darwin amd64 (x86_64-apple-darwin19)
Go Version:       go1.16.6
C Compiler:       Clang 10.0.0
Build Commit ID:  5afb632f77eee9f09f2adfa2943e1979ec4ebedf
Build Type:       release
Let’s apply this version of CockroachDB to each of the nodes. In an environment where you’ve manually configured load balancing, you’ll need to remove nodes from the load balancer before removing them from the cluster. This will prevent any requests from being routed to a dead node. For Kubernetes users , the following is safe to do:
apiVersion: apps/v1
kind: StatefulSet
...
      containers:
      - name: cockroachdb
        image: cockroachdb/cockroach:v21.2.0 # => cockroachdb/cockroach:v21.2.5
        imagePullPolicy: IfNotPresent

Kubernetes will perform a rolling upgrade of your nodes, without any downtime, and will remove each node from the load balancer before replacing it.

Cockroach Labs has some best practices9 to consider when performing an upgrade between versions (including minor versions like the update we’re about to do). A key thing to be aware of is auto-finalization and whether it’s enabled or not before you upgrade your nodes.

If there’s any chance that by upgrading your cluster nodes you could inadvertently corrupt your database (e.g., when upgrading between versions with breaking changes), it’s important to disable auto-finalization. This can be achieved as follows:
SET CLUSTER SETTING cluster.preserve_downgrade_option = '21.2';
And to reenable auto-finalization:
RESET CLUSTER SETTING cluster.preserve_downgrade_option;
Before we begin, I’ll run a statement from a previous chapter to get basic information about the nodes in the cluster:
$ cockroach node status --insecure
  id |     address     |   sql_address   |  build  |
-----+-----------------+-----------------+---------+
   1 | localhost:26257 | localhost:26257 | v21.2.0 |
   2 | localhost:26258 | localhost:26258 | v21.2.0 |
   3 | localhost:26259 | localhost:26259 | v21.2.0 |
As we can see, all of our nodes are running v21.2.0 as expected. Let’s perform our rolling upgrade now, starting with node one. First, we’ll stop the node (just ctrl-c the process for now) and check that it has become unavailable. Note that we’ll have to connect to another node to check the status while node one is down:
$ cockroach node status --insecure --url postgres://localhost:26258
id |     address     |   sql_address   |  build  | is_available | is_live
---+-----------------+-----------------+---------+--------------+----------
1  | localhost:26257 | localhost:26257 | v21.2.0 | false        | false
2  | localhost:26258 | localhost:26258 | v21.2.0 | true         | true
3  | localhost:26259 | localhost:26259 | v21.2.0 | true         | true

As there are no breaking changes between v21.2.0 and v21.2.5, we can simply run the start command again once we’ve obtained v21.2.5 of the binary; we don’t need to delete the node’s store directory.

Starting the node again, we can see that the node is available again and its version has incremented. We can also run the node command against node one again:
$ cockroach node status --insecure
  id |     address     |   sql_address   |  build  |
-----+-----------------+-----------------+---------+
   1 | localhost:26257 | localhost:26257 | v21.2.5 |
   2 | localhost:26258 | localhost:26258 | v21.2.0 |
   3 | localhost:26259 | localhost:26259 | v21.2.0 |
We’ll repeat the steps for nodes two and three now. Note that because our cluster has only three nodes, it’s critical that we perform the upgrade on one node at a time. If we remove two nodes from the cluster at the same time, the cluster will be unavailable.
$ cockroach node status --insecure
  id |     address     |   sql_address   |  build  |
-----+-----------------+-----------------+---------+
   1 | localhost:26257 | localhost:26257 | v21.2.5 |
   2 | localhost:26258 | localhost:26258 | v21.2.5 |
   3 | localhost:26259 | localhost:26259 | v21.2.5 |

Moving a Cluster

Let’s assume that in the last part of this section, we need to move a cluster from one location to another (e.g., for a cloud provider migration or simply onto newer hardware in an on-premise scenario). I won’t discuss topics like load balancing or DNS here but provide a pattern for the migration of nodes.

We’ll start a brand new cluster for this example, starting with a three-node cluster in AWS’s eu-central-1 (Frankfurt) region and moving them into GCP’s europe-west1 (St. Ghislain) region. Everything will be running locally, so note that no cloud migration is actually taking place.

Before we begin, it’s important to discuss the order in which we’ll add and remove nodes. We have a three-node cluster, with a replication factor of three (as can be seen by running SHOW ZONE CONFIGURATION FROM DATABASE defaultdb), so we should keep a three-node cluster available at all times. This means bringing up a node before taking another node down and waiting for all replicas to be rebalanced onto the new node before starting work on the next node.

First, we’ll start node1, node2, and node3 in the Frankfurt region and initialize the cluster:
$ cockroach start
  --insecure
  --store=node1
  --listen-addr=localhost:26257
  --http-addr=localhost:8080
  --locality=region=eu-central-1,zone=eu-central-1a
  --join='localhost:26257, localhost:26258, localhost:26259'
$ cockroach start
  --insecure
  --store=node2
  --listen-addr=localhost:26258
  --http-addr=localhost:8081
  --locality=region=eu-central-1,zone=eu-central-1a
  --join='localhost:26257, localhost:26258, localhost:26259'
$ cockroach start
  --insecure
  --store=node3
  --listen-addr=localhost:26259
  --http-addr=localhost:8082
  --locality=region=eu-central-1,zone=eu-central-1a
  --join='localhost:26257, localhost:26258, localhost:26259'
$ cockroach init --insecure --host=localhost:26257
Next, we’ll start node4, node5, and node6 in the St. Ghislain region and have them join the cluster. Once any underreplicated ranges have been rebalanced and resolved, we’ll start tearing down the original cluster nodes:
$ cockroach start
  --insecure
  --store=node4
  --listen-addr=localhost:26260
  --http-addr=localhost:8083
  --locality=region=europe-west1,zone=europe-west1b
  --join='localhost:26257, localhost:26258, localhost:26259, localhost:26260, localhost:26261, localhost:26262'
$ cockroach start
  --insecure
  --store=node5
  --listen-addr=localhost:26261
  --http-addr=localhost:8084
  --locality=region=europe-west1,zone=europe-west1c
  --join='localhost:26257, localhost:26258, localhost:26259, localhost:26260, localhost:26261, localhost:26262'
$ cockroach start
  --insecure
  --store=node6
  --listen-addr=localhost:26262
  --http-addr=localhost:8085
  --locality=region=europe-west1,zone=europe-west1d
  --join='localhost:26257, localhost:26258, localhost:26259, localhost:26260, localhost:26261, localhost:26262'

Note that I’ve included all of the new and existing hosts to the --join argument; this will prevent the cluster from becoming unavailable once I start removing the old nodes.

With the new nodes in place and all underreplicated ranges resolved, let’s run the node command to see how our cluster is looking:
$ cockroach node status --insecure --url postgres://localhost:26260 --format records
-[ RECORD 1 ]
id           | 1
address      | localhost:26257
locality     | region=eu-central-1,zone=eu-central-1a
is_available | true
is_live      | true
-[ RECORD 2 ]
id           | 2
address      | localhost:26259
locality     | region=eu-central-1,zone=eu-central-1a
is_available | true
is_live      | true
-[ RECORD 3 ]
id           | 3
address      | localhost:26258
locality     | region=eu-central-1,zone=eu-central-1a
is_available | true
is_live      | true
-[ RECORD 4 ]
id           | 4
address      | localhost:26260
locality     | region=europe-west1,zone=europe-west1b
is_available | true
is_live      | true
-[ RECORD 5 ]
id           | 5
address      | localhost:26261
locality     | region=europe-west1,zone=europe-west1c
is_available | true
is_live      | true
-[ RECORD 6 ]
id           | 6
address      | localhost:26262
locality     | region=europe-west1,zone=europe-west1d
is_available | true
is_live      | true

Everything’s looking good. All six nodes are operational. Once all remaining underreplicated ranges are resolved (see Figure 9-1 for an example of healthy replication) and you’ve taken any backups required, it’s safe to remove the original three nodes from the cluster and finish the move.

Figure 9-1.

Replication status of six healthy nodes

Let’s start by decommissioning node1, node2, and node3. This process will ensure that all ranges are replicated away from these nodes and then remove them from the cluster. Note that I’m connected to node6 but I could have connected to any node in the cluster to perform this operation:
$ cockroach node decommission 1 --insecure --url postgres://localhost:26262
$ cockroach node decommission 2 --insecure --url postgres://localhost:26262
$ cockroach node decommission 3 --insecure --url postgres://localhost:26262

Once the nodes are decommissioned, check that the replication status in the dashboard still looks like Figure 9-1 and then shut the nodes down.

You’ve just moved your cluster from AWS’s Frankfurt region to GCP’s St. Ghislain region!

Backing Up and Restoring Data

Being a multiactive database, CockroachDB does not require the conventional DR (Disaster Recovery) architecture of many traditional relational databases. Unfortunately, a multiactive database will not protect you against bad actors or simply bad mistakes; a TRUNCATE statement may wreak as much havoc on a CockroachDB as any other relational database.

That’s where backups come to the rescue. With CockroachDB, you can back up your clusters in any of the following ways:10
  • Full backups Full backups contain all of the data in your cluster (without replication). Say, for example, you replicate 1GB of data across five nodes; your backup will contain 1GB of data, not 5GB. Full backups are available for all clusters.

  • Incremental backups Incremental backups capture the changes since the last backup was made. You’ll always need at least one full backup but can take as many incremental backups as you like. Incremental backups are available for Enterprise clusters.

  • Encrypted backups Encrypted backups add an additional level of security to your database backups. Note that you can achieve secure backups without manual encryption by simply backing up into an encrypted S3 bucket (or similar). Encrypted backups are available for Enterprise clusters.

  • Backups with revision history Backups with revision history not only back up the latest data in the database but also any revision history yet to be garbage collected (you’ll have 25 hours' worth of revision history by default). From these backups, you can restore either the latest data or data from a previous point in time. These are available for Enterprise clusters.

  • Locality-aware backups Locality-aware backups allow specific database localities (e.g., just eu-central-1) to be backed up and are available for Enterprise clusters.

Let’s run through each of the backup methods to see how they work for different use cases. I’ll run each of the following examples in a demo cluster (which enables Enterprise features). To enable Enterprise features in a regular cluster, configure the following settings:
$ cockroach sql --insecure --host=<YOUR_HOST>
$ SET CLUSTER SETTING cluster.organization = 'your_organisation';
$ SET CLUSTER SETTING enterprise.license = 'your_license_key';
Before we create any backups, I’ll create a small database and table to prove that our backup and restore operations are successful. For this, I’ll create a new sensor_reading table that will work nicely for all of the backup methods :
CREATE TABLE sensor_reading (
    sensor_id UUID PRIMARY KEY,
    country STRING NOT NULL,
    reading DECIMAL NOT NULL,
    timestamp TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
INSERT INTO sensor_reading (sensor_id, country, reading, timestamp)
SELECT
    gen_random_uuid(),
    'DE',
    CAST(random() AS DECIMAL),
    '2022-02-18' - CAST(s * 10000 AS INTERVAL)
FROM generate_series(1, 1000) AS s;
Just one more step before we’re ready to backup. Unlike the IMPORT statement, which can use an HTTP endpoint, backups and restorations need to use cloud provider blob storage (e.g., S3, Google Cloud Storage, and Azure Storage). Let’s create a few S3 buckets for our backup examples:
$ aws s3api create-bucket
    --bucket practical-cockroachdb-backups
    --region eu-west-2
    --create-bucket-configuration LocationConstraint=eu-west-2
{
    "Location": "http://practical-cockroachdb-backups.s3.amazonaws.com/"
}
$ aws s3api create-bucket
    --bucket practical-cockroachdb-backups-us-west
    --region us-west-1
    --create-bucket-configuration LocationConstraint=us-west-1
{
    "Location": "http://practical-cockroachdb-backups-us-west.s3.amazonaws.com/"
}
$ aws s3api create-bucket
    --bucket practical-cockroachdb-backups-us-east
    --region us-east-1
{
    "Location": "/practical-cockroachdb-backups-us-east"
}

Full Backups

With the database, table , and bucket in place, we’re ready to begin! Let’s start by running a full backup and restore of the data:
BACKUP INTO 's3://practical-cockroachdb-backups?AWS_ACCESS_KEY_ID=****&AWS_SECRET_ACCESS_KEY=****&AWS_REGION=eu-west-2' AS OF SYSTEM TIME '-10s';
        job_id       |  status   | fraction_completed | rows | index_entries | bytes
---------------------+-----------+--------------------+------+---------------+------
  742270852339073025 | succeeded |                  1 | 1032 |            20 | 57311

Note that in our backup request, we tell CockroachDB to back up the database as it was ten seconds ago. This means that it’s not trying to back up live data while it’s being served to clients. This is a performance recommendation from Cockroach Labs, and ten seconds is the minimum recommended period; depending on your garbage collection window (the default is 25 hours), you may want to set this to be further in the past.

Let’s check that S3 has our backup:
$ aws s3api list-objects-v2
    --bucket practical-cockroachdb-backups
{
    "Contents": [
        {
            "Key": "2022/03/06-190437.13/BACKUP-CHECKPOINT-742270852339073025-CHECKSUM",
            "LastModified": "2022-03-06T19:04:48+00:00",
            "ETag": ""79f98a6fd4b39f02b7727c91707b71cd"",
            "Size": 4,
            "StorageClass": "STANDARD"
        }
        ...

Looks good! CockroachDB backups are stored in a directory structure whose directories indicate the date and time of the backup.

Let’s ensure our sensor_reading table was backed up successfully by running a restore. As all of the data is safely replicated across our three nodes, a restore won’t currently prove anything; let’s remedy that:
TRUNCATE sensor_reading;
SELECT COUNT(*) FROM sensor_reading;
  count
---------
      0
With the data gone (and the department panicking), it’s time to run our restore. Pay attention to the backups objects’ directory structure, we’ll need them now:
RESTORE FROM '2022/03/06-190437.13' IN 's3://practical-cockroachdb-backups?AWS_ACCESS_KEY_ID=****&AWS_SECRET_ACCESS_KEY=****&AWS_REGION=eu-west-2';
ERROR: full cluster restore can only be run on a cluster with no tables or databases but found 4 descriptors: [sensors crdb_internal_region _crdb_internal_region sensor_reading]
Oh no! It seems we can’t restore unless we have an empty cluster. What if we had other databases or tables that are not in need of restoration? Happily, CockroachDB has alternative RESTORE commands you can use, depending on your situation. All of which can be performed from the full backup we just took:
  • Ruined your cluster? Use RESTORE FROM to restore your entire cluster.

  • Ruined your database? Use RESTORE DATABASE {DATABASE} FROM to restore just your database.

  • Ruined your table? Use RESTORE TABLE {DATABASE}.{TABLE} FROM to restore just your table. Note that by passing in a comma-separated string of tables, you can back up multiple tables in one go.

There’s one more step we need to perform before our table can be restored. We need to DROP or RENAME it11 before running the restore. As my sensor_reading table is empty, there’s nothing I need to archive, so DROP works best for me:
DROP TABLE sensor_reading;
RESTORE TABLE sensors.sensor_reading FROM '2022/03/06-190437.13' IN 's3://practical-cockroachdb-backups?AWS_ACCESS_KEY_ID=****&AWS_SECRET_ACCESS_KEY=****&AWS_REGION=eu-west-2';
        job_id       |  status   | fraction_completed | rows | index_entries | bytes
---------------------+-----------+--------------------+------+---------------+-------
  742274102397501441 | succeeded |                  1 | 1000 |             0 | 52134

Incremental Backups

A full backup and restore works in this case, as the database and objects are small. If your cluster is many gigabytes in size, incremental backups may be a more efficient option for you. Let’s explore incremental backups to see how they work.

First, we’ll insert some new data into our restored sensor_reading table to simulate data being added to the table over time. This will create a delta between the original data we inserted at the start of the backups section to now. If CockroachDB does not detect a change between an incremental backup and the previous backup (whether full or incremental), you’ll see an empty backup directory:
INSERT INTO sensor_reading (sensor_id, country, reading, timestamp)
SELECT
    gen_random_uuid(),
    'DE',
    CAST(random() AS DECIMAL),
    '2022-02-18' - CAST(s * 10000 AS INTERVAL)
FROM generate_series(1, 500) AS s;
SELECT COUNT(*) FROM sensor_reading;
  count
---------
   1500
Now, let’s run an incremental backup:
BACKUP INTO LATEST IN 's3://practical-cockroachdb-backups?AWS_ACCESS_KEY_ID=****&AWS_SECRET_ACCESS_KEY=****&AWS_REGION=eu-west-2' AS OF SYSTEM TIME '-10s';
        job_id       |  status   | fraction_completed | rows | index_entries | bytes
---------------------+-----------+--------------------+------+---------------+-------
  742278826975625217 | succeeded |                  1 | 1511 |            35 | 85097
The number of rows indicates the number of rows backed up by the last backup operation. Note that because we have not yet backed up our restored dataset, the delta includes all rows in the table. If you run again now, you’ll see zero rows as there’s nothing fresh to back up:
BACKUP INTO LATEST IN 's3://practical-cockroachdb-backups?AWS_ACCESS_KEY_ID=****&AWS_SECRET_ACCESS_KEY=****&AWS_REGION=eu-west-2' AS OF SYSTEM TIME '-10s';
        job_id       |  status   | fraction_completed | rows | index_entries | bytes
---------------------+-----------+--------------------+------+---------------+-------
  742280265162588161 | succeeded |                  1 |    0 |             0 |     0
Note that because we’re running an incremental backup of the whole cluster, changes to any other database object will be picked up (and backed up). To restore our table, we simply need to rerun the restore operation as before. All incremental backups exist in S3 and are automatically picked up:
RESTORE TABLE sensors.sensor_reading FROM '2022/03/06-190437.13' IN 's3://practical-cockroachdb-backups?AWS_ACCESS_KEY_ID=****&AWS_SECRET_ACCESS_KEY=****&AWS_REGION=eu-west-2';

Encrypted Backups

To encrypt a backup , you’ll need to create your own encryption key and use that to encrypt the backup before it’s persisted in cloud storage. Let’s create an encryption key and use it to create an encrypted backup now:
$ aws kms create-key
    --key-spec=SYMMETRIC_DEFAULT
    --tags TagKey=Purpose,TagValue="Encrypt CockroachDB backups"
    --description "Practical CockroachDB Backups"
To use the encryption key for backups and restores, simply pass an argument to the BACKUP/RESTORE command as follows:
BACKUP INTO 's3://practical-cockroachdb-backups?AWS_ACCESS_KEY_ID=****&AWS_SECRET_ACCESS_KEY=****&AWS_REGION=eu-west-2'
    WITH kms = 'aws:///****?AUTH=implicit&REGION=eu-west-2';
There are a few things to note here:
  • The KMS key URL scheme of “aws:///” is not a typo.

  • The key ID to use in the URL can be obtained by running the following AWS CLI command and extracting the KeyId for the key you’d like to use:

$ aws kms list-keys
{
  "Keys": [
    {
      "KeyId": "********-****-****-****-************",
      "KeyArn": "arn:aws:kms:eu-west-2:***:key/********-****-****-****-************"
    },
    {
      "KeyId": "********-****-****-****-************",
      "KeyArn": "arn:aws:kms:eu-west-2:***:key/********-****-****-****-************"
    }
  ]
}
  • By using the configuration AUTH=implicit, I’m telling CockroachDB to use the same credentials to access the KMS key as it used to access the bucket. If your KMS key uses different credentials, provide the associated AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, and REGION query arguments for that key.

If you attempt to restore the backup without a KMS key, you’ll receive an error, as CockroachDB is aware that this backup has been manually encrypted with a KMS key:
RESTORE TABLE sensors.sensor_reading FROM '2022/03/08-183653.00' IN 's3://practical-cockroachdb-backups?AWS_ACCESS_KEY_ID=****&AWS_SECRET_ACCESS_KEY=****&AWS_REGION=eu-west-2';
ERROR: file appears encrypted -- try specifying one of "encryption_passphrase" or "kms": proto: wrong wireType = 5 for field EntryCounts
Restoring from a backup with the use of a KMS key performs the restore as expected:
RESTORE TABLE sensors.sensor_reading FROM '2022/03/08-183653.00' IN 's3://practical-cockroachdb-backups?AWS_ACCESS_KEY_ID=****&AWS_SECRET_ACCESS_KEY=****&AWS_REGION=eu-west-2'
  WITH kms = 'aws:///****?AUTH=implicit&REGION=eu-west-2';
        job_id       |  status   | fraction_completed | rows | index_entries | bytes
---------------------+-----------+--------------------+------+---------------+-------
  742834697875423233 | succeeded |                  1 | 1000 |             0 | 52140
To encrypt a multiregion cluster using KMS keys stored in different regions, simply provide a KMS key for each of the regions you’d like to encrypt. The cluster I’m encrypting is currently in a single region, but by adding more KMS keys to the array of keys, you can encrypt more regions:
BACKUP INTO 's3://practical-cockroachdb-backups?AWS_ACCESS_KEY_ID=****&AWS_SECRET_ACCESS_KEY=****&AWS_REGION=eu-west-2'
  WITH kms = (
    'aws:///****AUTH=implicit&REGION=eu-west-2'
  );
        job_id       |  status   | fraction_completed | rows | index_entries | bytes
---------------------+-----------+--------------------+------+---------------+-------
  742839873407320065 | succeeded |                  1 | 1055 |            65 | 76388
RESTORE TABLE sensors.sensor_reading FROM '2022/03/08-191858.46' IN 's3://practical-cockroachdb-backups?AWS_ACCESS_KEY_ID=****&AWS_SECRET_ACCESS_KEY=****&AWS_REGION=eu-west-2'
  WITH kms = (
    'aws:///****AUTH=implicit&REGION=eu-west-2'
  );
        job_id       |  status   | fraction_completed | rows | index_entries | bytes
---------------------+-----------+--------------------+------+---------------+-------
  742840069222793217 | succeeded |                  1 | 1000 |             0 | 52150

Note that the number of rows restored is less than the number of rows backed up. This is because a backup was made across the whole cluster, while the restore targets just the sensor_reading table.

If you’d rather not use keys stored in your cloud provider for encrypting backups, you can provide your own password to use for encryption. The process of encrypting backups in this manner is very similar to encrypt backups with cloud keys:
BACKUP INTO 's3://practical-cockroachdb-backups?AWS_ACCESS_KEY_ID=****&AWS_SECRET_ACCESS_KEY=****&AWS_REGION=eu-west-2'
  WITH encryption_passphrase = 'correct horse battery staple';
        job_id       |  status   | fraction_completed | rows | index_entries | bytes
---------------------+-----------+--------------------+------+---------------+-------
  742837673818750977 | succeeded |                  1 | 1046 |            48 | 69464
The restore is also very similar to the KMS equivalent:
RESTORE TABLE sensors.sensor_reading FROM '2022/03/08-190747.27' IN 's3://practical-cockroachdb-backups?AWS_ACCESS_KEY_ID=****&AWS_SECRET_ACCESS_KEY=****&AWS_REGION=eu-west-2'
  WITH encryption_passphrase = 'correct horse battery staple';
        job_id       |  status   | fraction_completed | rows | index_entries | bytes
---------------------+-----------+--------------------+------+---------------+-------
  742838121558212609 | succeeded |                  1 | 1000 |             0 | 52150

Locality-Aware Backups

To make a backup locality-aware, we simply need to pass some additional configuration to the BACKUP command to make sure that backups remain in their respective regions. Let’s create a new cluster using the demo command to simulate a multiregion cluster.

In the following example, we specify that by default, CockroachDB should back up to the us-west bucket, except for nodes with the us-east-1 locality, which will back up to the us-east bucket:
BACKUP INTO (
's3://practical-cockroachdb-backups-us-west?AWS_ACCESS_KEY_ID=****&AWS_SECRET_ACCESS_KEY=****&COCKROACH_LOCALITY=default',
's3://practical-cockroachdb-backups-us-east?AWS_ACCESS_KEY_ID=****&AWS_SECRET_ACCESS_KEY=****&COCKROACH_LOCALITY=region%3Dus-east-1'
);
RESTORE TABLE sensor_reading FROM '2022/06/29-183902.85' IN (
's3://practical-cockroachdb-backups-us-west?AWS_ACCESS_KEY_ID=****&AWS_SECRET_ACCESS_KEY=****&AWS_REGION=us-west-1&COCKROACH_LOCALITY=default',
's3://practical-cockroachdb-backups-us-east?AWS_ACCESS_KEY_ID=****&AWS_SECRET_ACCESS_KEY=****&AWS_REGION=us-east-1&COCKROACH_LOCALITY=region%3Dus-east-1'
);

KMS keys and encryption passphrases can be used in conjunction with regional backups, making for flexible backup and restores with CockroachDB.

Scheduled Backups

It’s safe to assume that you won’t want to manually back up your cluster at midnight every night. That’s where CockroachDB’s backup schedules come in. The following statement creates two backup schedules :
  • Weekly full backup A weekly full backup is created to ensure a clean baseline backup for each week is captured.

  • Daily incremental backup A daily backup is created as per our instructions using crontab syntax.

CREATE SCHEDULE cluster_backup
  FOR BACKUP INTO 's3://practical-cockroachdb-backups?AWS_ACCESS_KEY_ID=****&AWS_SECRET_ACCESS_KEY=****&AWS_REGION=eu-west-2'
    WITH revision_history
    RECURRING '@daily'
    WITH SCHEDULE OPTIONS first_run = 'now';
The output of this statement will contain information for two newly created schedules :
  • Weekly full backup:
    • ID: 742844058827390977

    • Status: ACTIVE

  • Daily incremental backup:
    • ID: 742844058837024769

    • Status: PAUSED: Waiting for initial backup to complete

Once the initial daily backup has finished , its status will go from PAUSED to ACTIVE. This can be seen with a call to SHOW SCHEDULES:
$ cockroach sql --url "postgresql://localhost:26260" --insecure
    --execute "SHOW SCHEDULES"
    --format records
-[ RECORD 1 ]
id              | 742844058837024769
label           | cluster_backup
schedule_status | ACTIVE
next_run        | 2022-03-13 00:00:00+00
recurrence      | @weekly
-[ RECORD 3 ]
id              | 742844058827390977
label           | cluster_backup
schedule_status | ACTIVE
next_run        | 2022-03-09 00:00:00+00
recurrence      | @daily

Our daily backup job will next execute tomorrow at midnight, whereas our weekly backup job will next execute on Sunday.

We asked CockroachDB to start our first daily backup immediately. This will unlikely be your desired behavior if creating the schedule in the middle of the day. To set a different time for the first run to start, simply replace the first_run value with a timestamp. For example:
CREATE SCHEDULE cluster_backup
  FOR BACKUP INTO 's3://practical-cockroachdb-backups?AWS_ACCESS_KEY_ID=****&AWS_SECRET_ACCESS_KEY=****&AWS_REGION=eu-west-2'
    WITH revision_history
    RECURRING '@daily'
    WITH SCHEDULE OPTIONS first_run = '2022-03-08 00:00:00+00';

Cluster Design

When designing your cluster, there are numerous configurations you’ll need to decide between to get optimal performance and resilience. In this section, we’ll focus on these decisions and their trade-offs.

Cluster Sizing

If you think about your CockroachDB cluster as a collection of vCPUs (Virtual CPUs) that are distributed across nodes, the process of architecting your cluster becomes simpler.

There are trade-offs between having a cluster with a small number of large nodes and a cluster with a large number of smaller nodes.

In a small cluster of large nodes , the additional computing power of each machine and fewer network hops required to satisfy large queries make for a more stable cluster .

In a large cluster of small nodes, the additional nodes to distribute data and parallelize distributed operations like backups and restores make for a more resilient cluster .

The decision between stability and resilience is yours to make, but as a rule-of-thumb, Cockroach Labs recommends that you meet in the middle and distribute your vCPUs across as few nodes as possible while still achieving resilience. Their Production Checklist12 offers a detailed breakdown of each scenario.

Node Sizing

Cockroach Labs provides a simple set of recommendations for sizing your cluster nodes based on their vCPU count. Let’s go through each of the recommendations against the minimum and recommended vCPU counts. Note that these are just general recommendations. Depending on your unique performance requirements, you may require different ratios:
  • Memory 4GB per vCPU:
    • 4 vCPUs -> 16GB of RAM per node

    • 8 vCPUs -> 32GB of RAM per node

  • Storage 150GB per vCPU:
    • 4 vCPUs -> 600GB of disk per node

    • 8 vCPUs -> 1.2TB of disk per node

  • IOPS 500 IOPS (input/output operations per second) per vCPU:
    • 4 vCPUs -> 2,000 IOPS

    • 8 vCPUs -> 4,000 IOPS

  • MB/s 30MB/s (amount of data transferred to or from disk per second) per vCPU:
    • 4 vCPUs -> 120MB/s

    • 8 vCPUs -> 240MB/s

It’s often said that “premature optimization is the root of all evil.” The same applies when creating the architecture your CockroachDB cluster will live in. Back in 2018, I was creating a CockroachDB cluster and asked the Cockroach Labs team whether they recommended immediately placing an external cache in front of my CockroachDB cluster to cater for read-heavy workloads. Their response was to start by harnessing CockroachDB’s own cache. This would result in fewer network hops and would provide query atomicity, whereas a combination of database and cache would not.

With CockroachDB, you have the freedom to not only scale the number of nodes in your cluster but also the nodes themselves. So design to support a sensible initial capacity and grow from there.

Monitoring

An important aspect of running any software safely in production is monitoring . In this section, we’ll configure monitoring for a CockroachDB cluster and raise alerts against a simple metric.

For this example, we’ll use Prometheus, a popular open source monitoring system that CockroachDB has great support for.

First, we’ll create a cluster to monitor:
$ cockroach start
  --insecure
  --store=node1
  --listen-addr=localhost:26257
  --http-addr=localhost:8080
  --join=localhost:26257,localhost:26258,localhost:26259
$ cockroach start
  --insecure
  --store=node2
  --listen-addr=localhost:26258
  --http-addr=localhost:8081
  --join=localhost:26257,localhost:26258,localhost:26259
$ cockroach start
  --insecure
  --store=node3
  --listen-addr=localhost:26259
  --http-addr=localhost:8082
  --join=localhost:26257,localhost:26258,localhost:26259
$ cockroach init --insecure --host=localhost:26257

Next, we’ll create an instance of Prometheus to monitor our three nodes. We’ll create prometheus.yml and alert_rules.yml configuration files and point the Prometheus instance at those.

The prometheus.yml file configures the basic properties of Prometheus. It dictates how frequently Prometheus will scrape for metrics, which hosts to scrape metrics from, and which URLs those hosts are serving metrics on. Note that because I’ll be running Prometheus using Docker, I use the host.docker.internal DNS name, which resolves out of the container to my host machine:
global:
  scrape_interval: 10s
rule_files:
  - alert_rules.yml
scrape_configs:
  - job_name: cockroachdb
    metrics_path: /_status/vars
    static_configs:
      - targets:
        - host.docker.internal:8080
        - host.docker.internal:8081
        - host.docker.internal:8082
The alert_rules.yml file configures alert groups containing metric rules. If any metric rules breach configured thresholds, an alert will be raised for that group. For this example, I create an alert that will fire if CockroachDB detects that a node has been offline for one minute:
groups:
- name: node_down
  rules:
  - alert: NodeDown
    expr: up == 0
    for: 1m
    labels:
      severity: critical
Next, we’ll create an instance of Alertmanager. This will receive alerts from Prometheus and direct them to a receiver. With the following configuration, I use a simple HTTP receiver to send notifications to https://httpbin.org :
global:
  resolve_timeout: 5m
route:
  group_by: ['alertname']
  group_wait: 5s
  group_interval: 5s
  repeat_interval: 1h
  receiver: api_notify
receivers:
  - name: api_notify
    webhook_configs:
      - url: https://httpbin.org/post
Let’s start Prometheus and Alertmanager now:
$ docker run
      --name prometheus
      --rm -it
      -p 9090:9090
      -v ${PWD}/prometheus.yml:/etc/prometheus/prometheus.yml
      -v ${PWD}/alert_rules.yml:/etc/prometheus/alert_rules.yml
      prom/prometheus
$ docker run
      --name alertmanager
      --rm -it
      -p 9093:9093
      -v ${PWD}/alertmanager.yml:/etc/alertmanager/alertmanager.yml
      prom/alertmanager

If you visit http://localhost:9090/alerts, you’ll see that the NodeDown alert is active and reporting that no nodes are currently down:

A screenshot of the node status and replication status. 1. Node status: Live nodes read as 6. 2. Replication status: Total ranges read as 43.

If we kill a node now, the alert will enter a “pending” state before “firing.” I’ll terminate the cockroach process for node3 to demonstrate these alert statuses.

After a short while, the alert will enter the pending state:

A screenshot of a program titled node down, 0 active. The page displays name, node down; expr, up double equals 0; for, 1 m; labels, none; and severity, critical.

After a little longer, it will enter the firing state:

A screenshot of a program and a table titled "Node Down, 1 Active" The column headers are: state, active since, and value. The state is pending in yellow.

If we restart node3 and wait a short while, the alert will clear and report that our cluster is once again healthy.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.117.186.92