Remote write

Remote write was a very sought after feature for Prometheus. It was first implemented as native support for sending samples in the openTSDB, InfluxDB, and Graphite data formats. However, a decision was soon made to not support each possible remote system but instead provide a generic write mechanism that's suitable for building custom adapters. This enabled custom integrations decoupled from the Prometheus roadmap, while opening up the possibility of supporting the read path in those bridges as well. The system-specific implementations of remote write were removed from the Prometheus binary and converted into a standalone adapters as an example. The logic of relying on adapters and empowering the community so that it can build whatever integration is required follows the philosophy we discussed in Chapter 12, Choosing the Right Service Discovery, for building custom service discovery integrations.

Official examples of custom remote storage adapters can be found at https://github.com/prometheus/prometheus/tree/master/documentation/examples/remote_storage/remote_storage_adapter.

Prometheus sends individual samples to remote write endpoints, using a very simple format which isn't tied to Prometheus internals. The system on the other end might not even be a storage system but a stream processor, such as Kafka or Riemann. This was a tough decision when defining the remote write design, as Prometheus already knew how to create efficient chunks and could just send those over the wire. Chunks would have made supporting streaming systems impractical, and sending samples is both easier to understand and easier to implement with regard to adapters.

Remote write was the target of a great enhancement with the release of Prometheus 2.8. Previously, when a metric failed to be delivered to a remote write endpoint (due to network or service issues) there was just a small buffer to store the data. If that buffer was filled, metrics would be dropped, and were permanently lost to those remote systems. Even worse, the buffer could create back-pressure and cause the Prometheus server to crash due to an Out of Memory (OOM) error. Since the remote write API started relying on the Write-Ahead Log (WAL) for bookkeeping, this doesn't happen anymore. Instead of using a buffer, the remote write now reads directly from the WAL, which has all transactions in flight and scraped samples. Using the WAL on the remote write subsystem makes Prometheus memory usage more predictable and allows it to resume from where it left off after a connectivity outage to the remote system.

Configuration-wise, the following snippet illustrates the minimal code required to set up a remote write endpoint in Prometheus:

remote_write:
  - url: http://example.com:8000/write

Since remote write is another instance of interfacing with external systems, external_labels are also applied to samples before being sent. This can also prevent collision of metrics on the remote side when using more than one Prometheus server to push data to the same location. Remote write also supports write_relabel_configs to allow you to control which metrics are sent and which are dropped. This relabeling is run after external labels are applied.

Later in this chapter, we'll talk about a fairly new (and experimental) Thanos component called receiver as a practical example of remote write usage.

Table of Contents for Remote write

Create new playlist

Sign In

Sign Up

Table of Contents for
Remote write