Elasticsearch provides a built-in system to rapidly ot and restore your data. When working with live data, keeping a backup is complex, due to the large number of concurrency problems.
An Elasticsearch snapshot allows for the creation of snapshots of individual indices (or aliases), or an entire cluster, into a remote repository.
Before starting to execute a snapshot, a repository must be created--this is where your backups/snapshots will be stored.
You need an up-and-running Elasticsearch installation as we described in the Downloading and installing Elasticsearch recipe in Chapter 2, Downloading and Setup.
To execute curl
via the command line you need to install curl
for your operating system.
We need to edit config/elasticsearch.yml
and add the directory of your backup repository:
path.repo: /tmp/
For our examples, we'll be using the /tmp
directory available in every Unix system. Generally, in a production cluster, this directory should be a shared repository.
To manage a repository, we will perform the following steps:
my_repository
, the HTTP method is PUT
and the curl
command is:curl -XPUT 'http://localhost:9200/_snapshot/my_repository' -d '{ "type": "fs", "settings": { "location": "/tmp/my_repository", "compress": true } }'
The result will be:
{"acknowledged":true}
If you check on your filesystem, the /tmp/my_repository
directory is created.
GET
and the curl
command is:curl -XGET 'http://localhost:9200/_snapshot/my_repository'
The result will be:
{ "my_repository" : { "type" : "fs", "settings" : { "compress" : "true", "location" : "/tmp/my_repository" } } }
DELETE
and the curl
command is:curl -XDELETE 'http://localhost:9200/_snapshot/my_repository'
The result will be as follows:
{"acknowledged":true}
Before starting to take a snapshot of our data, we must create a repository: a place where we store our backup data. The parameters that can be used to create a repository are:
type
: Used to define the type of shared filesystem repository (generally fs
)settings
: The options to set up the shared filesystem repositoryIn the case of fs
type usage, the settings are as follows:
location
: This is the location on the filesystem to store snapshots.compress
: This turns on compression for the snapshot files. Compression is applied only to metadata files (index mapping and settings); data files are not compressed (default true
). chunk_size
: This defines the size for chunks of files during snapshotting. The chunk size can be specified in bytes or by using size value notation (that is, 1g, 10m, 5k) (the default is disabled).max_restore_bytes_per_sec
: This controls the throttle per node restore rate (default 20mb
).max_snapshot_bytes_per_sec
: This controls of the throttle per node snapshot rate (default 20mb
).readonly
: This flag defines the repository as read-only (default false
).It is possible to return all the defined repositories by executing GET without giving the repository name:
curl -XGET 'http://localhost:9200/_snapshot'
The most common type for a repository backend is filesystem (fs), but there are other official repository backends, such as:
When a repository is created, it's immediately verified on all data nodes to be sure that it's functional.
Elasticsearch also provides a manual way to verify the node status repository, which is very useful in order to check the status of the cloud repository storage. The command to manually verify a repository is the following:
curl -XPOST 'http://localhost:9200/_snapshot/my_repository/_verify'
3.147.75.183