Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Managing repositories

Elasticsearch provides a built-in system to rapidly ot and restore your data. When working with live data, keeping a backup is complex, due to the large number of concurrency problems.

An Elasticsearch snapshot allows for the creation of snapshots of individual indices (or aliases), or an entire cluster, into a remote repository.

Before starting to execute a snapshot, a repository must be created--this is where your backups/snapshots will be stored.

Getting ready

You need an up-and-running Elasticsearch installation as we described in the Downloading and installing Elasticsearch recipe in Chapter 2, Downloading and Setup.

To execute curl via the command line you need to install curl for your operating system.

We need to edit config/elasticsearch.yml and add the directory of your backup repository:

path.repo: /tmp/

For our examples, we'll be using the /tmp directory available in every Unix system. Generally, in a production cluster, this directory should be a shared repository.

How to do it...

To manage a repository, we will perform the following steps:

To create a repository called my_repository, the HTTP method is PUT and the curl command is:

        curl -XPUT 'http://localhost:9200/_snapshot/my_repository' -d   
        '{
            "type": "fs",
            "settings": {
                "location": "/tmp/my_repository",
                "compress": true
            }
        }'

The result will be:

               {"acknowledged":true}

If you check on your filesystem, the /tmp/my_repository directory is created.

To retrieve repository information, the HTTP method is GET and the curl command is:

        curl -XGET 'http://localhost:9200/_snapshot/my_repository'

The result will be:

            { 
              "my_repository" : { 
                "type" : "fs", 
                "settings" : { 
                  "compress" : "true", 
                  "location" : "/tmp/my_repository" 
                } 
              } 
            }

To delete a repository, the HTTP method is DELETE and the curl command is:

        curl -XDELETE 'http://localhost:9200/_snapshot/my_repository'

The result will be as follows:

             {"acknowledged":true}

How it works...

Before starting to take a snapshot of our data, we must create a repository: a place where we store our backup data. The parameters that can be used to create a repository are:

type: Used to define the type of shared filesystem repository (generally fs)
settings: The options to set up the shared filesystem repository

In the case of fs type usage, the settings are as follows:

location: This is the location on the filesystem to store snapshots.
compress: This turns on compression for the snapshot files. Compression is applied only to metadata files (index mapping and settings); data files are not compressed (default true).
chunk_size: This defines the size for chunks of files during snapshotting. The chunk size can be specified in bytes or by using size value notation (that is, 1g, 10m, 5k) (the default is disabled).
max_restore_bytes_per_sec: This controls the throttle per node restore rate (default 20mb).
max_snapshot_bytes_per_sec: This controls of the throttle per node snapshot rate (default 20mb).
readonly: This flag defines the repository as read-only (default false).
It is possible to return all the defined repositories by executing GET without giving the repository name:
```
        curl -XGET 'http://localhost:9200/_snapshot'
```

There's more...

The most common type for a repository backend is filesystem (fs), but there are other official repository backends, such as:

S3 repository: https://www.elastic.co/guide/en/elasticsearch/plugins/5.0/repository-s3.html
HDFS: https://www.elastic.co/guide/en/elasticsearch/plugins/5.0/repository-hdfs.html for Hadoop environments
Azure Cloud: https://www.elastic.co/guide/en/elasticsearch/plugins/5.0/repository-azure.html for Azure storage repositories
Google Cloud: https://www.elastic.co/guide/en/elasticsearch/plugins/5.0/repository-gcs.html for Google Cloud storage repositories

When a repository is created, it's immediately verified on all data nodes to be sure that it's functional.

Elasticsearch also provides a manual way to verify the node status repository, which is very useful in order to check the status of the cloud repository storage. The command to manually verify a repository is the following:

    curl -XPOST 'http://localhost:9200/_snapshot/my_repository/_verify'

Table of Contents for
Managing repositories

Managing repositories

Getting ready

How to do it...

How it works...

There's more...

See also

Table of Contents for Managing repositories

Create new playlist

Sign In

Sign Up

Managing repositories

Getting ready

How to do it...

How it works...

There's more...

See also

Table of Contents for
Managing repositories