Managing repositories

Elasticsearch provides a built-in system to rapidly ot and restore your data. When working with live data, keeping a backup is complex, due to the large number of concurrency problems.

An Elasticsearch snapshot allows for the creation of snapshots of individual indices (or aliases), or an entire cluster, into a remote repository.

Before starting to execute a snapshot, a repository must be created--this is where your backups/snapshots will be stored.

Getting ready

You need an up-and-running Elasticsearch installation as we described in the Downloading and installing Elasticsearch recipe in Chapter 2, Downloading and Setup.

To execute curl via the command line you need to install curl for your operating system.

We need to edit config/elasticsearch.yml and add the directory of your backup repository:

path.repo: /tmp/

For our examples, we'll be using the /tmp directory available in every Unix system. Generally, in a production cluster, this directory should be a shared repository.

How to do it...

To manage a repository, we will perform the following steps:

  1. To create a repository called my_repository, the HTTP method is PUT and the curl command is:
            curl -XPUT 'http://localhost:9200/_snapshot/my_repository' -d   
            '{
                "type": "fs",
                "settings": {
                    "location": "/tmp/my_repository",
                    "compress": true
                }
            }'
    

    The result will be:

                   {"acknowledged":true}
    

    If you check on your filesystem, the /tmp/my_repository directory is created.

  2. To retrieve repository information, the HTTP method is GET and the curl command is:
            curl -XGET 'http://localhost:9200/_snapshot/my_repository'
    

    The result will be:

                { 
                  "my_repository" : { 
                    "type" : "fs", 
                    "settings" : { 
                      "compress" : "true", 
                      "location" : "/tmp/my_repository" 
                    } 
                  } 
                } 
    
  3. To delete a repository, the HTTP method is DELETE and the curl command is:
            curl -XDELETE 'http://localhost:9200/_snapshot/my_repository'
    

    The result will be as follows:

                 {"acknowledged":true} 
    

How it works...

Before starting to take a snapshot of our data, we must create a repository: a place where we store our backup data. The parameters that can be used to create a repository are:

  • type: Used to define the type of shared filesystem repository  (generally fs)
  • settings: The options to set up the shared filesystem repository

In the case of fs type usage, the settings are as follows:

  • location: This is the location on the filesystem to store snapshots.
  • compress: This turns on compression for the snapshot files. Compression is applied only to metadata files (index mapping and settings); data files are not compressed (default true). 
  • chunk_size: This defines the size for chunks of files during snapshotting. The chunk size can be specified in bytes or by using size value notation (that is, 1g, 10m, 5k) (the default is disabled).
  • max_restore_bytes_per_sec: This controls the throttle per node restore rate (default 20mb).
  • max_snapshot_bytes_per_sec: This controls of the throttle per node snapshot rate (default 20mb).
  • readonly: This flag defines the repository as read-only (default false).

    It is possible to return all the defined repositories by executing GET without giving the repository name:

            curl -XGET 'http://localhost:9200/_snapshot'

There's more...

The most common type for a repository backend is filesystem (fs), but there are other official repository backends, such as:

When a repository is created, it's immediately verified on all data nodes to be sure that it's functional.

Elasticsearch also provides a manual way to verify the node status repository, which is very useful in order to check the status of the cloud repository storage. The command to manually verify a repository is the following:

    curl -XPOST 'http://localhost:9200/_snapshot/my_repository/_verify'

See also

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.147.75.183