Learning command-line interfaces

So far, we have discussed using web-based applications to view the current status and manage things such as downtimes or comments.

There are also multiple tools that let us perform the same operations from the command line in a convenient way.

Using nagios_commander

One tool that provides an easy way to manage Nagios and view its data from command line is nagios_commander.

This is a shell script that communicates with Nagios using the web interface, using HTTP-based authentication. Since it is communicating over the network, the script can be run on any machine, not only on the machine where Nagios is running. It can also be used to manage multiple Nagios instances from a single machine.

All that is needed is to have the curl command available on your machine. For Ubuntu-based distributions, we'll need to run the following command:

root@ubuntu:~# apt-get -y install bsdmainutils curl

For CentOS, RHEL, and Oracle Linux, the command is:

[rootcentos ~]# yum install -y curl

Next, all that we have to do is download the nagios_commander script using the following commands:

root@ubuntu:~# curl -sSL https://raw.github.com/brandoconnor/nagios_commander/master/nagios_commander.sh >/usr/local/bin/nagios_commander
root@ubuntu:~# chmod 0755 /usr/local/bin/nagios_commander

After that nagios_commander will work properly.

The command takes the URL to the Nagios web interface and the username and password from the command line using the -n, -u and -p arguments, respectively:

# nagios_commander -n 127.0.0.1/nagios -u nagiosadmin -p nagiosadmin 
-q list -h
Hostname        Status
linuxbox01      UP
localhost       UP

The preceding command will list all hosts on our Nagios instance and print their status. The -q list -h command indicates a list of hosts that will be printed, and will be described in more detail later in this section.

It is also a good idea to create an alias or a helper script that will not require passing the location, username, and password on each invocation.

# alias ncmd='nagios_commander -n 127.0.0.1/nagios -u nagiosadmin -p nagiosadmin'

To be able to use it in all shells and not just the current one, the alias can be put in shell initialization scripts, such as .bash_aliases, in your home directory if you are using the bash shell.

This way we can simply call:

# ncmd -q list -h

Also, this should return the same result as the original command we invoked earlier.

The nagios_commander allows specifying a context for which a command is run. If not specified (or specified as an empty value for -h option), the context is global.

It is possible to run commands for specific hosts and hostgroups using the -h and -H options, respectively. The first one specifies that a specific host should be used. The -H option allows querying a specific hostgroup. For example, the -q list -h localhost command indicates that services for the host localhost should be shown.

# ncmd -q list -h localhost
Fetching services and health on localhost
---
Service          State
---
Current+Load     OK
Current+Users    OK
HTTP             OK
PING             OK
Root+Partition   OK
SSH              OK
Swap+Usage       OK
Total+Processes  OK

Similarly the -H option can be used to list the status of all hosts inside a hostgroup:

ncmd -q list -H linux-servers
Hostname        Status
localhost  UP
linuxbox01  UP

The -s option allow specifying services to run the query or command against. Similarly, the -S option can be used to run a command against a service group. These are only used when running commands to manage and/or acknowledge downtimes.

The -q option allows us to go information from Nagios. The following table shows the available query types:

Command

Contexts

Description

list

global, host

Lists all hosts or services, depending on the context

host_downtime

host

Lists host downtimes for all hosts or a specific host/hostgroup

service_downtime

service

Lists all service downtimes for all hosts, a specific host/hostgroup or for specific service/service group only

notifications

global

Shows whether notification sending is enabled

event_handlers

global

Shows whether running event handlers is enabled

active_svc_checks

global

Shows whether performing active service checks is enabled

active_host_checks

global

Shows whether performing active host checks is enabled

passive_svc_checks

global

Shows whether accepting passive service check results is enabled

passive_host_checks

global

Shows whether accepting passive host check results is enabled

Event handlers and notifications are described in more detail in Chapter 8, Notifications and Events. The concept of passive checks is explained in more detail in Chapter 9, Passive Checks and NRDP.

The -c option allows us to change Nagios settings and/or manage host and service downtimes from the command line. The first argument is the action to perform and the second argument is the scope. The flag also takes a third argument when the Nagios settings are to be changed.

To change any Nagios settings, the action has to be set and the scope should be  notifications, event_handlers, active_svc_checks, active_host_checks, passive_svc_checks, or passive_host_checks. The third argument should either be enable or disable. For example, to disable or enable sending notifications we can run:

# ncmd -c set notifications disable
# ncmd -c set notifications enable

Another possibility is to manage downtimes. In this case, the action should either be set, del, or ack to add a downtime, delete it, or acknowledge a problem, respectively. The -h, -H, -s, and -S options can be used to specify the host, hostgroup, and service or service group the downtime is related to.

When adding downtime or acknowledging a problem, it is also required to specify a comment and planned downtime. The -C option is used to specify a comment, and the -t option specifies time in minutes.

For example, to add a downtime for two hours for the localhost host, we can use:

# ncmd -c add downtime -C "Planned downtime" -t 120 -h localhost

We can then check the downtime by running the following command:

# ncmd -q host_downtime

The output of the preceding command will be as follows:

Hostname  Downtime-id   End_date_and_time     Author       Comment
localhost      1       2-14-2016 20:49:46  Nagios Admin    Planned
                                                           downtime

The downtime id is the unique identifier of a downtime. In order to delete a downtime, we need to know its id and delete it using the del action:

# ncmd -c del downtime -d 1

This will delete a downtime with id 1.

The ack action can be invoked in order to acknowledge a problem. The command itself does not require any additional argument; the only required flag is -C to indicate the comment for acknowledgement, as shown here:

# ncmd -c ack -h localhost -s SSH -C "SSH upgrade in progress, will be up soon"

This will add a new acknowledgement for service SSH on localhost.

Interacting with nagios-cli

Another command-line-based tool is nagios-cli, which provides a shell-like interface for Nagios. This is an open source project, whose homepage is http://nagios-cli.maze.io/ and its source code is in GitHub at https://github.com/tehmaze/nagios-cli. This tool reads the Nagios status file and sends commands using the Nagios pipe. It has to be run on the same machine or container where the Nagios service is running.

To install nagios-cli, we first need to install the prerequisites, which include Python, pip tool for installing the Python package, readline library, and development packages for those as well as Git to be able to retrieve nagios-cli itself.

On Debian and Ubuntu, the command to install the prerequisites is:

root@ubuntu:~# apt-get -y install patch python python-pip libpython-dev libncurses-dev libreadline-dev git

For CentOS, RHEL, and Oracle Linux, the command is:

[root@centos ~]# yum install -y patch python python-devel python-pip git readline-devel

Installing nagios-cli also requires some of the prerequisites for building Nagios. If the machine where nagios-cli will be run does not have them, it is recommended that you install them as well. The dependencies for different Linux distributions are described in more details in Chapter 2, Installing Nagios 4.

The next step is to install the readline Python package by running pip:

# pip install readline

After that, we can retrieve the nagios-cli source package by running the following command:

# git clone https://github.com/tehmaze/nagios-cli.git

This will retrieve the latest version of source code in a new directory called nagios-cli. We now need to install it by running:

# cd nagios-cli ; python setup.py install

This will install the nagios-cli binary into the /usr/local/bin directory. Next, we need to create a configuration file in /etc/nagios/nagios-cli.cfg with the following contents:

[nagios] 
log                     = /var/nagios 
command_file            = %(log)s/rw/nagios.cmd 
log_file                = %(log)s/nagios.log 
object_cache_file       = %(log)s/objects.cache 
status_file             = %(log)s/status.dat 

This will specify nagios-cli where the Nagios data is kept. Next, we can run the tool using the following command:

# nagios-cli -c /etc/nagios/nagios-cli.cfg

This will start the interactive shell. The shell accepts commands similar to any other Unix shell, as commands and arguments separated by space. It also supports tab-based expansion of arguments, such as for host and service commands, where it will auto expand host and service names, respectively.

The tool also provides the help command which provides all currently available commands. For example:

nagios > help
Global commands:
..         EOF        about      configure  exit       help       host    
license    quit       tail
Local commands:
list       ls

We can now issue the ls or list command to list hosts. For example:

nagios > ls
linuxbox01  localhost 

This will list all the hosts currently configured in Nagios. In this example, this includes linuxbox01 and localhost.

To change the context to a specific host, simply call the host command by providing the name of a host. The ls or list command will list all services, and the service command can be used for changing the context to a specific service for a specific host. Commands .. or the EOF commands can be used to go back to global context.

For example:

nagios > host localhost
nagios (host) localhost> ls
Current-Load     Current-Users    HTTP             PING
Root-Partition   SSH              Swap-Usage       Total-Processes
nagios (host) localhost> service SSH
nagios (host) localhost  SSH> ..
nagios (host) localhost > ..
nagios >

When in the context of a host or service, the status command will report information about the current host and/or service, as shown here:

nagios (host) localhost> status
host name           : localhost
current state       :  OK
plugin output       : PING OK - Packet loss = 0%, RTA = 0.10 ms
(...)
service             : SSH                   OK
service             : Swap Usage            OK
service             : Total Processes       OK
nagios (host) localhost> service SSH
nagios (host) localhost  SSH> status
host name           : localhost
service description : SSH
current state       :  OK
(...)

The preceding example shows only partial output from the status commands.

The check and acknowledge commands can be used to check the current status and acknowledge a problem for the current host or service. For example:

nagios (host) localhost  SSH> check
Service check scheduled
nagios (host) localhost  SSH> acknowledge
comment        : Reinstalling service
sticky     [Yn]: n
notify     [Yn]: n
persistent [Yn]: n
Service problem acknowledged

The following table shows key commands available in nagios-cli:

Command

Contexts

Description

help

Always

Provides a list of commands valid in the current scope

host

Always

Changes the context to a specific host

service

Host

Changes the context to a specific service in the current host

..

Host or service

Returns to the preceding context, that is from the service context to the host context or from host to global

EOF

Host or service

Returns to the preceding context, that is from the service context to the host context or from host to global

status

Host or service

Prints the detailed status for a host or service

check

Host or service

Forces a check to be made for a host or service

acknowledge

Host or service

Acknowledges a problem for a host or service

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.17.75.227