Chapter 16. System-Monitoring Tools

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 16 System-Monitoring Tools

To keep your system in optimum shape, you need to be able to monitor it closely. This is imperative in a corporate environment, where uptime is vital and any system failures and downtime can be quite expensive. Whether for checking processes for errant daemons or keeping a close eye on CPU and memory usage, Ubuntu provides a wealth of utilities designed to give you as little or as much feedback as you want. This chapter looks at some of the basic monitoring tools, along with some tactics designed to keep your system up longer. Some of the monitoring tools cover network connectivity, memory, and hard drive usage, and in this chapter you learn how to manipulate active system processes using a mixture of graphical and command-line tools.

Console-Based Monitoring

Those familiar with UNIX system administration already know about the ps, or process display, command commonly found on most flavors of UNIX. Because of the close relationship between Linux and UNIX, Ubuntu also includes this command, which enables you to see the current processes running on the system, who owns them, and how resource-hungry they are.

Although the Linux kernel has its own distinct architecture and memory management, it also benefits from enhanced use of the /proc file system, the virtual file system found on many UNIX flavors. Through the /proc file system, you can communicate directly with the kernel to get a deep view of what is currently happening. Developers tend to use the /proc file system as a way of extracting information from the kernel and for their programs to manipulate that information into human-readable formats. A full discussion of the /proc file system is beyond the scope of this book. To get a better idea of what it contains, you can take a look at https://en.tldp.org/LDP/Linux-Filesystem-Hierarchy/html/proc.html, which provides an excellent and in-depth guide.

Processes can also be controlled at the command line, which is important because you might sometimes have only a command-line interface. Whenever an application or a command is launched, either from the command line or a clicked icon, the process that comes from the kernel is assigned an identification number called a process ID (PID). This number is shown in the shell if the program is launched via the command line:

matthew@seymour:~$ gedit &
[1] 9649

In this example, gedit has been launched in the background, and the (bash) shell reported a shell job number ([1] in this case). A job number or job control is a shell-specific feature that allows a different form of process control, such as sending or suspending programs to the background and retrieving background jobs to the foreground. (See your shell’s man pages for more information if you are not using bash.)

The second number displayed (9649 in this example) represents the PID. You can get a quick list of your processes by using the ps command, like this:

matthew@seymour:~$ ps
  PID TTY          TIME CMD

 9595 pts/0    00:00:00 bash

 9656 pts/0    00:00:00 gedit

 9657 pts/0    00:00:00 ps

As you can see, the output includes the PID along with other information, such as the name of the running program. As with any other UNIX command, many options are available; the proc man page has a full list. One useful option is -e, which lists all processes running on the system. Another is aux, which provides a more detailed list of all the processes. You should also know that ps works not by polling memory but through the interrogation of the Linux /proc, or process file system.

The /proc directory contains many files, some of which include constantly updated hardware information (such as battery power levels). Linux administrators often pipe the output of ps through grep to display information about a specific program, like this:

Click here to view code image

matthew@seymour:~$ ps aux | grep bash
matthew      9656  0.0  0.1  21660  4460 pts/0    Ss   11:39   0:00 bash

This example returns the owner (the user who launched the program) and the PID, along with other information such as the percentage of CPU and memory usage, the size of the command (code, data, and stack), the time (or date) the command was launched, and the name of the command for any process that includes the match bash. Processes can also be queried by PID as follows:

Click here to view code image

matthew@seymour:~$ ps 9656
  PID TTY      STAT   TIME COMMAND
 9656 pts/0    S      0:00 gedit

You can use the PID to stop a running process by using the shell’s built-in kill command. This command asks the kernel to stop a running process and reclaim system memory. For example, to stop gedit in the preceding example, use the kill command like this:

matthew@seymour:~$ kill 9656

After you press Enter and then press Enter again, the shell reports the following:

Click here to view code image

[1]+  Terminated              gedit

Note that users can kill only their own processes, but root can kill all users’ processes. Controlling any other running process requires root permission, which you should use judiciously (especially when forcing a kill by using the -9 option); by inadvertently killing the wrong process through a typo in the command, you could bring down an active system.

Using the `kill` Command to Control Processes

The kill command is a basic UNIX system command. You can communicate with a running process by entering a command into its interface, such as when you type into a text editor. But some processes (usually system processes rather than application processes) run without such an interface, and you need a way to communicate with them as well, so you use a system of signals. The kill system accomplishes that by sending a signal to a process, and you can use it to communicate with any process. The general format of the kill command is as follows:

Click here to view code image

matthew@seymour:~$ kill option PID

Note that if you are using kill on a process you do not own, you need to have super user privileges and preface the kill command with sudo.

A number of signal options can be sent as words or numbers, but most are of interest only to programmers. One of the most common is the one used previously to kill gedit:

matthew@seymour:~$ kill PID

This tells the process with PID to stop (where you supply the actual PID). Issuing the command without a signal option issues the default, which is kill -15 (more on that later), and gives us no guarantee that a process will be killed because programs can catch, block, or ignore some terminate signals (and this is a good thing, done by design).

The following example includes a signal for kill that cannot be caught (9 is the number of the SIGKILL signal):

Click here to view code image

matthew@seymour:~$ kill -9 PID

You can use this combination when the plain kill shown previously does not work. Be careful, though. Using this does not allow a process to shut down gracefully, and shutting down gracefully is usually preferred because it closes things that the process might have been using and ensures that things such as logs are written before the process disappears. Instead, try this first:

Click here to view code image

matthew@seymour:~$ kill -1 PID

This is the signal to “hang up”—stop—and then clean up all associated processes as well (1 is the number of the SIGHUP signal).

In fact, some system administrators and programmers prefer something like this progression of signals:

▸ kill -15—This command sends a SIGTERM, which is a clean shutdown that flushes data that needs to be written to disk, cleans up memory registers, and closes the PID.

▸ kill -1—As mentioned earlier, this command sends a SIGHUP, which cleans up and usually also causes the program to restart.

▸ kill -2—This command sends a SIGINT, which is an interrupt from the keyboard, the equivalent to sending Ctrl+C. For example, if you want to stop a program that is running in the background as a daemon instead of in the terminal foreground, this is a good way to do it.

▸ kill -11—This command sends a SIGSEGV, which causes the problem to experience a segmentation fault and close. It does not flush data to disk, but it may create a core dump file that could be useful for debugging and learning why the program is misbehaving (or behaving exactly as you told it to behave and not as you intended it to behave).

▸ kill -9—This command sends a SIGKILL, which should be used as a last resort because it does not sync any data. Nothing is written to disk—no logging, no debugging, nothing. You stop the PID (usually, but not always), but you get nothing that helps you either save data that needed to be written to disk or assists you in figuring out what happened.

As you become proficient at process control and job control, you will learn the utility of a number of kill options. You can find a full list of signal options in the kill man page.

Using killall

The killall command allows you to kill a process by name, as in killall gedit, which would kill any and all gedit processes that are currently running. You can also kill all processes being run by a specific user (assuming that you have authority to do so) with killall -u username. See the killall man page for more options.

Using Priority Scheduling and Control

Two useful applications included with Ubuntu are the nice and renice commands. They are covered in Chapter 12, “Command-Line Master Class, Part 2.” Along with nice, system administrators can also use the time command to get an idea of how much time and what proportion of a system’s resources are required for a task, such as a shell script. (Here, time is used to measure the duration of elapsed time; the command that deals with civil and sidereal time is the date command.) This command is used with the name of another command (or script) as an argument, like this:

Click here to view code image

matthew@seymour:~$ sudo time -p find / -name conky
/home/matthew/conky
/etc/conky
/usr/lib/conky

/usr/bin/conky
real 30.19
user 1.09
sys 2.77

Output of the command displays the time from start to finish, along with the user and system time required. Other factors you can query include memory, CPU usage, and file system input/output (I/O) statistics. See the time command’s man page for more details.

The top command is covered in Chapter 12. It has some even-more-powerful cousins worth mentioning here.

One option for monitoring resource usage is called htop. It is not installed by default but is available from the Ubuntu software repositories and is worth a minute or two of your consideration when you’re familiar with top. Here are some key differences:

▸ With htop, you can scroll the list vertically and horizontally to see all processes and complete command lines.

▸ With top, you are subject to a delay for each unassigned key you press (which is especially annoying when multikey escape sequences are triggered by accident).

▸ htop starts faster. (top seems to collect data for a while before displaying anything.)

▸ With htop, you don’t need to type the process number to kill a process; with top, you do.

▸ With htop, you don’t need to type the process number or the priority value to renice a process; with top, you do.

▸ htop supports mouse operation; top doesn’t.

▸ top is older and therefore more used and tested.

See https://hisham.hm/htop/ for more details.

Displaying Free and Used Memory with `free`

Although top includes some memory information, the free utility displays the amount of free and used memory in the system, in kilobytes. (The -m switch causes it to display in megabytes.) On one system, the output looks like this:

Click here to view code image

matthew@seymour:~$ free
             total       used       free     shared    buffers     cached
Mem:       4055680    3327764     727916          0     280944    2097568
-/+ buffers/cache:     949252    3106428
Swap:      8787512          0    8787512

This output describes a machine with 4GB of RAM memory and a swap partition of 8GB. Note that none of the swap is being used and that the machine is not heavily loaded. Linux is very good at memory management and “grabs” all the memory it can in anticipation of future work.

Tip

A useful trick is to employ the watch command, which repeatedly reruns a command every two seconds by default. If you use the following, you can see the output of the free command updated every two seconds:

matthew@seymour:~$ watch free

Use Ctrl+C to quit.

Another useful system-monitoring tool is vmstat (virtual memory statistics). This command reports on processes, memory, I/O, and CPU, typically providing an average since the last reboot; or you can make it report usage for a current period by telling it the time interval, in seconds, and the number of iterations you desire, like this:

Click here to view code image

matthew@seymour:~$ vmstat 5 10

This causes vmstat to run every five seconds for 10 iterations.

Use the uptime command to see how long it has been since the last reboot and to get an idea of what the load average has been; higher numbers mean higher loads.

Disk Space

Along with system load, it is important to keep an eye on the amount of free hard drive space that your computer has remaining. It is easy to do this, mainly by using the df command, as follows:

matthew@seymour:~$ df

Just using the command alone returns this output:

Click here to view code image

Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/sda1             14421344   6584528   7104256  49% /
none                   2020124       348   2019776   1% /dev
none                   2027840      2456   2025384   1% /dev/shm
none                   2027840       220   2027620   1% /var/run
none                   2027840         0   2027840   0% /var/lock
none                   2027840         0   2027840   0% /lib/init/rw
/dev/sda6            284593052 147323812 122812752  55% /home

Here you can see each drive as mounted on your system, as well as the used space, the available space, the percentage of the total usage of the disk, and where it is mounted on your system.

Unless you are good at doing math in your head, you might find it difficult to work out exactly what the figures mean in megabytes and gigabytes, so it is recommended that you use the -h switch to make the output human readable, like this:

Click here to view code image

matthew@seymour:~$ df –h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda1              14G  6.3G  6.8G  49% /
none                  2.0G  348K  2.0G   1% /dev
none                  2.0G  2.4M  2.0G   1% /dev/shm
none                  2.0G  220K  2.0G   1% /var/run
none                  2.0G     0  2.0G   0% /var/lock
none                  2.0G     0  2.0G   0% /lib/init/rw
/dev/sda6             272G  141G  118G  55% /home

Disk Quotas

Disk quotas enable you to restrict the usage of disk space either by users or by groups. Although rarely—if ever—used on a local or standalone workstation, quotas are definitely a way of life at the enterprise level of computing. Usage limits on disk space not only conserve resources but also provide a measure of operational safety by limiting the amount of disk space any user can consume.

Disk quotas are more fully covered in Chapter 13, “Managing Users.”

Checking Log Files

Many of the services and programs that run on your computer save data in log files. Typical data include success and error messages for processes that are attempted and lists of actions. Some of these log files are extremely technical, whereas others are easily read and parsed by regular users who know what they are looking for. Most log files can be found in /var/log/ or its subdirectories.

Typically, log files are used to learn about something that happened recently, so most admins are interested in the most recent entries. In this case, using tail is commonly used to read just the most recent 10 lines:

Click here to view code image

matthew@seymour:~$ tail /var/log/boot.log
 * Starting                                                         [ OK ] *
Starting save kernel messages                                     [ OK ] * Starting
                                                       [ OK ] * Starting  [ OK ] *

Starting deferred execution scheduler                             [ OK ] *
Starting regular background program processing daemon            [ OK ] * Stopping
save kernel messages                                         [ OK ] * Stopping
anac(h)ronistic cron                                         [ OK ] * Starting CUPS
printing spooler/server                                 [ OK ] * Starting CPU interrupts
balancing daemon                              [ OK ]

There isn’t anything terribly interesting in this quote of today’s boot.log on this machine, but it is sufficient to show how reading the last few lines of a log file works.

You are more likely to want to be able to find out whether something specific is mentioned in a log. The following example shows how to use cat and grep to look for mentions of pnp in dmesg, the display message buffer log for the Linux kernel, to see if there is any mention of a plug-and-play device:

Click here to view code image

matthew@seymour:~$ cat /var/log/dmesg | grep pnp

[    0.426212] pnp: PnP ACPI init[    0.426223] ACPI: bus type pnp registered[
0.426303] pnp 00:01: [dma 4][    0.426315] pnp 00:01: Plug and Play ACPI device, IDs
PNP0200 (active)[    0.426338] pnp 00:02: Plug and Play ACPI device, IDs PNP0b00
(active)[    0.426351] pnp 00:03: Plug and Play ACPI device, IDs PNP0800 (active)[
0.426369] pnp 00:04: Plug and Play ACPI device, IDs PNP0c04 (active)[    0.426531] pnp
00:05: [dma 0 disabled][    0.426568] pnp 00:05: Plug and Play ACPI device, IDs PNP0501
(active)[    0.426872] pnp 00:08: Plug and Play ACPI device, IDs PNP0103 (active)[
0.427298] pnp: PnP ACPI: found 12 devices[    0.427299] ACPI: ACPI bus type pnp
unregistered

Here are a few of the most commonly used log files. Your system will have many others, in addition to these:

▸ /var/log/apport.log—Saves information about system crashes and reports

▸ /var/log/auth.log—Saves information about system access and authentication, including when a user does something using sudo

▸ /var/log/kern.log—Saves information from kernel messages, such as warnings and errors

▸ /var/log/syslog—Saves information from system events

▸ /var/log/ufw.log—Saves information from the Ubuntu Firewall

▸ /var/log/apt/history.log—Saves information about package installation and removal

Notice that the last one is in its own subdirectory. Many applications create their own directories and may even create multiple log files within their directories.

A couple of special cases deserve separate mention. These two are not read using standard methods, but each has its own program for reading from the command line. The commands are the same as the log names:

▸ faillog—Reads from /var/log/faillog and lists recent login failures

▸ lastlog—Reads from /var/log/lastlog and lists the most recent login for each account

For those who love GUI applications, there is a log reader installed by default in Ubuntu. Find System Log in the Dash to run it. It does not include every log in /var/log, but it does include the most important ones that serve the widest audience.

Rotating Log Files

Log files are great, but sometimes they can get unwieldy as time passes and more information is logged. Rotating log files prevents that problem. Rotating a log file means archiving the current log file, starting a fresh log, and deleting older log files. This means you always have a current log file to peruse (the previous log file), and the log files never grow too large.

Typically, log rotation is set up by an administrator to happen nightly, at a time when the system is not being heavily used. This is done with a utility called logrotate running as a cron job. (cron is described in Chapter 14, “Automating Tasks and Shell Scripting.”)

Ubuntu comes with logrotate installed. There is a cron job already set as well. You can find the script at /etc/cron.daily/logrotate. This file is a bash script and looks like this:

Click here to view code image

#!/bin/sh

# Clean nonexistent log file entries from status filecd /var/lib/logrotatetest -e
status || touch statushead -1 status > status.cleansed 's/"//g' status | while read
logfile datedo    [ -e "$logfile" ] && echo ""$logfile" $date"done >> status.cleanmv
status.clean status
test -x /usr/sbin/logrotate || exit 0/usr/sbin/logrotate /etc/logrotate.conf

Don’t worry if you don’t yet understand everything in this script. You don’t need to understand it to configure logrotate to do what you want. You can learn more about bash and shell scripting in Chapter 14.

The important line right now is that last one, which lists the location of the configuration file for logrotate. Here are the default contents of /etc/logrotate.conf.

Click here to view code image

# see "man logrotate" for details

# rotate log files weekly
weekly
# use the syslog group by default, since this is the owning group# of /var/log/syslog.su root syslog
# keep 4 weeks’ worth of backlogsrotate 4
# create new (empty) log files after rotating old onescreate

# uncomment this if you want your log files compressed#compress
# packages drop log rotation information into this directoryinclude /etc/logrotate.d
# no packages own wtmp, or btmp -- we'll rotate them here/var/log/wtmp {   missingok monthly    create 0664 root utmp    rotate 1}
/var/log/btmp {    missingok    monthly    create 0660 root utmp    rotate 1}
# system-specific logs may be configured here

This file includes useful comments, and what it can configure is straightforward. If you can read the file, you probably already have a pretty accurate guess as to what the various settings do. As the first comment in the file says, the man page for logrotate has an explanation of everything that you can consult if it is not already clear.

One interesting entry says that packages drop log rotation information into /etc/logrotate.d. This is worth a closer look. The directory contains a config file for applications that are installed using the package manager and that log information. These files are named after the applications whose log files they control. Let’s look at two examples. This first one is for apt, the package manager:

Click here to view code image

/var/log/apt/term.log {  rotate 12  monthly  compress  missingok  notifempty}
/var/log/apt/history.log {  rotate 12  monthly  compress  missingok  notifempty}

There are two entries here, each for a different log file that is created and used by apt. The entries define how many old versions of the log files to keep, how frequently logrotate will rotate the logs, whether to compress the log files, whether it is okay for the log file to be missing, and whether to bother rotating if the file is empty. Again, this is pretty straightforward. Here is a more complex example, for rsyslog, the system logging program:

Click here to view code image

/var/log/syslog
{
    rotate 7
    daily
    missingok
    notifempty
    delaycompress
    compress
    postrotate
        reload rsyslog >/dev/null 2>&1 || true
    endscript
}

/var/log/mail.info
/var/log/mail.warn
/var/log/mail.err
/var/log/mail.log
/var/log/daemon.log
/var/log/kern.log
/var/log/auth.log
/var/log/user.log
/var/log/lpr.log
/var/log/cron.log
/var/log/debug
/var/log/messages
{
    rotate 4
    weekly
    missingok
    notifempty
    compress
    delaycompress
    sharedscripts
    postrotate
        reload rsyslog >/dev/null 2>&1 || true
    endscript
}

The man page for logrotate defines all the commands used in these configuration files, but many are probably clear to you already. Here are some of the most important ones:

▸ rotate—Defines how many archived logs are kept at any one time

▸ interval—Defines how often to rotate the log; the actual setting will be daily, weekly, monthly, or yearly

▸ size—Defines how large a log file can become before it is rotated; this setting supersedes the preceding time interval setting, and the format will be a number and a unit, such as size 512k or size 128M or size 100G

▸ compress—Configures the log file to be compressed

▸ nocompress—Configures the log file not to be compressed

What is more important to cover here than all the individual options, which you can look up in the man page, is that these individual configuration files for specific applications will override the default settings in /etc/logrotate.conf. If a setting is assigned a value in that file, it will be used by all applications that logrotate affects unless an application-specific file in /etc/logrotate.d includes the same setting.

Graphical Process- and System-Management Tools

The GNOME and KDE desktop environments offer a rich set of network and system- monitoring tools. Graphical interface elements, such as menus and buttons, and graphical output, including metering and real-time load charts, make these tools easy to use. These clients, which require an active X session and in some cases root permission, are included with Ubuntu.

If you view the graphical tools locally while they are being run on a server, you must have X properly installed and configured on your local machine. Although some tools can be used to remotely monitor systems or locally mounted remote file systems, you have to properly configure pertinent X11 environment variables, such as $DISPLAY, to use the software or use the ssh client’s -X option when connecting to the remote host.

System Monitor

System Monitor is a graphical monitoring tool that is informative, easy to use and understand, and very useful. It has tabs for information about running processes, available resources, and local file systems.

Conky

Conky is a highly configurable, rather complex system monitor that is light on system resources and can give you information about nearly anything. The downside is that you need to learn how to configure it. Simply installing Conky from the software repositories only gets you started. However, for those who want specific information displayed on their desktop at all times, it is invaluable and well worth the time it takes to figure it out. We give an example here, but to truly appreciate the power, flexibility, and possibilities of Conky, visit https://github.com/brndnmtthws/conky.

Conky uses text files for configuration and is often started using a short script. The example shown in Figure 16.1 is from Matthew’s personal configuration on his workstation and is intended as a simple example to get you started.

Images — FIGURE 16-1 You can configure Conky to give up-to-the-moment information about anything.

In this example, Conky gives information about the kernel, the operating system, the hostname of the system, and the current time and date. It continually updates with information on load averages, CPU usage and temperature, battery status, and RAM and disk usage. In addition, it shows networking details, including the internal network IP address and the IP address assigned to the outside-facing router the network uses to connect to the wider world and current inbound and outbound network connections. (The IP address is assigned by the ISP, and it changes, so if you try to attack Matthew’s home network using it, you will find that the IP address is being used by someone else now.) That is a lot of information in a small space. This setup is not as pretty as some you will see at the previous links, nearly all of which have their setup details made freely available by the person who wrote the configurations.

In this example, Matthew is using two files to run Conky. The first one, called conkyrc, is a text file that includes the configuration details for what you see in Figure 16.1:

Click here to view code image

conky.config = {
-- Use Xft?
    use_xft = true,
-- Xft font when Xft is enabled
--font = 'HackBold:size=9',
    font = 'Ubuntu:size=8',

-- Text alignment, other possible values are commented
--minimum_size 10 10
    gap_x = 13,
    gap_y = 45,
--alignment top_left
    alignment = 'top_right',
--alignment bottom_left
--alignment bottom_right

-- Add spaces to keep things from moving about?  This only affects certain objects.
    use_spacer = 'right',

-- Subtract file system buffers from used memory?
    no_buffers = true,

-- Use double buffering (reduces flicker, may not work for everyone)
    double_buffer = true,

-- Allows icons to appear, window to be moved, and transparency
    own_window = true,
    own_window_class = 'Conky',--new
--own_window_type override
    own_window_transparent = true,
--own_window_hints undecorated,below,skip_taskbar
    own_window_type = 'normal',--new
    own_window_hints = 'undecorated,below,sticky,skip_taskbar',--new
    own_window_argb_visual = true,--new
    own_window_argb_value = 0,--new

-- set temperature units, either "farenheit" or "celsius"
    temperature_unit = "fahrenheit",

-- set to yes if you want Conky to be forked in the background
    background = true,

-- Update interval in seconds
    update_interval = 1,

    cpu_avg_samples = 1,
    net_avg_samples = 1,

-- ----- start display config -----

};

conky.text = [[
${alignc}${color c68be8}$sysname kernel $kernel
${alignc}${color c68be8}${exec cat /etc/issue.net} on $machine host $nodename

${color c68be8}Time:${color E7E7E7}   $time
${color c68be8}Updates: ${color E7E7E7}${execi 3600 /usr/lib/update-
[ccc]notifier/apt_check.py --human-readable | grep updated}
${color c68be8}Security: ${color E7E7E7}${execi 3600 /usr/lib/update-
[ccc]notifier/apt_check.py --human-readable | grep security}

${color c68be8}Disk Usage:
  Samsung SSD 850 PRO 256GB: ${color E7E7E7}/
    ${fs_used /}/${fs_size /} ${fs_bar /}
  ${color c68be8}Crucial_CT1050MX300SSD1 1.1TB: ${color E7E7E7}/home
    ${fs_used /home}/${fs_size /home} ${fs_bar /home}
  ${color c68be8}Hitachi Ultrastar A7K2000 HUA722010CLA330 1TB: ${color E7E7E7}/home/matt/
Media
    ${fs_used /home/matt/Media}/${fs_size /home/matt/Media} ${fs_bar /home/matt/Media}
  ${color c68be8}Seagate Expansion Disk (External) 5TB: ${color E7E7E7}/media/matt/Backups
    ${fs_used /media/matt/Backups}/${fs_size /media/matt/Backups} ${fs_bar /media/matt/Backups}

${color c68be8}RAM Usage:${color E7E7E7} $mem/$memmax - $memperc% $membar
${color c68be8}Swap Usage:${color E7E7E7} $swap/$swapmax - $swapperc% ${swapbar}

${color c68be8}Load average:${color E7E7E7}   $loadavg
${color c68be8}Processes:${color E7E7E7} $processes  ${color c68be8}Running:${color E7E7E7}
$running_processes ${color c68be8}

${color c68be8}CPU usage         ${alignr}PID     CPU%   MEM%
${color E7E7E7} ${top name 1}${alignr}${top pid 1}   ${top cpu 1}   ${top mem 1}
${color E7E7E7} ${top name 2}${alignr}${top pid 2}   ${top cpu 2}   ${top mem 2}
${color E7E7E7} ${top name 3}${alignr}${top pid 3}   ${top cpu 3}   ${top mem 3}

${color c68be8}Memory usage
${color E7E7E7} ${top_mem name 1}${alignr}${top_mem pid 1}  ${top_mem cpu 1}
${top_mem mem 1}
${color E7E7E7} ${top_mem name 2}${alignr}${top_mem pid 2}  ${top_mem cpu 2}
${top_mem mem 2}
${color E7E7E7} ${top_mem name 3}${alignr}${top_mem pid 3}  ${top_mem cpu 3}
${top_mem mem 3}

${color c68be8}Current CPU usage:
  ${color c68be8}CPU0:${color E7E7E7}  ${cpu cpu0}%  ${color c68be8}CPU1:${color E7E7E7}
${cpu cpu1}%  ${color c68be8}CPU2:${color E7E7E7}    ${cpu cpu2}%  ${color c68be8}
CPU3:${color E7E7E7}    ${cpu cpu3}%
  ${color c68be8}CPU4:${color E7E7E7}  ${cpu cpu4}%  ${color c68be8}CPU5:${color E7E7E7}
${cpu cpu5}%  ${color c68be8}CPU6:${color E7E7E7}    ${cpu cpu6}%  ${color c68be8}
CPU7:${color E7E7E7}    ${cpu cpu7}%
  ${color c68be8}CPU8:${color E7E7E7}  ${cpu cpu8}%  ${color c68be8}CPU9:${color E7E7E7}
${cpu cpu9}%  ${color c68be8}CPU10:${color E7E7E7}   ${cpu cpu10}%  ${color c68be8}
CPU11:${color E7E7E7}  ${cpu cpu11}%
  ${color c68be8}CPU12:${color E7E7E7}  ${cpu cpu12}%  ${color c68be8}CPU13:${color
E7E7E7}  ${cpu cpu13}%  ${color c68be8}CPU14:${color E7E7E7}  ${cpu cpu14}%  ${color c68be8}
CPU15:${color E7E7E7}  ${cpu cpu15}%
  ${color c68be8}CPU16:${color E7E7E7}  ${cpu cpu16}%  ${color c68be8}CPU17:${color E7E7E7}
${cpu cpu17}%  ${color c68be8}CPU18:${color E7E7E7}  ${cpu cpu18}%  ${color c68be8}
CPU19:${color E7E7E7}  ${cpu cpu19}%
  ${color c68be8}CPU20:${color E7E7E7}  ${cpu cpu20}%  ${color c68be8}CPU21:${color E7E7E7}
${cpu cpu21}%  ${color c68be8}CPU22:${color E7E7E7}  ${cpu cpu22}%  ${color c68be8}
CPU23:${color E7E7E7}  ${cpu cpu23}%

${color c68be8}Current CPU core temps (cores with sensors):
${color E7E7E7}${execi 2 sensors | grep "Core" | cut -c 1-22}

${color c68be8}Wired Networking:
  ${color c68be8}Local IP: ${color E7E7E7}${addr enp1s0} ${color c68be8}
  ${color c68be8}total download: ${color E7E7E7}${totaldown enp1s0}
  ${color c68be8}total upload: ${color E7E7E7}${totalup enp1s0}
  ${color c68be8}download speed: ${color E7E7E7}${downspeed enp1s0}${color E7E7E7} ${color
c68be8} upload speed: ${color E7E7E7}${upspeed enp1s0}
  ${color E7E7E7}${downspeedgraph enp1s0 15,150 E7E7E7 DA9347} $alignr${color
E7E7E7}${upspeedgraph enp1s0 15,150 DA9347 E7E7E7}

${color c68be8}Public IP: ${color E7E7E7}${execi 600 bash /home/matt/conky/myip.sh}

${color c68be8}Port(s) / Connections:
${color c68be8}Inbound: ${color E7E7E7}${tcp_portmon 1 32767 count}  ${color c68be8}Out
bound: ${color E7E7E7}${tcp_portmon 32768 61000 count}  ${color c68be8}Total: ${color
E7E7E7}${tcp_portmon 1 65535 count}
${color c68be8}Outbound Connection ${alignr} Remote Service/Port${color E7E7E7}
 ${tcp_portmon 1 65535 rhost 0} ${alignr} ${tcp_portmon 1 65535 rservice 0}
 ${tcp_portmon 1 65535 rhost 1} ${alignr} ${tcp_portmon 1 65535 rservice 1}
 ${tcp_portmon 1 65535 rhost 2} ${alignr} ${tcp_portmon 1 65535 rservice 2}
 ${tcp_portmon 1 65535 rhost 3} ${alignr} ${tcp_portmon 1 65535 rservice 3}
 ${tcp_portmon 1 65535 rhost 4} ${alignr} ${tcp_portmon 1 65535 rservice 4}
 ${tcp_portmon 1 65535 rhost 5} ${alignr} ${tcp_portmon 1 65535 rservice 5}
]];

Most of these details are clear, but one is particularly interesting. There are commercial and other sites that, if you visit them, will return your IP address. This is easily accomplished several ways. Matthew chose to put the following PHP in a file named myip.php on a server he owns, and he calls it directly:

Click here to view code image

 <?
 $remote = $_SERVER["REMOTE_ADDR"];
   echo $remote;
 ?>

Doing this can help you not feel guilty about constantly hitting someone else’s server for this information.

Finally, although you can run Conky from the command line at any time after it is set up, to make it more convenient, many people choose to keep their config files in their /home directory somewhere and then write a script with the custom location. If you add a pause to the beginning of the script, you can add the script to Startup Applications and have it come up after all your other desktop processes are up and running. Here is a simple example:

Click here to view code image

#!/bin/bash
sleep 45 &&
exec conky -d -c ~/conky/conkyrc &
exit

Save it in /home/username/conky along with all your Conky config files, make it executable, and then have the Startup Applications process call it at bootup. Note that this way, you can also run more than one instance of Conky at a time, perhaps having your regular instance in the upper right of the screen and a weather instance or something else in a different location. The possibilities are vast.

A lovely GUI program for creating and managing Conky configurations exists, but it is not in the Ubuntu software repositories. If you are interested in exploring it further, there is a PPA from which you can install Conky Manager at https://launchpad.net/conky-manager. Installation directions and other documentation are available from the maintainer’s website, at www.teejeetech.in/p/conky-manager.html.

Other Graphical Process- and System-Monitoring Tools

Graphical system- and process-monitoring tools that are available for Ubuntu include the following:

▸ vncviewer—AT&T’s open-source remote session manager (part of the Xvnc package), which you can use to view and run a remote desktop session locally. This software (discussed in more detail in Chapter 19, “Remote Access with SSH and VNC”) requires an active background X session on the remote computer.

▸ gnome-nettool—A GNOME-developed tool that enables system administrators to carry out a wide range of diagnostics on network interfaces, including port scanning and route tracing.

▸ wireshark—A graphical network protocol analyzer that can be used to save or display packet data in real time and that has intelligent filtering to recognize data signatures or patterns from a variety of hardware and data captures from third-party data-capture programs, including compressed files. Some protocols include Internet Protocol (IP), Transmission Control Protocol (TCP), Domain Name System (DNS), Hypertext Transfer Protocol (HTTP), Secured Shell (SSH), Transport Layer Security (TLS), and Hypertext Transfer Protocol over SSL/TLS (HTTPS).

KDE Process- and System-Monitoring Tools

KDE provides several process- and system-monitoring clients. Integrate the KDE graphical clients into the desktop taskbar by right-clicking the taskbar and following the menus.

These KDE monitoring clients include the following:

▸ kdf—A graphical interface to your system’s file system table that displays free disk space and enables you to mount and unmount file systems with a pointing device

▸ ksysguard—Another panel applet that provides CPU load and memory use information in animated graphs

Enterprise Server Monitoring

Servers used in enterprise situations require extreme uptime. Monitoring is vital to know how your system is performing and to have observability into the general health of your system components and services.

It is beyond the scope of this book to discuss topics such as observability, redundancy, failsafe and failover safeguards, and so on in any depth. However, the tools listed in this section can help enterprise sysadmins get started in their search for the proper tool(s) for their setting.

Datadog

Datadog is one of several young monitoring companies that have designed their products as software as a service (SaaS) cloud-native applications. Datadog requires you to install an agent in your system. You can do the rest using their web UI (an API is available to also programmatically access their platform, if you prefer). Then, you learn about the various integrations and decide which metrics you want to track, and finally you create dashboards, graphs, and monitors. It is proprietary and has multiple pricing options depending on your needs.

Nagios

Nagios has an open source foundation and some core monitoring options that are free to download, use, and modify. The paid option offers far more features and flexibility. In both cases, you must download the Nagios software and install on dedicated resources within your IT infrastructure. Nagios has been at the forefront of monitoring for years but is gradually giving way to SaaS newcomers in the new cloud computing era.

New Relic

New Relic is another of the several young monitoring companies that have designed their products as software as a service (SaaS) cloud-native applications. It is flexible, configurable, and designed to work across your cloud deployments. New Relic’s products are proprietary, and the company offers multiple pricing options depending on your needs.

SignalFx

SignalFx is another of the several young monitoring companies that have designed their products as software as a service (SaaS) cloud-native applications. It is flexible, configurable, and designed to work across your cloud deployments. SignalFx’s products are proprietary, and the company offers multiple pricing options depending on your needs. SignalFx was recently bought by Splunk, who is a long-time player in the on-site monitoring business and wants to stay relevant with the transition to cloud computing.

Splunk

Once a big player in enterprise monitoring, Splunk has had to shift course of late as enterprises have been moving to cloud computing, and they have faced strong competition from the smaller, young monitoring companies already mentioned. They are still, with Nagios, one of the mainstays of on-premise data analytics solutions. Splunk is the standard for some unique features like security information and event management (SIEM), artificial intelligence for IT operations (AIOps), and compliance monitoring.

Zabbix

Zabbix is more popular in Europe than in North America. Similar to Splunk, it is a mature enterprise platform designed to monitor large-scale IT environments, specifically on-premise, but the company has broadened its abilities and works across all the various popular cloud platforms. It is free and open source, with paid support, services, and training available.

References

▸ https://and.sourceforge.net—Home page of the auto nice daemon (AND), which can be used to prioritize and reschedule processes automatically

▸ https://sourceforge.net/projects/schedutils/—Home page for various projects offering scheduling utilities for real-time scheduling

▸ www.datadoghq.com/—Home page for Datadog

▸ www.nagios.com/—Home page for Nagios

▸ https://newrelic.com/—Home page for New Relic

▸ www.signalfx.com/—Home page for SignalFx

▸ www.splunk.com/—Home page for Solarwinds

▸ https://sourceforge.net/projects/schedutils/—Home page for Zabbix

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 16. System-Monitoring Tools

Create new playlist

Sign In

Sign Up

Chapter 16

System-Monitoring Tools

Console-Based Monitoring

Using the kill Command to Control Processes

Using Priority Scheduling and Control

Displaying Free and Used Memory with free

Disk Space

Disk Quotas

Checking Log Files

Rotating Log Files

Graphical Process- and System-Management Tools

System Monitor

Conky

Other Graphical Process- and System-Monitoring Tools

KDE Process- and System-Monitoring Tools

Enterprise Server Monitoring

Datadog

Nagios

New Relic

SignalFx

Splunk

Zabbix

References

Table of Contents for
Chapter 16. System-Monitoring Tools

Using the `kill` Command to Control Processes

Displaying Free and Used Memory with `free`