Computers are dynamic and multi-purpose machines; they do a variety of jobs using many tools. This chapter describes the ways you can manage these tools. One aspect of this task is installing, uninstalling, and upgrading software packages. Another aspect of software management is in managing programs once they're running. (Running programs are often called processes.) Finally, this chapter covers log files, which record details of what running programs do—particularly programs that run automatically and in the background.
Package management is an area of Linux that varies a lot from one distribution to another. Nonetheless, certain principles are common across most Linux distributions, so I describe these principles, followed by some of the basics of the two major Linux package management systems. I then describe how to manage packages using both the RPM Package Management (RPM; a recursive acronym) and Debian package systems.
If you've installed software in Windows, you're likely familiar with the procedure of double-clicking on an installer program, which places all the files associated with a program where they should go. A Windows software installer is similar to a Linux package file, but there are differences. Linux packages have the following characteristics:
Many program packages depend on library packages; libraries provide code that can be used by many programs.
You can compile and install software from source code manually, without using a packaging tool. This advanced topic is beyond the scope of this book.
Packages can, and frequently do, contain files that will be installed to many directories on the computer. This fact makes tracking package contents critical.
The package software maintains a database of information about installed packages (the package database). This information includes the names and version numbers of all the installed packages, as well as the locations of all the files installed from each package. This information enables the package software to quickly uninstall software, establish whether a new package's dependencies have been met, and determine whether a package you're trying to install has already been installed and, if so, whether the installed version is older than the one you're trying to install.
As noted earlier, two package systems, RPM and Debian, are common, although others exist as well. These systems differ in various technical details, as well as in the commands used to manage packages and in the format of the package files they use. You cannot install a Debian package on an RPM-based system, or vice versa. Indeed, installing a package intended for one distribution on another is a bit risky even when they use the same package type. This is because a non-native package may have dependencies that conflict with the needs of native packages.
Originally, package systems worked locally—that is, to install a package on your computer you would first have to download a package file from the Internet or in some other way. Only then could you use a local command to install the package. This approach, however, can be tedious when a package has many dependencies—you might attempt an installation, find unmet dependencies, download several more packages, find that one or more of them has unmet dependencies, and so on. By the time you've tracked down all of these depended-upon packages, you might need to install a dozen or more packages. Thus, modern distributions provide network-enabled tools to help automate the process. These tools rely on network software repositories, from which the tools can download packages automatically. The network-enabled tools vary from one distribution to another, particularly among RPM-based distributions.
In practice, then, the process of managing software in Linux involves using text-mode or GUI tools to interface with a software repository. A typical software installation task works something like this:
You can configure most distributions to use local media instead of or in addition to Internet repositories.
Immediately after installing a distribution, you may find that a large number of updates are available.
Upgrading software works in a similar way, although upgrades are less likely to require downloading depended-upon packages. Removing software can be done entirely locally, of course. Many distributions automatically check with their repositories from time to time and notify you when updates are available. Thus, you can keep your system up to date by clicking a few buttons when you're prompted to do so. As an example, Figure 9.1 shows Fedora 16's Software Update utility, which shows a list of available updates.
Package management necessarily involves root access, which is described in more detail in Chapter 13, “Understanding Users and Groups.” If you follow the automatic prompts to update your software, you can keep the system up to date by entering the root password, or on some distributions your regular password, when the update software prompts for it.
RPM-based distributions include Red Hat, Fedora, CentOS, SUSE Enterprise, open-SUSE, and Mandriva. The basic tool for installing software on these distributions is the text-mode rpm command. This program works on local files, though; to use a network repository, you must use another tool, which varies by distribution:
Because of the variability between these distributions, particularly for network-enabled updates, providing a complete description of all of these tools is impractical here. Fortunately, the GUI tools are easy to use and accessible. Even the text-mode tools are fairly straightforward, although you may need to consult their man pages to learn the details. Typically, they use logical subcommands, such as install to install a package, as in:
# yum install yumex
If you want to both upgrade software and remove packages, it's generally best to remove software first. This can obviate some downloads, reducing the upgrade time.
You might use this command to install the GUI Yumex tool on a Red Hat, Fedora, or CentOS system. Similarly, you can remove a specific package by using the remove subcommand or upgrade all of a computer's packages by using upgrade:
# yum remove zsh # yum upgrade
These examples upgrade the computer and then remove the zsh package. Both of these commands will produce a number of lines of output, and you may be asked to verify their actions. Consult the man page for yum (or whatever package management software your distribution uses) to learn more about this tool.
If you need to deal with RPM package files directly, you should be aware that they have filename extensions of .rpm. These files also usually include codes for architecture type (such as i386 or x86_64), and often codes for the distribution for which they're intended (such as fc16 for Fedora 16). For instance, samba-3.6.1 -77.fc16.x86_64.rpm is a package file for the samba package, version 3.6.1, release 77, for Fedora 16, on the x86-64 platform.
Third-party implementations of APT for many RPM-based distributions also exist. See http://apt4rpm.sourceforge.net for details. At least one RPM-based distribution, PCLinuxOS, uses APT natively.
The Debian GNU/Linux distribution created its own package system, and distributions based on Debian, such as Ubuntu and Mint, use the same system. Atop the basic Debian package system lies the Advanced Package Tool (APT), which provides access to network repositories.
The dpkg command is the lowest-level interface to the Debian package system; it's roughly equivalent to the rpm utility on RPM-based systems. Several tools provide text-mode and graphical interfaces atop dpkg, the most important of these being the text-mode apt-get and the GUI Synaptic. As their names imply, apt-get and Synaptic provide access to network repositories via APT. Figure 9.2 shows Synaptic in use.
Debian package files have names that end in .deb. Like RPM packages, these names typically include codes for the software version and architecture (such as i386 or amd64). For instance, samba_3.6.1-3_amd64.deb is a Debian package file for the samba package, version 3.6.1, revision 3, for AMD64 (x86-64) CPUs. You can install such files using dpkg or apt-get, or you can use apt-get to download a package and its dependencies from the Internet, using its install command, as in:
# apt-get install samba
As with RPM packages, you can remove packages or upgrade your computer's software, too:
# apt-get remove zsh # apt-get upgrade
APT is a powerful tool, as is the underlying dpkg. You should consult these programs' man pages to learn more about how to use these programs.
The Linux kernel is the core of a Linux installation. The kernel manages memory, provides software with a way to access the hard disk, doles out CPU time, and performs other critical low-level tasks. The kernel is loaded early in the boot process, and it's the kernel that's responsible for managing every other piece of software on a running Linux computer.
You can change which program runs as the first process by adding the init= option to your boot loader's kernel option line, as in init=/bin/bash to run bash.
One of the many ways that the kernel imposes order on the potentially chaotic set of running software is to create a sort of hierarchy. When it boots, the kernel runs just one program—normally /sbin/init. The init process is then responsible for starting all the other basic programs that Linux must run, such as the programs that manage logins and always-up servers. Such programs, if launched directly by init, are called its children. The children of init can in turn launch their own children. This happens when you log into Linux. The process that launched a given process is called its parent.
Occasionally, a process will terminate but leave behind children. When this happens, init “adopts” those child processes.
The result of this system is a treelike hierarchy of processes, as illustrated in Figure 9.3. (“Trees” in computer science are often depicted upside down.) Figure 9.3 shows a small subset of the many processes that run on a typical Linux installation: just a few processes associated with a text-mode login, including the login tool that manages logins, a couple of bash shells, and a few user programs. A working Linux system will likely have dozens or hundreds of running processes. The one on which I'm typing these words has 213 processes going at once!
Internally, the kernel maintains process information in the process table . Tools such as ps and top (described shortly) enable you to view and manipulate this table.
Every process has associated with it a process ID (PID) number. These numbers begin with 1, so init's PID is normally 1. Each process also has a parent process ID (PPID) number, which points to its parent. Many of the tools for managing processes rely on these numbers, and particularly on the PID number.
Before you can manage processes, you must be able to identify them. The ps and top utilities can help you identify processes. In either case, you can search for processes in various ways, such as by name or by resource use. You may also want to identify how much memory your processes are consuming, which you can do with the free command.
The simplest tool for identifying processes is ps, which produces a process listing. Listing 9.1 shows an example of ps in action. In this example, the -u option restricts output to processes owned by the specified user (rodsmith), while --forest creates a display that shows parent/child relationships.
Listing 9.1: Output of ps -u rodsmith --forest
$ ps -u rodsmith --forest PID TTY TIME CMD 2451 pts/3 00:00:00 bash 2551 pts/3 00:00:00 ps 2496 ? 00:00:00 kvt 2498 pts/1 00:00:00 bash 2505 pts/1 00:00:00 \_ nedit 2506 ? 00:00:00 \_ csh 2544 ? 00:00:00 \_ xeyes 19221 ? 00:00:01 dfm
The version of ps used in most Linux distributions combines features from several earlier ps implementations. The result is a huge selection of sometimes redundant options.
Listing 9.2 shows a second example of ps. In this example, the u option adds informational columns, while U rodsmith restricts output to processes owned by rodsmith. The ps command supports a huge number of options (consult its man page for details).
Listing 9.2: Output of ps u U rodsmith
$ ps u U rodsmith USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND rodsmith 19221 0.0 1.5 4484 1984 ? S May07 0:01 dfm rodsmith 2451 0.0 0.8 1856 1048 pts/3 S 16:13 0:00 -bash rodsmith 2496 0.2 3.2 6232 4124 ? S 16:17 0:00 /opt/kd rodsmith 2498 0.0 0.8 1860 1044 pts/1 S 16:17 0:00 bash rodsmith 2505 0.1 2.6 4784 3332 pts/1 S 16:17 0:00 nedit rodsmith 2506 0.0 0.7 2124 1012 ? S 16:17 0:00 /bin/cs rodsmith 2544 0.0 1.0 2576 1360 ? S 16:17 0:00 xeyes rodsmith 2556 0.0 0.7 2588 916 pts/3 R 16:18 0:00 ps u U
Because ps ax produces commands with their options, using grep to search for a string in the output returns the searched-for command, as well as the grep command itself.
Given the large number of ps options, different users can have different favorite ways to use the program. I find that typing ps ax usually produces the information I want, including PID values and command names (including command-line options) for all the processes on the computer. Adding u (as in ps aux) adds usernames, CPU loads, and a few other tidbits. The sheer scope of the information produced, however, can be overwhelming. One way to narrow this scope is to pipe the results through grep, which eliminates lines that don't include the search criterion you specify. For instance, if you want to know the PID number for the gedit process, you can do so like this:
$ ps ax | grep gedit 27946 pts/8 Sl 0:00 gedit 27950 pts/8 S+ 0:00 grep --colour=auto gedit
This command reveals that gedit has a PID value of 27946. This is usually the most important information when you use ps, since you'll use the PID value to change a process's priority or terminate it.
Although ps can return process priority and CPU use information, the program's output is usually sorted by PID number, and provides information at only a single moment in time. If you want to quickly locate CPU- or memory-hogging processes, or if you want to study how resource use varies over time, another tool is more appropriate: top. This program is essentially an interactive version of ps. Figure 9.4 shows top running in a GNOME Terminal window.
By default, top sorts its entries by CPU use, and it updates its display every few seconds. You'll need to be familiar with the purposes and normal habits of programs running on your system in order to determine if a CPU-hungry application is misbehaving; the legitimate needs of different programs vary so much that it's impossible to give a simple rule for judging when a process is consuming too much CPU time.
You can do more with top than watch it update its display. When it's running, you can enter any of several single-letter commands, some of which prompt you for additional information, as summarized in Table 9.1. Additional commands are described in top's man page.
One of the pieces of information provided by top is the load average, which is a measure of the demand for CPU time by applications. In Figure 9.4, you can see three load-average estimates on the top line; these correspond to the current load average and two previous measures. Load averages can be interpreted as follows:
Most computers sold today are multi-core models, but single-core models dominated the marketplace prior to about 2006.
The w command, described in Chapter 13, can tell you how much CPU time entire terminal sessions are consuming.
The load average can be useful in detecting runaway processes. For instance, if a system normally has a load average of 0.5 but it suddenly gets stuck at a load average of 2.5, a couple of CPU-hogging processes may have hung—that is, become unresponsive. Hung processes sometimes needlessly consume a lot of CPU time. You can use top to locate these processes and, if necessary, kill them.
Processes consume a number of system resources, the most important of these being CPU time and memory. As already noted, top sorts your processes by CPU time by default, so you can identify processes that are consuming the most CPU time. You can press the M key within top to have it sort by memory use, thus identifying the processes that are consuming the most memory. As with CPU time, you can't say that a process is consuming too much memory simply because it's at the top of the list, though; some programs legitimately consume a great deal of memory. Nonetheless, sometimes a program consumes too much memory, either because of inefficient coding or because of a memory leak—a type of program bug in which the program requests memory from the kernel and then fails to return it when it's done with the memory. A program with a memory leak consumes increasing amounts of memory, sometimes to the point where it interferes with other programs. As a short-term solution, you can usually terminate the program and launch it again, which resets the program's memory consumption, something like draining a sink that's filled with water from a leaky faucet. The problem will recur, but if the memory leak is small enough, you'll at least be able to get useful work done in the meantime.
The kernel grants programs access to sets of memory addresses, which the programs can then use. When a program is done, it should release its memory back to the kernel.
If you want to study the computer's overall memory use, the free command is useful. This program generates a report on the computer's total memory status:
$ free total used free shared buffers cached Mem: 7914888 7734456 180432 0 190656 3244720 -/+ buffers/cache: 4299080 3615808 Swap: 6291452 1030736 5260716
The Mem: line reveals total random access memory (RAM) statistics, including the total memory in the computer (minus whatever is used by the motherboard and kernel), the amount of memory used, and the amount of free memory. This example shows that most of the computer's memory is in use. Such a state is normal, since Linux puts otherwise unused memory to use as buffers and caches, which help speed up disk access. Thus, the Mem: line isn't the most useful; instead, you should examine the -/+ buffers/cache: line, which shows the total memory used by the computer's programs. In this example, 4,299,080 KiB of 7,914,888 KiB are in use, leaving 3,615,808 KiB free. In other words, a bit over half the computer's memory is in use by programs, so there should be no memory-related performance problems.
The Swap: line reveals how much swap space Linux is using. Swap space is disk space that's set aside as an adjunct to memory. Linux uses swap space when it runs out of RAM, or when it determines that RAM is better used for buffers or caches than to hold currently inactive programs. In this example, 1,030,736 KiB of swap space is in use, with 6,291,452 KiB total, for 5,260,716 free. Swap space use is generally quite low, and if it rises very much, you can suffer performance problems. In the long run, increasing the computer's RAM is generally the best solution to such problems. If you're suffering from performance problems because of excessive swap use and you need immediate relief, terminating some memory-hogging programs can help. Memory leaks, described earlier, can lead to such problems, and terminating the leaking program can restore system performance to normal.
The free command supports a number of options, most of which modify its display format. The most useful of these is -m, which causes the display to use units of mebibytes (MiB) rather than the default of kibibytes (KiB).
Many programs that run in the background (that is, daemons) write information about their normal operations to log files, which are files that record such notes. Consulting log files can therefore be an important part of diagnosing problems with daemons. The first step in doing this is to locate your log files. In some cases, you may need to tell the program to produce more verbose output to help track down the problem, so I provide some pointers on how to do that. Finally, I describe the kernel ring buffer, which isn't technically a log file but can fill a similar role for kernel information.
Linux stores most log files in the /var/log directory tree. Some log files reside in that directory, but some servers create entire subdirectories in which to store their own log files. Table 9.2 summarizes some common log files on many Linux systems. In addition, many server programs not described in this book add their own log files or subdirectories of /var/log. If you experience problems with such a server, checking its log files can be a good place to start troubleshooting.
Log file details vary between distributions, so some of the files in Table 9.2 may not be present on your system, or the files you find may have different names.
Log file rotation occurs late at night, so it won't happen if you shut off your computer. Leave it running overnight periodically to ensure log files are rotated.
Log files are frequently rotated, meaning that the oldest log file is deleted, the latest log file is renamed with a date or number, and a new log file is created. For instance, if it's rotated on December 1, 2012, /var/log/messages will become /var/log/messages-20121201, /var/log/messages-1.gz, or something similar, and a new /var/log/messages will be created. This practice keeps log files from growing out of control.
Most log files are plain text files, so you can check them using any tool that can examine text files, such as less or a text editor. One particularly handy command is tail, which displays the last ten lines of a file (or as many lines as you specify with the -n option). For instance, typing tail /var/log/messages shows you the last ten lines of that file.
Note that not all programs log messages. Typically, only daemons do so; ordinary user programs display error messages in other ways—in GUI dialog boxes or in a text-mode terminal. If you think a program should be logging data but you can't find it, consult its documentation. Alternatively, you can use grep to try to find the log file to which the program is sending its messages. For instance, typing grep sshd /var/log/* finds the files in which the string sshd (the SSH daemon's name) appears.
CREATING LOG FILES
Some programs create their own log files; however, most rely on a utility known generically as the system log daemon to do this job. This program's process name is generally syslog or syslogd. Like other daemons, it's started during the boot process by the system startup scripts. Several system log daemon packages are available. Some of them provide a separate tool, klog or klogd, to handle logging messages from the kernel separately from ordinary programs.
You can modify the behavior of the log daemon, including adjusting the files to which it logs particular types of messages, by adjusting its configuration file. The name of this file depends on the specific daemon in use, but it's typically /etc/rsyslog.conf or something similar. The details of log file configuration are beyond the scope of this book, but you should be aware that such details can be altered. This fact accounts for much of the distribution-to-distribution variability in log file features.
Once it's running, a log daemon accepts messages from other processes using a technique known as system messaging. It then sorts through the messages and directs them to a suitable log file depending on the message's source and a priority code.
Sometimes log files don't provide enough information to pin down the source of a problem. Fortunately, many programs that produce log file output can be configured to produce more such output. Unfortunately, doing so can sometimes make it harder to sift through all the entries for the relevant information.
The procedure for increasing the verbosity of log file output varies from one program to another. Typically, you must set an option in the program's configuration file. You should consult the program's documentation to learn how to do this.
Because the kernel ring buffer has a limited size, its earliest entries can be lost if the computer runs for a long time or if something produces many entries.
The kernel ring buffer is something like a log file for the kernel; however, unlike other log files, it's stored in memory rather than in a disk file. Like regular log files, its contents continue to change as the computer runs. To examine the kernel ring buffer, you can type dmesg. Doing so creates copious output, though, so you'll typically pipe the output through less:
$ dmesg | less
Alternatively, if you know that the information you want will be associated with a particular string, you can use grep to search for it. For instance, to find kernel ring buffer messages about the first hard disk, /dev/sda, you might type the following:
$ dmesg | grep sda
Kernel ring buffer messages can be particularly arcane; however, they can also be invaluable in diagnosing hardware and driver problems, since it's the kernel's job to interface with hardware. You might try searching the kernel ring buffer if a hardware device is behaving strangely. Even if you don't understand a message you find, you could try feeding that message into a Web search engine or passing it on to a more knowledgeable colleague for advice.
Some distributions place a copy of the kernel ring buffer when the system first boots in /var/log/dmesg or a similar file. You can consult this file if the computer has been running for long enough for its earliest entries to be lost. If you want to create such a file on a distribution that doesn't do so by default, you can edit the /etc/rc.d/rc.local file and add the following line to its end:
dmesg > /var/log/dmesg
An operating Linux computer can be thought of as consisting of running programs—that is, processes. Managing processes begins with managing the programs that are installed on the computer, which is a task you can perform with package management tools such as rpm or dpkg. You can learn what processes are running by using tools such as ps and top. Log files can help you learn about the actions of daemons, which may not be able to communicate error messages through the type of text-mode or GUI output that other programs can generate.
SUGGESTED EXERCISES
3.146.105.194