Process and System Management
When working with Linux, from an administrative perspective, working with processes is important. Every application or task you start on a Linux computer is started as a process. You will find that in some instances, a task may hang, or something else may happen that urges you to do some process management. In this chapter, you will learn how to monitor and manage processes. You will also learn how to schedule processes for automatic startup.
Understanding Linux Processes
When your computer boots, it will start a kernel. The kernel on its turn is responsible for starting the first process, which on modern distributions is the systemd process. This process is responsible for all other processes. When starting a process, systemd starts the process as a child of its own. If for instance you’re working from a bash session in a GNOME graphical environment, systemd starts gnome-terminal, and from there bash is started. To gnome-terminal is the parent process for bash, adnd systemd is its grandparent.
To get an overview of the relations between parent and child processes, you can use the pstree command, of which partial output is shown in listing 9-1.
o run a process, the Linux kernel works with a queue of runnable processes. In this queue, every process waits for its turn to be serviced by the scheduler. By default, Linux works with time slices for process handling. This means that every process gets a fair amount of system time before it has to make place for other processes. If a process needs more attention, you can use the nice function to increase (or decrease if necessary) the system time that is granted to the process. More on using nice on processes later in this chapter.
In some situations, you will have to stop a process yourself. This may happen if the process doesn’t reply anymore, or if the process behaves in a way that harms other processes. To stop a process, the Linux kernel will tell the responsible parent process that this process needs to be stopped. Under normal circumstances, the parent process that was responsible for starting a given process will always be present until all its children are stopped.
In the abnormal situation where the child is still there, but the parent is already stopped, the child process cannot be stopped anymore, and it becomes a zombie. From the command line there is nothing that you can do to stop a zombie process; the only solution is to restart your computer.
You will find that if zombie processes occur, often the same processes are involved. That is because the occurrence of zombie processes is often due to bad programming. So you may have to update the software that creates the zombie process to get finally rid of your zombie processes. In the following sections, you will learn how to monitor and manage processes.
Apart from zombie status, processes can be in other states as well. You can see these states when using the ps aux command, which shows current process status; these are displayed in the STAT column (see Listing 9-2). Processes can be in the following states:
You should know that there a different kinds of processes. Among these are the service processes, the so-called daemons. An example is the httpd process, which provides web services on your system. Daemon processes are automatically started when your server boots and systemd is entering a specific target. A systemd target defines the state a system should be in, and the processes and services tahat should be started to get into that state. On the flip side are the interactive processes, which typically are started by typing some command at the command line of your computer.
Finally, there are two ways in which a process can do its work to handle multiple tasks. First, it can just launch a new instance of the same process to handle the incoming request. If this is the case, you will see the same process listed multiple times in ps aux. The alternative is that the process works with one master process only, but launches a thread, which is a kind of a subprocess, for each new request that comes in. Currently, processes tend to be multithreaded, as this uses system resources more efficiently. For a Linux administrator, managing a multi-threaded process is a bit more challenging. Threads are managed from the master process itself, and not by an administrator who’s trying to manipulate them from the command line.
All work on processes that you’ll need to do will start by monitoring what the process is doing. Two commands are particularly important: top and ps. The ps command allows you to display a list of all processes that are running on your computer. Because ps lists all processes (when used as root), that makes it an excellent choice if you need to find a given process to perform management tasks on it. The top command gives an overview of the most active processes.
This overview is refreshed every 5 seconds by default. As it also offers you a possibility to perform management tasks on these active processes, top is a very useful command for process management, especially for users who are taking their first steps on the Linux command line.
The single most useful utility for process management is top. You can start it by typing the top command at the command line. Make sure that you have root permissions when doing this; otherwise, you can’t do process management. In Listing 9-3, you can see what the top screen looks like.
top basically shows you all you need to know about the current status of your system, and it refreshes its output every 5 seconds by default. Its results are divided in two parts. On the top part of the output window, you can see how busy your system is; in the lower part, you’ll see a list of the busiest processes your computer currently has.
The upper five lines of the top output (see Listing 9-3) shows you what your system currently is doing. This information can be divided into a few categories:
The numbers give an overview of the average amount of processes that has been waiting to get services in the indicated period. In general, the number that you see here shouldn’t be superior to the number of CPU cores in your computer (but exceptions to this generic guideline do exist).
The lower part of the top output shows you process information, divided in a couple of columns that are displayed by default. You should know that more columns are available than the ones displayed by default. If you want to activate the display of other columns, you should press the F key while in the top screen. This shows you a list of all columns that are available, indicating with an * which are currently active, as you can see in Listing 9-4. To toggle the status of a column, press the letter associated with that column. For instance, pressing J will show you which CPU was last used by a process.
The following list describes the columns that are listed by default:
By default, top output is sorted on CPU usage. You can sort the output on any other information as well; there are over 20 different ways to do so. Some of my favorites are listed here:
When done monitoring process activity with top, you can exit the utility. To do this, issue the q command. Apart from the interactive mode that you’ve just read about, you can also use top in batch mode. This is useful if you want to redirect the output of top to a file or pipe it to some other command. When using top in batch mode, you can’t use any of the commands discussed previously. You tell top to start in batch mode by passing some options to it when starting it:
For instance, the following would tell top to run in batch mode with a 5-second interval, doing its work two times:
top -b -d 5 -n 2
EXERCISE 9-1: MONITORING PROCESSES WITH TOP
In this exercise you’ll start some processes and monitor behavior using top.
Finding Processes with ps
If you want to manage processes from scripts in particular, the ps command is invaluable. This command shows you a list of all processes that are currently active on your computer. ps has many options, but most people use it in two ways only: ps aux and ps -ef. The value of ps is that it shows all processes in its output in a way that you can grep for the information you need. Imagine that you see in top that there is a zombie process; ps aux | grep defunc will show you which is the zombie process. Or imagine that you need the PIDs of all instances of your Apache web server; ps aux | grep httpd will give you the result.
One way of displaying all processes and their properties is by using ps aux. Listing 9-5 shows a part of the output of this command. To make it more readable I’ve piped the results of this command through less.
In the command ps aux, three options are used to ask the system to show process information. First, the option a makes sure that all processes are shown. Next, the option u gives extended usage information, whereas the option x also shows from which TTY and by what user a process is started. You can see the results in Listing 9-5, in which the following columns are listed. Because many of these columns are similar to the columns in top, I will give a short description of them only.
Note The ps command can be used in two ways, both of which go back to the time when there were two major styles in UNIX versions: the BSD style and the System V style. The command ps aux was used in the BSD style to give a list of all processes and their properties, and ps -ef was used in System V style to do basically the same. There are some minor differences, but basically both commands have the same result. So feel free to make your choice here!
The second way in which the ps command is often used is by issuing the ps -ef command. You can see a partial output of this command in Listing 9-6.
Just two columns in ps -ef are new compared to the output for ps aux. First is the PPID column. This column tells you which process was responsible for starting this process, the so-called parent process. Then there is the column with the name C, which refers to the CPU utilization of this process and hence gives the same information as the %CPU column in ps aux.
Personally, I like ps aux a lot if I need to terminate all processes that were started with the same command. On my SUSE box, it happens that the management program YaST crashes. This program basically uses two kinds of processes: processes that have yast in their command name and processes that have y2 in their command line. To get a list of PIDs for these processes, I use the following commands:
ps aux | grep yast | grep -v grep | awk '{ print $2 }' ps
aux | grep y2 | grep -v grep | awk '{ print $2 }'
Next, it is fairly easy to kill all instances of this process based on the list of PIDs that these two commands will show. You’ll read more about this in the section “Killing Processes with kill, pkill, and killall” later in this chapter.
Another useful way of showing process activity with ps, is by using ps fax. The option f shows the process list in a forest view, which allows you to easily see relations between parent and child processes. This offers an alternative way of showing parent-child relations to the pstree command.
In the preceding section, you read how you can find processes with ps and grep. There is a different option also: the pgrep command. This command is fairly easy to use: enter pgrep followed by the name of the process whose PID you are looking for, and as a result you will see all PIDs that instances of this process currently are using. For instance, if you want to know all PIDs that the Gnome processes are using, use pgrep gnome. This will display a result similar to what you see in Listing 9-7.
A useful feature of pgrep is that you can search for processes based on specific attributes as well. For instance, you can use -u to locate processes that are owned by a specific user, as in the following command:
pgrep -u linda
Also useful is that you can have it display processes if you are not sure about a property. For example, if you want to see processes that are owned by either linda or lori, use the following:
pgrep -u linda,lori
Showing Parent-Child Relations with pstree
For process management purposes, it is useful to know about parent-child relations between processes as well. You can use the pstree command without arguments to show a complete hierarchical list of all processes on your computer, or with a PID as an argument to show a process tree that starts on the selected PID only. If the output of pstree looks weird, you should use the -G option to give the result of pstree in a specific format for your terminal.
I need this to ensure proper display in a PuTTY window, for example. In Listing 9-8, you can see a partial output of this command.
In the output of pstree, you can see which process is responsible for calling which other process. For instance, in Listing 9-8, init is the first process that is started. The output of this command is generated on an older Linux distribution, where init was used as the service manager instead of the more recent systemd.
This process calls basically all the other processes such as acpid, application-bro, and so on. If a process has started other processes, you will see that with pstree as well. For instance, you can see that the pstree command used for this example listing actually is in the output listing as well, as a child of the bash process, which on its turn is started from an SSH environment.
Note Some people like to run a graphical user interface on their server; some people don’t. From the process perspective, it certainly makes sense not to run a GUI on your server. If you are not sure this really is useful, you should compare the result of pstree on a server that does have a GUI up and running with the result of the same command on a server that does not have a GUI up and running. You’ll see amazing differences as the result.
At this point you know how to monitor the generic state of your computer. You have read how to see what processes are doing and know about monitoring process activity. In this section, you’ll learn about some common process management tasks. These include killing processes that don’t listen anymore and adjusting process priority with nice. In a dedicated subsection, you can read how to manage processes from the top utility.
Killing Processes with kill, pkill, and killall
Among the most common process management tasks is the killing of processes. Killing a process, however, goes beyond the mere termination of a process. If you use the kill command or any of its alternatives, you can send a signal to the process. Basically, by sending it a signal, you give the process a command that it simply cannot ignore. A total of 32 signals are available, but of these only four are common. Table 9-1 gives an overview of these common signals.
Table 9-1. Common Process Management Signals
Signal |
Value |
Comment |
---|---|---|
SIGHUP |
1 |
Forces a process to reread its configuration without really stopping the process. Use it to apply changes to configuration files. |
SIGKILL |
9 |
Terminates the process using brute force. You risk losing data from open files when using this signal. Use it only if the process doesn’t stop after sending it a signal 15. |
SIGTERM |
15 |
Requests the process to terminate. The process may ignore this. |
SIGUSR1 |
30 |
Sends a specific user-defined signal to the process. Only works if defined within the command. |
When sending a signal to the process, you normally can choose between the signal name or the signal number. In the next three sections, you will see how to do this with the kill, pkill, and killall commands.
Killing processes with kill
The kill command provides the most common way to send signals to processes, and you will find it quite easy to use. This command works with only two arguments: the signal number or name and the PID upon which you want to act. If you don’t specify a signal number, kill by default sends signal 15, asking the process to terminate.
kill does not work with process names, just PID numbers. This means you first have to find the PIDs of the processes you want to send a signal to, which you can do with a command such as pgrep. You can specify multiple PIDs as arguments to kill. The following example shows you how to kill three PIDs with a single command:
kill 3019 3021 3022
Only some commands listen to user-defined signals. An example of these is the dd command, which you can use to clone a device. You can send this command signal USR1, which will ask dd to show its current progress. To find out whether a command listens to one of the USR signals, go to the man page for kill.
Killing processes with killall
Compared to kill, killall is a more versatile command, specifically due to its ability to work with some arguments that allow you to specify which processes you want to kill in a versatile way. For instance, you can use killall to terminate processes that have a specific file open at that time by just mentioning the file name. Some of the most useful options for killall are listed here:
For example, if you want to kill all processes that linda currently has opened, use the following command:
killall -u linda
Or if you need to terminate all http processes, use regular expressions as in the following command:
killall -r http
The third command that you can use to send signals to processes is pkill. Like killall, pkill can also look up processes based on their name or other attributes, which you can address using specific options. For instance, to kill all processes that are owned by user linda, use the following:
pkill -u linda
Another useful feature of pkill is that you can kill processes by their parent ID. For example, if you need to kill all processes that have process 1499 as their parent ID, use the following:
pkill -P 1499
Adjusting Process Priority with nice
As discussed earlier in this chapter, every process is started with a default priority. You can see the priority in the default output of the top command. By default, all processes that have the same priority are treated as equal by the operating system. If within these priorities you want to give more CPU time to a process, you can use the nice and renice commands to change their nice status. Process niceness ranges from -20 to 19. -20 means that a process is not very nice and will get the most favorable scheduling. 19 means that a process is very nice to others and gets the least favorable scheduling.
There are two ways to change the niceness of a program: use nice to start a program with the niceness that you specify, and use renice to change the niceness of a program that has already been started. The following shows how to change the niceness of top to the value of 5:
nice -n 5 top
In case you need to change the nice value for a program that is already running, you should use renice. A useful feature is the option to change the nice status of all processes that a given user has started. For instance, the following command would change the niceness of all processes linda has started to the value -5:
renice -5 -u linda
You can also just use a PID to change the nice value of a process:
renice -5 1499
You have already learned how to monitor processes using top. You’ve also learned how to manage processes using different command-line tools. From within the top interface, you can also perform some process management tasks. Two tasks are available: you can send processes a signal using kill functionality, and you can renice a process using nice functionality. To do this, use the following options from within the top interface:
EXERCISE 9-2: MANAGING PROCESSES
In this exercise you’ll learn how to manage processes using kill killall and nice.
On your computer, some processes will start automatically. In particular, these are the service processes your computer needs to do its work. Other processes are started manually. This means that you have to type a command at the command line to start them. There is also a solution between these two options. If you need a certain task to start automatically at predefined intervals, you can use cron to do so.
There are two parts in cron. First is the cron daemon crond. This process starts automatically on all computers and will check its configuration every minute to see whether it has to issue a certain task. By default, cron reads its master configuration file, which is /etc/crontab. Listing 9-9 shows what this file looks like on an Ubuntu server system.
Caution! The file /etc/crontab directs all tasks that should be scheduled through cron. Notice that you should NOT modify this file directly. After the discussion of the contents of this file, you’ll read how you should make changes to the configuration of scheduled tasks through cron. Modifications that have been made to /etc/crontab will work, but you might loose them as this file can be overwritten during package updates.
In all crontab configuration, you will find three different elements. First, you can see an indication of the time when a command should run. Next is the name of the user with whose permissions the job has to execute, and the last part is the name of the command that has to run.
You can use five time positions to indicate when a cron job has to run:
For instance, a task definition in /etc/crontab can look as follows:
10 5 3 12 * nobody /usr/bin/false
This task would start 10 minutes after 5 a.m. on December 3 only. A very common error that people make is shown in the following example:
* 5 * * * nobody /usr/bin/false
The purpose of this line is probably to run a task at 5 a.m. every morning; however, it would run every minute between 5:00 a.m. and 5:59 a.m., because the minute specification is an asterisk, which means “every.” Instead, to run the task at 5 a.m. only, the following should be specified:
0 5 * * * nobody /usr/bin/false
Apart from the system crontab, individual users can have crontabs as well. This is normally the common way to make adjustments to the scheduled tasks. Imagine that you want to make a backup every morning. To do so, you probably have a backup program, and this backup program may run automatically with the permissions of a specific user. You can, of course, make the definition in /etc/crontab, with the disadvantage that only root can schedule jobs this way. Therefore, the alterative in which users themselves specify the cron job may be more appealing. To do this, you have to use the crontab command. For instance, if user linda wants to install a cron job to send a mail message to her cell phone every morning at 6 a.m., she would use the following command:
crontab -e
This opens an editor window in which she can define the tasks that she wants cron to run automatically. Because the crontab file will be installed as her crontab file, there is no need to include a user specification there. This means just including the following line would be enough:
0 6 * * 1-5 mail -s "wakeup" [email protected] <.
Notice the use of 1-5 in the specification of the day of the week. This tells the cron process to run this job only on days 1 through 5, which is from Monday to Friday.
If you are logged in as the root user, you can also create cron jobs for other users. To do this, use crontab -u followed by the name of the user you want to create the cron job for. The command crontab -u linda, if issued as root for example, would create a cron job for user linda. This command also opens the crontab editor, which allows you to enter all the required commands. Also useful if you are root: the command crontab -l gives an overview of all the crontab jobs that are currently scheduled for a given user account.
Understanding cron.{hourly|daily|weekly|monthly}
Cron also uses four different directories to execute cron jobs at a regular interval. These are the directories /etc/cron.hourly, /etc/cron.daily, /etc/cron.weekly and /etc/cron.monthly. In these directories you can put scripts that will be executed at the indicated intervals. You can create these scripts as administrator, but many scripts will be placed here automatically when new packages are installed. The logrotate processes for instance are executed this way.
The contents of these scripts is bash shell scripting code, and they don’t contain any of the time indicators that are specific to cron. This is not needed, because the cron helper process anacron is taking care of execution of these scripts. Anacron was developed to ensure that specific tasks will be executed at a guarnateed interval. That ensures that the task will also run if the system has been down for maintenance temporarily.
Yet another way of running tasks through cron, is by creating files in /etc/cron.d. All files in the directory /etc/cron.d will be included when the cron process is started. Using this approach offers an alternative to making modifications to the /etc/crontab file. The advantage of this approach is that your changes won’t get lost during software updates. The contents of the files in /etc/cron.d is exactly the same as the contents of the lines that are added in /etc/crontab.
Summary
In this chapter, you have learned how to tune and manage processes and memory on your computer. You have learned about the way that Linux works with processes and also about memory usage on Linux. You acquired knowledge about some of the most important commands related to process management, including top and ps. In this chapter, the following commands and utilities have been discussed:
In the next chapter, you’ll learn how to configure system logging.
3.144.102.138