7

Controlling and Managing Processes

On a typical Linux server, there can be over a hundred processes running at any given time. The purposes of these processes range from system services, such as the Network Time Protocol (NTP) service, to processes that serve information to others, such as the Apache web server. As an administrator of Ubuntu servers, you will need to be able to manage these processes, as well as managing the resources available to them. In this chapter, we’ll take a look at process management, including the ps command, managing job control commands, and more.

As we work through these concepts, we will cover the following topics:

  • Managing jobs
  • Understanding the ps command
  • Changing the priority of processes
  • Dealing with misbehaving processes
  • Managing system processes
  • Scheduling tasks with cron

To begin our exploration of managing processes, let’s take a look first at managing jobs. Not only will this help us understand the concepts better, but it will also provide us with a better understanding of backgrounding and foregrounding.

Managing jobs

Up until now, everything we have been doing on the shell has been right in front of us, from execution to completion. We’ve installed applications, run programs, and walked through various commands. Each time, we’ve had control of our shell taken from us, and we’ve only been able to start a new task when the previous one had finished. For example, if we were to install the vim-nox package with the apt install command, we would watch helplessly while apt takes care of fetching the package and installing it for us.

While this is going on, our cursor goes away and our shell completes the task for us without allowing us to queue up another command. We can always open a new shell to the server and multitask by having two windows open at once, each doing different tasks. But that’s likely not going to be the most efficient method of multitasking when working with the command line.

Instead, we can actually background a process without waiting for it to complete while working on something else. Then, we can bring that process back to the front to return to working on it or to check whether or not it finished successfully. Think of this as a similar concept to a windowing desktop environment, or user interfaces on the Windows or macOS operating systems. We can work on an application, minimize it to get it out of the way, and then maximize it to continue working with it. Essentially, that’s the same concept as backgrounding a process in a Linux shell.

So how exactly do you background and foreground a process? This concept can be somewhat difficult to explain. In my opinion, the easiest way to learn a new concept is to try it out, and the easiest example I can think of is by (yet again) using a text editor. I promise that this time, using a text editor as an example won’t be boring. In fact, this example is extremely useful and may just become a part of your daily workflow. To do this exercise, you can use any command-line text editor you prefer, such as Vim or Nano. On Ubuntu Server, nano is usually installed by default, so you already have it if you want to go with that. If you prefer to use Vim, feel free to install the vim-nox package if you haven’t already installed it:

sudo apt install vim-nox 

You can actually install vim rather than vim-nox, but I always default to vim-nox since it features built-in support for scripting languages.

Again, feel free to use whichever text editor you feel comfortable with. In the following examples, I’ll be using nano, but if you use vim, just replace nano with vim every time you see it.

Anyway, to see backgrounding in action, open up your text editor. Feel free to open a file or just start a blank session. (If in doubt, type nano and press Enter.) With the text editor open, we can background it at any time by pressing Ctrl + z on our keyboard.

If you are using vim instead of nano, you can only background vim when you are not in insert mode, since it captures Ctrl + z rather than passing it to the shell.

Did you see what happened? You were immediately taken away from your editor and returned to the shell so you can now get back to executing commands. You should have seen some output similar to the following:

[1]+ Stopped nano 

Here, we see the job number of our process, its status, and then the name of the process. Even though the process of your text editor shows a status of Stopped, it’s still running. You can confirm this with the following command:

ps au | grep nano 

In my case, I see the nano process running with a PID of 43231:

jay        43231  0.0  0.1   5468  3632 pts/0    T    11:27   0:00 nano 

At this point, I can execute additional commands, navigate around my filesystem, and get additional work done. When I want to bring my text editor back, I can use the fg command to foreground the process, which will resume it. If I have multiple background processes, the fg command will bring back the one I was working on most recently.

I gave you an example of the ps command to show that the process was still running in the background, but there’s actually a dedicated command for that purpose, and that is the jobs command.

If you execute the jobs command, you’ll see in the output a list of all the processes running in the background:

Figure 7.1: Running the jobs command after backgrounding two nano processes

The output shows that I have two nano sessions in use, one modifying file1.txt, and the other modifying file2.txt. If I were to execute the fg command, that would bring up the nano session that’s editing file2.txt, since that was the last one I was working in. That may or may not be the one I want to return to editing, though. Since I have the job ID on the left, I can bring up a specific background process by using its ID with the fg command:

fg 1 

Knowing how to background a process can add quite a bit to your workflow. For example, let’s say, hypothetically, that I’m editing a config file for a server application, such as Apache. While I’m editing this config file, I need to consult the documentation (man page) for Apache because I forgot the syntax for something. I could open a new shell and an SSH session to my server and view the documentation in another window. This could get very messy if I open up too many shells. It would be much simpler to background the current nano session, read the documentation, and then foreground the process with the fg command to return to working on it, all from one SSH session!

To background a process, you don’t have to use Ctrl + z; you can actually background a process right when you execute it by entering a command with the ampersand symbol (&) typed at the end. To show you how this works, I’ll use htop as an example. Admittedly, this may not necessarily be the most practical example, but it does work to show you how to start a process and have it backgrounded right away.

We might not have htop installed yet, but for now, feel free to install this package (if it isn’t already) and then run it with the ampersand symbol:

sudo apt install htop 
htop & 

The first command, as you already know, installs the htop package on our server. With the second command, I’m opening htop but backgrounding it immediately. What I’ll see when it’s backgrounded is its job ID and process ID (more on this in the next section). Now, at any time, I can bring htop to the foreground with fg. Since I just backgrounded it, fg will bring htop back since it considers it the most recent. As you know, if it wasn’t the most recent, I could reference its job ID with the fg command to bring it back even if it wasn’t my most recently used job. Go ahead and practice using the ampersand symbol with a command and then bringing it back to the foreground. In the case of htop, it can be useful to start it, background it, and then bring it back anytime you need to check the performance of your server.

Keep in mind, though, that when you exit your shell, all your backgrounded processes will close. If you have unsaved work in your text editors, you’ll lose what you were working on. For this reason, if you utilize background processes, you may want to check to see if you have any pending jobs still running by executing the jobs command before logging out.

In addition, you’ll probably notice that some applications background cleanly, while others don’t. In the case of using a text editor and htop, those applications stay paused in the background, allowing us to perform other tasks and then return to those commands later. However, some applications may still spit out diagnostic text regularly in your main window, whether they’re backgrounded or not. To get even more control over your Bash sessions, you can learn how to use a multiplexer, such as tmux or screen, to allow these processes to run in their own session such that they don’t interrupt your work. Going over the use of a program such as tmux is beyond the scope of this book, but it is a useful utility to learn if you’re interested.

Being able to background and foreground a process allows us to manage tasks on the command line more effectively, and is definitely useful. Now, we can expand this and look at viewing other processes on the server, including those that we didn’t manually start such as a text editor. In the next section, we’ll take a look at the ps command, which can help us understand what is actually running on our server.

Understanding the ps command

While managing our server, we’ll need to understand what processes are running and how to manage them. Later in this chapter, we’ll work through starting, stopping, and monitoring processes. But before we get to those concepts, we first need to be able to determine what is actually running on our server. The ps command allows us to do this.

Viewing running processes with ps

When executed by itself, the ps command will show a list of processes run by the user who called the command:

Figure 7.2: The output of the ps command, when run as a normal user and with no options

In Figure 7.2, you can see that when I ran the ps command as my own user with no options, it showed me a list of processes that I am running as myself. In this case, I have a vim session open (running in the background), and in the last line, we also see ps itself, which is also included in the output.

On the left side of the output, you’ll see a number for each of the running processes. This is known as the Process ID (PID), which we mentioned in the Managing jobs section. Before we continue on, the PID is something that you really should be familiar with, so we may as well cover it right now.

Each process running on your server is assigned a PID, which differentiates it from other processes on your system. You may understand a process as vim, or top, or some other name. However, our server knows processes by their ID. When you open a program or start a process, it’s given a PID by the kernel. As you work on managing your server, you’ll find that the PID is useful to know, especially for the commands we’ll be covering in this very chapter. If you want to kill a misbehaving process, for example, a typical workflow would be for you to find the PID of that process and then reference that PID when you go to kill the process (which I’ll show you how to do in a later section). PIDs are actually more complex than just a number assigned to running processes, but for the purposes of this chapter, that’s the main purpose we’ll need to remember.

You can also use the pidof command to find the PID of a process if you know the name of it. For example, I showed you a screenshot of a vim process running with a PID of 1385. You can also do so by running the following command:

pidof vim 

The output will give you the PID(s) of the process without you having to use the ps command.

Configuring arguments to ps

Continuing with the ps command, there are several useful arguments you can give in order to change the way in which it produces an output. If you use the a option, you’ll see more information than you normally would:

ps a 

This will produce an output something like the following:

Figure 7.3: The output of the ps a command

With ps a, we’re seeing the same output as before, but with additional information, as well as column headings at the top. We now see a heading for PID, TTY, STAT, TIME, and COMMAND. From this new output, you can see that the vim processes I have running are editing a file named testfile.txt. This is great to know, because if I had more than one vim session open and one of them was misbehaving, I would probably want to know which one I specifically needed to stop.

We already saw the PID and COMMAND fields, although we didn’t see a formal heading at the top. The PID column we’ve already covered, so I won’t go into any additional detail about that. The COMMAND field tells us the actual command being run, which is very useful if we either want to ensure we’re managing the correct process or to see what a particular user is running (I’ll demonstrate how to display processes for other users soon).

The STAT, field is new; we didn’t see it when we ran ps by itself. The STAT field gives us the status code of the process, which refers to which state the process is currently in. The state can be uninterruptible sleep (D), defunct (Z), stopped (T), interruptible sleep (S), and in the run queue (R). There is also paging (W), but that is not used anymore, so there’s no need to cover it. Uninterruptible sleep is a state in which a process is generally waiting on input and cannot handle additional signals (we’ll briefly talk about signals later on in this chapter). A defunct process (also referred to as a zombie process) has, for all intents and purposes, finished its job but is waiting on the parent to perform cleanup. Defunct processes aren’t actually running, but remain in the process list and should normally close on their own. If such a process remains in the list indefinitely and doesn’t close, it can be a candidate for the kill command, which we will discuss later. A stopped process is generally a process that has been sent to the background, which will be discussed in the next section. Interruptible sleep means that the program is idle: it’s waiting for input in order to awaken.

The TTY column tells us which TTY the process is attached to. A TTY refers to a teletypewriter, which is a term used from a much different time period. In the past, during the time of big mainframes, users would use such computers using “terminals” – a device consisting of a monitor and keyboard, connected (via the wire) to the mainframe. Such devices could only display the output received from the mainframe, and receive data typed on the keyboard. Teletypewriter was the term used to refer to such devices. Obviously, we don’t use machines like these nowadays, but the concept is similar from a virtual standpoint.

On our server, we’re using our keyboard to send input to a device that then displays output to another device. In our case, the input device is our keyboard and the output device is our screen, which is either connected directly to our server or is located on our computer, which is connected to our server over a service such as SSH. On a Linux system, most processes run on a TTY, which is (for all intents and purposes) a terminal that grabs input and manages the output, similar to a teletypewriter in a virtual sense. A terminal is our method of interacting with our server.

In Figure 7.3, we have a process running on a TTY of tty1, and the other processes are running on pts/0. The TTY we see is the actual terminal device, and pts references a virtual (pseudo) terminal device. Our server is actually able to run several tty sessions, typically one to seven. Each of these can be running its own programs and processes. To understand this better, try pressing Ctrl + Alt + any function key, from F1 through F7 (if you have a physical keyboard plugged into a physical server). Each time, you should see your screen cleared and then moved to another terminal. Each of these terminals is independent of one another. Each of your function keys represents a specific TTY, so by pressing Ctrl + Alt + F6, you’re switching your display to TTY 6.

Essentially, you’re switching from TTY 1 through to TTY 7, with each being able to contain its own running processes. If you run ps a again, you’ll see any processes you start on those TTYs show up in the output as a tty session, such as tty2 or tty4. Processes that you start in a terminal emulator will be given a designation of pts, because they’re not running in an actual TTY, but rather a pseudo-TTY.

This was a long discussion for something that ends up being simple (TTY or pseudo-TTY), but with this knowledge, you should be able to differentiate between a process running on the actual server or through a shell.

Continuing, let’s take a look at the TIME field of our ps command output. This field represents the total amount of time the CPU has been utilized for that particular process. However, the time is 0:00 for each of the processes in the screenshot I’ve provided. This may be confusing at first. In my case, the vim processes in particular have been running for about 15 minutes or so since I took the screenshot, and they still show 0:00 utilization time even now. Actually, this isn’t the amount of time the process has been running, but rather the amount of time the process has been actively engaging with the CPU. In the case of vim, each of these processes is just a buffer with a file open. For the sake of comparison, the Linux machine I’m writing this chapter on has a process ID of 759 with a time of 92:51. PID 759 belongs to my X server, which provides the foundation for my graphical user interface (GUI) and windowing capabilities. However, this laptop currently has an uptime of 6 days and 22 hours as I type this, which is roughly equivalent to 166 hours, which is not the same amount of time that PID 759 is reporting in its TIME entry. Therefore, we can deduce that even though my laptop has been running for 6 days straight, the X server has only utilized 92 hours and 51 minutes of actual CPU time. In summary, the TIME column refers to the amount of time a process needs the CPU in order to calculate something and is not necessarily equal to how long something has been running, or for how long a graphical process is showing on your screen.

Let’s continue on with the ps command and look at some additional options. First, let’s see what we get when we add the u option to our previous example, which gives us the following example command:

ps au

This will produce an output that will look similar to the following:

Figure 7.4: The output of the ps au command

When you run it, you should notice the difference from the ps a command right away. With this variation, you’ll see processes listed that are being run by your user ID, as well as other users. When I run it, I see processes listed in the output for my user (jay), as well as one for root. The u option will be a common option you’re likely to use, since most of the time while managing servers, you’re probably more interested in keeping an eye on what kinds of shenanigans your users are getting themselves into. But perhaps the most common use of the ps command is the following variation:

ps aux 

With the x option added, we’re no longer limiting our output to processes within a TTY (either native or pseudo). The result is that we’ll see a lot more processes, including system-level processes that are not tied to a process we started ourselves. Go ahead and try it. In practice, though, the ps aux command is most commonly used with grep to look for a particular process or string. For example, let’s say you want to see a list of all nginx worker processes. To do that, you may execute a command such as the following:

ps aux | grep nginx 

Here, we’re executing the ps aux command as before, but we’re piping the output into grep, where we’re looking only for lines of output that include the string nginx. In practice, this is the way I often use ps, as well as the way I’ve noticed many other administrators using it. With ps aux, we are able to see a lot more output, and then we can narrow that down with search criteria by piping into grep. However, if all we wanted to do was to show processes that have a particular string, we could also do the following:

ps u -C nginx 

This would produce output containing a list of processes matching nginx, and related details. Another useful variation of the ps command is to sort the output by sorting the processes using the most CPU first:

ps aux --sort=-pcpu 

Unfortunately, that command shows a lot of output, and we would have to scroll back to the top in order to see the top processes. Depending on your terminal, you may not have the ability to scroll back very far (or at all), so the following command will narrow it down further:

ps aux --sort=-pcpu | head -n 5 

Now that is useful! With that example, I’m using the ps aux command with the --sort option, sorting by the percentage of CPU utilization (-pcpu). Then I’m piping the output into the head command, where I’m instructing it to show me only five lines (-n 5). Essentially, this is giving me a list of the top five processes that have used the most CPU since boot time. In fact, I can do the same, but with the most-used memory instead:

ps aux --sort=-pmem | head -n 5 

If you want to determine which processes are misbehaving and using a non-ordinary amount of memory or CPU, those commands will help you narrow it down. The ps command is a very useful command for your admin toolbox. Feel free to experiment with it beyond the examples I’ve provided; you can consult the man pages for the ps command to learn even more tricks. In fact, the second section of the man page for ps (under examples) gives you even more neat examples to try out.

Now that we know how to inspect running processes, in the next section, we’ll take a look at how to change the priority of the processes to ensure those that are more important are given extra attention by the CPU.

Changing the priority of processes

Processes on a Linux system can be run with an altered priority, giving some processes more priority and others less. This gives you, the administrator, full reign when it comes to ensuring that the most important processes on the system are running with an adequate level of prioritization. There are dedicated commands for this purpose: nice and renice. These commands allow you to launch a process with a specific priority, or change the priority of a process that’s already running.

Nowadays, manually editing the priority of a process is something administrators will find themselves doing less often than they used to. A processor with 32 cores (or many more) is not all that uncommon, and neither is hundreds of gigabytes of RAM. Servers nowadays are certainly more powerful than they used to be, and are nowhere near as resource-starved as machines of old. Many servers (such as virtual machines) and containers are dedicated to a single task, so process tuning may not be of extreme value anymore. However, data processing firms and companies utilizing deep learning functions may find themselves needing to fine-tune some things.

Regardless of whether or not prioritizing processes is something that will be immediately useful to you, it’s a good idea to at least understand the concept just in case you do find yourself needing to increase or decrease the priority of a process some day. Let’s revisit the ps command, this time with the -l argument:

ps -l

The output of this command will appear as follows:

Figure 7.5: The output of the ps -l command

With the output of the ps -l command, notice the PRI and NI columns. PRI refers to the priority, and NI pertains to the “niceness” value, which we’ll discuss in more detail later in this section. In this example, each process that I’m running has a PRI of 80, and an NI of 0. I didn’t change or alter any of these; these are the values that I get when I start processes with no special tweaks. A PRI value of 80 is the starting value for that value on all processes, and will change as we increase or decrease the niceness value.

As I mentioned, we have dedicated commands that allow us to alter priorities, nice and renice. To determine which to use, it all comes down to whether or not the process is already running. With regard to the processes listed in Figure 7.5, we would want to use renice to change the priority for those, since they’re all already running. If we wanted to launch a process with a specific priority right from the beginning, we would use nice instead.

For example, let’s change the process of the vim session I have running. Sure, this is a somewhat lame example, as vim isn’t a very important process. In the real world, you’d be prioritizing processes that are actually important. In my case, since the vim process has a PID of 1789, the command I would need to run in order to change the niceness would become this:

renice -n 10 -p 1789

The output of this command will appear as follows:

Figure 7.6: Changing the priority of a process with renice

If we run ps -l again, we can see the new nice value for vim:

Figure 7.7: The output of the ps -l command after changing the priority of a process

The new nice value of 10 now shows up for vim under NI, and the PRI value has increased to 90. Now, this instance of vim will run at a lower priority than my other tasks, the reason being that the higher the nice value, the lower the priority. Notice that I didn’t use sudo with the command when I changed the priority. In this example, that’s okay because I’m increasing the nice value of the process, and that’s allowed. However, let’s try to decrease the nice value without sudo, using the following command:

renice -n 5 -p 1789

As you can see in the following output, I won’t be as successful:

Figure 7.8: Attempting to decrease the priority of a process

My attempt to decrease the nice value from 10 down to 5 was blocked. If I were able to lower the niceness, then my process would be running at a higher priority. Instead, I received a Permission denied error. So essentially, users are allowed to increase the niceness of their processes, but are not allowed to decrease it, not even for processes they’ve initiated themselves. If you wish to decrease the nice value, you’ll need to do so with sudo. So essentially, if you want to be “nicer,” you can go ahead and do so. If you wish to be “meaner,” you’ll need root privileges. In addition, a user won’t be able to change the priority of a process they don’t own. So, if you attempt to use renice to change the niceness of a task running as a different user, you’ll receive an Operation not permitted error.

At this point, we know how to re-prioritize our running processes with renice. Now, let’s take a look at starting a new process with a specific priority with nice. Consider the following command:

nice -n 10 vim 

Here, we’re launching a new instance of vim, but with the priority set to a specific value right from the start. If we want to change the priority of vim again later, we’ll need to use renice. As I mentioned earlier, nice is used to launch a new process with a specific priority, and renice is for changing the priority of a pre-existing process. In this example, we launched vim and set its nice value to 10 in one command.

Changing the priority of a text editor such as vim may seem like an odd choice for a test case, and it is. But the vim editor is harmless, as the likelihood of us changing the priority of it leading to a system halt is extremely minimal. There’s no practical reason I can think of where it would be useful to re-prioritize something like a text editor. The takeaway, though, is that you can change the priority of the processes running on your server. On a real server, you may have an important process that runs and generates a report, and that report must be delivered on time. Or perhaps you have a process that generates an export of data that a client needs to have in order to make an on-time deliverable. So, if you think of the bigger picture, you can replace vim with the name of a process that is actually important for you or your organization.

You might be wondering what “nice” means in the context of the nice and renice commands. The “nice” number essentially refers to how nice a process is to other users. The higher the nice value, the lower the priority. So, a value of 20 is nicer than a value of 10. In that case, processes with a niceness of 20 are running at a lower priority, and so are kinder to the other processes on the system. The niceness can range from -20 to 19. A process with a nice value of -20 is the highest priority possible, while 19 is the lowest priority it can have. The entire system is quite a bit more complicated than this simple description. Although I refer to the nice value as the priority, it actually isn’t. The nice value is used to calculate the actual priority. But for now, if we simplify the nice value to be representative of the priority, and the nice value to equate to a lower priority, the higher the number gets, that’s enough for now.

So far, we’ve been using the nice and renice commands along with the -n option to set the nice values directly. It may be interesting to note though that you can simplify the renice command and leave out the -n option:

renice 10 42467

That command sets the nice value of the process to a positive 10, similar to our other examples. We can also use a negative number for the niceness if we want to increase the priority:

sudo renice -10 42467

Although it doesn’t save us much typing to leave out the -n option, now you know that it is a possibility. The other difference with that example was that I needed to use sudo since I’m decreasing the nice value (more on that later).

When it comes to the nice command, we can also leave out the -n option, but the command works a bit differently in this regard. The following won’t work:

nice 15 vim

The syntax of nice is a bit different, so giving it a positive number directly won’t work as it does with renice. For that, we’ll need to add a hyphen in front:

nice -15 vim

When you look at that command, you may assume we’re applying a negative number. Actually, that’s not true. Since the syntax is different with nice, the -15 value we used results in a positive 15. We needed the hyphen in front of the value to signify to nice that we’re applying a value as an option. If we actually do want to use a negative value with nice while also avoiding the -n option, we would need to use two hyphens:

nice --10 vim

The difference in syntax between the two commands with the -n option is a bit confusing in my opinion, so I recommend simply using the -n option with nice and renice, as that’s going to be more uniform between them:

nice -n 10 vim
sudo nice -n -10 vim
renice -n 10 42467
sudo renice -n -10 42467

Those examples show both nice and renice using the -n option, and setting both positive and negative values. Since the -n option is used the same way between the two commands, it may be easier to focus on committing that to memory rather than focusing on the specifics. As previously discussed, I used sudo with commands that set a negative value for niceness, since only root can change a process to, or start a process with, a niceness below 0. You’ll receive the following error if you try to do it anyway:

nice: cannot set niceness: Permission denied

This type of protection is somewhat important, because you may have some users who feel as though their processes are the most important, and try to prioritize them all the way to -19. At the end of the day, it’s better for a system administrator to make decisions on which processes are allowed to reach a niceness value in the negative.

As an administrator of Ubuntu servers, it’s up to you to decide which processes should be running, and at what priority. You’ll then determine the best way to achieve the exact system state that’s appropriate, and tuning process priority may be a part of that. If nothing else, learning the nice and renice commands gives you another utility for your toolset.

Dealing with misbehaving processes

Regarding the ps command, by this point you know how to display processes running on your server, as well as how to narrow down the output by string or resource usage. But what can you actually do with that knowledge? As much as we hate to admit it, sometimes the processes our server runs fail or misbehave and you need to restart them. If a process refuses to close normally, you may need to kill that process. In this section, we introduce the kill and killall commands to serve that purpose.

The kill command accepts a PID as an argument and attempts to close a process gracefully. In a typical workflow where you need to terminate a process that won’t do so on its own, you will first use the ps command to find the PID of the culprit. Then, knowing the PID, you can attempt to kill the process. For example, if PID 31258 needed to be killed, you could execute the following:

sudo kill 31258 

If all goes well, the process will end. You can restart it or investigate why it failed by perusing its logs.

To better understand what the kill command does, you first will need to understand the basics of Linux signals. Signals are used by both administrators and developers and can be sent to a process either by the kernel, another process, or manually with a command. A signal instructs the process of a request or change, and in some cases, to completely terminate. An example of such a signal is SIGHUP, which tells processes that their controlling terminal has exited. One situation in which this may occur is when you have a terminal emulator open, with several processes inside it running. If you close the terminal window (without stopping the processes you were running), they’ll be sent the SIGHUP signal, which basically tells them to quit (essentially, it means the shell quit or hung up).

Other examples include SIGINT (where an application is running in the foreground and is stopped by pressing Ctrl + c on the keyboard) and SIGTERM, which, when sent to a process, asks it to cleanly terminate. Yet another example is SIGKILL, which forces a process to terminate uncleanly. In addition to a name, each signal is also represented by a value, such as 15 for SIGTERM and 9 for SIGKILL. Going over each of the signals is beyond the scope of this chapter (the advanced topics of signals are mainly only useful for developers), but you can view more information about them by consulting the man page if you’re curious:

man 7 signal 

For the purposes of this section, the two types of signals we are most concerned about are SIGTERM(15) and SIGKILL(9). When we want to stop a process, we send one of these signals to it, and the kill command allows us to do just that. By default, the kill command sends signal 15 (SIGTERM), which tells the process to cleanly terminate. If successful, the process will free its memory and gracefully close. With our previous example kill command, we sent signal 15 to the process, since we didn’t clarify which signal to send.

Terminating a process with SIGKILL(9) is considered an extreme last resort. When you send signal 9 to a process, it’s the equivalent of ripping the carpet out from underneath it or blowing it up with a stick of dynamite. The process will be force-closed without giving it any time to react at all, so it’s one of those things you should avoid using unless you’ve literally tried absolutely everything you can think of. In theory, sending signal 9 can cause corrupted files, memory issues, or other shenanigans to occur. As for me, I’ve never actually run into long-term damage to software from using it, but theoretically it can happen, so you want to only use it in extreme cases. One case where such a signal may be necessary is regarding defunct or a zombie process in a situation where they don’t close on their own. These processes are basically dead already and are typically waiting on their parent processes to reap them.

If the parent process never attempts to do so, they will remain on the process list. This in and of itself may not really be a big issue, since these processes aren’t technically doing anything. But if their presence is causing problems and you can’t kill them, you could try to send SIGKILL to the process. There should be no harm in eliminating a zombie process, but you would want to give them time to be reaped first.

To send signal 9 to a process, you would use the -9 option of the kill command. It should go without saying, though, to make sure you’re executing it against the proper process ID:

sudo kill -9 31258 

Just like that, the process with a PID of 31258 will vanish without a trace. Anything it was writing to will be in limbo, and it will be removed from memory instantly. If, for some reason, the process still manages to stay running (which is extremely rare), you probably would need to reboot the server to get rid of it, which is something I’ve only seen in a few, very rare cases. An example of this is a zombie process, which is a process that shows up in the process list but isn’t impacted by having signals sent to it, since such a process won’t be scheduled for CPU time anyway. When it all comes down to it, if kill -9 doesn’t get rid of the process, nothing will.

Another method of killing a process is with the killall command, which is probably safer than the kill command (if for no other reason than there’s a smaller chance you’ll accidentally kill the wrong process). Like kill, killall allows you to send SIGTERM to a process, but unlike kill, you can do so by name. In addition, killall doesn’t just kill one process, it kills any process it finds with the name you’ve given it as an option. To use killall, you would simply execute killall along with the name of a process:

sudo killall myprocess 

Just like the kill command, you can also send signal 9 to the process as well:

sudo killall -9 myprocess 

Again, use that only when necessary. In practice, though, you probably won’t use killall -9 very often (if ever), because it’s rare for multiple processes under the same process name to become locked. If you need to send signal 9, stick to the kill command if you can.

The kill and killall commands can be incredibly useful in the situation of a stuck process, but these are commands you would hope you don’t have to use very often. Stuck processes can occur in situations where applications encounter a situation from which they can’t recover, so if you constantly find yourself needing to kill processes, you may want to check for an update to the package responsible for the service or check your server for hardware issues.

In the next section, let’s take a look at system processes that run in the background and provide a service to us or our users, such as a web server process or DHCP server.

Managing system processes

System processes, also known as daemons, are programs that run in the background on your server and are typically started automatically when it boots. We don’t usually manage these services directly as they run in the background to perform their duty, with or without needing our input. For example, if our server is a DHCP server and runs the isc-dhcp-server process, this process will run in the background, listening for DHCP requests and providing new IP assignments to them as they come in. Most of the time, when we install an application that runs as a service, Ubuntu will configure it to start when we boot our server, so we don’t have to start it ourselves. Assuming the service doesn’t run into an issue, it will happily continue performing its job forever until we tell it to stop. In Linux, services are managed by its init system, also referred to as PID 1 since the init system of a Linux system always receives that PID. In recent years, the way in which processes are managed in Ubuntu Server has changed considerably. Ubuntu has switched to systemd for its init system, which was previously Upstart until a few years ago. Ubuntu 16.04 was the first LTS release of Ubuntu with systemd, and this continues to be used today in Ubuntu 22.04. Since systemd has been the standard for quite some time now, we’ll focus our attention on the commands used with it to manage our services. Older init systems are aging out.

With systemd, services are known as units, but for all intents and purposes, the terms “service,” “daemon,” and “unit” all essentially mean the same thing. Since I started using Linux over 20 years ago, I still refer to systemd units as services, out of habit. To help us manage these “units,” systemd includes the systemctl command, which allows you to start, stop, and view the status of units on our server. To help illustrate this, I’ll use OpenSSH as an example. The name of the unit doesn’t really matter, as the syntax of the systemctl command is the same regardless of the name of the unit we’re interacting with. You can use systemctl to start, stop, or restart your Apache instance, your database server, or even use it to restart the entire networking stack. The systemctl command, with no options or parameters, assumes the list-units option, which dumps a list of units to your shell. This can be a bit messy, though, so if you already know the name of a unit you’d like to search for, you can pipe the output into grep and search for a string. This is handy in a situation where you may not know the exact name of the unit, but you know part of it:

systemctl | grep ssh 

If you want to check the health of a unit, the best way is to actually use the status keyword, which will show you some very useful information regarding the unit. This information includes whether or not the unit is running, if it’s enabled (meaning it’s configured to start at boot time), as well as the most recent log entries for the unit:

systemctl status ssh 

This command will produce an output something like the following:

Figure 7.9: Checking the status of a unit with systemctl

Most of the time, you can actually check the status of units without needing root access, but you may not see all the information available. In the screenshot, you can see several log entries for the ssh service, but some units do not show those entries without sudo. With the ssh unit in particular, we see the log entries when checking the status with or without sudo.

Another thing you may notice in the screenshot is that the name of the ssh unit is actually ssh.service, but you don’t need to include the .service part of the name, since that is implied by default. Sometimes, while viewing the status of a process with systemctl, the output may be condensed to save space on the screen. To avoid this and see the full log entries, add the -l option:

systemctl status -l ssh 

Another thing to pay attention to is the vendor preset of the unit. Most packages in Ubuntu that include a service file for systemd will enable it automatically, but other distributions typically don’t start and enable units by default (such as CentOS). In the case of the ssh example, you can see that the vendor preset is set to enabled. This means that once you install the openssh-server package, the ssh.service unit will automatically be enabled. You can confirm this by checking the Active line (where the example output says active (running)), which tells us that the unit is running. The Loaded line clarifies that the unit is enabled, so we know that the next time we start the server, ssh will be loaded automatically. Although systemd units are typically enabled and started automatically in Ubuntu when installing their package, this can still vary. When you install a new package, make sure you check the status of the unit, so you’ll be aware of its settings.

Starting and stopping a unit is just as easy; all you have to do is change the keyword you use with systemctl to start or stop in order to have the desired effect:

sudo systemctl stop ssh 
sudo systemctl start ssh 

There are additional keywords, such as restart (which takes care of the previous two command examples at the same time), and some units even feature reload, which allows you to activate new configuration settings without needing to bring down the entire application. An example of why this is useful is with Apache, which serves web pages to local or external users. If you stop Apache, all users will be disconnected from the website you’re serving. If you add a new site, you can use reload rather than restart, which will activate any new configuration you may have added without disturbing the existing connections. We’ll take a look at Apache in Chapter 14, Serving Web Content, so don’t worry too much about Apache right now. It’s just a good example of a unit with additional functionality. Not all units feature a reload option, so you should check the documentation of the application that provides the unit to be sure.

Since I mentioned starting and stopping the unit for OpenSSH in the previous examples, an interesting aside is that doing so will not disconnect a current SSH session to the server, should you have one open. If you stop the ssh service, it won’t drop your connection. Open connections are maintained, and stopping SSH only prevents new connections from happening. Therefore, SSH is different when compared to other units (such as Apache) in that existing connections aren’t dropped when restarting the unit.

As I mentioned before, if you want a unit to automatically start when the server boots, the unit will need to be enabled. Units are automatically enabled most of the time, but in case you find one that isn’t enabled, you can enable it with the enable keyword:

sudo systemctl enable ssh 

It’s just as easy to disable a unit as well:

sudo systemctl disable ssh 

You can combine the process of not only enabling a unit, but also starting it at the same time:

sudo systemctl enable --now ssh

The --now argument tells systemctl to start the unit immediately after enabling it, rather than waiting for the next boot to do so or having you also run the start argument in a separate command.

Even though systemd is primarily used for managing units, it’s actually an entire platform that manages multiple things on a Linux system, including DNS resolving, networking, and more. systemd even handles logging as well, and it provides us with the journalctl command, which we can use to view logging info (this is also why the output of systemctl status ssh was able to show us log entries).

We’ve discussed logging a bit in Chapter 4, Navigating and Essential Commands, and we’ll do so in more detail during Chapter 22, Troubleshooting Ubuntu Servers (which will also include further discussion of the journalctl command).

For now, just understand that systemd is quite extensive when it comes to the number of things it helps us manage. For the purposes of this chapter, however, if you understand how to start, stop, enable, disable, and check the status of a unit, you’re good to go for now.

Scheduling tasks with cron

Earlier in this chapter, we worked through starting processes and enabling them to run in the background, and ensuring they start as soon as the server boots. In some cases, you may need an application to perform a job at a specific time, rather than to have it always running in the background. This is where cron comes in. With cron, you can set a process, program, or script to run at a specific time, down to the minute. Each user is able to have their own set of cron configurations (known as a crontab), which can perform any function that a user would be able to do normally. The root user has a crontab as well, which allows system-wide administrative tasks to be performed. Each crontab includes a list of cron jobs (one per line), which we’ll get into shortly. To view a crontab for a user, we can use the crontab command:

crontab -l 

With the -l option, the crontab command will show you a list of jobs for the user who executed the command. If you execute it as root, you’ll see the root account’s crontab. If you execute it as user jdoe, you’ll see the crontab for jdoe, and so on. If you want to view a crontab for a user other than yourself, you can use the -u option and specify a user, but you’ll need to execute it as root or with sudo to view the crontab for someone other than the user you’re logged in as:

sudo crontab -u jdoe -l 

By default, no user has a crontab until you create one or more jobs. Therefore, you’ll probably see output such as the following when you check for your current users:

no crontab for jdoe 

To create a cron job, first log in as the user account you want the task to run under. Then, issue the following command:

crontab -e 

If you have more than one text editor on your system, you may see output similar to the following:

Figure 7.10: Selecting an editor for use with the crontab command

In this case, you’ll simply press the number corresponding to the text editor you’d like to use when creating your cron job. To set an environment variable that specifies a specific editor and edit your crontab with a single command, the following command will do exactly that:

EDITOR=vim crontab -e 

In this example, you can replace vim with whatever text editor you prefer. At this point, you should be placed in a text editor with your crontab file open. The default crontab file for each user features some helpful comments that give you some useful information regarding how cron works. To add a new job, you would scroll to the bottom of the file (after all the comments) and insert a new line. Formatting is very particular here, and the example comments in the file give you some clue as to how each line is laid out. Specifically, this part:

m h dom mon dow command 

Each cron job has six fields, each separated by at least one space or tab spaces. If you use more than one space, or tab, cron is smart enough to parse the file properly. In the first field, we have the minute in which we would like the job to occur. In the second field, we place the hour in the 24-hour format, from 0-23. The third field represents the day of the month. In this field, you can place a 5 (5th of the month), 23 (23rd of the month), and so on. The fourth field corresponds to the month, such as 3 for March or 12 for December. The fifth field is the day of the week, numbered from 0 to 6 to represent Sunday through Saturday. Finally, in the last field, we have the command to be executed. A few example crontab lines are as follows:

3 0 * * 4 /usr/local/bin/cleanup.sh 
* 0 * * * /usr/bin/apt update 
0 1 1 * * /usr/local/bin/run_report.sh 

With the first example, the cleanup.sh script, located in /usr/local/bin, will be run at 12:03 a.m. every Thursday. We know this because the minute column is set to 3, the hour column is set to 0 (midnight), the day column is 4 (Thursday), and the command column shows a fully qualified command of /usr/local/bin/cleanup.sh.

What does it mean for a command to be fully qualified? Basically, a command being fully qualified means that the entire path to the binary responsible for the command is completely typed out. In the second example, we could have simply typed apt update for the command and that would’ve probably worked just fine. However, not including the full path to the program is considered bad cron etiquette. While the command may have worked without being fully qualified, its success would depend on the application being found in the path of the user who is calling it. Not all servers are set up the same, so this might not work depending on how the shell is set up. If you include the full path, the job should run regardless of how the underlying shell is configured.

If you don’t know what the fully qualified command is, all you have to do is use the which command. This command, when used with the name of a command you’d like to run, will give you the fully qualified command if the command is located on your system.

Continuing with the second example, we’re running /usr/bin/apt update to update our server’s repository index every morning at midnight. The asterisks on each line refer to any, so with the minute column being simply *, that means that this task is eligible for any minute. Basically, the only field we clarified was the hour field, which we set to 0 in order to represent 12:00 a.m.

With the third example, we’re running the /usr/local/bin/run_report.sh script on the first day of every month at 01:00 a.m. If you notice, we set the third column (day of month) to 1, which is the same as February 1st, March 1st, and so on. This job will be run if it’s the first day of the month, but only if the current time is also 01:00 a.m., since we filled in the first and second columns, which represent the minute and hour, respectively.

Once you finish editing a user’s crontab and save it, cron is updated and, from that point forward, will execute the task at the time you select. The crontab will be executed according to the current time and date on your server, so you want to make sure that that is correct as well, otherwise you’ll have your jobs execute at unexpected times. You can view the current date and time on your server by simply issuing the date command.

To get the hang of creating jobs with cron, the best way (as always) is to practice. The second example cron job is probably a good one to experiment with, as updating your repository index isn’t going to hurt anything.

Summary

In this chapter, we learned how to manage processes. We began with a look at the ps command, which we can use to view a list of processes that are currently running. We also took a look at managing jobs, as well as killing processes that, for one reason or another, are misbehaving. We also discussed methods of changing the priority of a process, to ensure we have full control over which processes are given more processing time, and we also learned how we can schedule things to run at a later time and date with cron.

In Chapter 8, Monitoring System Resources, we’ll take a look at some ways we can keep an eye on the resources that are available on our server, where we will learn how to check disk usage and understand memory usage and swap space, as well as a looking at some utilities that can make resource management a breeze.

Relevant videos

Further reading

Join our community on Discord

Join our community’s Discord space for discussions with the author and other readers:

https://packt.link/LWaZ0

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
13.59.187.18