Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

D. BothUsing and Administering Linux: Volume 1https://doi.org/10.1007/978-1-4842-5049-5_2

2. Introduction to Operating Systems

David Both¹

(1)

Raleigh, NC, USA

Objectives

In this chapter you will learn to

Describe the functions of the main hardware components of a computer
List and describe the primary functions of an operating system
Briefly outline the reasons that prompted Linus Torvalds to create Linux
Describe how the Linux core utilities support the kernel and together create an operating system

Choice – Really!

Every computer requires an operating system. The operating system you use on your computer is at least as important – or more so – than the hardware you run it on. The operating system (OS) is the software that determines the capabilities and limits of your computer or device. It also defines the personality of your computer.

The most important single choice you will make concerning your computer is that of the operating system which will create a useful tool out of it. Computers have no ability to do anything without software. If you turn on a computer which has no software program, it simply generates revenue for the electric company in return for adding a little heat to the room. There are far less expensive ways to heat a room. The operating system is the first level of software which allows your computer to perform useful work. Understanding the role of the operating system is key to making informed decisions about your computer.

Of course, most people do not realize that there even is a choice when it comes to operating systems. Fortunately, Linux does give us a choice. Some vendors such as EmperorLinux, System76, and others are now selling systems that already have Linux installed. Others, like Dell, sometimes try out the idea of Linux by selling a single model with few options.

We can always just purchase a new computer, install Linux on it, and wipe out whatever other operating system might have previously been there. My preference is to purchase the parts from a local computer store or the Internet and build my own computers to my personal specifications. Most people don’t know that they have either of these options and, if they did, would not want to try anyway.

What is an operating system?

Books about Linux are books about an operating system. So – what is an operating system? This is an excellent question – one which most training courses and books I have read either skip over completely or answer very superficially. The answer to this question can aid the SysAdmin’s understanding of Linux and its great power.

The answer is not simple.

Many people look at their computer’s display and see the graphical (GUI¹) desktop and think that is the operating system. The GUI is only a small part of the operating system. It provides an interface in the form of a desktop metaphor that is understandable to many users. It is what is underneath the GUI desktop that is the real operating system. The fact is that for advanced operating systems like Linux, the desktop is just another application, and there are multiple desktops from which to choose. We will cover the Xfce desktop in Chapter 6 of this volume because that is the desktop I recommend for use with this course. We will also explore window managers, a simpler form of desktop, in Chapter 16 of this volume.

In this chapter and throughout the rest of this course, I will elaborate on the answer to this question, but it is helpful to understand a little about the structure of the hardware which comprises a computer system. Let’s take a brief look at the hardware components of a modern Intel computer.

Hardware

There are many different kinds of computers from single-board computers (SBC) like the Arduino and the Raspberry Pi to desktop computers, servers, mainframes, and supercomputers. Many of these use Intel or AMD processors, but others do not. For the purposes of this series of books, I will work with Intel X86_64 hardware. Generally, if I say Intel, you can also assume I mean the X86_64 processor series and supporting hardware, and that AMD X86_64 hardware should produce the same results, and the same hardware information will apply.

Motherboard

Most Intel-based computers have a motherboard that contains many components of the computer such as bus and I/O controllers. It also has connectors to install RAM memory and a CPU, which are the primary components that need to be added to a motherboard to make it functional. Single-board computers are self-contained on a single board and do not require any additional hardware because components such as RAM, video, network, USB, and other interfaces are all an integral part of the board.

Some motherboards contain a graphics processing unit (GPU) to connect the video output to a monitor. If they do not, a video card can be added to the main computer I/O bus, usually PCI², or PCI Express (PCIe).³ Other I/O devices like a keyboard, mouse, and external hard drives and USB memory sticks can be connected via the USB bus. Most modern motherboards have one or two Gigabit Ethernet network interface cards (NIC) and four or six SATA⁴ connectors for hard drives.

Random-access memory (RAM) is used to store data and programs while they are being actively used by the computer. Programs and data cannot be used by the computer unless they are stored in RAM from where they can be quickly moved into the CPU cache. RAM and cache memory are both volatile memory; that is, the data stored in them is lost if the computer is turned off. The computer can also erase or alter the contents of RAM, and this is one of the things that gives computers their great flexibility and power.

Hard drives are magnetic media used for long-term storage of data and programs. Magnetic media is nonvolatile; the data stored on a disk remains even when power is removed from the computer. DVDs and CD-ROM store data permanently and can be read by the computer but not overwritten. The exception to this is that some DVD and CD-ROM disks are re-writable. ROM means read-only memory because it can be read by the computer but not erased or altered. Hard drives and DVD drives are connected to the motherboard through SATA adapters.

Solid-state drives (SSDs) are the solid state equivalent of hard drives. They have the same characteristics in terms of the long-term storage of data because it is persistent through reboots and when the computer is powered off. Also like hard drives with rotating magnetic disks, SSDs allow data to be erased, moved, and managed when needed.

Printers are used to transfer data from the computer to paper. Sound cards convert data to sound as well as the reverse. USB storage devices can be used to store data for backup or transfer to other computers. The network interface cards (NICs) are used to connect the computer to a network, hardwired or wireless, so that it can communicate easily with other computers attached to the network.

The processor

Let’s take a moment to explore the CPU and define some terminology in an effort to help reduce confusion. Five terms are important when we talk about processors: processor, CPU, socket, core, and thread. The Linux command lscpu, as shown in Figure 2-1, gives us some important information about the installed processor(s) as well as clues about terminology. I use my primary workstation for this example.

../images/473415_1_En_2_Chapter/473415_1_En_2_Fig1_HTML.png — Figure 2-1
The output of the lscpu command gives us some information about the processor installed in a Linux host. It also helps us understand the current terminology to use when discussing processors

The first thing to notice in Figure 2-1 is that the term “processor” never appears. The common usage for the term “processor”⁵ refers generically to any hardware unit that performs some form of computations. It can refer to the CPU⁶ – central processing unit – of the computer, to a graphic processing unit (GPU⁷) that performs calculations relating to graphical video displays, or any number of other types of processors. The terms processor and CPU tend to be used interchangeably when referring to the physical package that is installed in your computer.

Using Intel terminology, which can be a bit fluid, the processor is the physical package that can contain one or more computing cores. Figure 2-2 shows an Intel i5-2500 processor which contains four cores. Because the processor package is plugged into a socket and a motherboard may have multiple sockets, the lscpu utility numbers the sockets. Figure 2-1 shows the information for the processor in socket number 1 on the motherboard. If this motherboard had additional sockets, lscpu would list them separately.

../images/473415_1_En_2_Chapter/473415_1_En_2_Fig2_HTML.jpg — Figure 2-2
An Intel Core i5 processor may contain one, two, or four cores. Photo courtesy of Wikimedia Commons, CC by SA 4 International

A core, which is sometimes referred to as a compute core, is the smallest physical hardware component of a processor that can actually perform arithmetic and logical computations, that is, it is composed of a single arithmetic and logic unit (ALU)⁸ and its required supporting components. Every computer has at least one processor with one or more cores. Most modern Intel processors have more – two, four, or six cores, and many processors have eight or more cores. They make up the brains of the computer. They are the part of the computer which is responsible for executing each of the instructions specified by the software utilities and application programs.

The line in the lscpu results that specifies the number of cores contained in the processor package is “Core(s) per socket.” For this socket on my primary workstation, there are sixteen (16) cores. That means that there are 16 separate computing devices in the processor plugged into this socket.

But wait – there’s more! The line “CPU(s)” shows that there are 32 CPUs on this socket. How can that be? Look at the line with the name “Thread(s) per core,” and the number there is 2, so 16 x 2 = 32. Well that is the math but not the explanation. The short explanation is that compute cores are really fast. They are so fast that a single stream of instructions and data is not enough to keep them busy all the time even in a very compute intensive environment. The details of why this is so are beyond the scope of this book but suffice it to say that before hyper-threading, most compute cores would sit waiting with nothing to do, while the slower external memory circuitry tried to feed them sufficient steams of program instructions and data to them to keep them active.

Rather than let precious compute cycles go to waste in high-performance computing environments, Intel developed hyper-threading technology that allows a single core to process two streams of instructions and data by switching between them. This enables a single core to perform almost as well as two. So the term CPU is used to specify that a single hyper-threading core is reasonably close to the functional equivalent of two CPUs.

But there are some caveats. Hyper-threading is not particularly helpful if all you are doing is word processing and spreadsheets. Hyper-threading is intended to improve performance in high-performance computing environments where every CPU compute cycle is important in speeding the results.

Peripherals

Peripherals are hardware devices that can be plugged into the computer via the various types of interface ports. USB devices such as external hard drives and thumb drives are typical of this type of hardware. Other types include keyboards, mice, and printers.

Printers can also be connected using the very old parallel printer ports which I still see on some new motherboards, but most are USB capable of being attached using USB or a network connection. Displays are commonly connected using HDMI, DVI, DisplayPort, or VGA connectors.

Peripheral devices can also include such items as USB hubs, disk drive docking stations, plotters, and more.

The operating system

All of these hardware pieces of the computer must work together. Data must be gotten into the computer and moved about between the various components. Programs must be loaded from long-term storage on the hard drive into RAM where they can be executed. Processor time needs to be allocated between running applications. Access to the hardware components of the computer such as RAM, disk drives, and printers by application programs must be managed.

It is the task of the operating system to provide these functions. The operating system manages the operation of the computer and of the application software which runs on the computer.

The definition

A simple definition of an operating system is that it is a program, much like any other program. It is different only in that its primary function is to manage the movement of data in the computer. This definition refers specifically to the kernel of the operating system.

The operating system kernel manages access to the hardware devices of the computer by utility and application programs. The operating system also manages system services such as memory allocation – the assignment of specific virtual memory locations to various programs when they request memory – the movement of data from various storage devices into memory where it can be accessed by the CPU, communications with other computers and devices via the network, display of data in text or graphic format on the display, printing, and much more.

The Linux kernel provides an API – application programming interface – for other programs to use in order to access the kernel functions. For example, a program that needs to have more memory allocated to its data structures uses a kernel function call to request that memory. The kernel then allocates the memory and notifies the program that the additional memory is available.

The Linux kernel also manages access to the CPUs as computing resources. It uses a complex algorithm to determine which processes have are allocated some CPU time, when, and for how long. If necessary, the kernel can interrupt a running program in order to allow another program to have some CPU time.

An operating system kernel like Linux can do little on its own. It requires other programs – utilities – that can be used to perform basic functions such as create a directory on the hard drive and then other program utilities to access that directory, create files in that directory, and then manage those files. These utility programs perform functions like creating files, deleting files, copying files from one place to another, setting display resolution, and complex processing of textual data. We will cover the use of many of these utilities as we proceed through this book.

Typical operating system functions

Any operating system has a set of core functions which are the primary reason for its existence. These are the functions that enable the operating system to manage itself, the hardware on which it runs, and the application programs and utilities that depend upon it to allocate system resources to them:

Memory management
Managing multitasking
Managing multiple users
Process management
Interprocess communication
Device management
Error handling and logging

Let’s look briefly at these functions.

Memory management

Linux and other modern operating systems use advanced memory management strategies to virtualize real memory – random-access memory⁹ (RAM) and swap memory (disk) – into a single virtual memory space which can be used as if it were all physical RAM. Portions of this virtual memory¹⁰ can be allocated by the memory management functions of the kernel to programs that request memory.

The memory management components of the operating system are responsible for assigning virtual memory space to applications and utilities and for translation between virtual memory spaces and physical memory. The kernel allocates and deallocates memory and assigns physical memory locations based upon requests, either implicit or explicit, from application programs. In cooperation with the CPU, the kernel also manages access to memory to ensure that programs only access those regions of memory which have been assigned to them. Part of memory management includes managing the swap partition or file and the movement of memory pages between RAM and the swap space on the hard drive.

Virtual memory eliminates the need for the application programmer to deal directly with memory management because it provides a single virtual memory address space for each program. It also isolates each application’s memory space from that of every other, thus making the program’s memory space safe from being overwritten or viewed by other programs.

Multitasking

Linux, like most modern operating systems, can multitask. That means that it can manage two, three, or hundreds of processes at the same time. Part of process management is managing multiple processes that are all running on a Linux computer.

I usually have several programs running at one time such as LibreOffice Write which is a word processor, an e-mail program, a spreadsheet, a file manager, a web browser, and usually multiple terminal sessions in which I interact with the Linux command-line interface (CLI). Right now, as I write this sentence, I have multiple documents open in several LibreOffice Write windows. This enables me to see what I have written in other documents and to work on multiple chapters at the same time.

But those programs usually do little or nothing until we give them things to do by typing words into the word processor or clicking an e-mail to display it. I also have several terminal emulators running and use them to log in to various local and remote computers for which I manage and have responsibility.

Linux itself always has many programs running in the background – called daemons – programs that help Linux manage the hardware and other software running on the host. These programs are usually not noticed by users unless we specifically look for them. Some of the tools you will learn about in this book can reveal these otherwise hidden programs.

Even with all of its own programs running in the background and users’ programs running, a modern Linux computer uses a few compute cycles and wastes most of its CPU cycles waiting for things to happen. Linux can download and install its own updates while performing any or all of the preceding tasks simultaneously – without the need for a reboot. Wait... what?! That’s right. Linux does not usually need to reboot before, during, or after installing updates or when installing new software. After a new kernel or glibc (General C Libraries) is installed, however, you may wish to reboot the computer to activate it, but you can do that whenever you want and not be forced to reboot multiple times during an update or even stop doing your work while the updates are installed.

Multiuser

The multitasking functionality of Linux extends to its ability to host multiple users – tens or hundreds of them – all running the same or different programs at the same time on one single computer.

Multiuser capabilities means a number of different things. First, it can mean a single user who has logged in multiple times via a combination of the GUI desktop interface and via the command line using one or more terminal sessions. We will explore the extreme flexibility available when using terminal sessions a bit later in this course. Second, multiuser means just that – many different users logged in at the same time, each doing their own thing, and each isolated and protected from the activities of the others. Some users can be logged in locally and others from anywhere in the world with an Internet connection if the host computer is properly configured.

The role of the operating system is to allocate resources to each user and to ensure that any tasks, that is, processes, they have running have sufficient resources without impinging upon the resources allocated to other users.

Process management

The Linux kernel manages the execution of all tasks running on the system. The Linux operating system is multitasking from the moment it boots up. Many of those tasks are the background tasks required to manage a multitasking and – for Linux – a multiuser environment. These tools take only a small fraction of the available CPU resources available on even modest computers.

Each running program is a process. It is the responsibility of the Linux kernel to perform process management.¹¹

The scheduler portion of the kernel allocates CPU time to each running process based on its priority and whether it is capable of running. A task which is blocked – perhaps it is waiting for data to be delivered from the disk, or for input from the keyboard – does not receive CPU time. The Linux kernel will also preempt a lower priority task when a task with a higher priority becomes unblocked and capable of running.

In order to manage processes, the kernel creates data abstractions that represent that process. Part of the data required is that of memory maps that define the memory that is allocated to the process and whether it is data or executable code. The kernel maintains information about the execution status such as how recently the program had some CPU time, how much time, and a number called the “nice” number. It uses that information and the nice number to calculate the priority of the process. The kernel uses the priority of all of the process to determine which process(es) will be allocated some CPU time.

Note that not all processes need CPU time simultaneously. In fact, for most desktop workstations in normal circumstances, usually only two or three processes at the most need to be on the CPU at any given time. This means that a simple quad-core processor can easily handle this type of CPU load.

If there are more programs – processes – running than there are CPUs in the system, the kernel is responsible for determining which process to interrupt in order to replace it with a different one that needs some CPU time.

Interprocess communication

Interprocess communication (IPC) is vital to any multitasking operating system. Many programs must be synchronized with each other to ensure that their work is properly coordinated. Interprocess communication is the tool that enables this type of inter-program cooperation.

The kernel manages a number of IPC methods. Shared memory is used when two tasks need to pass data between them. The Linux clipboard is a good example of shared memory. Data which is cut or copied to the clipboard is stored in shared memory. When the stored data is pasted into another application, that application looks for the data in the clipboard’s shared memory area. Named pipes can be used to communicate data between two programs. Data can be pushed into the pipe by one program, and the other program can pull the data out of the other end of the pipe. A program may collect data very quickly and push it into the pipe. Another program may take the data out of the other end of the pipe and either display it on the screen or store it to the disk, but it can handle the data at its own rate.

Device management

The kernel manages access to the physical hardware through the use of device drivers. Although we tend to think of this in terms of various types of hard drives and other storage devices, it also manages other input/output (I/O) devices such as the keyboard, mouse, display, printers, and so on. This includes management of pluggable devices such as USB storage devices and external USB and eSATA hard drives.

Access to physical devices must be managed carefully, or more than one application might attempt to control the same device at the same time. The Linux kernel manages devices so that only one program actually has control of or access to a device at any given moment. One example of this is a COM port.¹² Only one program can communicate through a COM port at any given time. If you are using the COM port to get your e-mail from the Internet, for example, and try to start another program which attempts to use the same COM port, the Linux kernel detects that the COM port is already in use. The kernel then uses the hardware error handler to display a message on the screen that the COM port is in use.

For managing disk I/O devices, including USB, parallel and serial port I/O, and filesystem I/O, the kernel does not actually handle physical access to the disk but rather manages the requests for disk I/O submitted by the various running programs. It passes these requests on to the filesystem, whether it be EXT[2,3,4], VFAT, HPFS, CDFS (CD-ROM file system), or NFS (Network Filesystem, or some other filesystem types), and manages the transfer of data between the filesystem and the requesting programs.

We will see later how all types of hardware – whether they are storage devices or something else attached to a Linux host – are handled as if they were files. This results in some amazing capabilities and interesting possibilities.

Error handling

Errors happen. As a result, the kernel needs to identify these errors when they occur. The kernel may take some action such as retrying the failing operation, displaying an error message to the user, and logging the error message to a log file.

In many cases, the kernel can recover from errors without human intervention. In others, human intervention may be required. For example, if the user attempts to unmount¹³ a USB storage device that is in use, the kernel will detect this and post a message to the umount program which usually sends the error message to the user interface. The user must then take whatever action necessary to ensure that the storage device is no longer in use and then attempt to unmount the device.

Utilities

In addition to its kernel functions, most operating systems provide a number of basic utility programs which enable users to manage the computer on which the operating system resides. These are the commands such as cp, ls, mv, and so on, as well as the various shells, such as bash, ksh, csh and so on, which make managing the computer so much easier.

These utilities are not truly part of the operating system; they are merely provided as useful tools that can be used by the SysAdmin to perform administrative tasks. In Linux, often these are the GNU core utilities. However, common usage groups the kernel together with the utilities into a single conceptual entity that we call the operating system.

A bit of history

Entire books have been written just about the history of Linux¹⁴ and Unix,¹⁵ so I will attempt to make this as short as possible. It is not necessary to know this history to be able to use Unix or Linux, but you may find it interesting. I have found it very useful to know some of this history because it has helped me to understand the Unix and Linux Philosophy and to formulate my own philosophy which I discuss in my book, The Linux Philosophy for SysAdmins¹⁶ and a good bit in the three volumes of this course.

Starting with UNICS

The history of Linux begins with UNICS which was originally written as a gaming platform to run a single game. Ken Thompson was an employee at Bell Labs in the late 1960s – before the breakup – working on a complex project called Multics. Multics was an acronym that stood for Multiplexed Information and Computing System. It was supposed to be a multitasking operating system for the GE (yes, General Electric) 645¹⁷ mainframe computer. It was a huge, costly, complex project with three very large organizations, GE, Bell Labs, and MIT, working on it.

Although Multics never amounted to much more than a small bump along the road of computer history, it did introduce a good number of then innovative features that had never before been available in an operating system. These features included multitasking and multiuser capabilities.

Ken Thompson,¹⁸ one of the developers of Multics, had written a game called Space Travel¹⁹ that ran under Multics. Unfortunately, due at least in part to the committee-driven design of Multics, the game ran very slowly. It was also very expensive to run at about $50 per iteration. As with many projects developed by committees, Multics died a slow, agonizing death. The platform on which the Space Travel game was run was no longer available.

Thompson then rewrote the game to run on a DEC PDP-7 computer similar to the one in Figure 2-3 that was just sitting around gathering dust. In order to make the game run on the DEC, he and some of his buddies, Dennis Ritchie²⁰ and Rudd Canaday, first had to write an operating system for the PDP-7. Because it could only handle two simultaneous users – far fewer than Multics had been designed for – they called their new operating system UNICS for UNiplexed Information and Computing System as a bit of geeky humor.

UNIX

At some time later, the UNICS name was modified slightly to UNIX, and that name has stuck ever since.

In 1970, recognizing its potential, Bell Labs provided some financial support for the Unix operating system and development began in earnest. In 1972 the entire operating system was rewritten in C to make it more portable and easier to maintain than the assembler it had been written in allowed for. By 1978, Unix was in fairly wide use inside AT&T Bell Labs and many universities.

Due to the high demand, AT&T decided to release a commercial version of Unix in 1982. Unix System III was based on the seventh version of the operating system. In 1983, AT&T released Unix System V Release 1. For the first time, AT&T promised to maintain upward compatibility for future versions. Thus programs written to run on SVR1 would also run on SVR2 and future releases. Because this was a commercial version, AT&T began charging license fees for the operating system.

Also, in order to promote the spread of Unix and to assist many large universities in their computing programs, AT&T gave away the source code of Unix to many of these institutions of higher learning. This caused one of the best and one of the worst situations for Unix. The best thing about the fact that AT&T gave the source code to universities was that it promoted rapid development of new features. It also promoted the rapid divergence of Unix into many distributions.

System V was an important milestone in the history of Unix. Many Unix variants are today based on System V. The most current release is SVR4 which is a serious attempt to reconverge the many variants that split off during these early years. SVR4 contains most of the features of both System V and BSD. Hopefully they are the best features.

../images/473415_1_En_2_Chapter/473415_1_En_2_Fig3_HTML.jpg — Figure 2-3
A DEC PDP-7 similar to the one used by Ken Thompson and Dennis Ritchie to write the UNICS[sic] operating system. This one is located in Oslo, and the picture was taken in 2005 before restoration began. Photo courtesy of Wikimedia, CC by SA 1.0

The Berkeley Software Distribution (BSD)

The University of California at Berkeley got into the Unix fray very early. Many of the students who attended the school added their own favorite features to BSD Unix. Eventually only a very tiny portion of BSD was still AT&T code. Because of this it was very different from, though still similar to System V. Ultimately the remaining portion of BSD was totally rewritten as well and folks using it no longer needed to purchase a license from AT&T.

The Unix Philosophy

The Unix Philosophy is an important part of what makes Unix unique and powerful. Because of the way that Unix was developed, and the particular people involved in that development, the Unix Philosophy was an integral part of the process of creating Unix and played a large part in many of the decisions about its structure and functionality. Much has been written about the Unix Philosophy. And the Linux Philosophy is essentially the same as the Unix Philosophy because of its direct line of descent from Unix.

The original Unix Philosophy was intended primarily for the system developers. In fact, the developers of Unix, led by Thompson and Ritchie, designed Unix in a way that made sense to them, creating rules, guidelines, and procedural methods and then designing them into the structure of the operating system. That worked well for system developers and that also – partly, at least – worked for SysAdmins (system administrators). That collection of guidance from the originators of the Unix operating system was codified in the excellent book, The Unix Philosophy, by Mike Gancarz, and then later updated by Mr. Gancarz as Linux and the Unix Philosophy.²¹

Another fine and very important book, The Art of Unix Programming,²² by Eric S. Raymond, provides the author’s philosophical and practical views of programming in a Unix environment. It is also somewhat of a history of the development of Unix as it was experienced and recalled by the author. This book is also available in its entirety at no charge on the Internet.²³

I learned a lot from all three of those books. They all have great value to Unix and Linux programmers. In my opinion, Linux and the Unix Philosophy and The Art of Unix Programming should be required reading for Linux programmers, system administrators, and DevOps personnel. I strongly recommend that you read these two books in particular.

I have been working with computers for over 45 years. It was not until I started working with Unix and Linux and started reading some of the articles and books about Unix, Linux, and the common philosophy they share that I understood the reasons why many things in the Linux and Unix worlds are done as they are. Such understanding can be quite useful in learning new things about Linux and in being able to reason through problem solving.

A (very) brief history of Linux

Linus Torvalds, the creator of Linux, was a student at Helsinki University in 1991. The university was using a very small version of Unix called Minix for school projects. Linus was not very happy with Minix and decided to write his own Unix-like operating system.²⁴

Linus wrote the kernel of Linux and used the then ubiquitous PC with an 80386 processor as the platform for his operating system because that is what he had on hand as his home computer. He released an early version in 1991 and the first public version in March of 1992.

Linux spread quickly, in part because many of the people who downloaded the original versions were hackers like Linus and had good ideas that they wanted to contribute. These contributors, with guidance from Torvalds, grew into a loose international affiliation of hackers dedicated to improving Linux.

Linux is now found in almost all parts of our lives.²⁵ It is ubiquitous, and we depend upon it in many places that we don’t normally even think about. Our mobile phones, televisions, automobiles, the International Space Station, most supercomputers, the backbone of the Internet, and most of the web sites on the Internet all utilize Linux.

For more detailed histories of Linux, see Wikipedia²⁶ and its long list of references and sources.

Core utilities

Linus Torvalds wrote the Linux kernel, but the rest of the operating system was written by others. These utilities were the GNU core utilities developed by Richard M. Stallman (aka, RMS) and others as part of their intended free GNU operating system. All SysAdmins use these core utilities regularly, pretty much without thinking about them. There is also another set of basic utilities, util-linux, that we should also look at because they also are important Linux utilities.

Together, these two sets of utilities comprise many of the most basic tools – the core – of the Linux system administrator’s toolbox. These utilities address tasks that include management and manipulation of text files, directories, data streams, various types of storage media, process controls, filesystems, and much more. The basic functions of these tools are the ones that allow SysAdmins to perform many of the tasks required to administer a Linux computer. These tools are indispensable because without them, it is not possible to accomplish any useful work on a Unix of Linux computer.

GNU is a recursive algorithm that stands for “Gnu’s Not Unix.” It was developed by the Free Software Foundation (FSF) to provide free software to programmers and developers. Most distributions of Linux contain the GNU utilities.

GNU coreutils

To understand the origins of the GNU core utilities, we need to take a short trip in the Wayback Machine to the early days of Unix at Bell Labs. Unix was originally written so that Ken Thompson, Dennis Ritchie, Doug McIlroy, and Joe Ossanna could continue with something they had started while working on a large multitasking and multiuser computer project called Multics. That little something was a game called “Space Travel.” As is true today, it always seems to be the gamers that drive forward the technology of computing. This new operating system was much more limited than Multics as only two users could log in at a time, so it was called Unics. This name was later changed to UNIX.

Over time, UNIX turned out to be such a success, that Bell Labs began essentially giving it away to universities and later to companies, for the cost of the media and shipping. Back in those days, system-level software was shared between organizations and programmers as they worked to achieve common goals within the context of system administration.

Eventually the PHBs²⁷ at AT&T decided that they should start making money on Unix and started using more restrictive – and expensive – licensing. This was taking place at a time when software in general was becoming more proprietary, restricted, and closed. It was becoming impossible to share software with other users and organizations.

Some people did not like this and fought it with – free software. Richard M. Stallman²⁸ led a group of rebels who were trying to write an open and freely available operating system that they call the “GNU Operating System.” This group created the GNU utilities but did not produce a viable kernel.

When Linus Torvalds first wrote and compiled the Linux kernel, he needed a set of very basic system utilities to even begin to perform marginally useful work. The kernel does not provide these commands or even any type of command shell such as bash. The kernel is useless by itself. So Linus used the freely available GNU core utilities and recompiled them for Linux. This gave him a complete operating system even though it was quite basic.

You can learn about all of the individual programs that comprise the GNU utilities by entering the command info coreutils at a terminal command line. The utilities are grouped by function to make specific ones easier to find. Highlight the group you want more information on, and press the Enter key.

There are 102 utilities in that list. It does cover many of the basic functions necessary to perform some basic tasks on a Unix or Linux host. However, many basic utilities are missing. For example, the mount and umount commands are not in this list. Those and many of the other commands that are not in the GNU coreutils can be found in the util-linux collection.

util-linux

The util-linux package of utilities contains many of the other common commands that SysAdmins use. These utilities are distributed by the Linux Kernel Organization, and virtually every distribution uses them. These 107 commands were originally three separate collections, fileutils, shellutils, and textutils, which were combined into the single package, util-linux, in 2003.

These two collections of basic Linux utilities, the GNU core utilities and util-linux, together provide the basic utilities required to administer a basic Linux system. As I researched this book, I found several interesting utilities in this list that I never knew about. Many of these commands are seldom needed. But when you do, they are indispensable. Between these two collections, there are over 200 Linux utilities. Linux has many more commands, but these are the ones that are needed to manage the most basic functions of the typical Linux host. The lscpu utility that I used earlier in this chapter is distributed as part of the util-linux package.

I find it easiest to refer to these two collections together as the Linux core utilities.

Copyleft

Just because Linux and its source code are freely available does not mean that there are no legal or copyright issues involved. Linux is copyrighted under the GNU General Public License Version 2 (GPL2). The GNU GPL2 is actually called a copyleft instead of a copyright by most people in the industry because its terms are so significantly different from most commercial licenses. The terms of the GPL allow you to distribute or even to sell Linux (or any other copylefted software), but you must make the complete source code available without restrictions of any kind, as well as the compiled binaries.

The original owner – Linus Torvalds in the case of parts of the Linux kernel – retains copyright to the portions of the Linux kernel he wrote, and other contributors to the kernel retain the copyright to their portions software no matter by whom or how much it is modified or added to.

Games

One thing that my research has uncovered and which I find interesting is that right from the beginning, it has been the gamers that have driven technology. At first it was things like Tic-Tac-Toe on an old IBM 1401, then Space Travel on Unics and the PDP-7, Adventure and many other text-based games on Unix, single player 2D video games on the IBM PC and DOS, and now first person shooter (FPS) and massively multiplayer online games (MMOGs) on powerful Intel and AMD computers with lots of RAM, expensive and very sensitive keyboards, and extremely high-speed Internet connections. Oh, yes, and lights. Lots of lights inside the case, on the keyboard and mouse, and even built into the motherboards. In many instances these lights are programmable.

AMD and Intel are intensely competitive in the processor arena, and both companies provide very high-powered versions of their products to feed the gaming community. These powerful hardware products also provide significant benefits to other communities like writers.

For me, having many CPUs and huge amounts of RAM and disk space makes it possible to run several virtual machines simultaneously. This enables me to have two or three VMs to represent the ones you will use for the experiments that will help you to explore Linux in this book, and other, crashable and disposable VMs that I use test various scenarios.

Chapter summary

Linux is an operating system that is designed to manage the flow and storage of programs and data in a modern Intel computer. It consists of a kernel, which was written by Linus Torvalds, and two sets of system-level utilities that provide the SysAdmin with the ability to manage and control the functions of the system and the operating system itself. These two sets of utilities, the GNU utilities and util-linux, together consist of a collection of over 200 Linux core utilities that are indispensable to the Linux SysAdmin.

Linux must work very closely with the hardware in order to perform many of its functions, so we looked at the major components of a modern Intel-based computer.

Exercises

1.
What is the primary function of an operating system?
2.
List at least four additional functions of an operating system.
3.
Describe the purpose of the Linux core utilities as a group.
4.
Why did Linus Torvalds choose to use the GNU core utilities for Linux instead of writing his own?

Footnotes

Graphical User Interface

Wikipedia, Conventional PCI, https://en.wikipedia.org/wiki/Conventional_PCI

Wikipedia, PCI Express, https://en.wikipedia.org/wiki/PCI_Express

Wikipedia, Serial ATA, https://en.wikipedia.org/wiki/Serial_ATA

Wikipedia, Processor, https://en.wikipedia.org/wiki/Processor

Wikipedia, Central processing unit, https://en.wikipedia.org/wiki/Central_processing_unit

Wikipedia, Graphics processing unit, https://en.wikipedia.org/wiki/Graphics_processing_unit

Wikipedia, Arithmetic Logic Unit, https://en.wikipedia.org/wiki/Arithmetic_logic_unit

Wikipedia, Random Access Memory, https://en.wikipedia.org/wiki/Random-access_memory

Wikipedia, Virtual Memory, https://en.wikipedia.org/wiki/Virtual_memory

Process management is discussed in Chapter 4 of Volume 2.

A COM (communications) port is used with serial communications such as a serial modem to connect to the Internet over telephone lines when a cable connection is not available.

The Linux command to unmount a device is actually umount.

Wikipedia, History of Linux, https://en.wikipedia.org/wiki/History_of_Linux

Wikipedia, History of Unix, https://en.wikipedia.org/wiki/History_of_Unix

Apress, The Linux Philosophy for SysAdmins, www.apress.com/us/book/9781484237298

Wikipedia, GE 645, https://en.wikipedia.org/wiki/GE_645

Wikipedia, Ken Thompson, https://en.wikipedia.org/wiki/Ken_Thompson

Wikipedia, Space Travel, https://en.wikipedia.org/wiki/Space_Travel_(video_game)

Wikipedia, Dennis Ritchie, https://en.wikipedia.org/wiki/Dennis_Ritchie

Gancarz, Mike, Linux and the Unix Philosophy, Digital Press – an imprint of Elsevier Science, 2003, ISBN 1-55558-273-7

Raymond, Eric S., The Art of Unix Programming, Addison-Wesley, September 17, 2003, ISBN 0-13-142901-9

Raymond, Eric S., The Art of Unix Programming, www.catb.org/esr/writings/taoup/html/index.html/

Torvalds, Linus and Diamond, David, Just for Fun, HarperCollins, 2001, 61–64, ISBN 0-06-662072-4