The reason we use computers is to manipulate data. It used to be called “Data Processing” for a reason and that was an accurate description. We still process data although it may be in the form of video and audio streams, network and wireless streams, word processing data, spreadsheets, images, and more. It is all still just data.
We work with and manipulate text data streams with the tools we have available to us in Linux. That data usually needs to be stored, and when there is a need to store data, it is always better to store it in open file formats than closed ones.
Although many user application programs store data in ASCII formats, including simple flat ASCII and XML, this chapter is about configuration data and scripts that relate directly to Linux. The files we will consider in this chapter are about system configuration.
Closed Is Impenetrable
Way back before the Registry1 was introduced with Windows 3.1, most utilities and applications stored their configuration data in .ini files. These .ini files were stored as ASCII text and were easy to access, read, and even to modify. All it took was a simple text editor to make changes to these .ini configuration files.
The registry changed all that by storing configuration data in a single, large, and impenetrable binary data file. Although individual programs could store configuration data in .ini files, the Registry was touted as a way to centralize control over program configuration, and its binary format was allegedly faster to parse than ASCII text files.
As System Administrators we have need to use many different types of data. Binary formats are by their very nature obscure and require special tools and knowledge to manipulate. There is a plethora of tools available that provide registry viewing and editing capability. These tools range from so-called freeware to expensive commercial programs. The necessity to use special tools that are themselves closed in order to manage a computer is a further step into impenetrability.
Part of the problem with all this is that the writers of these tools need to have information about the contents of registry entries that are being viewed or edited. Without that inside knowledge from the vendors of the proprietary software these tools are also useless. And one reason that proprietary software stores configuration data in a binary and proprietary format is to hide things from users.
This all stems from the closed and proprietary philosophy adhered to by these vendors. It appears on the surface to be about protecting the users from doing “stupid things,” but it is also a good way to obscure information.
I did try to locate a binary format Linux system configuration file in /etc but was unable to. Not one of the hundreds of configuration files in that directory was in a binary format. That is a really good thing, but it leaves me without a sample of a binary configuration file that I can use to show you what one is like.
One of the issues with binary formats is that there would have been no reason to create the many powerful tools we have in Linux. None of the data streams that could be generated from binary format files would be usable for tools like grep, awk, sed, cat, vim, emacs , or any of the hundreds of other text-based tools we take for granted every day while we administer the systems for which we are responsible.
Open Is Knowable
“Open source” is about the code and making the source code available to any and all who want to view or modify it. “Open data2” is about the openness of the data itself.
The term open data does not mean just having access to the data itself, it also means that the data can be viewed, used in some manner, and shared with others. The exact manner in which those goals are achieved may be subject to some sort of attribution and open licensing. As with open source software, such licensing is intended to ensure the continued open availability of the data and not to restrict it any manner.
Open data is knowable. That means that access to it is unfettered. Truly open data can be read freely and understood without the need for further interpretation or decryption. In the SysAdmin world, open means that the data we use to configure, monitor, and manage our Linux hosts is easy to find, read, and modify when necessary. It is stored in formats that permit that ease of access, such as ASCII text. When a system is open the data and software can all be managed by open tools – tools that work with ASCII text.
Flat ASCII Text
Flat text files are open and knowable. They are easy to read by both programs and SysAdmins so it is easy to see when things are working – or not. Most Linux configuration files are simple flat ASCII text files, which make them easy to view and modify with the simple Linux text manipulation tools that are already at our disposal.
So we can use cat and less to view the Linux configuration files, and grep to extract and view lines containing specified strings. We can use vi, vim , emacs , or any other text editor to modify configuration files that are ASCII text format.
In one of my jobs – the one where we used Perl CGI scripts to manage the email system – we used flat text files to store all of our data. This data included departmental information such as who was authorized to access the data for that department. It also contained the ID and login information for the email users for each department.
We wrote some Perl programs to manage access to this data, both for us as the overall email SysAdmins, as well as for the departmental administrators. The data was still flat ASCII text files, so we could use basic Linux command-line tools to access and modify the data, especially when making mass changes to the files. At the same time we were also able to use our web-based Perl CGI scripts to work with individual personnel and departmental records.
We did think about using MySQL for record management but we decided that ACII files made more sense for their ease of access. One of our SysAdmins wrote a series of Perl scripts in about a week that allowed us to use SQL-like function calls from within the Perl scripts so we had the best of both worlds.
System Configuration Files
Most of the system-wide configuration files are located in the /etc directory and its subdirectories. The files in /etc provide configuration data for many of the system services and servers such as email (SMTP, POP, IMAP), web (HTTP), time (NTP or chrony), SSH, network adapters and routing, the GRUB boot loader, display screen, and printer configuration, and much more.
Relax – we are not going to examine every line of the /etc/bashrc file in Figure 13-1. However, there are a few things that we should observe in this file.
First, just look at all of the comments. This file is meant to be read by users. We SysAdmins are, after all, advanced users. One thing I like about Red Hat-based distributions is that most of the configuration files and scripts are well commented.
One of the functions of this script is to set the shell command prompt. The script determines whether the shell is a standard xterm or vte terminal session, or if it is in a screen session. It sets the prompt string differently depending upon that condition. It also uses external files such as /etc/sysconfig/bash-prompt-xterm, which contains the prompt configuration in a file and location easily managed by the SysAdmin.
Up near the top of the file is a series of comments that briefly describe the function of the script along with an admonishment not to change this particular file. The comments also tell you where your own modifications should go. We will look at that a little further on.
Notice how the indents make the structure of this script fragment easier to read than if everything were jammed up against the left margin.
Did you catch that as we went by? This configuration file is an executable program. It is a bash script that contains program logic that can determine which execution path to take depending upon outside conditions. This script is not complete in itself; it is actually a fragment that can be sourced – imported – into other scripts as necessary.
Sourcing is a bash shell method for including the content of other bash scripts or fragments into a script. This allows the contents of the fragment being sourced to be used by multiple scripts. You can think of it like function libraries used by compiled programs. The sourced file is loaded into the calling script at the location of the source command. It is then immediately executed.
The lines highlighted in Figure 13-2 source the code from all of the *.sh files in /etc/profile.d into this code fragment.
The /etc/profile file is also a script fragment. We could spend some time here to locate the manner in which /etc/profile is launched, but that would take us in the wrong direction for what we are trying to accomplish here. Suffice it to say that when called from a bash itself, it is invoked as a login shell, it reads /etc/profile first (if it exists) and then ~/.bash_profile, ~/.bash_login, and ~/.profile, in that order (if they exist3).
Global Bash Configuration
Now, let’s make some global configuration changes to bash.
The /etc/bashrc file mentions the /etc/profile.d directory. Let’s look at that directory and its files in Experiment 13-1. While we are at it, we will add some global bash configuration of our own.
Experiment 13-1
This experiment should be performed as root. Our objective is to make some additions to the global configuration for the bash shell.
Make /etc/profile.d the PWD and list the contents.
All of the files with *.sh extensions are executed by the code in /etc/bashrc or .etc/profile. The ones with other extensions are not executed. We will make our additions to the bash configuration by creating a new file in this directory.
Use your favorite editor to create a new file named “mybash.sh” in this directory. Add the following content to the file.
Before we test this, let’s ensure that the aliases are not already there and that the TestVariable is null.
Now test the results. This change will not affect bash sessions that are already open. New sessions will reflect the changes. So open a new terminal session. As the student user, run the following commands to verify the results.
It is easy to make changes to ASCII files as this experiment shows. Notice that a reboot was not required to make these changes take effect – they were in effect immediately for new bash terminal sessions.
User Configuration Files
Let’s look at the so-called hidden files in your own home directory – those whose names begin with a period (.). These are user-specific configuration files that you can change to meet your own needs and preferences. Let’s look at the .bashrc file, which is the configuration file in which individual users can set their own bash configuration such as aliases, functions, and environment variables that are unique to them.
Experiment 13-2
Perform this Experiment as the student user.
The .bashrc file is short so we can view it with cat. Let’s be sure we are in the home directory for the student user and then display the file.
This file is well commented also and even tells us where to add our own configuration. So let’s add something innocuous that will allow us to test this local configuration . Use your favorite editor to add the following line to the end of the file.
View the variable.
The variable has not been added to the environment. It will now be part of the environment for bash terminal sessions opened from now on. It can be added to existing bash terminal sessions by sourcing the .bashrc file like this.
These are trivial examples, but they should give you some idea of how flexible having open format configuration files can be. It is easy to follow the logic of the files and easy to modify them when needed. Although each distribution varies in how it adds comments to these files, all of the ones I have used have enough information in the comments to enable me to figure out the appropriate location for me to alter the configuration. They also contain enough information to allow me to follow the logic. That doesn’t mean I don’t have to work a bit to understand it all, but I can do it if I need to or am just curious.
Be aware that the local user bash configuration overrides the global configuration. So if a user has the knowledge and wants to alter a global configuration parameter for themselves, they can do that by setting it in the ~/.bashrc file .
ASCII Rocks
Now we can see how the openness created by using ASCII text files for configuration allows us to explore and understand many of the processes of our Linux operating system. ASCII is the go-to format for configuration files and for shell scripts.
Many system-level executables are also bash scripts that set configurations and launch the binaries. Let’s check out the /bin directory to verify this.
Experiment 13-3
Perform this experiment as the root user.
Make /bin the PWD and count the number of files just to see how many executables there are altogether.
Let’s figure out how many are ASCII text files.
Over 13% of the executable files in /bin are ASCII shell scripts. Now view the list of files that are ASCII scripts. The specific results from your host will almost certainly differ from mine.
I won’t list those files here, but you should look through them a bit just to see what is there.
Now let’s look at one of these scripts. I chose the ps2ascii script, which is used as a wrapper around the ghostscript program.
Note If the host you are using does not have the ps2ascii program installed, you can either install it or choose a different ASCII file to explore for the rest of this experiment.
The ghostscript program converts Postscript and PDF files to ASCII text files by extracting the text out of the originals. This wrapper has comments that tell us what the program does. It sets some variables and then runs the program with options for different conditions.
Scripts like ps2ascii allow a great deal of flexibility when launching programs. They make life easier for users because the scripts can manage the task of setting up options and arguments that are passed to the main program.
Final Thoughts
Open data in Linux enables us as SysAdmins to explore everything in order to satisfy our curiosity about how Linux works. The use of ASCII text files for scripting and configuration files allows us access to the inner workings of the environment in which we work every day.
We were able to use that openness to trace our way through some related bash configuration programs and files. We discovered how to make global and local changes. We added some configuration of our own so that bash is now configured more to our own liking.
And, if we want or need to, we can download the source code used to compile the executable code for the kernel and all of the open source programs and utilities available with our Linux distribution. I have done that on a couple of occasions because I wanted to know more. You can, too, if your curiosity takes you there.
All of this is only possible in an open operating system.