© David Both 2018
David BothThe Linux Philosophy for SysAdminshttps://doi.org/10.1007/978-1-4842-3730-4_13

13. Store Data in Open Formats

David Both1 
(1)
Raleigh, North Carolina, USA
 

The reason we use computers is to manipulate data. It used to be called “Data Processing” for a reason and that was an accurate description. We still process data although it may be in the form of video and audio streams, network and wireless streams, word processing data, spreadsheets, images, and more. It is all still just data.

We work with and manipulate text data streams with the tools we have available to us in Linux. That data usually needs to be stored, and when there is a need to store data, it is always better to store it in open file formats than closed ones.

Although many user application programs store data in ASCII formats, including simple flat ASCII and XML, this chapter is about configuration data and scripts that relate directly to Linux. The files we will consider in this chapter are about system configuration.

Closed Is Impenetrable

Way back before the Registry1 was introduced with Windows 3.1, most utilities and applications stored their configuration data in .ini files. These .ini files were stored as ASCII text and were easy to access, read, and even to modify. All it took was a simple text editor to make changes to these .ini configuration files.

The registry changed all that by storing configuration data in a single, large, and impenetrable binary data file. Although individual programs could store configuration data in .ini files, the Registry was touted as a way to centralize control over program configuration, and its binary format was allegedly faster to parse than ASCII text files.

As System Administrators we have need to use many different types of data. Binary formats are by their very nature obscure and require special tools and knowledge to manipulate. There is a plethora of tools available that provide registry viewing and editing capability. These tools range from so-called freeware to expensive commercial programs. The necessity to use special tools that are themselves closed in order to manage a computer is a further step into impenetrability.

Part of the problem with all this is that the writers of these tools need to have information about the contents of registry entries that are being viewed or edited. Without that inside knowledge from the vendors of the proprietary software these tools are also useless. And one reason that proprietary software stores configuration data in a binary and proprietary format is to hide things from users.

This all stems from the closed and proprietary philosophy adhered to by these vendors. It appears on the surface to be about protecting the users from doing “stupid things,” but it is also a good way to obscure information.

I did try to locate a binary format Linux system configuration file in /etc but was unable to. Not one of the hundreds of configuration files in that directory was in a binary format. That is a really good thing, but it leaves me without a sample of a binary configuration file that I can use to show you what one is like.

One of the issues with binary formats is that there would have been no reason to create the many powerful tools we have in Linux. None of the data streams that could be generated from binary format files would be usable for tools like grep, awk, sed, cat, vim, emacs , or any of the hundreds of other text-based tools we take for granted every day while we administer the systems for which we are responsible.

Open Is Knowable

“Open source” is about the code and making the source code available to any and all who want to view or modify it. “Open data2” is about the openness of the data itself.

The term open data does not mean just having access to the data itself, it also means that the data can be viewed, used in some manner, and shared with others. The exact manner in which those goals are achieved may be subject to some sort of attribution and open licensing. As with open source software, such licensing is intended to ensure the continued open availability of the data and not to restrict it any manner.

Open data is knowable. That means that access to it is unfettered. Truly open data can be read freely and understood without the need for further interpretation or decryption. In the SysAdmin world, open means that the data we use to configure, monitor, and manage our Linux hosts is easy to find, read, and modify when necessary. It is stored in formats that permit that ease of access, such as ASCII text. When a system is open the data and software can all be managed by open tools – tools that work with ASCII text.

Flat ASCII Text

Flat text files are open and knowable. They are easy to read by both programs and SysAdmins so it is easy to see when things are working – or not. Most Linux configuration files are simple flat ASCII text files, which make them easy to view and modify with the simple Linux text manipulation tools that are already at our disposal.

So we can use cat and less to view the Linux configuration files, and grep to extract and view lines containing specified strings. We can use vi, vim , emacs , or any other text editor to modify configuration files that are ASCII text format.

In one of my jobs – the one where we used Perl CGI scripts to manage the email system – we used flat text files to store all of our data. This data included departmental information such as who was authorized to access the data for that department. It also contained the ID and login information for the email users for each department.

We wrote some Perl programs to manage access to this data, both for us as the overall email SysAdmins, as well as for the departmental administrators. The data was still flat ASCII text files, so we could use basic Linux command-line tools to access and modify the data, especially when making mass changes to the files. At the same time we were also able to use our web-based Perl CGI scripts to work with individual personnel and departmental records.

We did think about using MySQL for record management but we decided that ACII files made more sense for their ease of access. One of our SysAdmins wrote a series of Perl scripts in about a week that allowed us to use SQL-like function calls from within the Perl scripts so we had the best of both worlds.

System Configuration Files

Most of the system-wide configuration files are located in the /etc directory and its subdirectories. The files in /etc provide configuration data for many of the system services and servers such as email (SMTP, POP, IMAP), web (HTTP), time (NTP or chrony), SSH, network adapters and routing, the GRUB boot loader, display screen, and printer configuration, and much more.

You can also find configuration files that provide system-wide configuration that affects all users, such as /etc/bashrc. The /etc/bashrc file provides initial setup and configuration for all users when they open a bash shell. Figure 13-1 shows the content of the/etc/bashrc file on my Fedora VM.
../images/462716_1_En_13_Chapter/462716_1_En_13_Fig1a_HTML.png../images/462716_1_En_13_Chapter/462716_1_En_13_Fig1b_HTML.png../images/462716_1_En_13_Chapter/462716_1_En_13_Fig1c_HTML.png
Figure 13-1

The /etc/bashrc file provides configuration for all bash shell sessions when they are opened

Relax – we are not going to examine every line of the /etc/bashrc file in Figure 13-1. However, there are a few things that we should observe in this file.

First, just look at all of the comments. This file is meant to be read by users. We SysAdmins are, after all, advanced users. One thing I like about Red Hat-based distributions is that most of the configuration files and scripts are well commented.

One of the functions of this script is to set the shell command prompt. The script determines whether the shell is a standard xterm or vte terminal session, or if it is in a screen session. It sets the prompt string differently depending upon that condition. It also uses external files such as /etc/sysconfig/bash-prompt-xterm, which contains the prompt configuration in a file and location easily managed by the SysAdmin.

Up near the top of the file is a series of comments that briefly describe the function of the script along with an admonishment not to change this particular file. The comments also tell you where your own modifications should go. We will look at that a little further on.

Notice how the indents make the structure of this script fragment easier to read than if everything were jammed up against the left margin.

Did you catch that as we went by? This configuration file is an executable program. It is a bash script that contains program logic that can determine which execution path to take depending upon outside conditions. This script is not complete in itself; it is actually a fragment that can be sourced – imported – into other scripts as necessary.

Sourcing is a bash shell method for including the content of other bash scripts or fragments into a script. This allows the contents of the fragment being sourced to be used by multiple scripts. You can think of it like function libraries used by compiled programs. The sourced file is loaded into the calling script at the location of the source command. It is then immediately executed.

Sourcing can be accomplished by using the source command. The period (.) is an alias for the source command. This is illustrated in Figure 13-2, which is a fragment of the code in Figure 13-1.
../images/462716_1_En_13_Chapter/462716_1_En_13_Fig2_HTML.png
Figure 13-2

This code fragment sources the *.sh files located /etc/profile.d. Other files in that directory are ignored

The lines highlighted in Figure 13-2 source the code from all of the *.sh files in /etc/profile.d into this code fragment.

So how does the program fragment in Figure 13-1 get executed? Where is the code or trigger that imports – sources – this code into it so it can be executed. Good questions. The /etc/profile script in Figure 13-3 sources the /etc/bashrc file .
../images/462716_1_En_13_Chapter/462716_1_En_13_Fig3a_HTML.png../images/462716_1_En_13_Chapter/462716_1_En_13_Fig3b_HTML.png../images/462716_1_En_13_Chapter/462716_1_En_13_Fig3c_HTML.png
Figure 13-3

The /etc/profile script sets the global environment for all shells on the system when they are launched. It also sources the bash script fragments in /etc/profile.d and /etc/bashrc .

The /etc/profile file is also a script fragment. We could spend some time here to locate the manner in which /etc/profile is launched, but that would take us in the wrong direction for what we are trying to accomplish here. Suffice it to say that when called from a bash itself, it is invoked as a login shell, it reads /etc/profile first (if it exists) and then ~/.bash_profile, ~/.bash_login, and ~/.profile, in that order (if they exist3).

Global Bash Configuration

Now, let’s make some global configuration changes to bash.

The /etc/bashrc file mentions the /etc/profile.d directory. Let’s look at that directory and its files in Experiment 13-1. While we are at it, we will add some global bash configuration of our own.

Experiment 13-1

This experiment should be performed as root. Our objective is to make some additions to the global configuration for the bash shell.

Make /etc/profile.d the PWD and list the contents.

[root@testvm1 ~]# cd /etc/profile.d/ ; ls -l
total 100
-rw-r--r--. 1 root root  664 Jul 25  2017 bash_completion.sh
-rw-r--r--. 1 root root  196 Aug  3 04:18 colorgrep.csh
-rw-r--r--. 1 root root  201 Aug  3 04:18 colorgrep.sh
-rw-r--r--. 1 root root 1741 Nov 10 12:53 colorls.csh
-rw-r--r--. 1 root root 1606 Nov 10 12:53 colorls.sh
-rw-r--r--. 1 root root   69 Aug  4 19:53 colorsysstat.csh
-rw-r--r--. 1 root root   56 Aug  4 19:53 colorsysstat.sh
-rw-r--r--. 1 root root  162 Aug  5 02:00 colorxzgrep.csh
-rw-r--r--. 1 root root  183 Aug  5 02:00 colorxzgrep.sh
-rw-r--r--. 1 root root  216 Aug  3 04:57 colorzgrep.csh
-rw-r--r--. 1 root root  220 Aug  3 04:57 colorzgrep.sh
-rwxr-xr-x. 1 root root  249 Sep 21 03:40 kde.csh
-rwxr-xr-x. 1 root root  288 Sep 21 03:40 kde.sh
-rw-r--r--. 1 root root 1706 Jan  2 10:36 lang.csh
-rw-r--r--. 1 root root 2703 Jan  2 10:36 lang.sh
-rw-r--r--. 1 root root  500 Aug  3 11:02 less.csh
-rw-r--r--. 1 root root  253 Aug  3 11:02 less.sh
-rwxr-xr-x. 1 root root   49 Aug  3 21:06 mc.csh
-rwxr-xr-x. 1 root root  153 Aug  3 21:06 mc.sh
-rw-r--r--. 1 root root  106 Jan  2 07:21 vim.csh
-rw-r--r--. 1 root root  248 Jan  2 07:21 vim.sh
-rw-r--r--. 1 root root 2092 Nov  2 10:21 vte.sh
-rw-r--r--. 1 root root  120 Aug  4 23:29 which2.csh
-rw-r--r--. 1 root root  157 Aug  4 23:29 which2.sh

All of the files with *.sh extensions are executed by the code in /etc/bashrc or .etc/profile. The ones with other extensions are not executed. We will make our additions to the bash configuration by creating a new file in this directory.

Use your favorite editor to create a new file named “mybash.sh” in this directory. Add the following content to the file.

################################################################
# The following are global changes to BASH configuration       #
################################################################
alias lsn='ls --color=no'
alias vim='vim -c "colorscheme desert" '
TestVariable="Hello World"
set -o vi

Before we test this, let’s ensure that the aliases are not already there and that the TestVariable is null.

[root@testvm1 profile.d]# alias
alias cp='cp -i'
alias egrep='egrep --color=auto'
alias fgrep='fgrep --color=auto'
alias grep='grep --color=auto'
alias l.='ls -d .* --color=auto'
alias ll='ls -l --color=auto'
alias ls='ls --color=auto'
alias mc='. /usr/libexec/mc/mc-wrapper.sh'
alias mv='mv -i'
alias rm='rm -i'
alias which='(alias; declare -f) | /usr/bin/which --tty-only --read-alias --read-functions --show-tilde --show-dot'
alias xzegrep='xzegrep --color=auto'
alias xzfgrep='xzfgrep --color=auto'
alias xzgrep='xzgrep --color=auto'
alias zegrep='zegrep --color=auto'
alias zfgrep='zfgrep --color=auto'
alias zgrep='zgrep --color=auto'
[root@testvm1 profile.d]# echo $TestVariable
[root@testvm1 profile.d]#

Now test the results. This change will not affect bash sessions that are already open. New sessions will reflect the changes. So open a new terminal session. As the student user, run the following commands to verify the results.

[root@testvm1 profile.d]# echo $TestVariable
Hello World
[root@testvm1 profile.d]# alias
alias cp='cp -i'
alias egrep='egrep --color=auto'
alias fgrep='fgrep --color=auto'
alias grep='grep --color=auto'
alias l.='ls -d .* --color=auto'
alias ll='ls -l --color=auto'
alias ls='ls --color=auto'
alias lsn='ls --color=no'
alias mc='. /usr/libexec/mc/mc-wrapper.sh'
alias mv='mv -i'
alias rm='rm -i'
alias vim='vim -c "colorscheme desert" '
alias which='(alias; declare -f) | /usr/bin/which --tty-only --read-alias --read-functions --show-tilde --show-dot'
alias xzegrep='xzegrep --color=auto'
alias xzfgrep='xzfgrep --color=auto'
alias xzgrep='xzgrep --color=auto'
alias zegrep='zegrep --color=auto'
alias zfgrep='zfgrep --color=auto'
alias zgrep='zgrep --color=auto

It is easy to make changes to ASCII files as this experiment shows. Notice that a reboot was not required to make these changes take effect – they were in effect immediately for new bash terminal sessions.

User Configuration Files

Let’s look at the so-called hidden files in your own home directory – those whose names begin with a period (.). These are user-specific configuration files that you can change to meet your own needs and preferences. Let’s look at the .bashrc file, which is the configuration file in which individual users can set their own bash configuration such as aliases, functions, and environment variables that are unique to them.

Experiment 13-2

Perform this Experiment as the student user.

The .bashrc file is short so we can view it with cat. Let’s be sure we are in the home directory for the student user and then display the file.

[student@testvm1 ~]$ cd ; cat .bashrc
# .bashrc
# Source global definitions
if [ -f /etc/bashrc ]; then
        . /etc/bashrc
fi
# Uncomment the following line if you don't like systemctl's auto-paging feature:
# export SYSTEMD_PAGER=
# User specific aliases and functions

This file is well commented also and even tells us where to add our own configuration. So let’s add something innocuous that will allow us to test this local configuration . Use your favorite editor to add the following line to the end of the file.

StudentVariable="This is a local variable."

View the variable.

[student@testvm1 ~]$ echo $StudentVariable
[student@testvm1 ~]$

The variable has not been added to the environment. It will now be part of the environment for bash terminal sessions opened from now on. It can be added to existing bash terminal sessions by sourcing the .bashrc file like this.

[student@testvm1 ~]$ . .bashrc
[student@testvm1 ~]$ echo $StudentVariable
This is a local variable.

These are trivial examples, but they should give you some idea of how flexible having open format configuration files can be. It is easy to follow the logic of the files and easy to modify them when needed. Although each distribution varies in how it adds comments to these files, all of the ones I have used have enough information in the comments to enable me to figure out the appropriate location for me to alter the configuration. They also contain enough information to allow me to follow the logic. That doesn’t mean I don’t have to work a bit to understand it all, but I can do it if I need to or am just curious.

Be aware that the local user bash configuration overrides the global configuration. So if a user has the knowledge and wants to alter a global configuration parameter for themselves, they can do that by setting it in the ~/.bashrc file .

ASCII Rocks

Now we can see how the openness created by using ASCII text files for configuration allows us to explore and understand many of the processes of our Linux operating system. ASCII is the go-to format for configuration files and for shell scripts.

Many system-level executables are also bash scripts that set configurations and launch the binaries. Let’s check out the /bin directory to verify this.

Experiment 13-3

Perform this experiment as the root user.

Make /bin the PWD and count the number of files just to see how many executables there are altogether.

[root@testvm1 ~]# cd /bin/ ; ls | wc -l
2605

Let’s figure out how many are ASCII text files.

[root@testvm1 bin]# for I in `ls`;do file $I;done | grep ASCII | wc -l
355

Over 13% of the executable files in /bin are ASCII shell scripts. Now view the list of files that are ASCII scripts. The specific results from your host will almost certainly differ from mine.

[root@testvm1 bin]# for I in `ls`;do file $I;done | grep ASCII | less

I won’t list those files here, but you should look through them a bit just to see what is there.

Now let’s look at one of these scripts. I chose the ps2ascii script, which is used as a wrapper around the ghostscript program.

Note If the host you are using does not have the ps2ascii program installed, you can either install it or choose a different ASCII file to explore for the rest of this experiment.

[root@testvm1 bin]# cat ps2ascii
#!/bin/sh
# Extract ASCII text from a PostScript file.  Usage:
#       ps2ascii [infile.ps [outfile.txt]]
# If outfile is omitted, output goes to stdout.
# If both infile and outfile are omitted, ps2ascii acts as a filter,
# reading from stdin and writing on stdout.
# This definition is changed on install to match the
# executable name set in the makefile
GS_EXECUTABLE=gs
trap "rm -f _temp_.err _temp_.out" 0 1 2 15
OPTIONS="-q -dSAFER -sDEVICE=txtwrite"
if ( test $# -eq 0 ) then
    $GS_EXECUTABLE $OPTIONS -o - -
elif ( test $# -eq 1 ) then
    $GS_EXECUTABLE $OPTIONS -o - "$1"
else
    $GS_EXECUTABLE $OPTIONS -o "$2" "$1"
fi

The ghostscript program converts Postscript and PDF files to ASCII text files by extracting the text out of the originals. This wrapper has comments that tell us what the program does. It sets some variables and then runs the program with options for different conditions.

Scripts like ps2ascii allow a great deal of flexibility when launching programs. They make life easier for users because the scripts can manage the task of setting up options and arguments that are passed to the main program.

Final Thoughts

Open data in Linux enables us as SysAdmins to explore everything in order to satisfy our curiosity about how Linux works. The use of ASCII text files for scripting and configuration files allows us access to the inner workings of the environment in which we work every day.

We were able to use that openness to trace our way through some related bash configuration programs and files. We discovered how to make global and local changes. We added some configuration of our own so that bash is now configured more to our own liking.

And, if we want or need to, we can download the source code used to compile the executable code for the kernel and all of the open source programs and utilities available with our Linux distribution. I have done that on a couple of occasions because I wanted to know more. You can, too, if your curiosity takes you there.

All of this is only possible in an open operating system.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
13.58.248.90