A text file is a file containing human-readable text. Each line ends with a line feed character, a carriage return, or both, depending on the operating system. By Linux convention, each line ends with a line feed, the
(newline) character in printf
.
The examples in this chapter use a text file that lists several pieces of furniture by name, price, quantity, and supplier number, as shown in Listing 11.1.
Example 11.1. orders.txt
Birchwood China Hutch,475.99,1,756 Bookcase Oak Veneer,205.99,1,756 Small Bookcase Oak Veneer,205.99,1,756 Reclining Chair,1599.99,1,757 Bunk Bed,705.99,1,757 Queen Bed,925.99,1,757 Two-drawer Nightstand,125.99,1,756 Cedar Toy Chest,65.99,1,757 Six-drawer Dresser,525.99,1,757 Pine Round Table,375.99,1,757 Bar Tool,45.99,1,756 Lawn Chair,55.99,1,756 Rocking Chair,287.99,1,757 Cedar Armoire,825.99,1,757 Mahogany Writing Desk,463.99,1,756 Garden Bench,149.99,1,757 Walnut TV Stand,388.99,1,756 Victorian-style Sofa,1225.99,1,757 Chair - Rocking,287.99,1,75 Grandfather Clock,2045.99,1,756
Linux contains many utilities for working with text files. Some can act as filters, processing the text so that it can be passed on to yet another command using a pipeline. When a text file is passed through a pipeline, it is called a text stream, that is, a stream of text characters.
Linux has three commands for pathnames.
The basename
command examines a path and displays the filename. It doesn't check to see whether the file exists.
$ basename /home/kburtch/test/orders.txt
orders.txt
If a suffix is included as a second parameter, basename
deletes the suffix if it matches the file's suffix.
$ basename /home/kburtch/test/orders.txt .txt
orders
The corresponding program for extracting the path to the file is dirname
.
$ dirname /home/kburtch/test/orders.txt
/home/kburtch/test
There is no trailing slash after the final directory in the path.
To verify that a pathname is a correct Linux pathname, use the pathchk
command. This command verifies that the directories in the path (if they already exist) are accessible and that the names of the directories and file are not too long. If there is a problem with the path, pathchk
reports the problem and returns an error code of 1.
$ pathchk "~/x" && echo "Acceptable path" Acceptable path $ mkdir a $ chmod 400 a $ pathchk "a/test.txt" pathchk: directory 'a' is not searchable $ pathchk "~/xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" && echo "Acceptable path" pathchk: name 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxx' has length 388; exceeds limit of 255
With the —portability
(-p
) switch, pathchk
enforces stricter portability checks for all POSIX-compliant Unix systems. This identifies characters not allowed in a pathname, such as spaces.
$ pathchk "new file.txt" $ pathchk -p "new file.txt" pathchk: path 'new file.txt' contains nonportable character ' '
pathchk
is useful for checking pathnames supplied from an outside source, such as pathnames from another script or those typed in by a user.
A particular feature of Unix-based operating systems, including the Linux ext3
file system, is the way space on a disk is reserved for a file. Under Linux, space is never released for a file. For example, if you overwrite a 1MB file with a single byte, Linux still reserves one megabyte of disk space for the file.
If you are working with files that vary greatly in size, you should remove the file and re-create it in order to free up the disk space rather than simply overwriting it.
This behavior affects all files, including directories. If a program removes all 5,000 files from a large directory, and puts a single file in that directory, the directory will still have space reserved for 5,000 file entries. The only way to release this space is to remove and re-create the directory.
The built-in type
command, as discussed in Chapter 3, “Files, Users, and Shell Customization,” identifies whether a command is built-in or not, and where the command is located if it is a Linux command.
To test files other than commands, the Linux file
command performs a series of tests to determine the type of a file. First, file
determines whether the file is a regular file or is empty. If the file is regular, file consults the /usr/share/magic
file, checking the first few bytes of the file in an attempt to determine what the file contains. If the file is an ASCII text file, it performs a check of common words to try to determine the language of the text.
$ file empty_file.txt empty_file.txt: empty $ file orders.txt orders.txt: ASCII text
file
also works with programs. If check-orders.sh
is a Bash script, file
identifies it as a shell script.
$ file check-orders.sh check-orders.sh: Bourne-Again shell script text $ file /usr/bin/test /usr/bin/test: ELF 32-bit LSB executable, Intel 80386, version 1, dynamically linked (uses shared libs), stripped
For script programming, file
's -b
(brief) switch hides the name of the file and returns only the assessment of the file.
$ file -b orders.txt
ASCII text
Other useful switches include -f
(file) to read filenames from a specific file. The -i
switch returns the description as MIME type suitable for Web programming. With the -z
(compressed) switch, file
attempts to determine the type of files stored inside a compressed file. The -L
switch follows symbolic links.
$ file -b -i orders.txt
text/plain, ASCII
As discussed in Chapter 3, “Files, Users, and Shell Customization,” files are deleted with the rm
(remove) command. The -f
(force) command removes a file even when the file permissions indicate the script cannot write to the file, but rm
never removes a file from a directory that the script does not own. (The sticky bit is an exception and is discussed in Chapter 15, “Shell Security.”)
As whenever you deal with files, always check that the file exists before you attempt to remove it. See Listing 11.2.
Example 11.2. rm_demo.sh
#!/bin/bash # # rm_demo.sh: deleting a file with rm shopt -s -o nounset declare -rx SCRIPT=${0##*/} declare -rx FILE2REMOVE="orders.bak" declare -x STATUS if [ ! -f "$FILE2REMOVE" ] ; then printf "%s " "$SCRIPT: $FILE2REMOVE does not exist" >&2 exit 192 else rm "$FILE2REMOVE" >&2 STATUS=$? if [ $STATUS -ne 0 ] ; then printf "%s " "$SCRIPT: Failed to remove file $FILE2REMOVE" >&2 exit $STATUS fi fi exit 0
When removing multiple files, avoid using the -r
(recursive) switch or filename globbing. Instead, get a list of the files to delete (using a command such as find
, discussed next) and test each individual file before attempting to remove any of them. This is slower than the alternatives but if a problem occurs no files are removed and you can safely check for the cause of the problem.
New, empty files are created with the touch
command. The command is called touch
because, when it's used on an existing file, it changes the modification time even though it makes no changes to the file.
touch
is often combined with rm
to create new, empty files for a script. Appending output with >>
does not result in an error if the file exists, eliminating the need to remember whether a file exists.
For example, if a script is to produce a summary file called run_results.txt
, a fresh file can be created with Listing 11.3.
Example 11.3. touch_demo.sh
#!/bin/bash # # touch_demo.sh: using touch to create a new, empty file shopt -s -o nounset declare -rx RUN_RESULTS="./run_results.txt" if [ -f "$RUN_RESULTS" ] ; then rm -f "$RUN_RESULTS" if [ $? -ne 0 ] ; then printf "%s " "Error: unable to replace $RUN_RESULTS" >&2 fi touch "$RUN_RESULTS" fi printf "Run stated %s " "'date'" >> "$RUN_RESULTS"
Files are renamed or moved to new directories using the mv
(move) command. If -f
(force) is used, move
overwrites an existing file instead of reporting an error. Use -f
only when it is safe to overwrite the file.
You can combine touch
with mv
to back up an old file under a different name before starting a new file. The Linux convention for backup files is to rename them with a trailing tilde (~
). See Listing 11.4.
Example 11.4. backup_demo.sh
#!/bin/bash # # backup_demo.sh shopt -s -o nounset declare -rx RUN_RESULTS="./run_results.txt" if [ -f "$RUN_RESULTS" ] ; then mv -f "$RUN_RESULTS" "$RUN_RESULTS""~" if [ $? -ne 0 ] ; then printf "%s " "Error: unable to backup $RUN_RESULTS" >&2 fi touch "$RUN_RESULTS" fi printf "Run stated %s " "'date'" >> "$RUN_RESULTS"
Because it is always safe to overwrite the backup, the move is forced with the -f
switch. Archiving files is usually better than outright deleting because there is no way to “undelete” a file in Linux.
Similar to mv
is the cp
(copy) command. cp
makes copies of a file and does not delete the original file. cp
can also be used to make links instead of copies using the —link
switch.
There are two Linux commands that display information about a file that cannot be easily discovered with the test
command.
The Linux stat
command shows general information about the file, including the owner, the size, and the time of the last access.
$ stat ken.txt
File: "ken.txt"
Size: 84 Blocks: 8 Regular File
Access: (0664/-rw-rw-r—) Uid: ( 503/ kburtch) Gid: ( 503/ kburtch)
Device: 303 Inode: 131093 Links: 1
Access: Tue Feb 20 16:34:11 2001
Modify: Tue Feb 20 16:34:08 2001
Change: Tue Feb 20 16:34:08 2001
To make the information more readable from a script, use the -t
(terse) switch. Each stat item is separated by a space.
$ stat -t orders.txt
orders.txt 21704 48 81fd 503 503 303 114674 1 6f 89 989439402
981490652 989436657
The Linux statftime
command has similar capabilities to stat
, but has a wider range of formatting options. statftime
is similar to the date
command: It has a string argument describing how the status information should be displayed. The argument is specified with the -f
(format) switch.
The most common statftime
format codes are as follows:
%d
—. Day (zero filled)
%D
—. mm/dd/yy
%H
—. Hour (24-hr clock)
%I
—. Hour (12-hr clock)
%j
—. Day (1..366)
%m
—. Month
%M
—. Minute
%S
—. Second
%U
—. Week number (Sunday)
%w
—. Weekday (Sunday)
%Y—. Year
%%
—. Percent character
%_A
—. Uses file last access time
%_a
—. Filename (no suffix)
%_C
—. Uses file inode change time
%_d
—. Device ID
%_e
—. Seconds elapsed since epoch
%_f
—. File system type
%_i
—. Inode number
%_L
—. Uses current (local) time
%_l
—. Number of hard links
%_M
—. Uses file last modified time
%_m
—. Type/attribute/access bits
%_n
—. Filename
%_r
—. Rdev ID (char/block devices)
%_s
—. File size (bytes)
%_U
—. Uses current (UTC) time
%_u
—. User ID (uid
)
A complete list appears in the reference section at the end of this chapter.
By default, any of formatting codes referring to time will be based on the file's modified time.
$ statftime -f "%c" orders.txt
Tue Feb 6 15:17:32 2001
Other types of time can be selected by using a time code. The format argument is read left to right, which means different time codes can be combined in one format string. Using %_C
, for example, changes the format codes to the inode change time (usually the time the file was created). Using %_L
(local time) or %_U
(UTC time) makes statftime
behave like the date
command.
$ statftime -f "modified time = %c current time = %_L%c" orders.txt modified time = Tue Feb 6 15:17:32 2001 current time = Wed May 9 15:49:01 2001 $ date Wed May 9 15:49:01 2001
statftime
can create meaningful archive filenames. Often files are sent with a name such as orders.txt
and the script wants to save the orders with the date as part of the name.
$ statftime -f "%_a_%_L%m%d.txt" orders.txt
orders_0509.txt
Besides generating new filenames, statftime
can be used to save information about a file to a variable.
$ BYTES='statftime -f "%_s" orders.txt' $ printf "The file size is %d bytes " "$BYTES" The file size is 21704 bytes
When a list of files is supplied on standard input, the command processes each file in turn. The %_z
code provides the position of the filename in the list, starting at 1.
Linux has a convenient tool for downloading files from other logins on the current computer or across a network. wget
(web get) retrieves files using FTP or HTTP. wget
is designed specifically to retrieve files, making it easy to use in shell scripts. If a connection is broken, wget
tries to reconnect and continue to download the file.
The wget
program uses the same form of address as a Web browser, supporting ftp://
and http://
URLs. Login information is added to a URL by placing user:
and password@
prior to the hostname. FTP URLs can end with an optional ;type=a
or ;type=i
for ASCII or IMAGE FTP downloads. For example, to download the info.txt
file from the kburtch
login with the password jabber12
on the current computer, you use:
$ wget ftp://kburtch:jabber12@localhost/info.txt;type=i
By default, wget
uses —verbose
message reporting. To report only errors, use the —quiet
switch. To log what happened, append the results to a log file using —append-output
and a log name and log the server responses with the —server-response
switch.
$ wget —server-response —append-output wget.log ftp://kburtch: jabber12@localhost/info.txt;type=i
Whole accounts can be copied using the —mirror
switch.
$ wget —mirror ftp://kburtch:jabber12@localhost;type=i
To make it easier to copy a set of files, the —glob
switch can enable file pattern matching. —glob=on
causes wget
to pattern match any special characters in the filename. For example, to retrieve all text files:
$ wget —glob=on 'ftp://kburtch:jabber12@localhost/*.txt'
There are many special-purpose switches not covered here. A complete list of switches is in the reference section. Documentation is available on the wget
home page at http://www.gnu.org/software/wget/wget.html.
Besides wget
, the most common way of transferring files between accounts is using the ftp
command. FTP is a client/server system: An FTP server must be set up on your computer if there isn't one already. Most Linux distributions install an FTP server by default.
With an FTP client, you'll have to redirect the necessary download commands using standard input, but this is not necessary with wget
.
To use ftp
from a script, you use three switches. The -i
(not interactive) switch disables the normal FTP prompts to the user. -n
(no auto-login) suppresses the login prompt, requiring you to explicitly log in with the open and user commands. -v
(verbose) displays more details about the transfer. The ftp
commands can be embedded in a script using a here file.
ftp -i -n -v <<! open ftp.nightlight.com user incoming_orders password cd linux_lightbulbs binary put $1 ! if [ $? -ne 0 ] ; then printf "%s " "$SCRIPT: FTP transfer failed" >&2 exit 192 fi
This script fragment opens an FTP connection to a computer called ftp.nightlight.com
. It deposits a file in linux_lightbulbs
directory in the incoming_orders
account. If an error occurs, an error message is printed and the script stops.
Processing files sent by FTP is difficult because there is no way of knowing whether the files are still being transferred. Instead of saving a file to a temp file and then moving it to its final location, an FTP server will create a blank file and will slowly save the data to the file. The mere presence of the file is not enough to signify the transfer is complete. The usual method of handling this situation is to wait until the file has been modified within a reasonable amount of time (perhaps an hour). If the file hasn't been modified recently, the transfer is probably complete and the file can be safely renamed and moved to a permanent directory.
Some distributions have an ftpcopy
(or a ftpcp
) command, which will copy whole directories at one time. Care must be taken with ftpcopy
because it is primarily intended as a mirroring tool and it will delete any local files not located at the remote account.
Part of the OpenSSH (Open Source Secure Shell) project, Secure FTP (sftp
) is another file-transfer program that works in a similar way to FTP but encrypts the transfer so that it cannot be intercepted or read by intermediary computers. The encryption process increases the amount of data and slows the transfer but provides protection for confidential information.
You must specify the computer and user account on the sftp
command line. SFTP prompts you for the password.
$ sftp root@our_web_site.com:/etc/httpd/httpd.conf
Connecting to our_web_site.com...
root@our_web_site.coms password:
Fetching /etc/httpd/httpd.conf to httpd.conf
For security purposes, SFTP normally asks the user for the Linux login password. It doesn't request the password from standard input but from the controlling terminal. This means you can't include the password in the batch file. The solution to this problem is to use SSH's public key authentication using the ssh-keygen
command. If you have not already done so, generate a new key pair as follows.
$ ssh-keygen -t rsa
A pair of authentication keys are stored under .ssh
in your home directory. You must copy the public key (a file ending in .pub
) to the remote machine and add it to a text file called ~/.sshd/authorized_keys
. Each local login accessing the remote login needs a public key in authorized_keys
. If a key pair exists, SFTP automatically uses the keys instead of the Linux login password.
Like FTP, SFTP needs a list of commands to carry out. SFTP includes a -b
(batch) switch to specify a separate batch file containing the commands to execute. To use a convenient here file in your script, use a batch file called /dev/stdin
.
The commands that SFTP understands are similar to FTP. For purposes of shell scripting, the basic transfer commands are the same. Transfers are always “binary.” There is a -v
(verbose) switch, but it produces a lot of information. When the -b
switch is used, SFTP shows the commands that are executed so the -v
switch is not necessary for logging what happened during the transfer.
sftp -C -b /dev/stdin root@our_web_site.com <<! cd /etc/httpd get httpd.conf ! STATUS=$? if [ $STATUS -ne 0 ] ; then printf "%s " "Error: SFTP transfer failed" >&2 exit $STATUS fi
The -C
(compress) option attempts to compress the data for faster transfers.
For more information about ssh
, sftp
, and related programs, visit http://www.openssh.org/.
Files sent by FTP or wget can be further checked by computing a checksum. The Linux cksum
command counts the number of bytes in a file and prints a cyclic redundancy check (CRC) checksum, which can be used to verify that the file arrived complete and intact. The command uses a POSIX-compliant algorithm.
$ cksum orders.txt
491404265 21799 orders.txt
There is also a Linux sum
command that provides compatibility with older Unix systems, but be aware that cksum
is incompatible with sum
.
For greater checksum security, some distributions include a md5sum
command to compute an MD5 checksum. The —status
switch quietly tests the file. The —binary
(or -b
) switch treats the file as binary data as opposed to text. The —warn
switch prints warnings about bad MD5 formatting. —check
(or -c
) checks the sum on a file.
$ md5sum orders.txt 945eecc13707d4a23e27730a44774004 orders.txt $ md5sum orders.txt > orderssum.txt $ md5sum —check orderssum.txt file1.txt: OK
Differences between two files can be pinpointed with the Linux cmp
command.
$ cmp orders.txt orders2.txt
orders.txt orders2.txt differ: char 179, line 6
Extremely large files can be split into smaller files using the Linux split
command. Files can be split by bytes or by lines. The —bytes=
s
(or -b
s
) switch creates files of no more than s
bytes. The —lines=
s
(or -l
s
) switch creates files of no more than s
lines. The —line-bytes=
s
(or -C
s
) switch constraints each line to no more than s
bytes. The size is a number with an optional b
(512 byte blocks), k
(kilobytes), or m
(megabytes). The final parameter is the prefix to use for the new filenames.
$ split —bytes=10k huge_file.txt small_file $ ls -l small_file* -rw-rw-r— 1 kburtch kburtch 10240 Aug 28 16:19 small_fileaa -rw-rw-r— 1 kburtch kburtch 10240 Aug 28 16:19 small_fileab -rw-rw-r— 1 kburtch kburtch 1319 Aug 28 16:19 small_fileac
You reassemble a split file with the Linux cat
command. This command combines files and writes them to standard output. Be careful to combine the split files in the correct order.
$ cksum huge_file.txt 491404265 21799 huge_file.txt $ cat small_fileaa small_fileab small_fileac > new_file $ cksum new_file 491404265 21799 new_file
If the locale where the split occurred is the same as the locale where the file is being reassembled, it is safe to use wildcard globbing for the cat
filenames.
The Linux csplit
(context split) command splits a file at the points where a specific pattern appears.
The basic csplit
pattern is a regular expression in slashes followed by an optional offset. The regular expression represents lines that will become the first line in the next new file. The offset is the number of lines to move forward or back from the matching line, which is by default zero. The pattern "/dogs/+1"
will separate a file into two smaller files, the first ending with the first occurrence of the pattern dogs
.
Quoting the pattern prevents it from being interpreted by Bash instead of the csplit
command.
The —prefix=
P
(or -f
P
) switch sets the prefix for the new filenames. The —suffix=
S
(or -b
S
) writes the file numbers using the specified C printf
function codes. The —digits=
D
(or -n
D
) switch specifies the maximum number of digits for file numbering. The default is two digits.
$ csplit —prefix "chairs" orders.txt "/Chair/" 107 485 $ ls -l chairs* -rw-rw-r— 1 kburtch kburtch 107 Oct 1 15:33 chairs00 -rw-rw-r— 1 kburtch kburtch 485 Oct 1 15:33 chairs01 $ head -1 chairs01 Reclining Chair,1599.99,1,757
The first occurrence of the pattern Chair
was in the line Reclining Chair
.
Multiple patterns can be listed. A pattern delineated with percent signs (%
) instead of with slashes indicates a portion of the file that should be ignored up to the indicated pattern. It can also have an offset. A number by itself indicates that particular line is to start the next new file. A number in curly braces repeats the last pattern a specific number of times, or an asterisk to match all occurrences of a pattern.
To split the orders.txt
file into separate files, each beginning with the word Chair
, use the all occurrences pattern.
$ csplit —prefix "chairs" orders.txt "/Chair/" "{*}" 107 222 23 179 61 $ ls -l chairs* -rw-rw-r— 1 kburtch kburtch 107 Oct 1 15:37 chairs00 -rw-rw-r— 1 kburtch kburtch 222 Oct 1 15:37 chairs01 -rw-rw-r— 1 kburtch kburtch 23 Oct 1 15:37 chairs02 -rw-rw-r— 1 kburtch kburtch 179 Oct 1 15:37 chairs03 -rw-rw-r— 1 kburtch kburtch 61 Oct 1 15:37 chairs04
The —elide-empty-files
(or -z
) switch doesn't save files that contain nothing. —keep-files
(or -k
) doesn't delete the generated files when an error occurs. The —quiet
(or —silent
or -q
or -s
) switch hides progress information.
csplit
is useful in splitting large files containing repeated information, such as extracting individual orders sent from a customer as a single text file.
The Linux expand
command converts Tab characters into spaces. The default is eight spaces, although you can change this with —tabs=
n
(or -t
n
) to n
spaces. The —tabs
switch can also use a comma-separated list of Tab stops.
$ printf " A TEST " > test.txt $ wc test.txt 1 2 8 test.txt $ expand test.txt | wc 1 2 21
The —initial
(or -i
) switch converts only leading Tabs on a line.
$ expand —initial test.txt | wc
1 2 15
The corresponding unexpand
command converts multiple spaces back into Tab characters. The default is eight spaces to a Tab, but you can use the —tabs=
n
switch to change this. By default, only initial tabs are converted. Use the —all
(or -a
) switch to consider all spaces on a line.
Use expand
to remove tabs from a file before processing it.
Temporary files, files that exist only for the duration of a script's execution, are traditionally named using the $$
function. This function returns the process ID number of the current script. By including this number in the name of the temporary files, it makes the name of the file unique for each run of the script.
$ TMP="/tmp/reports.$$" $ printf "%s " "$TMP" /tmp/reports.20629 $ touch "$TMP"
The drawback to this traditional approach lies in the fact that the name of a temporary file is predictable. A hostile program can see the process ID of your scripts when it runs and use that information to identify which temporary files your scripts are using. The temporary file could be deleted or the data replaced in order to alter the behavior of your script.
For better security, or to create multiple files with unique names, Linux has the mktemp
command. This command creates a temporary file and prints the name to standard output so it can be stored in a variable. Each time mktemp
creates a new file, the file is given a unique name. The name is created from a filename template the program supplies, which ends in the letter X six times. mktemp
replaces the six letters with a unique, random code to create a new filename.
$ TMP='mktemp /tmp/reports.XXXXXX' $ printf "%s " "$TMP" /tmp/reports.3LnWVw $ ls -l "$TMP" -rw———- 1 kburtch kburtch 0 Aug 1 14:34 reports.3LnWVw
In this case, the letters XXXXXX
are replaced with the code 3LnWvw
.
mktemp
creates temporary directories with the -d
(directories) switch. You can suppress error messages with the -q
(quiet) switch.
When many scripts share the same files, there needs to be a way for one script to indicate to another that it has finished its work. This typically happens when scripts overseen by two different development teams need to share files, or when a shared file can be used by only one script at a time.
A simple method for synchronizing scripts is the use of lock files. A lock file is like a flag variable: The existence of the file indicates a certain condition, in this case, that the file is being used by another program and should not be altered.
Most Linux distributions include a directory called /var/lock
, a standard location to place lock files.
Suppose the invoicing files can be accessed by only one script at a time. A lock file called invoices_lock
can be created to ensure only one script has access.
declare -r INVOICES_LOCKFILE="/var/lock/invoices_lock" while test ! -f "$INVOICES_LOCKFILE" ; do printf "Waiting for invoices to be printed... " sleep 10 done touch "$INVOICES_LOCKFILE"
This script fragment checks every 10 seconds for the presence of invoices_lock
. When the file disappears, the loop completes and the script creates a new lock file and proceeds to do its work. When the work is complete, the script should remove the lock file to allow other scripts to proceed.
If a lock file is not removed when one script is finished, it causes the next script to loop indefinitely. The while
loop can be modified to use a timeout so that the script stops with an error if the invoice files are not accessible after a certain period of time.
declare -r INVOICES_LOCKFILE="/var/lock/invoices_lock" declare -ir INVOICES_TIMEOUT=1800 # 30 minutes declare -i TIME=0 TIME_STARTED='date +%s' while test ! -f "$INVOICES_LOCKFILE" ; do printf "Waiting for the invoices to be printed... " sleep 10 TIME='date +%s' TIME=TIME-TIME_STARTED if [ $TIME -gt $INVOICES_TIMEOUT ] ; then printf "Timed out waiting for the invoices to print " exit 1 fi done
The date
command's %s
code returns the current clock time in seconds. When two executions of date are subtracted from each other, the result is the number of seconds since the first date
command was executed. In this case, the timeout period is 1800 seconds, or 30 minutes.
Lock files are convenient when a small number of scripts share the same file. When too many scripts are waiting on a lock file, a race condition occurs: the computer spends a lot of time simply checking for the presence of the lock file instead of doing useful work. Fortunately, there are other ways to share information.
Two scripts can share data using a special kind of file called a named pipe. These pipes (also called FIFOs or queues) are files that can be read by one script while being written to by another. The effect is similar to the pipe operator (|
), which forwards the results of one command as the input to another. Unlike a shell pipeline, the scripts using a named pipe run independently of one another, sharing only the pipe file between them. No lock files are required.
The mkfifo
command creates a new named pipe.
$ mkfifo website_orders.fifo $ ls -l website_orders.fifo prw-rw-r— 1 kburtch kburtch 0 May 22 14:14 orders.fifo
The file type p
to the left of the ls
output indicates this is a named pipe. If the ls
filename typing option (-F
) is used, the filename is followed by a vertical bar (|
) to indicate a pipe.
The named pipe can be read like a regular file. Suppose, for example, you want to create a script to log incoming orders from the company Web site, as shown in Listing 11.5.
Example 11.5. do_web_orders.sh
#!/bin/bash # # do_web_orders.sh: read a list of orders and show date read shopt -s -o nounset declare -rx SCRIPT=${0##*/} declare -rx QUEUE="website_orders.fifo" declare DATE declare ORDER if test ! -r "$QUEUE" ; then printf "%s " "$SCRIPT:$LINENO: the named pipe is missing or not readable" >&2 exit 192 fi { while read ORDER; do DATE='date' printf "%s: %s " "$DATE" "$ORDER" done } < $QUEUE printf "Program complete" exit 0
In this example, the contents of the pipe are read one line at a time just as if it was a regular file.
When a script reads from a pipe and there's no data, it sleeps (or blocks) until more data becomes available. If the program writing to the pipe completes, the script reading the pipe sees this as the end of the file. The while
loop will complete and the script will continue after the loop.
To send orders through the pipe, they must be printed or otherwise redirected to the pipe. To simulate a series of orders, write the orders file to the named pipe using the cat
command. Even though the cat
command is running in the background, it continues writing orders to the named pipe until all the lines have been read by the script.
$ cat orders.txt > website_orders.fifo & $ sh do_web_orders.sh Tue May 22 14:23:00 EDT 2001: Birchwood China Hutch,475.99,1,756 Tue May 22 14:23:00 EDT 2001: Bookcase Oak Veneer,205.99,1,756 Tue May 22 14:23:00 EDT 2001: Small Bookcase Oak Veneer,205.99,1,756 Tue May 22 14:23:00 EDT 2001: Reclining Chair,1599.99,1,757 Tue May 22 14:23:00 EDT 2001: Bunk Bed,705.99,1,757 Tue May 22 14:23:00 EDT 2001: Queen Bed,925.99,1,757 Tue May 22 14:23:00 EDT 2001: Two-drawer Nightstand,125.99,1,756 Tue May 22 14:23:00 EDT 2001: Cedar Toy Chest,65.99,1,757 Tue May 22 14:23:00 EDT 2001: Six-drawer Dresser,525.99,1,757 Tue May 22 14:23:00 EDT 2001: Pine Round Table,375.99,1,757 Tue May 22 14:23:00 EDT 2001: Bar Stool,45.99,1,756 Tue May 22 14:23:00 EDT 2001: Lawn Chair,55.99,1,756 Tue May 22 14:23:00 EDT 2001: Rocking Chair,287.99,1,757 Tue May 22 14:23:00 EDT 2001: Cedar Armoire,825.99,1,757 Tue May 22 14:23:00 EDT 2001: Mahogany Writing Desk,463.99,1,756 Tue May 22 14:23:00 EDT 2001: Garden Bench,149.99,1,757 Tue May 22 14:23:00 EDT 2001: Walnut TV Stand,388.99,1,756 Tue May 22 14:23:00 EDT 2001: Victorian-style Sofa,1225.99,1,757 Tue May 22 14:23:00 EDT 2001: Chair - Rocking,287.99,1,757 Tue May 22 14:23:00 EDT 2001: Grandfather Clock,2045.99,1,756
Using tee
, a program can write to two or more named pipes simultaneously.
Because a named pipe is not a regular file, commands such as grep
, head
, or tail
can behave unexpectedly or block indefinitely waiting for information on the pipe to appear or complete. If in doubt, verify that the file is not a pipe before using these commands.
Sometimes the vertical bar pipe operators cannot be used to link a series of commands together. When a command in the pipeline does not use standard input, or when it uses two sources of input, a pipeline cannot be formed. To create pipes when normal pipelines do not work, Bash uses a special feature called process substitution.
When a command is enclosed in <(...)
, Bash runs the command separately in a subshell, redirecting the results to a temporary named pipe instead of standard input. In place of the command, Bash substitutes the name of a named pipe file containing the results of the command.
Process substitution can be used anywhere a filename is normally used. For example, the Linux grep
command, a file-searching command, can search a file for a list of strings. A temporary file can be used to search a log file for references to the files in the current directory.
$ ls -1 > temp.txt $ grep -f temp.txt /var/log/nightrun_log.txt Wed Aug 29 14:18:38 EDT 2001 invoice_error.txt deleted $ rm temp.txt
A pipeline cannot be used to combine these commands because the list of files is being read from temp.txt
, not standard input. However, these two commands can be rewritten as a single command using process substitution in place of the temporary filename.
$ grep -f <(ls -1) /var/log/nightrun_log.txt
Wed Aug 29 14:18:38 EDT 2001 invoice_error.txt deleted
In this case, the results of ls -1
are written to a temporary pipe. grep
reads the list of files from the pipe and matches them against the contents of the nightrun_log.txt
file. The fact that Bash replaces the ls
command with the name of a temporary pipe can be checked with a printf
statement.
$ printf "%s
" <(ls -1)
/dev/fd/63
Bash replaces -f <(ls -1)
with -f /dev/fd/63
. In this case, the pipe is opened as file descriptor 63
. The left angle bracket (<
) indicates that the temporary file is read by the command using it. Likewise, a right angle bracket (>
) indicates that the temporary pipe is written to instead of read.
Files can be read by piping their contents to a command, or by redirecting the file as standard input to a command or group of commands. This is the easiest way to see what a text file contains, but it has two drawbacks. First, only one file can be examined at a time. Second, it prevents the script from interacting with the user because the read
command reads from the redirected file instead of the keyboard.
Instead of piping or redirection, files can be opened for reading by redirecting the file to a descriptor number with the exec
command, as shown in Listing 11.6.
Example 11.6. open_file.sh
#!/bin/bash # # open_file.sh: print the contents of orders.txt shopt -s -o nounset declare LINE exec 3< orders.txt while read LINE <&3 ; do printf "%s " "$LINE" done exit 0
In this case, the file orders.txt
is redirected to file descriptor 3
. Descriptor 3
is the lowest number that programs can normally use. File descriptor 0
is standard input, file descriptor 1
is standard output, and file descriptor 2
is standard error.
The read
command receives its input from descriptor 3
(orders.txt
), which is being redirected by <
. read
can also read from a particular file descriptor using the Korn shell -u
switch.
If the file opened with exec
does not exist, Bash reports a "bad file number"
error. The file descriptor must also be a literal number, not a variable.
If exec
is not used, the file descriptor can still be opened but it cannot be reassigned.
3< orders.txt 3< orders2.txt
In this example, file descriptor 3
is orders.txt
. The second line has no effect because descriptor 3
is already opened. If exec
is used, the second line re-opens descriptor 3
as orders2.txt
.
To save file descriptors, exec
can copy a descriptor to a second descriptor. To make input file descriptor 4
the same file as file descriptor 3
, do this
exec 4<&3
Now descriptor 3
and 4
refer to the same file and can be used interchangeably. Descriptor 3
can be used to open another file and can be restored to its original value by copying it back from descriptor 4
. If descriptor 4
is omitted, Bash assumes that you want to change standard input (descriptor 0
).
You can move a file descriptor by appending a minus sign to it. This closes the original file after the descriptor was copied.
exec 4<&3-
You can likewise duplicate output file descriptors with >&
and move them by appending a minus sign. The default output is standard output (descriptor 1
).
To open a file for writing, use the output redirection symbol (>
).
exec 3<orders.txt exec 4>log.out while read LINE <&3 ; do printf "%s " "$LINE" >&4 done
The <>
symbol opens a file for both input and output.
exec 3<>orders.txt
The reading or writing proceeds sequentially from the beginning of the file. Writing to the file overwrites its contents: As long as the characters being overwritten are the same length as the original characters, the new characters replace the old. If the next line in a file is dog
, for example, writing the line cat
over dog
replaces the word dog
. However, if the next line in the file is horse
, writing cat
creates two lines—the line cat
and the line se
. The linefeed character following cat
overwrites the letter r
. The script will now read the line se
.
<>
has limited usefulness with regular files because there is no way to “back up” and rewrite something that was just read. You can only overwrite something that you are about to read next.
The script in Listing 11.7 reads through a file and appends a "Processed on"
message to the end of the file.
Example 11.7. open_files2.sh
#!/bin/bash # # open_files2.sh shopt -o -s nounset declare LINE exec 3<>orders.txt while read LINE <&3 ; do printf "%s " "$LINE" done printf "%s " "Processed on "'date' >&3 exit 0
<>
is especially useful for socket programming, which is discussed in Chapter 16, “Network Programming.”
As files can be opened, so they can also be closed. An input file descriptor can be closed with <&-
. Be careful to include a file descriptor because, without one, this closes standard input. An output file descriptor can be closed with >&-
. Without a descriptor, this closes standard output.
As a special Bash convention, file descriptors can be referred to by a pathname. A path in the form of /dev/fd/
n
refers to file descriptor n
. For example, standard output is /dev/fd/1
. Using this syntax, it is possible to refer to open file descriptors when running Linux commands.
$ exec 4>results.out $ printf "%s " "Send to fd 4 and standard out" | tee /dev/fd/4 Send to fd 4 and standard out $ exec 4>&- $ cat results.out Send to fd 4 and standard out
The Linux head
command returns the first lines contained in a file. By default, head
prints the first 10 lines. You can specify a specific number of lines with the —lines=
n
(or -n
n
) switch.
$ head —lines=5 orders.txt
Birchwood China Hutch,475.99,1,756
Bookcase Oak Veneer,205.99,1,756
Small Bookcase Oak Veneer,205.99,1,756
Reclining Chair,1599.99,1,757
Bunk Bed,705.99,1,757
You can abbreviate the —lines
switch to a minus sign and the number of lines.
$ head -3 orders.txt
Birchwood China Hutch,475.99,1,756
Bookcase Oak Veneer,205.99,1,756
Small Bookcase Oak Veneer,205.99,1,756
The amount of lines can be followed by a c
for characters, an l
for lines, a k
for kilobytes, or an m
for megabytes. The —bytes
(or -c
) switch prints the number of bytes you specify.
$ head -9c orders.txt Birchwood $ head —bytes=9 orders.txt Birchwood
The Linux tail
command displays the final lines contained in a file. Like head
, the amount of lines or bytes can be followed by a c
for characters, an l
for lines, a k
for kilobytes, or an m
for megabytes.
The switches are similar to the head
command. The —bytes=
n
(or -c
) switch prints the number of bytes you specify. The —lines=
n
(-n
) switch prints the number of lines you specify.
$ tail -3 orders.txt
Walnut TV Stand,388.99,1,756
Victorian-style Sofa,1225.99,1,757
Grandfather Clock,2045.99,1,756
Combining tail
and head
in a pipeline, you can display any line or range of lines.
$ head -5 orders.txt | tail -1
Bunk Bed,705.99,1,757
If the starting line is a plus sign instead of a minus sign, tail
counts that number of lines from the start of the file and prints the remainder. This is a feature of tail
, not the head
command.
$ tail +17 orders.txt
Walnut TV Stand,388.99,1,756
Victorian-style Sofa,1225.99,1,757
Grandfather Clock,2045.99,1,756
When using head
or tail
on arbitrary files in a script, always check to make sure that the file is a regular file to avoid unpleasant surprises.
The Linux wc
(word count) command provides statistics about a file. By default, wc
shows the size of the file in lines, words, and characters. To make wc
useful in scripts, switches must be used to return a single statistic.
The —bytes
(or —chars
or -c
) switch returns the file size, the same value as the file size returned by statftime
.
$ wc —bytes invoices.txt
20411 invoices.txt
To use wc
in a script, direct the file through standard input so that the filename is suppressed.
$ wc —bytes < status_log.txt
57496
The —lines
(or -l
) switch returns the number of lines in the file. That is, it counts the number of line feed characters.
$ wc —lines < status_log.txt
1569
The —max-line-length
(or -L
) switch returns the length of the longest line. The —words
(or -w
) switch counts the number of words in the file.
wc
can be used with variables when their values are printed into a pipeline.
$ declare -r TITLE="Annual Grain Yield Report" $ printf "%s " "$TITLE" | wc —words 4
The Linux cut
command removes substrings from all lines contained in a file.
The —fields
(or -f
) switch prints a section of a line marked by a specific character. The —delimiter
(or -d
) switch chooses the character. To use a space as a delimiter, it must be escaped with a backslash or enclosed in quotes.
$ declare -r TITLE="Annual Grain Yield Report" $ printf "%s " "$TITLE" | cut -d' ' -f2 Grain
In this example, the delimiter is a space and the second field marked by a space is Grain
. When cutting with printf
, always make sure a line feed character is printed; otherwise, cut
will return an empty string.
Multiple fields are indicated with commas and ranges as two numbers separated by a minus sign (-).
$ printf "%s
" "$TITLE" | cut -d' ' -f 2,4
Grain Report
You separate multiple fields using the delimiter character. To use a different delimiter character when displaying the results, use the —output-delimiter
switch.
The —characters
(or -c
) switch prints the specified characters' positions. This is similar to the dollar sign expression substrings but any character or range of characters can be specified. The —bytes
(or -b
) switch works identically but is provided for future support of multi-byte international characters.
$ printf "%s
" "$TITLE" | cut —characters 1,3,6-8
Anl G
The —only-delimited
(or -s
) switch ignores lines in which the delimiter character doesn't appear. This is an easy way to skip a title or other notes at the beginning of a data file.
When used on multiple lines, cut
cuts each line
$ cut -d, -f1 < orders.txt | head -3
Birchwood China Hutch
Bookcase Oak Veneer
Small Bookcase Oak Veneer
The script in Listing 11.8 adds the quantity fields in orders.txt
.
The Linux paste
command combines lines from two or more files into a single line. With two files, paste
writes to standard output the first line of the first file, a Tab character, and the first line from the second file, and then continues with the second line until all the lines have been written out. If one file is shorter than the other, blank lines are used for the missing lines.
The —delimiters
(-d
) switch is a list of one or more delimiters to use in place of a Tab. The paste
command cycles through the list if it needs more delimiters than are provided in the list, as shown in Listing 11.9.
Example 11.9. two_columns.sh
#!/bin/bash # # two_columns.sh shopt -s -o nounset declare -r ORDERS="orders.txt" declare -r COLUMN1="column1.txt" declare -r COLUMN2="column2.txt" declare –i LINES LINES='wc -l < "$ORDERS"' LINES=LINES/2 head -$LINES < "$ORDERS" > "$COLUMN1" LINES=LINES+1 tail +$LINES < "$ORDERS" > "$COLUMN2" paste —delimiters="|" "$COLUMN1" "$COLUMN2" rm "$COLUMN1" rm "$COLUMN2" exit 0
Running this script, the contents of orders.txt
are separated into two columns, delineated by a vertical bar.
$ sh two_columns.sh
Birchwood China Hutch,475.99,1,756|Bar Stool,45.99,1,756
Bookcase Oak Veneer,205.99,1,756|Lawn Chair,55.99,1,756
Small Bookcase Oak Veneer,205.99,1,756|Rocking Chair,287.99,1,757
Reclining Chair,1599.99,1,757|Cedar Armoire,825.99,1,757
Bunk Bed,705.99,1,757|Mahogany Writing Desk,463.99,1,756
Queen Bed,925.99,1,757|Garden Bench,149.99,1,757
Two-drawer Nightstand,125.99,1,756|Walnut TV Stand,388.99,1,756
Cedar Toy Chest,65.99,1,757|Victorian-style Sofa,1225.99,1,757
Six-drawer Dresser,525.99,1,757|Chair - Rocking,287.99,1,757
Pine Round Table,375.99,1,757|Grandfather Clock,2045.99,1,756
Suppose you had a file called order1.txt
containing an item from orders.txt
separated into a list of the fields on single lines.
Birchwood China Hutch 475.99 1 756
The paste —serial
(-s
) switch pastes all the lines of each file into a single item, as opposed to combining a single line from each file one line at a time. This switch recombines the separate fields into a single line.
$ paste —serial —delimiters="," order1.txt Birchwood China Hutch,475.99,1,756
To merge the lines of two or more files so that the lines follow one another, use the sort
command with the -m
switch.
Columns created with the paste
command aren't suitable for all applications. For pretty displays, the Linux column
command creates fixed-width columns. The columns are fitted to the size of the screen as determined by the COLUMNS
environment variable, or to a specific row width using the -c
switch.
$ column < orders.txt
Birchwood China Hutch,475.99,1,756 Bar Stool,45.99,1,756
Bookcase Oak Veneer,205.99,1,756 Lawn Chair,55.99,1,756
Small Bookcase Oak Veneer,205.99,1,756 Rocking Chair,287.99,1,757
Reclining Chair,1599.99,1,757 Cedar Armoire,825.99,1,757
Bunk Bed,705.99,1,757 Mahogany Writing Desk,463.99,1,756
Queen Bed,925.99,1,757 Garden Bench,149.99,1,757
Two-drawer Nightstand,125.99,1,756 Walnut TV Stand,388.99,1,756
Cedar Toy Chest,65.99,1,757 Victorian-style Sofa,1225.99,1,757
Six-drawer Dresser,525.99,1,757 Chair - Rocking,287.99,1,757
Pine Round Table,375.99,1,757 Grandfather Clock,2045.99,1,756
The -t
switch creates a table from items delimited by a character specified by the -s
switch.
$ column -s ',' -t < orders.txt | head -5
Birchwood China Hutch 475.99 1 756
Bookcase Oak Veneer 205.99 1 756
Small Bookcase Oak Veneer 205.99 1 756
Reclining Chair 1599.99 1 757
Bunk Bed 705.99 1 757
The Linux fold
command ensures that a line is no longer than a certain number of characters. If a line is too long, a carriage return is inserted. fold
wraps at 80 characters by default, but the —width=
n
(or -w
) switch folds at any characters. The —spaces
(or -s
) switch folds at the nearest space to preserve words. The —bytes
(or -b
) switch counts a Tab character as one character instead of expanding it.
$ head -3 orders.txt | cut -d, -f 1 Birchwood China Hutch Bookcase Oak Veneer Small Bookcase Oak Veneer $ head -3 orders.txt | cut -d, -f 1 | fold —width=10 Birchwood China Hutc h Bookcase O ak Veneer Small Book case Oak V eneer $ head -3 orders.txt | cut -d, -f 1 | fold —width=10 —spaces Birchwood China Hutch Bookcase Oak Veneer Small Bookcase Oak Veneer
The Linux join
command combines two files together. join
examines one line at a time from each file. If a certain segment of the lines match, they are combined into one line. Only one instance of the same segment is printed. The files are assumed to be sorted in the same order.
The line segment (or field) is chosen using three switches. The -1
switch selects the field number from the first file. The -2
switch selects the field number from the second. The -t
switch specifies the character that separates one field from another. If these switches aren't used, join
separates fields by spaces and examines the first field on each line.
Suppose the data in the orders.txt
file was separated into two files, one with the pricing information (orders1.txt
) and one with the quantity and account information (orders2.txt
).
$ cat orders1.txt Birchwood China Hutch,475.99 Bookcase Oak Veneer,205.99 Small Bookcase Oak Veneer,205.99 Reclining Chair,1599.99 Bunk Bed,705.99 $ cat orders2.txt Birchwood China Hutch,1,756 Bookcase Oak Veneer,1,756 Small Bookcase Oak Veneer,1,756 Reclining Chair,1,757 Bunk Bed,1,757
To join these two files together, use a comma as a field separator and compare field 1 of the first file with field 1 of the second.
$ join -1 1 -2 1 -t, orders1.txt orders2.txt
Birchwood China Hutch,475.99,1,756
Bookcase Oak Veneer,205.99,1,756
Small Bookcase Oak Veneer,205.99,1,756
Reclining Chair,1599.99,1,757
Bunk Bed,705.99,1,757
If either file contains a line with a unique field, the field is discarded. Lines are joined only if matching fields are found in both files. To print unpaired lines, use -a 1
to print the unique lines in the first file or -a 2
to print the unique lines in the second file. The lines are printed as they appear in the files.
The sense of matching can be reversed with the -v
switch. -v 1
prints the unique lines in the first file and -v 2
prints the unique lines in the second file.
The tests are case-insensitive when the —ignore-case
(or -i
) switch is used.
The fields can be rearranged using the -o
(output) switch. Use a comma-separated field list to order the fields. A field is specified using the file number (1 or 2), a period and the field number from that file. A zero is a short form of the join field.
$ join -1 1 -2 1 -t, -o "1.2,2.3,2.2,0" orders1.txt orders2.txt
475.99,756,1,Birchwood China Hutch
205.99,756,1,Bookcase Oak Veneer
205.99,756,1,Small Bookcase Oak Veneer
1599.99,757,1,Reclining Chair
705.99,757,1,Bunk Bed
The merge
command performs a three-way file merge. This is typically used to merge changes to one file from two separate sources. The merge is performed on a line-by-line basis. If there is a conflicting modification, merge
displays a warning.
For easier reading, the -L
(label) switch can be used to specify a title for the file, instead of reporting conflicts using the filename. This switch can be repeated three times for each of the three files.
For example, suppose there are three sets of orders for ice cream. The original set of orders(file1.txt
) is as follows:
1 quart vanilla 2 quart chocolate
These orders have been modified by two people. The Barrie store now has (file2.txt
) the following:
1 quart vanilla 1 quart strawberry 2 quart chocolate
And the Orillia (file3.txt
) is as follows:
1 quart vanilla 2 quart chocolate 4 quart butter almond
The merge
command reassembles the three files into one file.
$ merge -L "Barrie Store" -L "Original Orders" -L "Orillia Store" file2.txt
file1.txt file3.txt
will change file2.txt
so that it contains:
1 quart vanilla 1 quart strawberry 2 quart chocolate 4 quart butter almond
However, if the butter almond
and strawberry
orders were both added as the third line, merge
reports a conflict:
$ merge -L "Barrie Store" -L "Original Orders" -L "Orillia Store" file2.txt
file1.txt file3.txt
merge: warning: conflicts during merge
file2.txt
will contain the details of the conflict:
<<<<<<< Barrie Store 1 quart strawberry ======= 4 quart butter almond >>>>>>> Orillia Store
If there are no problems merging, merge
returns a zero exit status.
The -q
(quiet) switch suppresses conflict warnings. -p
(print) writes the output to standard output instead of overwriting the original file. The –A
switch reports conflicts in the diff3 -A
format.
-c
—. Displays magic file output
-f
file—. Reads a list of files to process from file
-i
—. Shows the MIME type
-L
—. Follows symbolic links
-m
list
—. Colon-separated list of magic files
-n
—. Flushes the output after each file
-s
—. Allows block or character special files
-v
—. Version
-z
—. Examines compressed files
%_A
—. Uses file last access time
%_a
—. Filename (no suffix)
%_C
—. Uses file inode change time
%_d
—. Device ID
%_e
—. Seconds elapsed since epoch
%_f
—. File system type
%_g
—. Group ID (gid
) number
%_h
—. Three-digit hash code of path
%_i
—. Inode number
%_L
—. Uses current (local) time
%_l
—. Number of hard links
%_M
—. Uses file last modified time
%_m
—. Type/attribute/access bits
%_r
—. Rdev ID (char/block devices)
%_s
—. File size (bytes)
%_U
—. Uses current (UTC) time
%_u
—. User ID (uid
)
%_z
—. Sequence number (1,2,...)
%A
—. Full weekday name
%a
—. Abbreviated weekday name
%B
—. Full month name
%b
—. Abbreviated month name
%C
—. Century number
%c
—. Standard format
%D
—. mm/dd/yy
%d
—. Day (zero filled)
%e
—. Day (space filled)
%H
—. Hour (24-hr clock)
%I
—. Hour (12-hr clock)
%M
—. Minute
%m
—. Month
%n
—. Line feed (newline) character
%P
—. am/pm
%p
—. AM/PM
%r
—. hh:mm:ss AM/PM
%S
—. Second
%T
—. hh:mm:ss (24-hr)
%t
—. Tab character
%U
—. Week number (Sunday)
%V
—. Week number (Monday)
%W
—. Week number (Monday)
%w
—. Weekday (Sunday)
%X
—. Current time
%x
—. Current date
%Y
—. Year
%y
—. Year (two digits)
—accept
L
(or -A
list
)—. Comma-separated lists of suffixes and patterns to accept
—append-output
log
(or -a
log
)—. Like —output-file
, but appends instead of overwriting
—background
(or -b
)—. Runs in the background as if it was started with &
—continue
(or -c
)—. Resumes a terminated download
—cache=
O
(or -C
O
)—. Doesn't return cached Web pages when “off”
—convert-links
(or -k
)—. Converts document links to reflect local directory
—cut-dirs=
N
—. Ignores the first N
directories in a URL pathname
—delete-after
—. Deletes downloaded files to “preload” caching servers
—directory-prefix=
P
(or -P
P
)—. Saves files under P
instead of current directory
—domains
list
(or -D
list
)—. Accepts only given host domains
—dot-style=
S
—. Progress information can be displayed as default
, binary
, computer
, mega
, or micro
—exclude-directories=
list
(or -X
list
)—. Directories to reject when downloading
—exclude-domains
list
—. Rejects given host domains
—execute
cmd
(or -e
cmd
)—. Runs a resource file command
—follow-ftp
—. Downloads FTP links in HTML documents
—force-directories
(or -x
)—. Always creates directories for the hostname when saving files
—force-html
(or -F
)—. Treats —input-file
as an HTML document even if it doesn't look like one
—glob=
O
(or -g
O
)—. Allows file globbing in FTP URL filenames when “on”
—header=
H
—. Specifies an HTTP header to send to the Web server
—http-passwd=
P
—. Specifies a password (instead of in the URL)
—http-user=
U
—. Specifies a username (instead of in the URL)
—ignore-length
—. Ignores bad document lengths returned by Web servers
—include-directories=
list
(or -I
list
)—. Directories to accept when downloading
—input-file=
F
(or -i
F
)—. Reads the URLS to get from the given file; it can be an HTML document
—level=
D
(or -l
D
)—. Maximum recursion level (default is 5)
—mirror
(or -m
)—. Enables recursion, infinite levels, time stamping, and keeping a .listing
file
—no-clobber
(or -nc
)—. Doesn't replace existing files
—no-directories
(or -nd
)—. Saves all files in the current directory
—no-host-directories
(or -nH
)—. Never creates directories for the hostname
—no-host-lookup
(or -nh
)—. Disables DNS lookup of most hosts
—no-parent
(or -np
)—. Only retrieves files below the parent directory
—non-verbose
(or -nv
)—. Shows some progress information, but not all
—output-document=
F
(or -O
F
)—. Creates one file F
containing all files; if -
, all files are written to standard output
—output-file
log
(or -o
log
)—. Records all error messages to the given file
—passive-ftp
—. Uses “passive” retrieval, useful when wget
is behind a firewall
—proxy=
O
(or -Y
O
)—. Turns proxy support “on” or “off”
—proxy-passwd=
P
—. Specifies a password for a proxy server
—proxy-user=
U
—. Specifies a username for a proxy server
—quiet
(or -q
)—. Suppresses progress information
—quota=
Q
(or -Q
Q
)—. Stops downloads when the current files exceeds Q bytes; can also specify k
kilobytes or m
megabytes; inf
disables the quota
—recursive
(or -r
)—. Recursively gets
—reject
L
(or -R
list
)—. Comma-separated lists of suffixes and patterns to reject
—relative
list
(or -L
)—. Ignores all absolute links
—retr-symlinks
—. Treats remote symbolic links as new files
—save-headers
(or -s
)—. Saves the Web server headers in the document file
—server-response
(or -S
)—. Shows server responses
—span-hosts
(or -H
)—. Spans across hosts when recursively retrieving
—spider
—. Checks for the presence of a file, but doesn't download it
—timeout=
S
(or -T
S
)—. Network socket timeout in seconds; 0
for none
—timestamping
(or -N
)—. Only gets new files
—tries=
N
(or -t
N
)—. Try at most N
tries; if inf
, tries forever
—user-agent=
U
(or -U
U
)—. Specifies a different user agent than wget
to access servers that don't allow wget
—verbose
(or -v
)—. By default, shows all progress information
—wait=
S
(or -w
s
)—. Pauses S
seconds between retrievals. Can also specify m
minutes, h
hours, and d
days.
-a
—. Uses an anonymous login
-d
—. Enables debugging
-e
—. Disables command-line editing
-f
—. Forces a cache reload for transfers that go through proxies
-g
—. Disables filename globbing
-I
—. Turns off interactive prompting during multiple file transfers
-n
—. No auto-login upon initial connection
-o
file
—. When auto-fetching files, saves the contents in file
-p
—. Uses passive mode (the default)
-P
port
—. Connects to the specified port instead of the default port
-r
sec
—. Retries connecting every sec
seconds
-R
—. Restarts all non-proxied auto-fetches
-t
—. Enables packet tracing
-T
dir
, max [,inc]
—. Sets maximum bytes/second transfer rate for direction dir
, increments by optional inc
-v
—. Enables verbose messages (default for terminals)
—suffix-format=
FMT
(or -b
FMT
)—. Uses printf
formatting FMT
instead of %d
—prefix=
PFX
(or -f
PFX
)—. Uses prefix PFX
instead of xx
—keep-files
(or -k
)—. Does not remove output files on errors
—digits=
D
(or -n
D
)—. Uses specified number of digits instead of two
—quiet
(or —silent
or -s
)—. Does not print progress information
—elide-empty-files
(or -z
)—. Removes empty output files
—bytes=
N
(-c
N
)—. Outputs the last N
bytes
—follow
[=ND]
(or -f
ND
)—. Outputs appended data as the file indicated by name N
or descriptor D
grows
—lines=
N
(or -n
N
)—. Outputs the last N
lines, instead of the last 10
—max-unchanged-stats=
N
—. Continues to check file up to N
times (default is 5), even if the file is deleted or renamed
—max-consecutive-size-changes=
N
—. After N
iterations (default 200) with the same size, makes sure that the filename refers to the same inode
—pid=
PID
—. Terminates after process ID PID
dies
–quiet
(or —silent
or -q
)—. Never outputs headers with filenames
—sleep-interval=
S
(or -s
S
)—. Sleeps S
seconds between iterations
—characters=
L
(or -c
L
)—. Shows only these listed characters
—delimiter=
D
(or -d
D
)—. Uses delimiter D
instead of a Tab character for the field delimiter
—fields=
L
(or -f
L
)—. Shows only these listed fields
—only-delimited
(or -s
)—. Does not show lines without delimiters
—output-delimiter=
D
—. Uses delimiter D
as the output delimiter
-2
F
—. Joins on field F
of file 2
-a
file
—. Prints unpaired lines from file
-e
s
—. Replaces missing input fields with string s
—ignore-case
(or -i
)—. Ignores differences in case when comparing fields
-o
F
—. Obeys format F
while constructing output line
-t
C
—. Uses character C
as input and output field separator
-v
file
—. Suppresses joined output lines from file
-A
—. Merges conflicts by merging all changes leading from file2
to file3
into file1
-e
—. Merge conflicts are marked as ====
and ====
-E
—. Merge conflicts are marked as <<<<<
and >>>>>>
-L
label
—. Uses up to three times to specify labels to be used in place of the filenames
-p
—. Writes to standard output
3.143.3.208