CHAPTER 40
Removing Large Files and
Log Rolling

This chapter gives a few tips relating to moving or removing files that are consuming large amounts of disk space. In some cases you may have a file system that's filling up because a process adds copious amounts of entries to a log file. When you run into a large log file that is found to be the primary cause of a full file system, your first inclination may be to remove the file to reclaim space. However, this may not work as you might intend.

Before I go into why, let me first talk a little about files and their structure. The file name is the visible representation of an inode that allows you to access the data in the file. The inode contains all the important information about a file, such as its ownership, permissions, modification times, and other items of interest. An inode also holds information about the location of the data on the hard disk. A file name is a user-friendly way of accessing an inode, which is represented by a number. You can determine the inode number of any file by running the command ls -i filename.

Here is an example of a full directory:

$ ls -i
474322 37l5152.pdf               214111 rsync.tar.gz
474633 770tref.pdf               215939 yum-2.0.5-1.fd.fr.noarch.rpm
215944 openbox-3.1-1.i386.rpm

Note that the inode number precedes each of the five files listed here. As just mentioned, part of the data contained in an inode is a pointer to the data on the physical disk. If the file has been opened by a process for writing, the process writes to this location on the hard disk.

The potential problem with removing a large file to clean up disk space is that if a user or an administrator removes the file, the inode may still be kept open by a process that is writing the data, and the disk space will not be returned to the system. The operating system won't realize it should release the disk space for reuse until the process closes the file. At that time the disk space will definitely be reclaimed.

One way of finding out if a process is keeping a file open is to use the fuser command, which will display a list of process IDs that are accessing a given file. You also can use the lsof command to find this information. lsof is designed to list open files and the processes that are accessing them. Once you know the processes that are using the file, you can stop them before deleting the file. However, you may not be able (or want) to do this. The other problem with needing to stop a process that is holding a file open is that doing so may be against existing site policies. The process or application that is holding the file open might be production-critical and impact business needs if it is halted.

Another way to clean up the space used by a file is to zero out the file by redirecting /dev/null into the file. This trims the file down to zero bytes while leaving the file itself in place. The file remains open and accessible to any process that might be using it. However, the operating system will release the disk space in a timely fashion. Keep in mind that some processes may keep a file open for writing for a very long time.

Here is a sample of a directory listing, including an offending log file that is consuming large amounts of disk space. We have a choice of several possible commands that will zero out the file.

$ ls -l
total 113172
-rw-rw-r-- 1 rbpeters rbpeters    1057862  Jun  7 18:21 37l5152.log
-rw-rw-r-- 1 rbpeters rbpeters     449184  Jun  7 18:21 770tref.log
-rw-rw-r-- 1 rbpeters rbpeters 1104249096  Jun  7 21:22 really_big.log

The following command is one, cp /dev/null really_big.log is another, and echo > really_big.log is yet another. All of these commands overwrite a large file with nothing.

$ >really_big.log

This is the resulting directory listing after the file has been zeroed out:

$ ls -l
total 1484
-rw-rw-r-- 1 rbpeters rbpeters  1057862 Jun 7 18:21 37l5152.log
-rw-rw-r-- 1 rbpeters rbpeters   449184 Jun 7 18:21 770tref.log
-rw-rw-r-- 1 rbpeters rbpeters        0 Jun 7 21:25 really_big.log

This procedure can be used to rotate log files. For example, you may have an application for which you would like to keep one week's worth of log information. You would then rotate it to an older version to keep up to one month. After a month, you would delete the file. One solution includes the logrotate command, which is part of multiple Linux distributions.

The following is a simple script that copies the existing log file to a backup with a version number and then zeroes out the file. During the time the script is being run, any existing processes that are writing to the file can still access it.

The code first defines the file in question and then checks whether it exists. If it does exist, execution continues.

#!/bin/sh
LOGFILE=./my.log
if [ ! -f $LOGFILE ]
then
  echo "Nothing to do... exiting."
else

You've already seen the use of the redirect (>) to zero a file. The main addition to this idea here is the cp -p command.

  cp -p $LOGFILE ${LOGFILE}.0
  > $LOGFILE
fi

This command preserves the original file's ownership and time stamp for the new copy. Thus the new copy maintains the attributes of the original file and it appears as if the file had been moved. This is also a useful technique when modifying scripts, configuration files, or any other file for which preserving the original file's attributes would be valuable.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.136.233.19