In a production environment, it is a good idea to know whether your applications are dumping core often. It is also considered good housekeeping to know about your core files so your hard disk won't fill up with unnecessary files. The small script in this chapter tracks down and cleans up core files. The script was intended to be run as an hourly cron
job, although you could change the schedule to fit your needs. The job also has its priority lowered using the nice
command so that it won't interfere with the performance of regular processes on the machine. The notifications I've received from this script have characterized chronic issues with applications on more than one occasion. Without the script, I would have never seen the patterns.
This script steps through each of the locally mounted file systems and finds all core files. It determines the applications that created the core files and moves the core files to a central location for later examination. The script also logs its actions and cleans up old saved files.
First we have to set up some straightforward variables.
#!/bin/sh
HOWOLD=30
FSLIST="/ /boot"
UNAME=`uname -n`
DATE=`date +%m%d%I%M`
DATADIR="/usr/local/data/cores"
LOGFILE="$DATADIR/cor_report"
The HOWOLD
variable is used as the maximum age, in days, for saved core files. Any files older than this will be removed.
The FSLIST
variable contains the list of file systems that will be checked for core files. The list will vary from system to system. You could set FSLIST
dynamically by using the df
command to determine the locally mounted file systems. The command might look something like this: FSLIST=`df -l | grep '^/dev' | awk '{print $6}'`
, which gathers the lines starting with /dev
and then prints the field containing the file-system name.
The other four variables contain the name of the system that the script is running on (UNAME
), the current date (DATE
), the directory to which the core files will be saved (DATADIR
), and the name of the log file (LOGFILE
).
Now we determine whether the data directory exists. This is the directory where the core files will be saved. If it doesn't exist, you have to create it. The -p
option to mkdir
adds any omitted parent directories to the path of the directory being created.
if [ ! -d $DATADIR ]
then
mkdir -p $DATADIR
fi
Then you need to find all previously saved core files that don't need to be kept around anymore, and remove them.
find $DATADIR -name *.core.* -mtime +$HOWOLD -exec rm {} ;
The following is the script's main loop. It finds all the core files created since the last time the script was run.
find $FSLIST -mount -type f -name core -print | while read file
do
if [ -s $file ]
then
coretype=`file $file | cut -d' -f2`
mv $file $DATADIR/$UNAME.core.$coretype.$DATE
echo "$coretype $UNAME $DATADIR/$UNAME.core.$coretype.$DATE" |
tee -a $LOGFILE | mailx -s "core_finder: $coretype core file
found on $UNAME" sysadmins
else
rm $file
fi
done
If a core file's size is zero bytes, we remove it. If it is larger than zero bytes, we use the file
command to determine the application that created the core file. Then the core file is saved in the archive directory and its name is changed to include the type of core and the date when the file was found. Then we add an entry to the log file noting the action that was taken, and send an e-mail notice to the administrators.
3.129.243.23