This chapter covers the following topics:
This chapter covers the following exam objectives:
Objective 4.1: Given a scenario, analyze system properties and remediate accordingly
Objective 4.4: Given a scenario, analyze and troubleshoot application and hardware issues
One of the more challenging troubleshooting operations is dealing with hardware issues. Because of the vast array of possible hardware devices, this sort of troubleshooting requires a lot of research and patience.
In this chapter, you will learn many commands and techniques designed to help you troubleshoot issues related to the CPU, RAM, I/O devices, and other hardware devices. It will help you build a toolbox full of tools you can use to troubleshoot future hardware problems.
The “Do I Know This Already?” quiz enables you to assess whether you should read this entire chapter or simply jump to the “Exam Preparation Tasks” section for review. If you are in doubt, read the entire chapter. Table 16-1 outlines the major headings in this chapter and the corresponding “Do I Know This Already?” quiz questions. You can find the answers in Appendix A, “Answers to the ‘Do I Know This Already?’ Quizzes and Review Questions.”
Table 16-1 “Do I Know This Already?” Foundation Topics Section-to-Question Mapping
Foundation Topics Section |
Questions Covered in This Section |
---|---|
1 |
|
2–3 |
|
4 |
|
5 |
Caution
The goal of self-assessment is to gauge your mastery of the topics in this chapter. If you do not know the answer to a question or are only partially sure of the answer, you should mark that question as wrong for purposes of the self-assessment. Giving yourself credit for an answer you correctly guess skews your self-assessment results and might provide you with a false sense of security.
1. Which of the following commands can be used to perform simple disk latency tests?
a. df
b. ioping
c. du
d. noop
2. The output of the uptime command indicates a value of 1.50 for the past 1 minute on your system that has two CPUs. On average, what percentage of the time was each CPU busy during this period?
a. 50%
b. 75%
c. 100%
d. 150%
3. Which option to the iostat command displays just CPU load information?
a. -c
b. -d
c. -h
d. -m
4. Which of the following commands display memory usage? (Choose all that apply.)
a. free
b. iostat
c. vmstat
d. netstat
5. Which of the following commands displays information about the system’s Ethernet device?
a. lsmod
b. lsusb
c. lsdevices
d. None of these are correct.
Many components of the Linux operating system can have an impact on the performance of a system. Monitoring all these components to proactively protect your system can be daunting. To make this task easier, you should become familiar with the various tools and utilities that can be used to monitor your system.
The most important components that you should monitor include the following:
CPU usage: Regardless of the other components, if your CPU is overloaded, performance is going to suffer.
Memory usage: Full system RAM can prevent additional programs from executing or can slow down your system as swap space is heavily used.
Disk I/O: Particularly on systems that host databases or mail servers, disk input/output performance is an important component to monitor.
Network I/O: Anyone who has experienced a slow network (which is likely to be everyone) understands how important network speed is to the operation of a system.
Before you dive into learning about the various tools available to monitor your system, you should consider why you would want to monitor the system in the first place. There are a large number of reasons, and the following list is meant to give you an idea of why this topic is so important:
Monitor the system for overall performance issues: This reason is often the first one that comes to mind when administrators think about monitoring a system. No one wants a slow, sluggish operating system, and properly monitoring your system can help you find performance issues.
Monitor the system to gauge the performance of specific programs: You might be called on to assist developers in testing new software. One way of doing this is to monitor CPU, memory, disk, and network usage while the specific software is running.
Monitor the system for security issues: Clearly, security is a huge issue in today’s IT world. Not only do you need to be concerned about someone attempting to steal sensitive data, you also need to be concerned about monitoring your system for suspicious activity. For example, a heavy increase in network activity could indicate a brute force hacking attempt.
Note
Linux+ Object 4.1: “Given a scenario, analyze system properties and remediate accordingly” includes several topics that are covered in Chapter 19, “Storage Configuration.” Therefore, the following topics are not included in this chapter:
The iostat command
The du command
The df command
LVM tools
The fsck command
The partprobe command
Please see Chapter 19 for details on these topics.
The purpose of the ioping command is to perform simple latency tests on a disk. Latency is how the delay in data transfer is measured. Typically you want to perform multiple tests on a device, such as the following command that ends up performing 100 tests on the filesystem for the current directory (with the tail command included to limit output):
[root@localhost root]# ioping -c 100 . | tail 4 KiB <<< . (ext4 /dev/sda1): request=96 time=210.6 us 4 KiB <<< . (ext4 /dev/sda1): request=97 time=280.7 us 4 KiB <<< . (ext4 /dev/sda1): request=98 time=249.2 us 4 KiB <<< . (ext4 /dev/sda1): request=99 time=230.1 us 4 KiB <<< . (ext4 /dev/sda1): request=100 time=210.9 us --- . (ext4 /dev/sda1) ioping statistics --- 99 requests completed in 22.6 ms, 396 KiB read, 4.38 k iops, 17.1 MiB/s generated 100 requests in 1.65 min, 400 KiB, 1 iops, 4.04 KiB/s min/avg/max/mdev = 143.1 us / 228.1 us / 369.1 us / 40.4 us
There are two things you should consider when reviewing the output of the ioping command:
The results should be compared with the specifications that the hard disk vendor has provided.
The results should be compared to a baseline test. It is best to create a baseline during a time when the system is not being used by users and when not much activity is taking place.
Kernel parameters can be used to optimize the I/O (input/output) scheduler. Several parameters can be set to change the behavior of the scheduler. For the Linux+ exam, the following parameters are important:
noop: This schedule follows the FIFO (first-in, first-out) principle.
CFQ: This schedule, which stands for Completely Fair Queuing, has a separate queue for each process, and each queue is served in a continuous loop.
Deadline: This is the standard scheduler, and it creates two queues: a read queue and a write queue. It also puts a timestamp on each I/O request to ensure that requests are handled in a timely manner.
Which scheduler is best depends on what the device is being used for; each device can have a different scheduler. A fair amount of research may be necessary to determine the best scheduler. In addition, other kernel parameters can further tune how each scheduler performs.
To see the current scheduler, view the contents of the /sys/block/device/queue/scheduler file (where device is the actual device name). For example:
[root@localhost root]# cat /sys/block/sda/queue/scheduler [noop] deadline cfq
The value within the [ ] characters is the default. To change this value, use the echo command as shown here:
[root@localhost root]# echo "cfq" > /sys/block/sda/queue/scheduler [root@localhost root]# cat /sys/block/sda/queue/scheduler noop deadline [cfq]
Note
Kernel parameters can also be set so changes are persistent across reboots. See Chapter 20, “Network Configuration,” to learn about the sysctl configuration file.
While monitoring the CPU(s), it is important to determine not only whether a CPU is overloaded but also what is causing the overload. For example, is it a user process or a system process? If working in a virtual environment, is the hypervisor to blame? Determining the answers to such questions will help you troubleshoot system performance issues.
The uptime command shows how long the system has been running. More importantly for system monitoring, it provides a quick snapshot of how many users are on the system and the system load average over the most recent 1, 5, and 15 minutes. Here is an example:
[root@localhost ~]# uptime 16:16:00 up 1 day, 1:05, 6 users, load average: 0.60, 0.51, 0.25
The load average often causes confusion for administrators. This value is designed to describe the CPU load. For a system with a single CPU, a load average of 0.50 means the CPU was used for 50% of that time period. A load average of 1.50 means the CPU was overtasked; process requests were stuck on the queue as the CPU was busy handling other requests. This is not an ideal situation for a system if it happens often.
If you have two CPUs, then you must look at this data differently. A load average of 0.50 would mean the CPUs were used for 25% of that time period. A load average of 1.50 means that the CPUs were in use for 75% of that time period. In other words, the load average as a percentage is calculated by dividing the load value by the number of CPUs and multiplying by 100.
Another means of discovering information about CPUs is the /proc/cpuinfo file, which contains detailed information about each CPU, as demonstrated in Example 16-1.
Example 16-1 The /proc/cpuinfo File
[root@localhost ~]# cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 13 model name : QEMU Virtual CPU version (cpu64-rhel6) stepping : 3 microcode : 0x1 cpu MHz : 2666.760 cache size : 4096 KB physical id : 0 siblings : 1 core id : 0 cpu cores : 1 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 4 wp : yes flags : fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pse36 clflush mmx fxsr sse sse2 syscall nx lm nopl pni cx16 hypervisor lahf_lm bugs : bogomips : 5333.52 clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: processor : 1 vendor_id : GenuineIntel cpu family : 6 model : 13 model name : QEMU Virtual CPU version (cpu64-rhel6) stepping : 3 microcode : 0x1 cpu MHz : 2666.760 cache size : 4096 KB physical id : 1 siblings : 1 core id : 0 cpu cores : 1 apicid : 1 initial apicid : 1 fpu : yes fpu_exception : yes cpuid level : 4 wp : yes flags : fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pse36 clflush mmx fxsr sse sse2 syscall nx lm nopl pni cx16 hypervisor lahf_lm bugs : bogomips : 5333.52 clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management:
One of the more useful tools in monitoring the CPU is called iostat. This tool is also used for monitoring disk I/O, but this section focuses on the CPU monitoring features.
When you execute the iostat command with the -c option, the command provides you with statistics regarding CPU utilization since the last time the system was booted, as shown in the following example:
[root@localhost ~]# iostat -c Linux 3.10.0-229.el7.x86_64 (localhost.localdomain) 09/05/2015 _x86_64_(1 CPU) avg-cpu: %user %nice %system %iowait %steal %idle 1.84 0.00 1.18 0.33 0.00 96.65
The beginning of the output provides a summary of your system, including the kernel version, the hostname, the date of the report, the kernel type, and how many CPUs the system has. The second part of the report contains CPU statistics; if you have more than one CPU, this will be an average of all of the CPUs. The values provided include
%user: This value represents the percentage of CPU utilization for when applications were running at the user level (processes running as a normal user account).
%nice: Regular users can execute commands using the nice command to alter process CPU priority. This value represents the CPU utilization for these processes.
%system: This value represents the percentage of CPU utilization of kernel-based processes.
%iowait: This value represents the percentage of CPU utilization when the CPU was waiting for disk I/O operations to complete before performing the next action.
%steal: This value only pertains to virtual CPUs. In some cases the virtual CPU must wait for the hypervisor to handle requests from other virtual CPUs. This value indicates the percentage of time waiting for the hypervisor to handle the virtual CPU’s request.
%idle: This value represents the percentage of time the CPU is not handling requests.
One of the purposes of the iostat command is to determine the causes of likely problems. Fortunately, this normally only requires understanding the output of commands and some common sense. For example, consider the following output:
[root@localhost ~]# iostat -c Linux 3.10.0-229.el7.x86_64 (localhost.localdomain) 09/05/2015 _x86_64_(1 CPU) avg-cpu: %user %nice %system %iowait %steal %idle 1.84 0.00 1.18 45.33 0.00 51.65
Here you can see that the output of the %iowait value is higher than what might be considered normal. If the system seems sluggish (users are complaining about slow processes, services are responding slowly, and so on), this value would encourage you to investigate your hard disk utilization.
It is unlikely that you will be asked about specific iostat options on the Linux+ exam. However, there are two useful arguments to the command: the interval and the count. The interval is always the first argument and indicates how many seconds to wait between running iostat reports. The count is the number of reports to run.
For example, review the following output, which shows the iostat command displaying statistical information once every second three times during a period in which a process peaks in system usage:
[root@localhost ~]# iostat -c 1 3 Linux 3.10.0-229.el7.x86_64 (localhost.localdomain) 09/05/2015 _x86_64_(1 CPU)
avg-cpu: %user %nice %system %iowait %steal %idle 6.30 0.40 0.98 0.42 0.00 91.89
avg-cpu: %user %nice %system %iowait %steal %idle 5.05 0.00 94.95 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle 5.05 0.00 94.95 0.00 0.00 0.00
Another command that provides the same statistics as the iostat command is the sar command. However, the sar command displays this information as it occurs over time (typically at 10-minute intervals). For example, look at the output of Example 16-2.
Example 16-2 Output of the sar Command
[root@localhost ~]# sar | head Linux 3.10.0-229.el7.x86_64 (localhost.localdomain) 09/06/2015 _x86_64_(1 CPU) 12:00:01 AM CPU %user %nice %system %iowait %steal %idle 12:10:01 AM all 0.11 0.00 0.10 0.01 0.00 99.78 12:20:01 AM all 0.08 0.00 0.07 0.02 0.00 99.84 12:30:01 AM all 0.10 0.00 0.11 0.01 0.00 99.79 12:40:01 AM all 0.06 0.00 0.04 0.00 0.00 99.89 12:50:01 AM all 0.08 0.00 0.04 0.00 0.00 99.88 01:00:02 AM all 0.10 0.00 0.11 0.01 0.00 99.79 01:10:01 AM all 0.10 0.00 0.05 0.00 0.00 99.85
A system with a blazing-fast CPU can grind to a halt if there are issues with memory. Keep in mind that when you are monitoring memory utilization, you need to look at both RAM (random access memory) and swap space.
The first command related to memory usage that you should be aware of is the free command. This command provides a summary of virtual memory (RAM and swap space utilization):
[root@localhost ~]# free total used free shared buff/ cache available Mem: 3883128 774752 777300 18200 2331076 2784820 Swap: 839676 0 839676
The Mem: line describes RAM, and the Swap: line describes virtual memory. The columns of output are described here:
total: The total amount of memory on the system.
used: The amount of memory currently being used.
free: The amount of memory available.
shared: The amount of memory used by tmpfs, a filesystem that appears to be normal hard disk space but that is really storing data in memory.
buff/cache: The amount of memory in a temporary storage location.
available: The amount of memory available for new processes.
The values are displayed in kilobytes by default. You can view more precise values by using the -b or --bytes option or can display in megabytes (-m or --mega) or gigabytes (-g or --giga). There is also an -h (or --human) option you can use to display in whatever value is appropriate for “human-readable sizes.”
If you need more detail than the free command provides, you can execute the vmstat command. Consider the output in the following example:
[root@localhost ~]# vmstat procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 2 0 0 776828 1704 2329456 0 0 49 89 115 120 6 1 93 0 0
It is not critical to memorize each line of this output, but you should be familiar with the output the vmstat command can provide. For example, in the preceding output, there is a column called -----io----. Under this column are two subcolumns:
bi: Also called “blocks in,” this value represents how many blocks have been received from a block device (like a hard disk).
bo: Also called “blocks out,” this value represents how many blocks have been sent to a block device (like a hard disk).
These values can be used to determine whether a performance issue is memory based or disk based. If the bi and bo values are high, it could mean that processes are being blocked on I/O.
You should also be aware that the vmstat command displays buffer cache output, which is included in the Linux+ exam objectives. The buffer (under the buff column) is a value that represents how much RAM is currently in use for disk block caching (specially file metadata). The cache itself contains file contents that are temporarily stored in memory.
Note that most of the data that is displayed by the free and vmstat commands actually comes from the /proc/meminfo file, which can be viewed directly.
What happens when the system uses too much memory? The Linux kernel has a feature called the OOM killer (out of memory killer), which kills processes to clear up memory. Without this feature, the system could grind to a halt, and new processes could fail to execute.
The OOM killer determines which process to kill by assigning a score to each process (called a “badness score”) and then killing the worst process. Often the worst process is the process using the most memory and, very likely, a key process on a server, such as the mail service or the web server.
You can search log files (typically either /var/log/messages or /var/log/kern.log) to find evidence—like the following message—that the OOM killer has struck:
host kernel: Out of Memory: Killed process 1466 (httpd).
Note
There are methods of configurating the OOM killer (using kernel parameters), but they are beyond the scope of the Linux+ exam. For the exam, you should be aware of the function of the OOM killer and how you can determine if the OOM killer has taken action.
The operating system uses swap space when available RAM runs low. Data in RAM not currently being used is “swapped” to the hard drive to make room for other processes to use RAM.
Typically you create a swap partition as part of the installation process. At some point after installation, you may decide to add additional swap space. This can be in the form of another swap partition or a swap file.
To see your currently active swap devices, execute the swapon command with the -s option:
[root@localhost ~]# swapon -s Filename Type Size Used Priority /dev/dm-1 partition 1048568 27100 -1
From the output of the swapon -s command, you can see the name of the device (/dev/dm-1) that holds the swap filesystem, the size (in bytes) of the swap filesystem, and how much of this space has been used. The priority indicates which swap filesystem should be used first.
If you have an existing swap device, such as a swap file, you can add it to the currently used swap devices by using the swapon command. For example, the following command adds the /var/swap file:
[root@localhost ~]# swapon /var/swap [root@localhost ~]# swapon -s Filename Type Size Used Priority /dev/dm-1 partition 1048568 27484 -1 /var/swap file 51192 0 -2
To have this swap device enabled each time you boot the system, add a line like the following to the /etc/fstab file:
/var/swap swap swap defaults 0 0
If you decide that you want to manually remove a device from current swap space, use the swapoff command:
[root@localhost ~]# swapon -s Filename Type Size Used Priority /dev/dm-1 partition 1048568 27640 -1 /var/swap file 51192 0 -2 [root@localhost ~]# swapoff /var/swap [root@localhost ~]# swapon -s Filename Type Size Used Priority /dev/dm-1 partition 1048568 27640 -1
A swap device can be either a storage device (for example, partition, logical volume, RAID) or a file. Typically, storage devices are a bit quicker than swap files as the kernel doesn’t have to go through a filesystem to access swap space on a storage device. However, you aren’t always able to add a new device to the system, so you may find that swap files are more often used for secondary swap devices.
Assuming that you have already created a new partition with a tag type 82 (/dev/sdb1 in this example), you can format it as a swap device by executing the following command:
[root@localhost ~]# mkswap /dev/sdb1 Setting up swapspace version 1, size = 203772 KiB no label, UUID=039c21dc-c223-43c8-8176-da2f4d508402
Recall that you can add this to existing swap space with the swapon command:
[root@localhost ~]# swapon -s Filename Type Size Used Priority /dev/dm-1 partition 1048568 37636 -1 [root@localhost ~]# swapon /dev/sdb1 [root@localhost ~]# swapon -s Filename Type Size Used Priority /dev/dm-1 partition 1048568 37636 -1 /dev/sdb1 partition 203768 0 -2
Remember that the swapon command is only a temporary solution. Add an entry like the following to the /etc/fstab to make this a permanent swap device:
UUID=039c21dc-c223-43c8-8176-da2f4d508402 swap defaults 0 0
To create a swap file, you first need to create a large file. This is most easily accomplished by using the dd command. Example 16-3 demonstrates how to create a 200MB file named /var/extra_swap and then enable it as swap space:
Example 16-3 Creating a File and Enabling Swap Space
[root@localhost ~]# dd if=/dev/zero of=/var/extra_swap bs=1M count=200 200+0 records in 200+0 records out 209715200 bytes (210 MB) copied, 4.04572 s, 51.8 MB/s [root@localhost ~]# mkswap /var/extra_swap mkswap: /var/extra_swap: warning: don't erase bootbits sectors on whole disk. Use -f to force. Setting up swapspace version 1, size = 204796 KiB no label, UUID=44469984-0a99-45af-94b7-f7e97218d67a [root@localhost ~]# swapon /var/extra_swap [root@localhost ~]# swapon -s Filename Type Size Used Priority /dev/dm-1 partition 1048568 37664 -1 /dev/sdb1 partition 203768 0 -2 /var/extra_swap file 204792 0 -3
A variety of troubleshooting topics related to hardware devices are listed under Objective 4.4: “Given a scenario, analyze and troubleshoot application and hardware issues” for the Linux+ exam. The subsections in this section are taken directly from the exam objective.
Most of the useful information regarding troubleshooting memory issues is covered earlier in this chapter, in the “Memory Monitoring and Configuration” section. In addition to that information, always consider compatibility issues with any hardware, including memory sticks.
The printer service Common Unix Printing System (CUPS) is covered in Chapter 15, “Linux Devices.” Typically, the biggest issue regarding troubleshooting new printers is related to printer drivers; consult the cups.org website for new drivers.
Other common issues include
Paper jams: Read the user manual for the printer to learn how to clear paper from the printer.
Disabled printer queue: Consult Chapter 15 to learn how to change the printer state.
Invalid print jobs: Consult Chapter 15 to learn how to remove print jobs from the print queue.
Note
Chapter 15 also contains an entire section on troubleshooting printer problems.
Normally video hardware issues stem from broken hardware devices (such as the monitor, the video card, or monitor cables) or compatibility issues, which often arise when newer hardware is used. For compatibility issues, consult the website of your X server for possible new drivers.
A GPU (graphics processing unit) is a device used to process data related to the graphics card (also known as a video card). Typically such a device needs a driver in order to work correctly. This is especially true for any newer GPU. Consult the website of the vendor that provided the GPU for details.
A complete discussion of GPU management is beyond the scope of the Linux+ exam. If you want to learn more about this topic, visit https://www.kernel.org/doc/html/v4.20/gpu/index.html (replacing v4.20 in this URL with the current version of the kernel on your system for the more accurate documentation).
Often the term communications port refers to network ports used by services, but because this topic is related to hardware troubleshooting, in this case it refers to I/O ports, or “comm ports.” I/O ports are used to communicate with devices like your keyboard, mouse, and terminal devices.
To display your I/O ports, view the /proc/ioports file, shown in Example 16-4.
Example 16-4 Viewing I/O Ports
[root@localhost ~]# cat /proc/ioports 0000-001f : dma1 0020-0021 : pic1 0040-0043 : timer0 0050-0053 : timer1 0060-0060 : keyboard 0064-0064 : keyboard 0070-0071 : rtc0 0080-008f : dma page reg 00a0-00a1 : pic2 00c0-00df : dma2 00f0-00ff : fpu 0170-0177 : 0000:00:01.1 0170-0177 : ata_piix 01f0-01f7 : 0000:00:01.1 01f0-01f7 : ata_piix 0376-0376 : 0000:00:01.1 0376-0376 : ata_piix 03c0-03df : vga+ 03f6-03f6 : 0000:00:01.1 03f6-03f6 : ata_piix 03f8-03ff : serial 0cf8-0cff : PCI conf1 afe0-afe3 : ACPI GPE0_BLK b000-b03f : 0000:00:01.3 b000-b003 : ACPI PM1a_EVT_BLK b004-b005 : ACPI PM1a_CNT_BLK b008-b00b : ACPI PM_TMR b010-b015 : ACPI CPU throttle b100-b10f : 0000:00:01.3 b100-b107 : piix4_smbus c000-c00f : 0000:00:01.1 c000-c00f : ata_piix c020-c03f : 0000:00:01.2 c020-c03f : uhci_hcd c040-c07f : 0000:00:03.0 c040-c07f : e1000 c080-c0bf : 0000:00:05.0 c080-c0bf : virtio-pci c0c0-c0df : 0000:00:06.0 c0c0-c0df : virtio-pci
Another technique to display hardware information is to execute the lsdev command, as shown in Example 16-5.
Example 16-5 The lsdev Command
[root@localhost ~]# lsdev Device DMA IRQ I/O Ports ------------------------------------------------ 0000:00:01.1 0170-0177 01f0-01f7 0376-0376 03f6-03f6 d000-d00f 0000:00:03.0 d010-d017 0000:00:04.0 d020-d03f 0000:00:05.0 d100-d1ff d200-d23f 0000:00:0d.0 d240-d247 d250-d257 d260-d26f ACPI 4000-4003 4004-4005 4008-400b 4020-4021 acpi 9 ahci d240-d247 d250-d257 d260-d26f ata_piix 14 15 0170-0177 01f0-01f7 0376-0376 03f6-03f6 d000-d00f cascade 4 dma 0080-008f dma1 0000-001f dma2 00c0-00df e1000 d010-d017 eth0 19 fpu 00f0-00ff i8042 1 12 Intel d100-d1ff d200-d23f keyboard 0060-0060 0064-0064 ohci_hcd:usb1 22 PCI 0cf8-0cff pic1 0020-0021 pic2 00a0-00a1 rtc0 8 0070-0071 rtc_cmos 0070-0071 snd_intel8x0 21 timer 0 timer0 0040-0043 timer1 0050-0053 vboxguest 20 vesafb 03c0-03df
The primary focus of the lsdev command is to display hardware direct memory access (DMA), I/O ports, and interrupts. This command gathers information from the /proc/dma, /proc/ioports, and /proc/interrupts files and displays the information in an easy-to-read format.
To view information about USB devices attached to a system, execute the lsusb command. If no devices are currently attached, the output should look like the following:
[root@localhost ~]# lsusb Bus 001 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
This output shows the root hub, which is essentially the USB ports. Even if you have multiple USB ports, you will probably see only one root hub. In some cases, you might see more than one; for example, if you have both USB 2.0 and USB 1.1 ports, each set of ports might show up as a root hub.
If you attached a device to your system, the output of the lsusb command would look something like the following:
[root@localhost ~]# lsusb Bus 001 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub Bus 001 Device 005: ID 058f:6387 Alcor Micro Corp. Flash Drive
The output shown here includes a vendor number (in this case 058f) and a product number (in this case 6387) that are useful when you use the -v option to the lsusb command. The -v option shows verbose information, which is not something that you want to see for every USB device because it would produce a large amount of output. Use the -d option and pass the vendor and product numbers as arguments to limit the verbose output to just a single USB device, as demonstrated in Example 16-6.
Example 16-6 The lsusb -v -d Command
[root@localhost ~]# lsusb -v -d 058f:6387 Bus 001 Device 005: ID 058f:6387 Alcor Micro Corp. Flash Drive Device Descriptor: bLength 18 bDescriptorType 1 bcdUSB 1.10 bDeviceClass 0 (Defined at Interface level) bDeviceSubClass 0 bDeviceProtocol 0 bMaxPacketSize0 64 idVendor 0x058f Alcor Micro Corp. idProduct 0x6387 Flash Drive bcdDevice 1.04 iManufacturer 1 Generic iProduct 2 Mass Storage iSerial 3 04316D18 bNumConfigurations 1 Configuration Descriptor: bLength 9 bDescriptorType 2 wTotalLength 32 bNumInterfaces 1 bConfigurationValue 1 iConfiguration 0 bmAttributes 0x80 (Bus Powered) MaxPower 100mA Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 0 bAlternateSetting 0 bNumEndpoints 2 bInterfaceClass 8 Mass Storage bInterfaceSubClass 6 SCSI bInterfaceProtocol 80 Bulk-Only iInterface 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x01 EP 1 OUT bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0200 1x 512 bytes bInterval 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x82 EP 2 IN bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0200 1x 512 bytes bInterval 0 Device Status: 0x0000 (Bus Powered)
Whenever you add any hardware device to a system, it can be helpful to view the contents of the /var/log/messages file on Red Hat–based systems or the /var/log/syslog file on Debian-based systems as this file shows you how the device was recognized by the kernel. For example, the following messages appeared when a USB thumb drive was added to a system:
[root@localhost ~]# tail /var/log/messages Nov 10 12:50:15 localhost kernel: scsi10 : SCSI emulation for USB Mass Storage devices Nov 10 12:50:16 localhost kernel: scsi 10:0:0:0: Direct-Access Generic Flash Disk 8.07 PQ: 0 ANSI: 2 Nov 10 12:50:16 localhost kernel: sd 10:0:0:0: Attached scsi generic sg8 type 0 Nov 10 12:50:16 localhost kernel: sd 10:0:0:0: [sdg] 7831552 512-byte logical blocks: (4.00 GB/3.73 GiB) Nov 10 12:50:16 localhost kernel: sd 10:0:0:0: [sdg] Write Protect is off Nov 10 12:50:16 localhost kernel: sd 10:0:0:0: [sdg] Assuming drive cache: write through Nov 10 12:50:16 localhost kernel: sd 10:0:0:0: [sdg] Assuming drive cache: write through Nov 10 12:50:16 localhost kernel: sdg: sdg1 Nov 10 12:50:16 localhost kernel: sd 10:0:0:0: [sdg] Assuming drive cache: write through Nov 10 12:50:16 localhost kernel: sd 10:0:0:0: [sdg] Attached SCSI removable disk
By looking at this output, you can determine what type of device was attached (Generic Flash Disk) and what device name it was given ([sdg]). You can then mount the /dev/sdg device and access the data on the USB drive.
Keyboard mapping is the process of making sure the keys on the keyboard match with the actions that you want the keys to take. Use the xev command to perform keyboard mapping operations.
One of the questions you will often need to answer is “Is the current hardware error a result of an actual hardware issue or an issue with the software that is used to access the hardware?” There is no hard-and-fast rule you can follow to determine the answer, but there are some things you can try in order to determine the source of the problem, including
Move the device to another system to see if the problem persists.
Try another similar hardware device.
Reinstall the software.
Upgrade the hardware drivers.
If you forget the root password or come across a system where the root password is unknown, the general steps to fixing this problem are
Step 1. Reboot the system to single user mode. (See the following section, “Single User Mode,” for details.)
Step 2. Mount the root filesystem.
Step 3. Manually edit the /etc/shadow file and remove the root password.
Step 4. Reboot the system and log in as the root user (no password required).
Step 5. Set the root password.
Note
The steps may vary depending on your distribution and other configuration settings. Consult your distro documentation for additional details.
Single user mode is the operating system level in which only the root user can log in. This level has limited functionality (typically no networking, graphical user interface, and so on). Administrators use this level to fix system boot problems or to recover the root password.
For details on how to get into single user mode, see Chapter 4, “The Boot Process.”
There are two hardware commands listed in the Linux+ exam objectives that you should know about. The first one, dmidecode, is used to display a description of hardware components. Example 16-7 shows an example of this command.
Example 16-7 The dmidecode Command
[root@localhost ~]# dmidecode # dmidecode 3.1 Getting SMBIOS data from sysfs. SMBIOS 2.5 present. 10 structures occupying 450 bytes. Table at 0x000E1000.
Handle 0x0000, DMI type 0, 20 bytes BIOS Information Vendor: innotek GmbH Version: VirtualBox Release Date: 12/01/2006 Address: 0xE0000 Runtime Size: 128 kB ROM Size: 128 kB Characteristics: ISA is supported PCI is supported Boot from CD is supported Selectable boot is supported 8042 keyboard services are supported (int 9h) CGA/mono video services are supported (int 10h) ACPI is supported
Handle 0x0001, DMI type 1, 27 bytes System Information Manufacturer: innotek GmbH Product Name: VirtualBox Version: 1.2 Serial Number: 0 UUID: DB8F323E-EFB2-4815-880C-86C6E52E5C09 Wake-up Type: Power Switch SKU Number: Not Specified Family: Virtual Machine
Handle 0x0008, DMI type 2, 15 bytes Base Board Information Manufacturer: Oracle Corporation Product Name: VirtualBox Version: 1.2 Serial Number: 0 Asset Tag: Not Specified Features: Board is a hosting board Location In Chassis: Not Specified Chassis Handle: 0x0003 Type: Motherboard Contained Object Handles: 0
Handle 0x0003, DMI type 3, 13 bytes Chassis Information Manufacturer: Oracle Corporation Type: Other Lock: Not Present Version: Not Specified Serial Number: Not Specified Asset Tag: Not Specified Boot-up State: Safe Power Supply State: Safe Thermal State: Safe Security Status: None
Handle 0x0007, DMI type 126, 42 bytes Inactive
Handle 0x0005, DMI type 126, 15 bytes Inactive
Handle 0x0006, DMI type 126, 28 bytes Inactive
Handle 0x0002, DMI type 11, 7 bytes OEM Strings String 1: vboxVer_5.1.34 String 2: vboxRev_121010 Handle 0x0008, DMI type 128, 8 bytes OEM-specific Type Header and Data: 80 08 08 00 05 8D 27 00
Handle 0xFEFF, DMI type 127, 4 bytes End Of Table
The other command that displays hardware information is lshw. This command produces a vast amount of information, and the output for the following command is limited to just the first 10 lines:
[root@localhost ~]# lshw | head student-virtualbox description: Computer product: VirtualBox vendor: innotek GmbH version: 1.2 serial: 0 width: 64 bits capabilities: smbios-2.5 dmi-2.5 vsyscall32 configuration: family=Virtual Machine uuid=DB8F323E-EFB2-4815- 880C-86C6E52E5C09 *-core
Many problems could possibly occur with a system. Keep in mind that hardware issues do crop up, and to solve them, you must know what tools to use and what things to look for. This chapter describes these tools and provides you with a starting point for diving into the process of fixing a hardware issue.
As mentioned in the section “How to Use This Book” in the Introduction, you have a couple of choices for exam preparation: the exercises here, Chapter 30, “Final Preparation,” and the exam simulation questions in the Pearson Test Prep Software Online.
Review the most important topics in this chapter, noted with the Key Topic icon in the outer margin of the page. Table 16-2 lists these key topics and the page number on which each is found.
Table 16-2 Key Topics for Chapter 16
Key Topic Element |
Description |
Page Number |
---|---|---|
Note |
The iostat, du, df, fsck, and partprobe commands and the LVM tools |
|
Paragraph |
The ioping command |
|
Section |
I/O scheduling |
|
Paragraph |
The uptime command |
|
Paragraph |
Load average |
|
Paragraph |
The /proc/cpuinfo file |
|
Paragraph |
The iostat command |
|
Paragraph |
The sar command |
|
Paragraph |
The free command |
|
Paragraph |
The vmstat command |
|
Paragraph |
Buffer cache output |
|
Paragraph |
The /proc/meminfo file |
|
Paragraph |
Out of memory killer |
|
Paragraph |
The swapon command |
|
Paragraph |
The swapoff command |
|
Paragraph |
The mkswap command |
|
Section |
Memory |
|
Section |
Printers |
|
Section |
Video |
|
Section |
GPU Drivers |
|
Section |
Communications Ports |
|
Section |
USB |
|
Section |
Keyboard Mapping |
|
Section |
Hardware or Software |
|
Section |
Lost Root Password |
|
Section |
Single User Mode |
|
Paragraph |
The dmidecode and lshw commands |
Define the following key terms from this chapter and check your answers in the glossary:
Universal Serial Bus
The answers to these review questions are in Appendix A.
1. Which option to the lsusb command displays detailed information about USB devices?
a. -a
b. --all
c. -d
d. -v
2. Which of the following commands can display disk utilization data? (Choose all that apply.)
a. iostat
b. pstree
c. netstat
d. sar
3. The uptime command provides CPU load average over what time periods?
a. 1, 3, and 5 minutes
b. 5, 10, and 15 minutes
c. 1, 5, and 15 minutes
d. 10, 20, and 30 minutes
4. Which column from the output of the iostat -c command provides the percentage of time that the CPU is not handling requests?
a. %empty
b. %null
c. %inactive
d. %idle
5. Which of the following commands would display CPU statistics every 2 seconds for a total of 4 times?
a. iostat -c 2 4
b. iostat -c 4 2
c. iostat -d 2 4
d. iostat -d 4 2
6. Which command displays CPU statistics like the iostat command but displays historical values, not present values?
a. free
b. sar
c. vmstat
d. top
7. What data does the free command display? (Choose all that apply.)
a. RAM
b. Swap space
c. CPU
d. Network data
8. Which option to the swapon command is used to display currently used swap space?
_______________________________________
9. What command removes a swap device from current use?
_______________________________________
10. What file in the /proc/ folder contains details about the CPU?
_______________________________________
18.191.223.123