Before we can use a device, we need to create a device file for it. This harkens back to the notion that “everything in UNIX is a file.” Some devices will format the data stored on it; a filesystem formats the data for ease of management and better performance when reading and writing. Whether the data is formatted or not, the device file concept makes communicating with devices very simple; open the device file, and start reading or writing. Being a CSA, you will be familiar with concepts such as hardware paths and the ioscan
command. We look at some situations where communicating with peripherals is a little more involved; we will look at restructuring our IO tree using ioinit
; we discuss how a switched fabric, fibre channel SAN affects the device files for disks; and we will conclude by looking at OLA/R, the ability to replace a PCI interface card while the system is still running.
The motivation for reorganizing your IO tree is normally to standardize the configuration of a collection of servers. Managing a multitude of machines that are fundamentally the same type of machine and with the same number and type of interface cards is hard enough. When we throw into the mix a different IO tree, we are introducing multiple sets of device files which all need interpreting. Take a situation where we have 16 servers all sharing a set of LUNs configured on a disk array. If each server has a different IO tree, we will have 16 sets of different device files all relating to the same devices. Standardizing device files can make managing a large collection of machines much easier. Some administrators I know will simply create a device file with mknod
that reflects a device file on another node. There is nothing wrong with this except it is another task we need to remember whenever we add another interface card or device onto our systems.
A better approach would be to standardize the underlying Instance numbers for interfaces. In this way when we add another device, HP-UX will automatically create the correct device files on all nodes. As well as standardizing the IO tree, we will probably want to standardize the process of modifying a given hardware configuration in the future. It is usually the process of adding and removing hardware components that causes disruption in an orderly grouping of Instance numbers. The case for standardizing your IO tree is when we fundamentally have the same hardware configuration on each server. This is not absolutely essential, but it does make life slightly easier when we have the same amount of Instance numbers to manage. With dissimilar numbers of devices, we will be standardizing the common
devices between the machines. Unique devices on a given server will follow their own convention for Instance numbers; as you can see, having similar hardware configurations is desirable.
The process itself is not difficult. The consequences of the change can be quite dramatic. When we reassign Instance numbers, we are creating a whole new list of device files. We need to document VERY CAREFULLY the current device files and which hardware paths they map to. We also need to map which device files are currently in use, e.g., which disk device files are currently members of volume groups. After the new configuration has taken place, we will need to rework any applications, e.g., LVM, networking, and so on, to reference the new device files. Ideally we would perform the remapping of Instance number/device files before performing any system configuration, but life isn't always that easy. Here's a cookbook of the steps involved to reorganize your IO tree and hence remap your Instance numbers:
Consider making a System Recovery Tape.
Collect IO trees from all nodes concerned.
Decide on the format of the standardized IO tree.
Document the current device file—hardware path mapping.
Establish which system and user applications use current device files.
Create an ASCII file representing the new IO tree.
Shut down the system(s) to single user mode.
Apply the new IO tree configuration with the ioinit
command.
Reboot the system to single user mode.
Check that all new device files are created correctly.
Rework any user or system applications affected by the change in device file names.
Remove all old device files.
I have (only) two systems that have mismatched IO trees. They both have identical hardware configurations; it's just that one of them has had numerous interface cards inserted and removed over the years. Consequently, the sequence of Instance numbers is dissimilar. Both nodes are connected to the same disk array, so I am going to attempt to ensure they have the same IO tree by the end of this process and hence the same device files mapping to the same devices.
As with any major operating system configuration change, it is worthwhile to have a consistent backup of your operating system just in case. It is worth considering using commands like make_[tape|net]_recovery
in order to be able to re-establish the current system configuration as quickly as possible, should something unexpected and catastrophic happen.
This can simply be a simple ioscan –f
command. What we are trying to establish is which system(s) have dissimilar IO trees. On my systems, the problem lies with my interface cards and the disk attached to them. This part of the investigation can take some time on large systems, as you have to wade through the output from multiple ioscan
commands. On node hpeos004
, the sequence of Instance numbers is simple and straightforward:
root@hpeos004[] ioscan -fknC ext_bus
Class I H/W Path Driver S/W State H/W Type Description
=================================================================
ext_bus 0 0/0/1/0 c720 CLAIMED INTERFACE SCSI C896 Ultra Wide LVD
ext_bus 1 0/0/1/1 c720 CLAIMED INTERFACE SCSI C896 Ultra Wide Single-Ended
ext_bus 2 0/0/2/0 c720 CLAIMED INTERFACE SCSI C87x Fast Wide Single-Ended
ext_bus 3 0/0/2/1 c720 CLAIMED INTERFACE SCSI C87x Ultra Wide Single-Ended
ext_bus 4 0/6/0/0 c720 CLAIMED INTERFACE SCSI C896 Ultra2 Wide LVD
ext_bus 5 0/6/0/1 c720 CLAIMED INTERFACE SCSI C896 Ultra2 Wide LVD
root@hpeos004[]
I am using the –k
option to ioscan
throughout, as I am not expecting to see any new devices appear during this process, and it is much quicker than actually probing every hardware path; especially if you have fibre channel attached disk arrays. On our other node, hpeos003
, the IO tree is somewhat in a mess:
root@hpeos003[] ioscan -fknC ext_bus
Class I H/W Path Driver S/W State H/W Type Description
=================================================================
ext_bus 0 0/0/1/0 c720 CLAIMED INTERFACE SCSI C896 Ultra Wide LVD
ext_bus 1 0/0/1/1 c720 CLAIMED INTERFACE SCSI C896 Ultra Wide Single-Ended
ext_bus 6 0/0/2/0 c720 CLAIMED INTERFACE SCSI C87x Fast Wide Single-Ended
ext_bus 7 0/0/2/1 c720 CLAIMED INTERFACE SCSI C87x Ultra Wide Single-Ended
ext_bus 5 0/6/0/0 c720 CLAIMED INTERFACE SCSI C896 Ultra Wide LVD
ext_bus 4 0/6/0/1 c720 CLAIMED INTERFACE SCSI C896 Ultra Wide LVD
root@hpeos003[]
We should also check how the devices connected to each of these interfaces have been configured. We know through our knowledge of these systems that they are connected to a shared disk array. There is nothing to say that one system was able to access more devices on that array due to some form of LUN security. If we are attempting to standardize the Instance numbers in the IO tree, we should at least perform the task for every affected device, not only the interface cards.
root@hpeos004[] ioscan -fkH 0/6/0/0
Class I H/W Path Driver S/W State H/W Type Description
======================================================================
ext_bus 4 0/6/0/0 c720 CLAIMED INTERFACE SCSI C896 Ultra2 Wide LVD
target 6 0/6/0/0.7 tgt CLAIMED DEVICE
ctl 4 0/6/0/0.7.0 sctl CLAIMED DEVICE Initiator
target 15 0/6/0/0.8 tgt CLAIMED DEVICE
disk 2 0/6/0/0.8.0 sdisk CLAIMED DEVICE HP 73.4GST373307LC
target 16 0/6/0/0.9 tgt CLAIMED DEVICE
disk 3 0/6/0/0.9.0 sdisk CLAIMED DEVICE HP 73.4GST373307LC
target 17 0/6/0/0.10 tgt CLAIMED DEVICE
disk 4 0/6/0/0.10.0 sdisk CLAIMED DEVICE HP 73.4GST373307LC
target 18 0/6/0/0.11 tgt CLAIMED DEVICE
disk 5 0/6/0/0.11.0 sdisk CLAIMED DEVICE HP 73.4GST373307LC
target 19 0/6/0/0.12 tgt CLAIMED DEVICE
disk 6 0/6/0/0.12.0 sdisk CLAIMED DEVICE HP 73.4GST373307LC
target 20 0/6/0/0.13 tgt CLAIMED DEVICE
disk 7 0/6/0/0.13.0 sdisk CLAIMED DEVICE HP 73.4GST373307LC
target 21 0/6/0/0.14 tgt CLAIMED DEVICE
disk 8 0/6/0/0.14.0 sdisk CLAIMED DEVICE HP 73.4GST373307LC
target 22 0/6/0/0.15 tgt CLAIMED DEVICE
ctl 6 0/6/0/0.15.0 sctl CLAIMED DEVICE HP A6491A
root@hpeos004[]
In the case of disk drives, the Instance number of the disk itself is not used in the device file name, but I think we should standardize this part of the IO tree as well. Here are the Instance numbers for the attached disk on node hpeos003
:
root@hpeos003[] ioscan -fkH 0/6/0/0
Class I H/W Path Driver S/W State H/W Type Description
======================================================================
ext_bus 5 0/6/0/0 c720 CLAIMED INTERFACE SCSI C896 Ultra Wide LVD
target 14 0/6/0/0.6 tgt CLAIMED DEVICE
ctl 5 0/6/0/0.6.0 sctl CLAIMED DEVICE Initiator
target 15 0/6/0/0.8 tgt CLAIMED DEVICE
disk 2 0/6/0/0.8.0 sdisk CLAIMED DEVICE HP 73.4GST373307LC
target 16 0/6/0/0.9 tgt CLAIMED DEVICE
disk 3 0/6/0/0.9.0 sdisk CLAIMED DEVICE HP 73.4GST373307LC
target 17 0/6/0/0.10 tgt CLAIMED DEVICE
disk 24 0/6/0/0.10.0 sdisk CLAIMED DEVICE HP 73.4GST373307LC
target 18 0/6/0/0.11 tgt CLAIMED DEVICE
disk 25 0/6/0/0.11.0 sdisk CLAIMED DEVICE HP 73.4GST373307LC
target 19 0/6/0/0.12 tgt CLAIMED DEVICE
disk 26 0/6/0/0.12.0 sdisk CLAIMED DEVICE HP 73.4GST373307LC
target 20 0/6/0/0.13 tgt CLAIMED DEVICE
disk 27 0/6/0/0.13.0 sdisk CLAIMED DEVICE HP 73.4GST373307LC
target 21 0/6/0/0.14 tgt CLAIMED DEVICE
disk 28 0/6/0/0.14.0 sdisk CLAIMED DEVICE HP 73.4GST373307LC
target 22 0/6/0/0.15 tgt CLAIMED DEVICE
ctl 6 0/6/0/0.15.0 sctl CLAIMED DEVICE HP A6491A
root@hpeos003[]
As you can see, they are somewhat different to the Instance numbers on node hpeos004
. This disparity is simply due to the fact that this machine was used for testing new interface cards and devices. When we added a new card, we simply installed it and never worried about the consequences of device file names.
In our case, the answer is relatively simple: We are going to use the sequence of Instance numbers we find on node hpeos004
. Where you have significant differences between different nodes, the process of establishing a consistent numbering sequence can be somewhat time consuming. When remapping Instance numbers, we cannot use an Instance number that is currently being used by an existing device. This can significantly increase the number of affected devices.
When we apply our new configuration, we will have a plethora of new device files created. It is a good idea to print out a complete listing from ioscan –fn
to ensure that you know exactly what you had before you started. Another good idea is to keep a backup copy of the files /etc/ioconfig
and /stand/ioconfig
, which hold your current IO tree with hardware to Instance number mappings. Both files should be the same. The file /stand/ioconfig
is accessible at boot time even for NFS-diskless clients. The kernel IO tree is initialized from this file. Once booted, the kernel IO tree is updated from /etc/ioconfig
. If this file is missing at boot time, the sysinit
(from /etc/inittab
) process /sbin/ioinitrc
will bring the system to single user mode whereby we can restore /etc/ioconfig
from a backup tape or recreate it by using /sbin/ioinit –c
.
This part of the process can be very time consuming and the scariest. In my case, we will be recreating device files for LVM disks. This means I need to establish which disks are affected because when the system tries to activate a particular volume group, it will need to know about the new device files. In the case of LVM, I am going to have to vgexport
and then vgimport
all affected volume groups once the new device files have been created. Ensure that you consider every possible system and user application that might refer to a device file. I suppose if you forget one, you will know soon enough as that particular application stops working, but it doesn't do your résumé any good if you make changes to a system which cause catastrophic consequences for user applications. Here I can see that a particular disk device files does relate to the same device as I can read the LVM header of the disk.
root@hpeos004[] echo "0x2008?4D" | adb /dev/dsk/c4t8d0
2008: 2007116332 1016986841 2007116332 1016986870
root@hpeos004[]
If this is a disk used by both systems, when I read the LVM header off the disk from the other node, I should see the same information even though it uses a different device file.
root@hpeos003[] echo "0x2008?4D" | adb /dev/dsk/c5t8d0
2008: 2007116332 1016986841 2007116332 1016986870
root@hpeos003[]
Remember that we could quite happily exist like this with different device files pointing to the same hardware; it's just that it might make our ever-busier lives easier if they were the same.
This is where we need to work out which devices are affected and construct a file that reflects the new configuration. First, we construct a file in the correct format for the job. The format of this file needs to be:
<hardware path> <class> <instance number>
This is relatively simply to do with an ioscan
command and some fancy footwork with awk
:
root@hpeos003[] ioscan -kf| tail +3 | awk '{print $3" "$1" "$2}' > /tmp/iotree
root@hpeos003[]
We can now edit this file so that it contains only the devices with Instance numbers we want to change. Here's the file I constructed for node hpeos003
:
root@hpeos003[] cat /tmp/iotree
0/0/2/0 ext_bus 2
0/0/2/1 ext_bus 3
0/6/0/0 ext_bus 4
0/6/0/0.10.0 disk 4
0/6/0/0.11.0 disk 5
0/6/0/0.12.0 disk 6
0/6/0/0.13.0 disk 7
0/6/0/0.14.0 disk 8
0/6/0/1 ext_bus 5
root@hpeos003[]
The order of individual lines is not important; just ensure that you include all affected devices. We will need to perform the same task on all affected nodes.
It's a good idea to shut the system down to single user mode to ensure that all user and non-essential system applications are not running. Technically this is not necessary, but the command that actually applies the new configuration may have to reboot the system for the changes to take effect. In doing so, it will issue a reboot
command, so being in single user mode is probably a wise move.
Now the moment of truth: The ioinit
command will attempt to remap our Instance numbers listed in the file /etc/ioconfig
based on the content of our file /tmp/iotree
. For these changes to take effect, we may have to perform a reboot. In reality, we will probably have to perform a reboot in every case, as the kernel IO tree has not been directly affected by these changes yet. This is why we are using the –r
option to ioinit
command that will perform the reboot if the changes we make warrant a reboot. We can see the changes made by ioinit
if we can decipher the binary nonsense that is the /etc/ioconfig
file. I have included a program called dump_ioconfig
in Appendix B that reads the ioconfig
file. I will use it in this demonstration:
Let's run the ioinit
command with the /tmp/iotree
file we created earlier:
root@hpeos003[] ioscan -fnkC ext_bus Class I H/W Path Driver S/W State H/W Type Description ================================================================= ext_bus 0 0/0/1/0 c720 CLAIMED INTERFACE SCSI C896 Ultra Wide LVD ext_bus 1 0/0/1/1 c720 CLAIMED INTERFACE SCSI C896 Ultra Wide Single-Ended ext_bus 6 0/0/2/0 c720 CLAIMED INTERFACE SCSI C87x Fast Wide Single-Ended ext_bus 7 0/0/2/1 c720 CLAIMED INTERFACE SCSI C87x Ultra Wide Single-Ended ext_bus 5 0/6/0/0 c720 CLAIMED INTERFACE SCSI C896 Ultra Wide LVD ext_bus 4 0/6/0/1 c720 CLAIMED INTERFACE SCSI C896 Ultra Wide LVD root@hpeos003[] ioinit -f /tmp/iotree root@hpeos003[] ~/dump_ioconfig | grep ext_bus ext_bus 0 0 0 1 0 c720 ext_bus 1 0 0 1 1 c720 ext_bus 2 0 0 2 0 c720 ext_bus 3 0 0 2 1 c720 ext_bus 4 0 6 0 0 c720 ext_bus 5 0 6 0 1 c720 root@hpeos003[]
As we can see from the output of dump_ioconfig
, the Instance numbers in the /etc/ioconfig
file has changed; the Instance number is the second field from the left. Now if we run ioinit
again, first it will realize that the changes in the file /tmp/iotree
have already been made, but this time with the –r
option ioinit
, it will realize that the kernel is not consistent with the /etc/ioconfig
file and call a reboot:
root@hpeos003[] ioinit -r -f /tmp/iotree
ioinit: Input is identical to kernel, line ignored
Input line 1: 0/0/2/0 ext_bus 2
ioinit: Input is identical to kernel, line ignored
Input line 2: 0/0/2/1 ext_bus 3
ioinit: Input is identical to kernel, line ignored
Input line 3: 0/6/0/0 ext_bus 4
ioinit: Input is identical to kernel, line ignored
Input line 4: 0/6/0/0.10.0 disk 4
ioinit: Input is identical to kernel, line ignored
Input line 5: 0/6/0/0.11.0 disk 5
ioinit: Input is identical to kernel, line ignored
Input line 6: 0/6/0/0.12.0 disk 6
ioinit: Input is identical to kernel, line ignored
Input line 7: 0/6/0/0.13.0 disk 7
ioinit: Input is identical to kernel, line ignored
Input line 8: 0/6/0/0.14.0 disk 8
ioinit: Input is identical to kernel, line ignored
Input line 9: 0/6/0/1 ext_bus 5
ioinit:Rebooting the system to reassign instance numbers
Shutdown at 14:46 (in 0 minutes)
*** FINAL System shutdown message from root@hpeos003 ***
System going down IMMEDIATELY
System shutdown time has arrived
reboot: redirecting error messages to /dev/console
The reason I am suggesting rebooting to single user mode is simply to check that all necessary device files have been created. In our case, I would suspect that insf
would not create our new device files, as the hardware paths for these devices already exist, i.e., they are not new devices.
It also gives us the opportunity to rework any user and system applications before they start up and consequently fail due to the change in device file names.
As part of the /sbin/ioinitrc
startup script, ioinit –i –r
is run, which invokes insf
to create any new device files. These devices exist as far as a hardware path is concerned, so they are not regarded as new devices. We have to supply the –e
option to insf
to create device files for devices that existed previously, i.e., that already have an Instance number.
root@hpeos003[] ioscan -fnkC disk
Class I H/W Path Driver S/W State H/W Type Description
======================================================================
disk 9 0/0/1/0.0.0 sdisk CLAIMED DEVICE HP 73.4GST373307LC
/dev/dsk/c0t0d0 /dev/rdsk/c0t0d0
disk 10 0/0/1/0.1.0 sdisk CLAIMED DEVICE HP 73.4GST373307LC
/dev/dsk/c0t1d0 /dev/rdsk/c0t1d0
disk 11 0/0/1/0.2.0 sdisk CLAIMED DEVICE HP 73.4GST373307LC
/dev/dsk/c0t2d0 /dev/rdsk/c0t2d0
disk 12 0/0/1/0.3.0 sdisk CLAIMED DEVICE HP 73.4GST373307LC
/dev/dsk/c0t3d0 /dev/rdsk/c0t3d0
disk 13 0/0/1/0.4.0 sdisk CLAIMED DEVICE HP 73.4GST373307LC
/dev/dsk/c0t4d0 /dev/rdsk/c0t4d0
disk 14 0/0/1/0.5.0 sdisk CLAIMED DEVICE HP 73.4GST373307LC
/dev/dsk/c0t5d0 /dev/rdsk/c0t5d0
disk 15 0/0/1/0.6.0 sdisk CLAIMED DEVICE HP 73.4GST373307LC
/dev/dsk/c0t6d0 /dev/rdsk/c0t6d0
disk 0 0/0/1/1.15.0 sdisk CLAIMED DEVICE HP 36.4GST336753LC
/dev/dsk/c1t15d0 /dev/rdsk/c1t15d0
disk 1 0/0/2/1.15.0 sdisk CLAIMED DEVICE HP 36.4GST336753LC
disk 2 0/6/0/0.8.0 sdisk CLAIMED DEVICE HP 73.4GST373307LC
disk 3 0/6/0/0.9.0 sdisk CLAIMED DEVICE HP 73.4GST373307LC
disk 4 0/6/0/0.10.0 sdisk CLAIMED DEVICE HP 73.4GST373307LC
disk 5 0/6/0/0.11.0 sdisk CLAIMED DEVICE HP 73.4GST373307LC
disk 6 0/6/0/0.12.0 sdisk CLAIMED DEVICE HP 73.4GST373307LC
disk 7 0/6/0/0.13.0 sdisk CLAIMED DEVICE HP 73.4GST373307LC
disk 8 0/6/0/0.14.0 sdisk CLAIMED DEVICE HP 73.4GST373307LC
root@hpeos003[]
You can see that we are missing our device files. We will create them manually with insf –ve
:
root@hpeos003[] insf -ve insf: Installing special files for btlan instance 0 address 0/0/0/0 insf: Installing special files for sdisk instance 9 address 0/0/1/0.0.0 insf: Installing special files for sdisk instance 10 address 0/0/1/0.1.0 insf: Installing special files for sdisk instance 11 address 0/0/1/0.2.0 insf: Installing special files for sdisk instance 12 address 0/0/1/0.3.0 insf: Installing special files for sdisk instance 13 address 0/0/1/0.4.0 insf: Installing special files for sdisk instance 14 address 0/0/1/0.5.0 insf: Installing special files for sdisk instance 15 address 0/0/1/0.6.0 insf: Installing special files for sctl instance 0 address 0/0/1/0.7.0 insf: Installing special files for sctl instance 7 address 0/0/1/0.15.0 insf: Installing special files for sctl instance 1 address 0/0/1/1.7.0 insf: Installing special files for sdisk instance 0 address 0/0/1/1.15.0 insf: Installing special files for sctl instance 4 address 0/0/2/0.7.0 making rscsi/c2t7d0 c 203 0x027000 insf: Installing special files for sctl instance 5 address 0/0/2/1.7.0 making rscsi/c3t7d0 c 203 0x037000 insf: Installing special files for sdisk instance 8 address 0/0/2/1.15.0 making dsk/c3t15d0 b 31 0x03f000 making rdsk/c3t15d0 c 188 0x03f000 insf: Installing special files for asio0 instance 0 address 0/0/4/1 insf: Installing special files for btlan instance 1 address 0/2/0/0/4/0 insf: Installing special files for btlan instance 2 address 0/2/0/0/5/0 insf: Installing special files for btlan instance 3 address 0/2/0/0/6/0 insf: Installing special files for btlan instance 4 address 0/2/0/0/7/0 insf: Installing special files for sctl instance 2 address 0/6/0/0.6.0 making rscsi/c4t6d0 c 203 0x046000 insf: Installing special files for sdisk instance 1 address 0/6/0/0.8.0 making dsk/c4t8d0 b 31 0x048000 making rdsk/c4t8d0 c 188 0x048000 insf: Installing special files for sdisk instance 2 address 0/6/0/0.9.0 making dsk/c4t9d0 b 31 0x049000 making rdsk/c4t9d0 c 188 0x049000 insf: Installing special files for sdisk instance 3 address 0/6/0/0.10.0 making dsk/c4t10d0 b 31 0x04a000 making rdsk/c4t10d0 c 188 0x04a000 insf: Installing special files for sdisk instance 4 address 0/6/0/0.11.0 making dsk/c4t11d0 b 31 0x04b000 making rdsk/c4t11d0 c 188 0x04b000 insf: Installing special files for sdisk instance 5 address 0/6/0/0.12.0 making dsk/c4t12d0 b 31 0x04c000 making rdsk/c4t12d0 c 188 0x04c000 insf: Installing special files for sdisk instance 6 address 0/6/0/0.13.0 making dsk/c4t13d0 b 31 0x04d000 making rdsk/c4t13d0 c 188 0x04d000 insf: Installing special files for sdisk instance 7 address 0/6/0/0.14.0 making dsk/c4t14d0 b 31 0x04e000 making rdsk/c4t14d0 c 188 0x04e000 insf: Installing special files for sctl instance 6 address 0/6/0/0.15.0 making rscsi/c4t15d0 c 203 0x04f000 insf: Installing special files for sctl instance 3 address 0/6/0/1.6.0 making rscsi/c5t6d0 c 203 0x056000 insf: Installing special files for pseudo driver cn insf: Installing special files for pseudo driver mm insf: Installing special files for pseudo driver devkrs insf: Installing special files for pseudo driver ptym insf: Installing special files for pseudo driver ptys insf: Installing special files for pseudo driver ip insf: Installing special files for pseudo driver arp insf: Installing special files for pseudo driver rawip insf: Installing special files for pseudo driver tcp insf: Installing special files for pseudo driver udp insf: Installing special files for pseudo driver stcpmap insf: Installing special files for pseudo driver nuls insf: Installing special files for pseudo driver netqa insf: Installing special files for pseudo driver dmem insf: Installing special files for pseudo driver diag0 insf: Installing special files for pseudo driver telm insf: Installing special files for pseudo driver tels insf: Installing special files for pseudo driver tlclts insf: Installing special files for pseudo driver tlcots insf: Installing special files for pseudo driver iomem insf: Installing special files for pseudo driver tlcotsod insf: Installing special files for pseudo driver dev_config insf: Installing special files for pseudo driver strlog insf: Installing special files for pseudo driver sad insf: Installing special files for pseudo driver echo insf: Installing special files for pseudo driver dlpi insf: Installing special files for pseudo driver ptm insf: Installing special files for pseudo driver pts insf: Installing special files for pseudo driver diag1 insf: Installing special files for pseudo driver klog insf: Installing special files for pseudo driver sy insf: Installing special files for pseudo driver kepd insf: Installing special files for pseudo driver diag2 insf: Installing special files for pseudo driver root making rroot c 255 0xffffff making root b 255 0xffffff root@hpeos003[]
We now have our device files. We can check this with another ioscan
:
root@hpeos003[]root@hpeos003[] ioscan -fnkC disk
Class I H/W Path Driver S/W State H/W Type Description
======================================================================
disk 9 0/0/1/0.0.0 sdisk CLAIMED DEVICE HP 73.4GST373307LC
/dev/dsk/c0t0d0 /dev/rdsk/c0t0d0
disk 10 0/0/1/0.1.0 sdisk CLAIMED DEVICE HP 73.4GST373307LC
/dev/dsk/c0t1d0 /dev/rdsk/c0t1d0
disk 11 0/0/1/0.2.0 sdisk CLAIMED DEVICE HP 73.4GST373307LC
/dev/dsk/c0t2d0 /dev/rdsk/c0t2d0
disk 12 0/0/1/0.3.0 sdisk CLAIMED DEVICE HP 73.4GST373307LC
/dev/dsk/c0t3d0 /dev/rdsk/c0t3d0
disk 13 0/0/1/0.4.0 sdisk CLAIMED DEVICE HP 73.4GST373307LC
/dev/dsk/c0t4d0 /dev/rdsk/c0t4d0
disk 14 0/0/1/0.5.0 sdisk CLAIMED DEVICE HP 73.4GST373307LC
/dev/dsk/c0t5d0 /dev/rdsk/c0t5d0
disk 15 0/0/1/0.6.0 sdisk CLAIMED DEVICE HP 73.4GST373307LC
/dev/dsk/c0t6d0 /dev/rdsk/c0t6d0
disk 0 0/0/1/1.15.0 sdisk CLAIMED DEVICE HP 36.4GST336753LC
/dev/dsk/c1t15d0 /dev/rdsk/c1t15d0
disk 1 0/0/2/1.15.0 sdisk CLAIMED DEVICE HP 36.4GST336753LC
/dev/dsk/c3t15d0 /dev/rdsk/c3t15d0
disk 2 0/6/0/0.8.0 sdisk CLAIMED DEVICE HP 73.4GST373307LC
/dev/dsk/c4t8d0 /dev/rdsk/c4t8d0
disk 3 0/6/0/0.9.0 sdisk CLAIMED DEVICE HP 73.4GST373307LC
/dev/dsk/c4t9d0 /dev/rdsk/c4t9d0
disk 4 0/6/0/0.10.0 sdisk CLAIMED DEVICE HP 73.4GST373307LC
/dev/dsk/c4t10d0 /dev/rdsk/c4t10d0
disk 5 0/6/0/0.11.0 sdisk CLAIMED DEVICE HP 73.4GST373307LC
/dev/dsk/c4t11d0 /dev/rdsk/c4t11d0
disk 6 0/6/0/0.12.0 sdisk CLAIMED DEVICE HP 73.4GST373307LC
/dev/dsk/c4t12d0 /dev/rdsk/c4t12d0
disk 7 0/6/0/0.13.0 sdisk CLAIMED DEVICE HP 73.4GST373307LC
/dev/dsk/c4t13d0 /dev/rdsk/c4t13d0
disk 8 0/6/0/0.14.0 sdisk CLAIMED DEVICE HP 73.4GST373307LC
/dev/dsk/c4t14d0 /dev/rdsk/c4t14d0
root@hpeos003[]
This is where our previous documentation is important. Our user and system applications will need reworking to reference the newly created device files. We need to be able to reference the new device file name back to the old one in order to update our user and system applications. It is crucial that we get this right because a mistake here can render entire applications unusable. In my case, I need to vgexport
and vgimport
my existing volume groups as the /etc/lvmtab
file currently references the old device file names.
root@hpeos003[] strings /etc/lvmtab /dev/vg00 /dev/dsk/c1t15d0 /dev/vgora /dev/dsk/c5t8d0 root@hpeos003[] root@hpeos003[] vgchange -a y /dev/vgora vgchange: Warning: Couldn't attach to the volume group physical volume "/dev/dsk/c5t8d0": The path of the physical volume refers to a device that does not exist, or is not configured into the kernel. vgchange: Warning: couldn't query physical volume "/dev/dsk/c5t8d0": The specified path does not correspond to physical volume attached to this volume group vgchange: Warning: couldn't query all of the physical volumes. vgchange: Couldn't activate volume group "/dev/vgora": Quorum not present, or some physical volume(s) are missing. root@hpeos003[]
In my case, there isn't much work to do. First, I could check the LVM headers on what I think is the new disk. If I had recorded this information as part of my documentation stage, this would assist in identifying the new device files.
root@hpeos003[] echo "0x2008?4D" | adb /dev/dsk/c4t8d0
2008: 2007116332 1016986841 2007116332 1016986870
root@hpeos003[]
If we compare this with the LVM header we can read from our other node, we will discover whether we have correctly identified the new device file name.
root@hpeos004[] echo "0x2008?4D" | adb /dev/dsk/c4t8d0
2008: 2007116332 1016986841 2007116332 1016986870
root@hpeos004[]
All I need to do now is perform the vgexport
and vgimport
:
root@hpeos003[] ll /dev/vgora total 0 crw-rw-rw- 1 root sys 64 0x010000 May 2 09:37 group brw-r----- 1 root sys 64 0x010001 May 2 09:37 ora1 brw-r----- 1 root sys 64 0x010002 May 2 09:37 ora2 crw-r----- 1 root sys 64 0x010001 May 2 09:37 rora1 crw-r----- 1 root sys 64 0x010002 May 2 09:37 rora2 root@hpeos003[] vgexport –m /tmp/vgora.map /dev/vgora root@hpeos003[] root@hpeos003[] mkdir /dev/vgora root@hpeos003[] mknod /dev/vgora/group c 64 0x010000 root@hpeos003[] vgimport -m /tmp/vgora.map /dev/vgora /dev/dsk/c4t8d0 Warning: A backup of this volume group may not exist on this machine. Please remember to take a backup using the vgcfgbackup command after activating the volume group. root@hpeos003[] ll /dev/vgora total 0 crw-rw-rw- 1 root sys 64 0x010000 Sep 20 16:22 group brw-r----- 1 root sys 64 0x010001 Sep 20 16:23 ora1 brw-r----- 1 root sys 64 0x010002 Sep 20 16:23 ora2 crw-r----- 1 root sys 64 0x010001 Sep 20 16:23 rora1 crw-r----- 1 root sys 64 0x010002 Sep 20 16:23 rora2 root@hpeos003[] root@hpeos003[] vgchange -a y /dev/vgora Activated volume group Volume group "/dev/vgora" has been successfully changed. root@hpeos003[] vgcfgbackup /dev/vgora Volume Group configuration for /dev/vgora has been saved in /etc/lvmconf/vgora.conf root@hpeos003[] vgdisplay /dev/vgora --- Volume groups --- VG Name /dev/vgora VG Write Access read/write VG Status available Max LV 255 Cur LV 2 Open LV 2 Max PV 16 Cur PV 1 Act PV 1 Max PE per PV 1016 VGDA 2 PE Size (Mbytes) 4 Total PE 250 Alloc PE 200 Free PE 50 Total PVG 0 Total Spare PVs 0 Total Spare PVs in use 0 root@hpeos003[]
This process can take quite some considerable time when a number of devices are involved.
Technically, you don't need to remove the old device files, but I consider the job complete when we have completely rid ourselves of all vestiges of the old IO tree. There is also the issue of you or a new colleague finding the old device files and wondering why they are there. Here we can see some of the old device files; notice the address they refer to is now “?
”.
root@hpeos003[] lssf /dev/dsk/c5t* sdisk card instance 5 SCSI target 10 SCSI LUN 0 section 0 at address ? /dev/dsk/c5t10d0 sdisk card instance 5 SCSI target 11 SCSI LUN 0 section 0 at address ? /dev/dsk/c5t11d0 sdisk card instance 5 SCSI target 12 SCSI LUN 0 section 0 at address ? /dev/dsk/c5t12d0 sdisk card instance 5 SCSI target 13 SCSI LUN 0 section 0 at address ? /dev/dsk/c5t13d0 sdisk card instance 5 SCSI target 14 SCSI LUN 0 section 0 at address ? /dev/dsk/c5t14d0 sdisk card instance 5 SCSI target 8 SCSI LUN 0 section 0 at address ? /dev/dsk/c5t8d0 sdisk card instance 5 SCSI target 9 SCSI LUN 0 section 0 at address ? /dev/dsk/c5t9d0 root@hpeos003[]
Because there is no device associated with these device files, we could simply remove them with the rm
command. I will use rmsf
, which can give me a message if I delete a device file with no device associated with it. This is a sanity check to ensure that I don't accidentally delete real device files:
root@hpeos003[] rmsf -a /dev/dsk/c5*
Warning: No device associated with "/dev/dsk/c5t10d0"
Warning: No device associated with "/dev/dsk/c5t11d0"
Warning: No device associated with "/dev/dsk/c5t12d0"
Warning: No device associated with "/dev/dsk/c5t13d0"
Warning: No device associated with "/dev/dsk/c5t14d0"
Warning: No device associated with "/dev/dsk/c5t8d0"
Warning: No device associated with "/dev/dsk/c5t9d0"
root@hpeos003[]
I would continue with this process until I have cleaned up all the old device files.
The problem we have here is that when we first experience a Fibre Channel SAN, most of us think that the Fibre Channel card is in some way akin to a SCSI card. I can understand that because at the end of our SCSI card are our disks and the end of our Fibre Channel card are our disks. What we need to remember is that Fibre Channel is a link technology that supports other protocols running on top of it. In relation to disks, this means the SCSI-2 protocol is running over Fibre Channel. This is the one of the beauties of Fibre Channel, it gives us the best of both worlds—the distances of LANs and the ease-of-use of SCSI. In some ways, I consider fibre channel as a mile-long SCSI cable (I know it can be longer than that … it was a joke). In Chapter 23, we discuss Fibre Channel in more detail. Here, we are going to concentrate on deciphering device files for Fibre Channel attached disks.
The most important thing to remember here is that Fibre Channel is supporting SCSI-2 protocols, so we will use SCSI-2 addressing to reference our disks. This means that the Instance number used in the device file for a disk is the Instance number of the ext_bus
device to which we are connected. Let's take an example of a small SAN with two disk arrays attached to two interconnected switches:
The two switches are connected together, and each HP server has a single connection to each switch. The two disk arrays have been configured with two LUNs (LUN0 = 4GB and LUN1 = 2GB) and three LUNs (LUN0 = 0GB, LUN1 = 10GB and LUN200 = 5GB), respectively. Here, we have the output from ioscan
show the ext_bus
interfaces for node hp1
:
root@hp1[] ioscan -fnC ext_bus Class I H/W Path Driver S/W State H/W Type Description ======================================================================== ext_bus 0 0/0/1/0 c720 CLAIMED INTERFACE SCSI C896 Ultra Wide LVD ext_bus 1 0/0/1/1 c720 CLAIMED INTERFACE SCSI C896 Ultra Wide Single-Ended ext_bus 2 0/0/2/0 c720 CLAIMED INTERFACE SCSI C87x Fast Wide Single-Ended ext_bus 3 0/0/2/1 c720 CLAIMED INTERFACE SCSI C87x Ultra Wide Single-Ended ext_bus 10 0/2/0/0.1.1.0.0 fcparray CLAIMED INTERFACE FCP Array Interface ext_bus 12 0/2/0/0.1.1.255.0 fcpdev CLAIMED INTERFACE FCP Device Interface ext_bus 14 0/2/0/0.2.1.0.0 fcparray CLAIMED INTERFACE FCP Array Interface ext_bus 15 0/2/0/0.2.1.0.1 fcparray CLAIMED INTERFACE FCP Array Interface ext_bus 17 0/2/0/0.2.1.255.0 fcpdev CLAIMED INTERFACE FCP Device Interface ext_bus 18 0/2/0/0.2.2.0.0 fcparray CLAIMED INTERFACE FCP Array Interface ext_bus 20 0/2/0/0.2.2.0.1 fcparray CLAIMED INTERFACE FCP Array Interface ext_bus 21 0/2/0/0.2.2.255.0 fcpdev CLAIMED INTERFACE FCP Device Interface ext_bus 11 0/4/0/0.1.1.0.0 fcparray CLAIMED INTERFACE FCP Array Interface ext_bus 13 0/4/0/0.1.1.255.0 fcpdev CLAIMED INTERFACE FCP Device Interface ext_bus 16 0/4/0/0.2.1.0.0 fcparray CLAIMED INTERFACE FCP Array Interface ext_bus 19 0/4/0/0.2.1.0.1 fcparray CLAIMED INTERFACE FCP Array Interface ext_bus 22 0/4/0/0.2.1.255.0 fcpdev CLAIMED INTERFACE FCP Device Interface ext_bus 4 0/4/0/0.2.2.0.0 fcparray CLAIMED INTERFACE FCP Array Interface ext_bus 23 0/4/0/0.2.2.0.1 fcparray CLAIMED INTERFACE FCP Array Interface ext_bus 5 0/4/0/0.2.2.255.0 fcpdev CLAIMED INTERFACE FCP Device Interface root@hp1[]
You can see that I have underlined the ext_bus
interfaces that are acting as our Array Interface; these are this interfaces used by the SCSI protocol to communicate with LUNs configured on the disk array. The entry described as Device Interface is simply a reference for the disk array itself; it plays no part in the addressing of individual LUNs. I suppose we need to start by explaining a little about the hardware path itself. Let's take one of the addresses as an example:
0/2/0/0.1.1.0.0
The format of the hardware path can be broken into three major components:
<HBA hardware path><Port ID><SCSI address>
<HBA hardware path>
: An HBA in the world of Fibre Channel can be thought of as an interface card: HBA = Host Bus Adapter. In this example, the hardware path is
0/2/0/0
<Port ID>
: Officially this is known as the N_Port ID
. For our discussion, we can simplify it to just Port ID
. The Port ID identifies the end-point in our SAN closest to our disks (LUNs). No matter how many switches we have traversed to get there, the Port ID
is effectively the point at which I leave the SAN and start talking to my disk device. The Port ID is a 24-bit address assigned by a switch in our SAN. The 24-bit address is broken into three components:
A Domain (switch number) = 1
An Area (physical port on the switch) = 1
A Port (the FC-AL Loop ID hence always 0 in a switched fabric SAN)
These are the next three components in our hardware path:
1.1.0
A quick word of caution. In some (older) SANs, you may see the Area in our example as being 17. The firmware in a switch has a setting known as Core Switch PID Format. When turned off, we would add 16 to the physical port number to give the Area = 17
, taking the example above. When Core Switch PID Format is turned on, we simply use the physical port number on the switch. Most new switches come with this firmware setting turned on. A SAN will fragment (switches don't talk to each other) if switches use different settings.
<SCSI address>
: The last component in the address of my ext_bus
interface is known as the Virtual SCSI Bus (VSB) address. As we mentioned previously, Fibre Channel supports SCSI-2 protocols. Trying to visualize what this last component in the address is doing, I visualize this part of the address as the end of the SCSI cable. A Virtual SCSI Bus (visualize the end of SCSI cable) can support up to 128 devices. So at the end of this particular cable could be up to 128 LUNs. This will become important later.
Our example address of 0/2/0/0.1.1.0.0
can be seen in Figure 4-1 as the line near the top of the figure. The other colored lines relate to the hardware address of an ext_bus
as follows in Table 4-1.
The actual Fibre Channel cards themselves have their own Instance numbers:
root@hp1[] ioscan -fnkC fc
Class I H/W Path Driver S/W State H/W Type Description
=================================================================
fc 1 0/2/0/0 td CLAIMED INTERFACE HP Tachyon TL/TS Fibre Channel Mass Storage Adapter
/dev/td1
fc 0 0/4/0/0 td CLAIMED INTERFACE HP Tachyon XL2 Fibre Channel Mass Storage Adapter
/dev/td0
The Instance number of the Fibre Channel card plays no part in the device file name for a disk; remember that a SCSI disk is attached to an ext_bus
interface.
We can now talk about the address of the actual disks themselves. HP-UX hardware paths follow the SCSI-2 standard for addressing, so we use the idea of a target and SCSI logical unit number (LUN). The LUN number we see in Figure 4-1 is the LUN number assigned by the disk array. This is a common concept for disk arrays. HP-UX has to translate the LUN address assigned by the disk array into a SCSI address. If we look at the components of a SCSI address, we can start to try to work out how to convert a LUN number into a SCSI address:
Target: 4-bit address means valid values = 0 through 15
LUN: 3-bit address means valid values = 0 through 7
If we take an example of a LUN number of 2010, there's a simple(ish) formula for calculating the SCSI target and LUN address. Here's the formula:
Ensure that the LUN number is represented in decimal; an HP XP disk array uses hexadecimal LUN numbers while an HP VA disk array uses decimal LUN numbers.
Divide the LUN number by 8. This gives us the SCSI target address.
The remainder gives us the SCSI LUN.
For our simple example of a LUN=20:
LUN(on disk array) = 2010.
Divide 20 by 8 … SCSI target = 2.
Remainder = 4 … SCSI LUN = 4.
SCSI address = 2.4.
This would give us a complete hardware path of:
0/2/0/0.1.1.0.0.2.4
Component of address |
---|
HBA/Interface card |
Domain (=switch number) |
Area (=physical port number) |
Port (always 0 in a switched fabric) |
Virtual SCSI Bus |
SCSI target |
SCSI LUN |
The problem we've got here is that LUN numbers assigned by disk arrays can exceed 128! This poses an interesting problem for HP-UX. The way we get around that is simply to increase the Virtual SCSI Bus address every time we go beyond a multiple of 128. This goes to explain the two ext_bus
we can see in the ioscan
output above:
ext_bus 15 0/2/0/0.2.1.0.1 fcparray CLAIMED INTERFACE FCP Array Interface ext_bus 20 0/2/0/0.2.2.0.1 fcparray CLAIMED INTERFACE FCP Array Interface ext_bus 19 0/4/0/0.2.1.0.1 fcparray CLAIMED INTERFACE FCP Array Interface ext_bus 23 0/4/0/0.2.2.0.1 fcparray CLAIMED INTERFACE FCP Array Interface
The original ext_bus
interfaces ran out of address space because we have a LUN number of 200 on our disk array. HP-UX will create additional ext_bus
interfaces by increasing the Virtual SCSI Bus address by 1. These new interfaces are assigned their own Instance numbers. The Virtual SCSI Bus address is a 4-bit value, so we can support 16 VSB interfaces x 128 LUNS per VSB = 2048 LUNs per physical Fibre Channel card. This has a bearing on our formula for calculating our SCSI target and SCSI LUN address. We now have to accommodate the Virtual SCSI Bus in our formula. Here goes:
Ensure that the LUN number is represented in decimal; an HP XP disk array uses hexadecimal LUN numbers while an HP VA disk array uses decimal LUN numbers.
Virtual SCSI Bus starts at address 0.
If LUN number is < 12810, go to step 7, below.
Increment Virtual SCSI Bus by 110.
Subtract 12810 from the LUN number.
If LUN number is still greater than 12810, go back to step 4 above.
Divide the LUN number by 8. This gives us the SCSI target address.
The remainder gives us the SCSI LUN.
In our SAN, we have a LUN = 200. Let's work out the three components of the SCSI address:
LUN(on disk array) = 20010
Virtual SCSI Bus = 0
Is LUN number > 12810 ?
YES
Virtual SCSI Bus = 1
LUN number = 20010 – 12810 = 7210
Is LUN number > 12810 ?
NO
Divide 72 by 8 … SCSI target = 9
Remainder = 0 … SCSI LUN = 0
SCSI address = 1.9.0
We can now work out how specific device files are created. Let's take an example disk array LUN=200 connected via the Fibre Channel card at H/W Path 0/2/0/0. We know the ext_bus
interface can see the LUN through two ports on Switch2 (ports 1 and 2). Because we are dealing with LUN=200 which exceeds 128, we are dealing with the ext_bus
interfaces which utilize a new Virtual SCSI Bus interface:
ext_bus 15 0/2/0/0.2.1.0.1 fcparray CLAIMED INTERFACE FCP Array Interface ext_bus 20 0/2/0/0.2.2.0.1 fcparray CLAIMED INTERFACE FCP Array Interface
As you can see, the Virtual SCSI Bus address has been increased by one, giving us a new interface and consequently additional Instance numbers. The Instance numbers for these two new interfaces are 15 and 20 respectively, giving device file components, c15
and c20
. From our formula, to calculate the SCSI target and SCSI LUN addresses, we now know the other two components of the device file name, t9
and d0
. Here's an extract from an ioscan
that shows these disks and their associated device file names:
root@hp1[] ioscan -fnC disk
Class I H/W Path Driver S/W State H/W Type Description
===========================================================================
...
disk 15 0/2/0/0.2.1.0.1.9.0 sdisk CLAIMED DEVICE HP A6189A
/dev/dsk/c15t9d0 /dev/rdsk/c15t9d0
...
disk 19 0/2/0/0.2.2.0.1.9.0 sdisk CLAIMED DEVICE HP A6189A
/dev/dsk/c20t9d0 /dev/rdsk/c20t9d0
I must admit that this was not entirely straightforward when I first looked at it, especially when you have a server connected to a large disk array with hundreds of LUNs, each with multiple paths to it. In the example above, we have a single LUN (=200) configured on a disk array, which has four paths to it. Trying to identify it can be a long and complex task. I would remind you of the adb
command we have seen earlier which read the LVM header off the beginning of the disk. This can prove very useful in identifying disks that have multiple PV links to them.
Below is the ioscan
output for the node hp1
with all the paths and device files for the five LUNs in Figure 4-1:
root@hp1[] ioscan -fnC disk
Class I H/W Path Driver S/W State H/W Type Description
===========================================================================
disk 0 0/0/1/1.15.0 sdisk CLAIMED DEVICE HP 36.4GST336753LC
/dev/dsk/c1t15d0 /dev/rdsk/c1t15d0
disk 1 0/0/2/1.15.0 sdisk CLAIMED DEVICE HP 36.4GST336753LC
/dev/dsk/c3t15d0 /dev/rdsk/c3t15d0
disk 20 0/2/0/0.1.1.0.0.0.0 sdisk CLAIMED DEVICE HP A6188A
/dev/dsk/c10t0d0 /dev/rdsk/c10t0d0
disk 21 0/2/0/0.1.1.0.0.0.1 sdisk CLAIMED DEVICE HP A6188A
/dev/dsk/c10t0d1 /dev/rdsk/c10t0d1
disk 9 0/2/0/0.2.1.0.0.0.0 sdisk CLAIMED DEVICE HP A6189A
/dev/dsk/c14t0d0 /dev/rdsk/c14t0d0
disk 10 0/2/0/0.2.1.0.0.0.1 sdisk CLAIMED DEVICE HP A6189A
/dev/dsk/c14t0d1 /dev/rdsk/c14t0d1
disk 15 0/2/0/0.2.1.0.1.9.0 sdisk CLAIMED DEVICE HP A6189A
/dev/dsk/c15t9d0 /dev/rdsk/c15t9d0
disk 16 0/2/0/0.2.2.0.0.0.0 sdisk CLAIMED DEVICE HP A6189A
/dev/dsk/c18t0d0 /dev/rdsk/c18t0d0
disk 17 0/2/0/0.2.2.0.0.0.1 sdisk CLAIMED DEVICE HP A6189A
/dev/dsk/c18t0d1 /dev/rdsk/c18t0d1
disk 19 0/2/0/0.2.2.0.1.9.0 sdisk CLAIMED DEVICE HP A6189A
/dev/dsk/c20t9d0 /dev/rdsk/c20t9d0
disk 11 0/4/0/0.1.1.0.0.0.0 sdisk CLAIMED DEVICE HP A6188A
/dev/dsk/c11t0d0 /dev/rdsk/c11t0d0
disk 12 0/4/0/0.1.1.0.0.0.1 sdisk CLAIMED DEVICE HP A6188A
/dev/dsk/c11t0d1 /dev/rdsk/c11t0d1
disk 13 0/4/0/0.2.1.0.0.0.0 sdisk CLAIMED DEVICE HP A6189A
/dev/dsk/c16t0d0 /dev/rdsk/c16t0d0
disk 14 0/4/0/0.2.1.0.0.0.1 sdisk CLAIMED DEVICE HP A6189A
/dev/dsk/c16t0d1 /dev/rdsk/c16t0d1
disk 18 0/4/0/0.2.1.0.1.9.0 sdisk CLAIMED DEVICE HP A6189A
/dev/dsk/c19t9d0 /dev/rdsk/c19t9d0
disk 2 0/4/0/0.2.2.0.0.0.0 sdisk CLAIMED DEVICE HP A6189A
/dev/dsk/c4t0d0 /dev/rdsk/c4t0d0
disk 3 0/4/0/0.2.2.0.0.0.1 sdisk CLAIMED DEVICE HP A6189A
/dev/dsk/c4t0d1 /dev/rdsk/c4t0d1
disk 22 0/4/0/0.2.2.0.1.9.0 sdisk CLAIMED DEVICE HP A6189A
/dev/dsk/c23t9d0 /dev/rdsk/c23t9d0
root@hp1[]
This should give you some examples to try to work out which disk Interface number is associated with each LUN, e.g., disk Interface = 22 (H/W Path = 0/4/0/0.2.2.0.1.9.0) is LUN 200 connected via /dev/td0 through the SAN via Switch2, port 2.
The ability to replace interface cards, online, was new for the HP-UX 11i release. It became a necessity for the operating system to perform this task with the advent of new, highly available servers such as Superdome. In fact, servers such as Superdome can support the online addition and replacement of not only PCI interface cards but also cell components at the hardware level. Unfortunately, at this time, the operating system is slightly behind the hardware. In the not too distant future, it is anticipated that HP-UX will be able to perform OLA/R on cell components as well. For the moment, we will have to satisfy ourselves with the OLA/R for PCI interface cards.
There are a number of servers that support OLA/R for PCI interface cards (including Superdome). I won't list here the current supported servers because it will no doubt be out of date by the time you read this. Be sure to check your system documentation as to whether your server supports OLA/R.
The motivation for using OLA/R is to avoid rebooting a server in order to add a new PCI interface card or to replace a failed PCI interface card. These are the two tasks we can perform with OLA/R. In the near future, we will be able to perform the deletion of PCI cards. I make a point of mentioning this because it will have a bearing on the task of replacing a failed PCI card, as we see later. We start with replacing a failed PCI card, as this is probably the more involved task.
If the card that has failed is supporting our root disk, then we will certainly have a problem on our hands. In today's IT climate, more and more installations utilize some form of disk/data mirroring in order to protect not only their user data but also their operating system software; downtime can be expensive these days. What is a little scary is the number of installations that use the same interface card to house both the original (primary) disk and the additional (secondary and possibly tertiary) disk(s). In these times of high availability, we need to ensure that all of our mirrored disks are on separate interfaces to avoid a single-point-of-failure (SPOF), and that we don't fall into the trap of just using any old other interface. I know some customers who used the second SCSI port on a dual-port SCSI card to attach their mirror-disks. Not until someone pointed out that the entire PCI card was a SPOF did they think twice about how their mirroring was configured. I will make the assumption that you have your mirrored disks on a separate interface which is an interface on a separate PCI card. I suspect that you have noticed a certain amount of hardware diagnostic messages in syslog
since the PCI card failed. We look at what some of those diagnostic messages mean later on. In my demonstration, I simulate a PCI card that has failed and perform the necessary steps to replace that card.
You can either use SAM to perform OLA/R or perform the steps manually. If you perform the steps manually, you will have to ensure that you have additional resources online when you disable the affected PCI card. SAM will perform a task known as Critical Resource Analysis before disabling an affected PCI card. If your system fails Critical Resource Analysis, SAM will not disable the affected PCI card. As such, SAM is the preferred method for performing this task. In my demonstration, I will perform the steps manually in order to explain each step in turn as well as to highlight any potential pitfalls.
I have all the logical volumes on my root disk mirrored (except my dump device(s); you can't mirror a dump device); you can see this from the output from the lvlnboot
and lvdisplay
commands below:
root @uksd3 #lvlnboot -v vg00 Boot Definitions for Volume Group /dev/vg00: Physical Volumes belonging in Root Volume Group: /dev/dsk/c0t0d0 (2/0/1/0/0.0.0) -- Boot Disk /dev/dsk/c0t1d0 (2/0/1/0/0.1.0) /dev/dsk/c3t8d0 (2/0/4/0/0.8.0) -- Boot Disk /dev/dsk/c3t10d0 (2/0/4/0/0.10.0) Boot: lvol1 on: /dev/dsk/c0t0d0 /dev/dsk/c3t8d0 Root: lvol3 on: /dev/dsk/c0t0d0 /dev/dsk/c3t8d0 Swap: lvol2 on: /dev/dsk/c0t0d0 /dev/dsk/c3t8d0 Dump: lvol2 on: /dev/dsk/c0t0d0, 0 root @uksd3 # root @uksd3 #lvdisplay -v /dev/vg00/lvol1 --- Logical volumes --- LV Name /dev/vg00/lvol1 VG Name /dev/vg00 LV Permission read/write LV Status available/syncd Mirror copies 1 Consistency Recovery MWC Schedule parallel LV Size (Mbytes) 300 Current LE 75 Allocated PE 150 Stripes 0 Stripe Size (Kbytes) 0 Bad block off Allocation strict/contiguous IO Timeout (Seconds) default --- Distribution of logical volume --- PV Name LE on PV PE on PV /dev/dsk/c0t0d0 75 75 /dev/dsk/c3t8d0 75 75 --- Logical extents --- LE PV1 PE1 Status 1 PV2 PE2 Status 2 00000 /dev/dsk/c0t0d0 00000 current /dev/dsk/c3t8d0 00000 current 00001 /dev/dsk/c0t0d0 00001 current /dev/dsk/c3t8d0 00001 current 00002 /dev/dsk/c0t0d0 00002 current /dev/dsk/c3t8d0 00002 current ... 00073 /dev/dsk/c0t0d0 00073 current /dev/dsk/c3t8d0 00073 current 00074 /dev/dsk/c0t0d0 00074 current /dev/dsk/c3t8d0 00074 current root @uksd3 #
The process to replace a failed PCI card can be summarized as follows:
Identify the failed PCI card.
Perform Critical Resource Analysis on the affected PCI card.
Turn on the attention light for the affected PCI card slot.
Check that the affected PCI slot is in its own power domain.
Check that the affected PCI card is not a multi-function card.
Run any associated driver scripts before suspending the driver.
Suspend the kernel driver for the affected PCI slot.
Turn off the power to the affected PCI slot.
Replace the PCI card.
Turn on the power to the affected PCI slot.
Run any associated driver scripts before resuming the driver.
Resume the driver for the affected PCI slot.
Check functionality of the newly replaced PCI card.
Turn off the attention light for the affected PCI slot.
I will go through each of these steps on a live system. This system is a Superdome system running HP-UX 11i version 1. I will replace the PCI card attached to the root disk from which I booted.
In my example, I am going to replace the PCI card for the disks I booted from. Using adb
, I can see which disk and hence which interface card I booted from:
root @uksd3 #echo "boot_string/S" | adb /stand/vmunix /dev/kmem
boot_string:
boot_string: disk(2/0/1/0/0.0.0.0.0.0.0;0)/stand/vmunix
root @uksd3 #
The interface card I am going to disable is the Ultra2 Wide LVD interface at hardware path 2/0/1/0/0:
root @uksd3 #ioscan -fnkH 2/0/1/0
Class I H/W Path Driver S/W State H/W Type Description
========================================================================
ext_bus 0 2/0/1/0/0 c720 CLAIMED INTERFACE SCSI C895 Ultra2 Wide LVD
target 0 2/0/1/0/0.0 tgt CLAIMED DEVICE
disk 0 2/0/1/0/0.0.0 sdisk CLAIMED DEVICE FUJITSU MAJ3182MC
/dev/dsk/c0t0d0 /dev/rdsk/c0t0d0
target 1 2/0/1/0/0.1 tgt CLAIMED DEVICE
disk 1 2/0/1/0/0.1.0 sdisk CLAIMED DEVICE FUJITSU MAJ3182MC
/dev/dsk/c0t1d0 /dev/rdsk/c0t1d0
target 2 2/0/1/0/0.2 tgt CLAIMED DEVICE
disk 2 2/0/1/0/0.2.0 sdisk CLAIMED DEVICE SEAGATE ST118202LC
/dev/dsk/c0t2d0 /dev/rdsk/c0t2d0
target 3 2/0/1/0/0.7 tgt CLAIMED DEVICE
ctl 0 2/0/1/0/0.7.0 sctl CLAIMED DEVICE Initiator
/dev/rscsi/c0t7d0
target 4 2/0/1/0/0.15 tgt CLAIMED DEVICE
ctl 1 2/0/1/0/0.15.0 sctl CLAIMED DEVICE HP A5272A
/dev/rscsi/c0t15d0
root @uksd3 #
You will normally have some idea of which disks have failed by having numerous SCSI and LVM error messages in syslog.log
. You should be able to work out the hardware path of the interface card from the hardware path of the disks that have failed. In this example, we will see the SCSI and LVM error messages when I disable the PCI card in question. I will then decipher the error messages to trace back to the hardware path of the failed interface card. Once we know the hardware address of the PCI interface card, we need to translate this into a PCI slot-id. The type of partitioned server you have will determine how HP-UX converts a hardware path into a PCI slot-id. We need the slot-id in order to communicate with the PCI slot, regardless what type of interface card is in the slot. The most consistent way to translate a HP-UX hardware path into a PCI slot-id is to use the rad
command (on HP-UX 11.23 we can use the olrad
or pdweb
commands). Below, you can see the output from the rad –q
command on my system:
root @uksd3 #rad -q
Driver(s)
Slot Path Bus Speed Power Occupied Suspended Capable
0-1-1-0 2/0/0 0 33 On Yes No No
0-1-1-1 2/0/1/0 8 33 On Yes No Yes
0-1-1-2 2/0/2/0 16 33 On Yes No Yes
0-1-1-3 2/0/3/0 24 33 On Yes No Yes
0-1-1-4 2/0/4/0 32 33 On Yes No Yes
0-1-1-5 2/0/6/0 48 33 On No N/A N/A
0-1-1-6 2/0/14/0 112 33 On No N/A N/A
0-1-1-7 2/0/12/0 96 33 On Yes No Yes
0-1-1-8 2/0/11/0 88 33 On Yes No Yes
0-1-1-9 2/0/10/0 80 33 On Yes No Yes
0-1-1-10 2/0/9/0 72 33 On Yes No Yes
0-1-1-11 2/0/8/0 64 33 On Yes No Yes
root @uksd3 #
I have underlined the line for our interface card. Notice on the right side of the output the column headed Capable
. This will tell us whether the driver is capable of being disabled or, as we'll call it in OLA/R, suspended
. Currently, our card is not suspended; it is capable, and power to the slot is on.
If you are performing OLA/R on a failed PCI interface card, it is absolutely crucial that you perform an exhaustive Critical Resource Analysis. As the name suggests, we are analyzing the system to ensure that we have enough additional resource online to allow our critical resource to be disabled without having any noticeable effect on system availability. In my example, I will need to be absolutely sure that my mirror disks are on separate interfaces and that if the system is rebooted, it will reboot from the mirror disks. I can see the mirror disks from the output of the lvlnboot
command:
root @uksd3 #lvlnboot -v vg00 Boot Definitions for Volume Group /dev/vg00: Physical Volumes belonging in Root Volume Group: /dev/dsk/c0t0d0 (2/0/1/0/0.0.0) -- Boot Disk /dev/dsk/c0t1d0 (2/0/1/0/0.1.0) /dev/dsk/c3t8d0 (2/0/4/0/0.8.0) -- Boot Disk /dev/dsk/c3t10d0 (2/0/4/0/0.10.0) Boot: lvol1 on: /dev/dsk/c0t0d0 /dev/dsk/c3t8d0 Root: lvol3 on: /dev/dsk/c0t0d0 /dev/dsk/c3t8d0 Swap: lvol2 on: /dev/dsk/c0t0d0 /dev/dsk/c3t8d0 Dump: lvol2 on: /dev/dsk/c0t0d0, 0 root @uksd3 #
From this I can also see the hardware path to those disks. Using ioscan
, I can ensure that those interfaces are on separate PCI cards:
root @uksd3 #ioscan -fnkC ext_bus
Class I H/W Path Driver S/W State H/W Type Description
===========================================================================
ext_bus 0 2/0/1/0/0 c720 CLAIMED INTERFACE SCSI C895 Ultra2 Wide LVD
ext_bus 1 2/0/3/0/0 c720 CLAIMED INTERFACE SCSI C87x Ultra Wide Differential
ext_bus 2 2/0/3/0/1 c720 CLAIMED INTERFACE SCSI C87x Ultra Wide Differential
ext_bus 3 2/0/4/0/0 c720 CLAIMED INTERFACE SCSI C895 Ultra2 Wide LVD
ext_bus 4 2/0/8/0/0 c720 CLAIMED INTERFACE SCSI C87x Ultra Wide Differential
ext_bus 5 2/0/8/0/1 c720 CLAIMED INTERFACE SCSI C87x Ultra Wide Differential
ext_bus 6 2/0/11/0/0 c720 CLAIMED INTERFACE SCSI C87x Ultra Wide Differential
ext_bus 7 2/0/12/0/0.8.0.4.0 fcparray CLAIMED INTERFACE FCP Array Interface
ext_bus 8 2/0/12/0/0.8.0.5.0 fcparray CLAIMED INTERFACE FCP Array Interface
ext_bus 9 2/0/12/0/0.8.0.255.0 fcpdev CLAIMED INTERFACE FCP Device Interface
root @uksd3 #
I need to ensure that the system will reboot from those disks in the event of a system reboot. I can do this with the setboot
command, and on a partitioned server I can also use the pardisplay
command:
root @uksd3 #setboot Primary bootpath : 2/0/1/0/0.0.0 Alternate bootpath : 2/0/4/0/0.8.0 Autoboot is ON (enabled) Autosearch is ON (enabled) Note: The interpretation of Autoboot and Autosearch has changed for systems that support hardware partitions. Please refer to the manpage. root @uksd3 # root @uksd3 #parstatus -w The local partition number is 2. root @uksd3 #parstatus -Vp 2 [Partition] Partition Number : 2 Partition Name : uksd3 Status : active IP address : 0.0.0.0 Primary Boot Path : 2/0/1/0/0.0.0 Alternate Boot Path : 2/0/4/0/0.8.0 HA Alternate Boot Path : 2/0/4/0/0.8.0 PDC Revision : 35.4 IODCH Version : 5C70 CPU Speed : 552 MHz Core Cell : cab0,cell2 [Cell] CPU Memory Use OK/ (GB) Core On Hardware Actual Deconf/ OK/ Cell Next Par Location Usage Max Deconf Connected To Capable Boot Num ========== ============ ======= ========= =================== ======= ==== === cab0,cell2 active core 4/0/4 4.0/ 0.0 cab0,bay1,chassis1 yes yes 2 [Chassis] Core Connected Par Hardware Location Usage IO To Num =================== ============ ==== ========== === cab0,bay1,chassis1 active yes cab0,cell2 2 root @uksd3 #
Finally, we will check that the correct ISL boot command is stored in the AUTO file on all mirror disks to ensure that we override LVM quorum specifications if necessary:
root @uksd3 #lifcp /dev/rdsk/c0t0d0:AUTO - hpux -lq root @uksd3 #lifcp /dev/rdsk/c3t8d0:AUTO - hpux -lq root @uksd3 #
We are now ready to proceed.
The attention light is an orange blinking light located on the front of the plastic divider in our PCI card-cage (see Figure 4-2).
A solid green light tells us that there is power to the slot. The attention light will remain blinking orange even if we turn power to the slot off.
root @uksd3 #rad -f on 0-1-1-1
root @uksd3 #
I am not talking about any attention lights on the body of a card itself. This attention light issued from the light pipes located on the plastic separator between individual PCI card slots. Turning this attention light on has nothing to do with the attention light located on the front and back door of a server complex.
Technically, we don't need to turn the attention light on. It is a good sanity check to ensure that both you and the engineer who is going to replace the card know exactly which card it is by flashing the attention light. If you mention the slot-id to an HP Hardware Customer Engineer, both of you can decipher it separately and confirm each other's diagnosis as to which card the slot-id refers to.
Note: I haven't found a way to programmatically tell if the attention light is currently on for a PCI card slot. If you find out, please let me know.
The fact that we are using a Superdome means that each PCI card is in its own power domain. A power domain is one or more PCI card slots sharing a common power source. If we had an rp5400 series machine, some PCI slots share a common power source. If we turn off power to one PCI slot, we turn off power on all slots sharing the common power source. Even on a Superdome system, I still check:
root @uksd3 #rad -a 0-1-1-1
0-1-1-1
root @uksd3 #
If this PCI card slot were sharing a common power source with other PCI card slots, we would see other slot-ids listed in the above output.
Multi-function cards are cards with more than one interface on the card itself. This could be anything from a dual-port SCSI card to a four-port LAN card. If we turn power off to the entire card, then we will affect all ports on the card:
root @uksd3 #rad -h 0-1-1-1
2/0/1/0/0
root @uksd3 #
The above output shows us the hardware address(es) of all ports on the card. The fact that we have only one hardware address associated with this card is good; we don't have a multi-port or multi-function card.
Both the rad –a
and rad –
h commands can be seen as part of our Critical Resource Analysis. As such we have not affected anything in the system yet, except to turn a PCI card slot attention light on. The next step is the first disruptive step in the process. If our Critical Resource Analysis has been not been thorough enough, we could render the system unusable after using the next command.
The programmer who wrote the device driver for this card may have supplied an associated shell script to run whenever we are performing OLA/R on a PCI card. First, we need to know the driver name and the associated HP-UX hardware path. We have probably seen this information previously, but we can just confirm it right here:
root @uksd3 #rad -h 0-1-1-1 2/0/1/0/0 root @uksd3 # root @uksd3 #ioscan -fnkH 2/0/1/0/0 Class I H/W Path Driver S/W State H/W Type Description ======================================================================== ext_bus 0 2/0/1/0/0 c720 CLAIMED INTERFACE SCSI C895 Ultra2 Wide LVD target 0 2/0/1/0/0.0 tgt CLAIMED DEVICE disk 0 2/0/1/0/0.0.0 sdisk CLAIMED DEVICE FUJITSU MAJ3182MC /dev/dsk/c0t0d0 /dev/rdsk/c0t0d0 target 1 2/0/1/0/0.1 tgt CLAIMED DEVICE disk 1 2/0/1/0/0.1.0 sdisk CLAIMED DEVICE FUJITSU MAJ3182MC /dev/dsk/c0t1d0 /dev/rdsk/c0t1d0 target 2 2/0/1/0/0.2 tgt CLAIMED DEVICE disk 2 2/0/1/0/0.2.0 sdisk CLAIMED DEVICE SEAGATE ST118202LC /dev/dsk/c0t2d0 /dev/rdsk/c0t2d0 target 3 2/0/1/0/0.7 tgt CLAIMED DEVICE ctl 0 2/0/1/0/0.7.0 sctl CLAIMED DEVICE Initiator /dev/rscsi/c0t7d0 target 4 2/0/1/0/0.15 tgt CLAIMED DEVICE ctl 1 2/0/1/0/0.15.0 sctl CLAIMED DEVICE HP A5272A /dev/rscsi/c0t15d0 root @uksd3 #
The driver name for this card is c720
. We can look in the directory /usr/sbin/olrad.d
for a script of the same name as the kernel driver. This shell script may ask us for a timeout value for the driver; this is entirely up to the programmer who wrote the kernel driver and supplied this script. We can determine any timeout values associated with this driver using the rad –
V command:
root @uksd3 #rad -V 2/0/1/0/0
Name State Suspend_time Resume_time Remove_time Error time
c720 RUNNING 120.000000 120.000000 0.000000 0.000000
root @uksd3 #
You might have noticed this is the first time the rad
command has taken an HP-UX hardware path as an argument. The convention with the rad
command is that an uppercase option needs an HP-UX hardware path while a lowercase option requires a PCI slot-id. Now that we have those timeout values, we can run the associated driver script with the appropriate command-line arguments:
root @uksd3 #ll /usr/sbin/olrad.d total 80 -r-xr-xr-x 1 bin bin 2889 Nov 14 2000 c720 -r-xr-xr-x 1 bin bin 2977 Dec 12 2001 c8xx -r-xr-xr-x 1 bin bin 2236 Jun 19 2001 fddi4 -r-xr-xr-x 1 bin bin 4542 Dec 21 2000 iop_drv -r-xr-xr-x 1 bin bin 2124 Jun 19 2002 td root @uksd3 # root @uksd3 #/usr/sbin/olrad.d/c720 prep_replace 2/0/1/0/0 root @uksd3 #
This script allows the driver developer to perform any task he feels necessary before all further requests are suspended for the affected PCI card slot. In this instance, I wasn't prompted for any of the timeout values obtained previously. This does not mean that a script you run will not ask you for one or all of those timeout values. We should run these scripts to ensure that anything that needs to be done is actually done.
Without this command, the kernel will produce a plethora of diagnostic messages when we turn off power to the specified PCI slot. This command will effectively suspend the PCI card slot from operating. Initially, kernel subsystems may complain that the card is no longer functioning; in our case, the SCSI and LVM subsystems will complain because outstanding IO requests will not be completed. Let's suspend the driver:
root @uksd3 #rad -s 0-1-1-1 The following interface driver node(s) will be suspended: 2/0/1/0/0 c720 Warning: rad does not perform critical resource analysis. Please ensure that no critical resources are affected by this operation before proceeding. Do you wish to continue(Y/N)? y root @uksd3 # root @uksd3 #rad -q Driver(s) Slot Path Bus Speed Power Occupied Suspended Capable 0-1-1-0 2/0/0 0 33 On Yes No No 0-1-1-1 2/0/1/0 8 33 On Yes Yes Yes 0-1-1-2 2/0/2/0 16 33 On Yes No Yes 0-1-1-3 2/0/3/0 24 33 On Yes No Yes 0-1-1-4 2/0/4/0 32 33 On Yes No Yes 0-1-1-5 2/0/6/0 48 33 On No N/A N/A 0-1-1-6 2/0/14/0 112 33 On No N/A N/A 0-1-1-7 2/0/12/0 96 33 On Yes No Yes 0-1-1-8 2/0/11/0 88 33 On Yes No Yes 0-1-1-9 2/0/10/0 80 33 On Yes No Yes 0-1-1-10 2/0/9/0 72 33 On Yes No Yes 0-1-1-11 2/0/8/0 64 33 On Yes No Yes root @uksd3 #
We will now look at the associated output from syslog
:
root @uksd3 #more /var/adm/syslog/syslog.log ... Nov 4 18:15:50 uksd3 vmunix: SCSI: Write error -- dev: b 31 0x000000, errno: 126, resid: 8192, Nov 4 18:15:50 uksd3 vmunix: blkno: 7835584, sectno: 15671168, offset: -566296576, bcount: 8192. Nov 4 18:15:50 uksd3 vmunix: SCSI: Async write error -- dev: b 31 0x000000, errno: 126, resid: 2048, Nov 4 18:15:50 uksd3 vmunix: blkno: 6389028, sectno: 12778056, offset: -2047569920, bcount: 2048. Nov 4 18:15:50 uksd3 vmunix: blkno: 4701432, sectno: 9402864, offset: 519299072, bcount: 2048. Nov 4 18:15:50 uksd3 vmunix: blkno: 4701464, sectno: 9402928, offset: 519331840, bcount: 2048. Nov 4 18:15:50 uksd3 vmunix: blkno: 4701448, sectno: 9402896, offset: 519315456, bcount: 2048. Nov 4 18:15:50 uksd3 vmunix: blkno: 4701444, sectno: 9402888, offset: 519311360, bcount: 2048. Nov 4 18:15:50 uksd3 vmunix: blkno: 4572074, sectno: 9144148, offset: 386836480, bcount: 2048. Nov 4 18:15:50 uksd3 vmunix: blkno: 4572098, sectno: 9144196, offset: 386861056, bcount: 2048. Nov 4 18:15:50 uksd3 vmunix: blkno: 4572090, sectno: 9144180, offset: 386852864, bcount: 2048. Nov 4 18:15:50 uksd3 vmunix: SCSI: Read error -- dev: b 31 0x000000, errno: 126, resid: 2048, Nov 4 18:15:50 uksd3 vmunix: blkno: 8, sectno: 16, offset: 8192, bcount: 2048. Nov 4 18:15:50 uksd3 vmunix: SCSI: Async write error -- dev: b 31 0x000000, errno: 126, resid: 2048, Nov 4 18:15:50 uksd3 above message repeats 7 times Nov 4 18:15:50 uksd3 vmunix: LVM: Path (device 0x1f001000) to PV 1 in VG 0 Failed! Nov 4 18:15:50 uksd3 vmunix: LVM: vg[0]: pvnum=0 (dev_t=0x1f000000) is POWERFAILED Nov 4 18:15:50 uksd3 vmunix: Nov 4 18:15:51 uksd3 above message repeats 9 times Nov 4 18:15:50 uksd3 vmunix: LVM: vg[0]: pvnum=1 (dev_t=0x1f001000) is POWERFAILED Nov 4 18:15:55 uksd3 vmunix: DIAGNOSTIC SYSTEM WARNING: Nov 4 18:15:55 uksd3 vmunix: The diagnostic logging facility is no longer receiving excessive Nov 4 18:15:55 uksd3 vmunix: errors from the I/O subsystem. 15 I/O error entries were lost. root @uksd3 #
As we can see, SCSI has produced lbolt
messages associated with some devices. Hopefully, in the lbolt
message, or an LVM POWERFAIL
message, we get a pointer to a device. This is the address of an affected device. The LVM POWERFAIL
message has given us a dev_t
pointer of 0x1f000000
and 0x1f001000
. If we look at one of the addresses, 1f000000
, the first two characters represent the major number of the affected device: 1f16= 3110. The remainder of the address is the minor number of the affected device. If we look at the disk under /dev/dsk
, we are looking for a disk with major number = 31
and minor number of 000000
:
root @uksd3 #ll /dev/dsk total 0 brw-r----- 1 bin sys 31 0x000000 Jul 7 11:17 c0t0d0 brw-r----- 1 bin sys 31 0x001000 Jul 7 11:17 c0t1d0 brw-r----- 1 bin sys 31 0x002000 Jul 7 11:17 c0t2d0 brw-r----- 1 bin sys 31 0x03a000 Oct 31 11:17 c3t10d0 brw-r----- 1 bin sys 31 0x038000 Oct 31 11:17 c3t8d0 brw-r----- 1 bin sys 31 0x061000 Jul 7 11:17 c6t1d0 brw-r----- 1 bin sys 31 0x070000 Jul 7 11:17 c7t0d0 brw-r----- 1 bin sys 31 0x070100 Jul 7 11:17 c7t0d1 brw-r----- 1 bin sys 31 0x070200 Jul 7 11:17 c7t0d2 brw-r----- 1 bin sys 31 0x070300 Jul 7 11:17 c7t0d3 brw-r----- 1 bin sys 31 0x070400 Jul 7 11:17 c7t0d4 brw-r----- 1 bin sys 31 0x071000 Jul 7 11:17 c7t1d0 brw-r----- 1 bin sys 31 0x072000 Jul 7 11:17 c7t2d0 brw-r----- 1 bin sys 31 0x073000 Jul 7 11:17 c7t3d0 brw-r----- 1 bin sys 31 0x080000 Jul 7 11:17 c8t0d0 brw-r----- 1 bin sys 31 0x080100 Jul 7 11:17 c8t0d1 brw-r----- 1 bin sys 31 0x080200 Jul 7 11:17 c8t0d2 brw-r----- 1 bin sys 31 0x080300 Jul 7 11:17 c8t0d3 brw-r----- 1 bin sys 31 0x080400 Jul 7 11:17 c8t0d4 brw-r----- 1 bin sys 31 0x081000 Jul 7 11:17 c8t1d0 brw-r----- 1 bin sys 31 0x082000 Jul 7 11:17 c8t2d0 brw-r----- 1 bin sys 31 0x083000 Jul 7 11:17 c8t3d0 root @uksd3 #
As you can see, we have pinpointed the affected devices. With lssf
, we can identify the interface card to which it pertains:
root @uksd3 #lssf /dev/dsk/c0t0d0 sdisk card instance 0 SCSI target 0 SCSI LUN 0 section 0 at address 2/0/1/0/0.0.0 /dev/dsk/c0t0d0 root @uksd3 #lssf /dev/dsk/c0t1d0 sdisk card instance 0 SCSI target 1 SCSI LUN 0 section 0 at address 2/0/1/0/0.1.0 /dev/dsk/c0t1d0 root @uksd3 #
We can also see that our root disk now has stale extents associated with the IO requests that never completed:
root @uksd3 #lvdisplay -v /dev/vg00/lvol* | grep stale | wc -l
52
root @uksd3 #
With the driver now suspended, the kernel will no longer process any requests or messages for this device. This should be born in mind if we are to use commands like ioscan
at this point:
root @uksd3 #ioscan -fnC ext_bus
Class I H/W Path Driver S/W State H/W Type Description
===========================================================================
ext_bus 0 2/0/1/0/0 c720 CLAIMED INTERFACE SCSI C895 Ultra2 Wide LVD
ext_bus 1 2/0/3/0/0 c720 CLAIMED INTERFACE SCSI C87x Ultra Wide Differential
ext_bus 2 2/0/3/0/1 c720 CLAIMED INTERFACE SCSI C87x Ultra Wide Differential
ext_bus 3 2/0/4/0/0 c720 CLAIMED INTERFACE SCSI C895 Ultra2 Wide LVD
ext_bus 4 2/0/8/0/0 c720 CLAIMED INTERFACE SCSI C87x Ultra Wide Differential
ext_bus 5 2/0/8/0/1 c720 CLAIMED INTERFACE SCSI C87x Ultra Wide Differential
ext_bus 6 2/0/11/0/0 c720 CLAIMED INTERFACE SCSI C87x Ultra Wide Differential
ext_bus 7 2/0/12/0/0.8.0.4.0 fcparray CLAIMED INTERFACE FCP Array Interface
ext_bus 8 2/0/12/0/0.8.0.5.0 fcparray CLAIMED INTERFACE FCP Array Interface
ext_bus 9 2/0/12/0/0.8.0.255.0 fcpdev CLAIMED INTERFACE FCP Device Interface
root @uksd3 #
As you can see, the state of the driver for this card is still CLAIMED
and we are not seeing a NO-HW
state anywhere. The reason is that before we suspended the driver, the card was in a CLAIMED
state. Now that the driver has been suspended, it will remain in this state until the driver is resumed. We can proceed with turning off power to the PCI card slot.
Now that the kernel driver is no longer processing requests for this PCI card slot, we can now turn off the power to the slot itself.
root @uksd3 #rad -o 0-1-1-1 root @uksd3 # root @uksd3 #rad -q Driver(s) Slot Path Bus Speed Power Occupied Suspended Capable 0-1-1-0 2/0/0 0 33 On Yes No No 0-1-1-1 2/0/1/0 8 33 Off Yes Yes Yes 0-1-1-2 2/0/2/0 16 33 On Yes No Yes 0-1-1-3 2/0/3/0 24 33 On Yes No Yes 0-1-1-4 2/0/4/0 32 33 On Yes No Yes 0-1-1-5 2/0/6/0 48 33 On No N/A N/A 0-1-1-6 2/0/14/0 112 33 On No N/A N/A 0-1-1-7 2/0/12/0 96 33 On Yes No Yes 0-1-1-8 2/0/11/0 88 33 On Yes No Yes 0-1-1-9 2/0/10/0 80 33 On Yes No Yes 0-1-1-10 2/0/9/0 72 33 On Yes No Yes 0-1-1-11 2/0/8/0 64 33 On Yes No Yes root @uksd3 #
We are now in a position to have the card replaced by a qualified HP engineer. You might want to tell the engineer that it is the PCI card in cabinet 0, Bay 1, IO cardcage 1, slot 1; it's the card with the attention light flashing.
In most cases, you will need an HP Hardware Customer Engineer to replace hardware in your server. If you perform this task yourself, there is a possibility of rendering the entire system unusable as well as nullifying your Support Contract. Check this out before proceeding. If you are going to replace a PCI card on your own, please make sure you follow all electro-static discharge guidelines. The PCI cards do not have retention screws but are a tight fit in the PCI slot. When we turn power back on to the slot, if we have not inserted it properly, we will receive an error message.
The card that we use must be of the same product type as the one currently in the PCI card slot. If one of your colleagues has told you that “it is a Fibre Channel card, it will be OK,” that's not necessarily true. Hewlett-Packard will only support the replacement of a PCI card with another PCI card of the same product number. This needs to be emphasized and explained a little further.
If we had an A6795A Fibre Channel card, I must replace it with an A6795A Fibre Channel card. Just because we have another Fibre Channel card does not mean that any assumptions of predetermined behavior established by the kernel driver would be maintained if we replace the A6795A card with just another Fibre Channel card.
Another aspect of this is that if we have a PCI card slot occupied with a card, e.g., our A6795A Fibre Channel card, and we want to replace it with an ATM card, for example, we would have to go through a reboot of the server to effect this change. The reason for this is that OLA/R does not support deleting the old driver from the kernel IO tree before adding a new card into the kernel IO tree.
Now that the card has been replaced, we can now turn on power to the slot. If we have not inserted the replacement card properly, we will get an error message:
root @uksd3 #rad -i 0-1-1-1 root @uksd3 #rad -q Driver(s) Slot Path Bus Speed Power Occupied Suspended Capable 0-1-1-0 2/0/0 0 33 On Yes No No 0-1-1-1 2/0/1/0 8 33 On Yes Yes Yes 0-1-1-2 2/0/2/0 16 33 On Yes No Yes 0-1-1-3 2/0/3/0 24 33 On Yes No Yes 0-1-1-4 2/0/4/0 32 33 On Yes No Yes 0-1-1-5 2/0/6/0 48 33 On No N/A N/A 0-1-1-6 2/0/14/0 112 33 On No N/A N/A 0-1-1-7 2/0/12/0 96 33 On Yes No Yes 0-1-1-8 2/0/11/0 88 33 On Yes No Yes 0-1-1-9 2/0/10/0 80 33 On Yes No Yes 0-1-1-10 2/0/9/0 72 33 On Yes No Yes 0-1-1-11 2/0/8/0 64 33 On Yes No Yes root @uksd3 #
The script we ran in Step 6 has a number of command line options depending on whether we are adding or replacing a driver. One of the options is post_replace
. Like the prep_replace
option, it allows the kernel developer to do something to the card at the specified hardware path before the kernel driver is resumed. An example cited to me could be a Fibre Channel card where the laser can be turned on and tested, independently of the rest of the card. We should ensure that we run the associated script to ensure that anything that needs to be done is done.
root @uksd3 #ll /usr/sbin/olrad.d total 80 -r-xr-xr-x 1 bin bin 2889 Nov 14 2000 c720 -r-xr-xr-x 1 bin bin 2977 Dec 12 2001 c8xx -r-xr-xr-x 1 bin bin 2236 Jun 19 2001 fddi4 -r-xr-xr-x 1 bin bin 4542 Dec 21 2000 iop_drv -r-xr-xr-x 1 bin bin 2124 Jun 19 2002 td root @uksd3 #/usr/sbin/olrad.d/c720 post_replace 2/0/1/0/0 root @uksd3 #
Now that the card is replaced and is prepared for re-execution, we use the rad
command to resume the kernel driver:
root @uksd3 #rad -r 0-1-1-1 root @uksd3 # root @uksd3 #rad -q Driver(s) Slot Path Bus Speed Power Occupied Suspended Capable 0-1-1-0 2/0/0 0 33 On Yes No No 0-1-1-1 2/0/1/0 8 33 On Yes No Yes 0-1-1-2 2/0/2/0 16 33 On Yes No Yes 0-1-1-3 2/0/3/0 24 33 On Yes No Yes 0-1-1-4 2/0/4/0 32 33 On Yes No Yes 0-1-1-5 2/0/6/0 48 33 On No N/A N/A 0-1-1-6 2/0/14/0 112 33 On No N/A N/A 0-1-1-7 2/0/12/0 96 33 On Yes No Yes 0-1-1-8 2/0/11/0 88 33 On Yes No Yes 0-1-1-9 2/0/10/0 80 33 On Yes No Yes 0-1-1-10 2/0/9/0 72 33 On Yes No Yes 0-1-1-11 2/0/8/0 64 33 On Yes No Yes root @uksd3 #
Depending on the original function of the card determines what happens next. If the card was a primary LAN interface in a MC/ServiceGuard cluster, I would expect MC/ServiceGuard to relocate the appropriate IP addresses back from the standby interface to the primary interface. In our case, we will check on the how our stale extents are behaving.
We can use the rad –c
command to check the status of a particular PCI slot:
root @uksd3 #rad -c 0-1-1-1
Path :2/0/1/0/0
Name :c720
Device_ID :000c
Vendor_ID :1000
Subsystem_ID :10f5
Subsystem_Vendor_ID :103c
Revision_ID :2
Class :010000
Status :0200
Command :0157
Multi_func :No
Bridge :No
Capable_66Mhz :No
Power_Consumption :75
root @uksd3 #
Normally, we will want to check with commands such as lanscan
(for replaced LAN cards), LVM commands (for replaced Fibre Channel, SCSI cards), and the content of syslog.log
to ensure that functionality has been restored. In our case, we should check up on the state of our stale extents:
root @uksd3 #lvdisplay -v /dev/vg00/lvol* | grep stale | wc -l
0
root @uksd3 #
As we can see, LVM has resynchronized all the stale extents. This is the normal behavior of LVM. From syslog.log
, we can see that the disks attached to that interface have now been returned to the volume group:
root @uksd3 #tail -5 /var/adm/syslog/syslog.log Nov 4 18:32:45 uksd3 EMS [2738]: ------ EMS Event Notification ------ Value: "CRITICAL (5)" for Resource: "/storage/events/disks/default/2_0_1_0_0.2.0" ( Threshold: >= " 3") Execute the following command to obtain event details: /opt/resmon/bin/resdata -R 179437576 -r storage/events/disks/default/2_0_1_0_0 .2.0 -n 179437571 -a Nov 4 18:36:02 uksd3 vmunix: LVM: Recovered Path (device 0x1f001000) to PV 1 in VG 0. Nov 4 18:36:02 uksd3 vmunix: LVM: Recovered Path (device 0x1f000000) to PV 0 in VG 0. Nov 4 18:36:02 uksd3 vmunix: LVM: Restored PV 1 to VG 0. Nov 4 18:36:03 uksd3 vmunix: LVM: Restored PV 0 to VG 0. root @uksd3 #
At the beginning, we turned on the attention light for the affected PCI slot. We would remiss to leave it blinking; this would alert someone to the possibility of a potential hardware problem. It is a good practice to turn the attention light off. We do so with the rad –f
command:
root @uksd3 #rad -f off 0-1-1-1
root @uksd3 #
Before we add a new PCI card, we must make sure that the driver for the new card is currently in the kernel. If we don't have the driver loaded in the kernel, chances are good that we will have to reboot the server to include the driver into the kernel. If the card is a multi-function card, we need to ensure that all the drivers for all the functions are loaded in the kernel.
From HP-UX 11.0 onward, HP-UX supported Dynamically Linked Kernel Modules (DLKM). If the driver you need to load is not a DLKM, then it needs to be linked into the kernel. I will not cover this in great detail here, but just to remind you:
root @uksd3 #kmadmin -s
Name ID Status Type
=====================================================
krm 1 UNLOADED WSIO
root @uksd3 #
This command tells me all the DLKM modules installed and compiled on my system. If the driver in question is a DLKM module, I will need to install and compile it (usually performed as part of a swintsall
of the driver software). When completed, I should see the driver listed in the output to kmadmin –s
. As such, few drivers for HP-UX are DLKM modules, although new software products, e.g., IPFilter, are introducing more and more DLKM every day. It is worth checking the documentation for the driver concerned. We could then load the driver into memory simply with this command:
root @uksd3 #kmadmin -L krm kmadmin: Module krm loaded, ID = 1 root @uksd3 #kmadmin -Q krm Module Name krm Module ID 1 Module Path /stand/dlkm/mod.d/krm Status LOADED Size 61440 Base Address 0xe84000 BSS Size 53248 BSS Base Address 0xe85000 Hold Count 1 Dependent Count 0 Unload Delay 0 seconds Description krm Type WSIO Block Major -1 Character Major 76 Flags a5 root @uksd3 #
We are now in a position to proceed with adding the new PCI card. The process is quite similar to the process for replacing a failed PCI card, so I will simply list the relevant bullet points and make any additional comments:
Identify any empty PCI card slot.
We would use the rad –q
command to identify an empty slot.
Perform Critical Resource Analysis on the affected PCI card slot.
There are currently no resources using this slot, but we do have to consider the consequences of turning off power to this slot.
Check that the affected PCI slot is in its own power domain.
Turn on the attention light for the affected PCI card slot.
Turn off the power to the affected PCI slot.
Add the PCI card and attach any associated devices.
Turn on the power to the affected PCI slot.
Run any associated driver scripts before resuming the driver.
The script located in /usr/sbin/olrad.d
has a command line option of post_add
. We should run the associated script to ensure that anything that needs to be done is done.
Check functionality of the newly replaced PCI card.
In this case, we will most likely have to assign an Instance number to the card (run ioscan
) and create the necessary device files (run insf –ve
) before we can use any of the new devices.
Turn off the attention light for the affected PCI slot.
That concludes our discussions regarding Managing HP-UX Peripherals.
You are attempting to ensure that the IO tree on all your HP-UX machines is the same from one server to another. All the servers currently have exactly the same hardware configuration and are all connected to the same number of shared devices. All the servers have recently been reinstalled with the most recent version of HP-UX. Explain how you can ensure that all the servers in your network continue to create device files following the same device file naming convention for every shared device. Give at least one example when the device file names could become out of sync. | |
You have successfully remapped your IO tree to reflect the IO tree on other servers in your network. You can identify all the new device files and can identify individual devices (uin commands such as | |
Your SAN administrator has decided to perform a firmware upgrade on all of your Fibre Channel switches because they are rather old and have experienced intermittent problems that have been identified with the particular firmware revision used on the switches. The upgrade was entirely successfully, and the SAN is working as expected. Unfortunately, the main database application cannot access its data held on a disk array located within the SAN. The SAN administrator has checked the zoning of the SAN and all appears to be okay (they even changed the GBICs just in case). The disk array administrator has checked the LUN security on the disk array, and all appears okay. What could be causing the application to not be able to see the LUNs on the disk array? List two solutions to rectify the problem. | |
Given the simplified diagram of a Switched Fabric SAN in Figure 4-3, work out the full hardware path and associated device files (/dev/*dsk/?) that represent the path taken by the red and blue lines to the LUN on the disk array. Note: The | |
You are about to replace an interface card using OLA/R commands instead of using SAM. You know the hardware path and the associated slot-id of the interface card in question. Why is it important that you know the name of the kernel driver associated with the interface card? |
3.17.157.182