4.2. SAN Storage

Centralizing your storage for multiple hosts can have a number of positive effects for many environments. Many centralized storage environments are SANs. Proper planning will have long-lasting positive effects on your installation. And in the planning stage, one of the most critical considerations is what your SAN will be doing. That determines the requirements for performance, total capacity, concurrent access, uptime—even future expansion (another area where thinking in advance is imperative). Once you've completed planning, you need to gather the necessary equipment, software, and documentation for the installation. Whatever form your SAN takes, it'll be a bit of an investment, but a decent one if you plan appropriately for your environment. Fail to do that, though, and you may end up with a large bill and a system that doesn't work for you.

What constitutes a SAN can be different to different people. Because Apple distributes and supports Xsan, though, we'll cover it first. In the course of this, we'll also look at setting up standard file-sharing services on the Xsan volume, to provide high availability beyond what is capable when using a traditional file server that uses direct-attached storage. We'll set up multiple server heads in an active/active configuration. Once we've covered Xsan, we'll look at using iSCSI initiators to interface with common SAN solutions that your organization may already have in production.

4.2.1. Xsan

Apple has made Xsan one of the easiest, most cost-effective and versatile storage area networking platforms on the market. This powerful software integrates Mac OS X Server (as well as other Apple offerings), Fibre Channel (which we'll discuss shortly), and RAID architecture. It binds all the components together to provide performance and flexibility that pushes centralized storage for heterogeneous networks to the next level. To grasp the power and flexibility of this solution, you must first understand how it organizes and provides access to data, which we describe throughout this chapter. The combination of Mac OS X Server and various other Apple offerings provides multiple computers concurrent access to large amounts of media, organized using pools of storage and interconnected using Fibre Channel, providing fast and virtualized connectivity to the target storage.

SAN installations are as diverse as the businesses they serve. And although Xsan was developed primarily for professional video, admins can leverage it to provide storage for a wide array of uses including file sharing, mail clustering, and calendar-server clustering. In an Xsan installation you'll find a variety of components. These typically include Apple Xserve RAID or Promise RAID storage, client systems with Fibre Channel cards, transceivers, Xsan software, fiber cabling, an FC switch, a dedicated Ethernet network for management, and one or more systems (known as metadata controllers or MDCs) devoted to running the Xsan. Each node must also run either Xsan for client connectivity and administration or, on Windows and Linux PCs, Quantum StorNext (which you can purchase at www.quantum.com).

Fibre Channel is an extension to SCSI that allows connections to a wide variety of devices and multiple petabytes of data via copper and optical cabling. Client nodes must be connected via FC to access an Xsan installation. Fibre-attached servers can reshare data over various file- or Web-sharing protocols, but the exclusive direct access to storage through Fibre Channel provides security and much better performance. Each file or Web server added to the SAN will make it faster provided you haven't saturated the back-end storage.

4.2.1.1. Cabling and Transceivers

Apple FC cards that come with two SFP (small form-factor pluggable) ports (most do) also include two, 2.9-meter copper cables with SFP connections on each end. You shouldn't use longer copper cables.

For connecting systems that are further than 2.9 meters from the SAN, Apple advocates converting from standard copper cables to LC (Lucent Connector) or SC (Subscriber Connector) optical cabling and highly recommends that it be multimode, which, throughout this chapter, we'll assume to be in use. Because most devices added into an FC network use LC adapters, going that route typically offers the path of least resistance when installing your SAN. LC multimode cables are typically orange, indicating a maximum throughput of 4gbps, or light blue (for 10gbps) and contain two optical cables per sheath, making them easy to identify. To use LC cabling, the SFP connection built into systems must be adapted from SFP to LC using a transceiver.

Not all LC cabling can support the maximum speeds, so it's important to ensure you're using the proper type. The maximum length of an optical cable is determined by its diameter. You can run cables that are 9μm in diameter up to 10km. This is typically referred to as long haul. The only long-haul transceiver supported by Apple for the Xsan is the Finisar FTRJ-1319-P1BTL. Short-haul cables are 50μm and can run 500 meters; 62.5μm cables can span 300 meters.

If you'll be using transceivers, Apple recommends sticking with the same manufacturer and model for all devices connected to your fabric (an interconnection of FC host ports). It's also worth noting that the online Apple Store sells Finisar transceivers.

4.2.1.2. Storage

We refer to a single device chassis containing a number of drives as a shelf of storage. You can combine the disks on a shelf of storage into a logical RAID that, depending on its type, offers a variety of features such as redundancy or faster access times. The most common RAID products used in an Xsan environment are the Apple Xserve RAID in legacy installations and the Promise Vtrak in newer setups. (You can use other Fibre Channel RAID devices, including those from EMC and Active Storage, but they will likely not be supported by Apple.) Each RAID unit, or shelf, will typically have multiple controllers.

Each RAID can provide a number of LUNs (Logical Unit Numbers). A LUN is a logical partition of the storage that resides on a given shelf. On an Xserve RAID device, a LUN is restricted to drives managed by a given controller. With a Promise RAID product, a controller can provide LUN failover between controllers as long as they're within the same shelf or in a connected expansion chassis. Each controller is then plugged into the Fibre Channel switch.

NOTE

While there are many hardware vendors that supply components you can use in an Xsan, the devices must all be approved by Apple if you want support from Apple.

4.2.1.3. Virtualized Storage

As you add more RAID devices to your environment, you aggregate the storage. Xsan can combine a set of LUNs into a storage pool. A storage pool can span multiple shelves or be on a single unit but should typically contain at most four LUNs. Because the storage pool will reduce the capacity of all its LUNs to that of the smallest one, LUNs you choose to pool should be of similar capacity.

Combining multiple storage pools creates volumes, and with Xsan you can mount and unmount these on client systems. For most purposes, such a collection will present as any other local hard drive despite running the Apple clustered file system (ACFS) rather than the default OS X file system, HFS+. Servers treat Xsan volumes much as they would direct-attached external storage despite the significantly more-complex back-end infrastructure.

Once configured, Fibre Channel is the network that interconnects all of the clients, servers, and storage. In FC jargon, the first two are referred to as FC initiators. Storage devices such as disk-based raids, tape libraries, or other storage media are referred to as targets. Built on top of this storage and communications infrastructure, the Xsan software provides the virtualized, logical constructs used to provide maximum speed, redundancy, and concurrent access.

NOTE

Although not strictly considered a component of an Xsan, a UPS capable of powering the equipment is absolutely necessary. This one item can save an administrator some painful headaches. If your entire data center is powered by a UPS, you're probably covered; if not, you should certainly invest in one and set up the automated shutdown software to unmount clients, stop volumes, and gracefully shut down the computers that manage the SAN.

4.2.1.4. Initiators

Now that we've covered the physical components of an Xsan, it's important to understand those that reside on Mac OS X. For the purposes of this chapter, Xsan clients are systems that log into an Xsan and mount volumes. Metadata controllers are systems that manage those same volumes. The Xsan software installs and runs a number of services on the computer that manages the actual Xsan.

All of the computers that act as Xsan initiators (non-storage devices, such as clients and controllers) run the Xsan software and have a host bus adapter (HBA). Apple sells rebranded LSI Logic HBAs for use with Xsan or for connecting directly to an Xserve RAID or Promise RAID component. Each FC port on these cards has a factory-assigned WWPN (WorldWide Port Name) and WWN (WorldWide Name), the equivalent of an Ethernet adapter's MAC address. In a standard setup you probably won't need to customize any of the card's settings. If necessary, though, you can do so using the FC System Preference pane. Configuration choices are: Automatic, Point-to-Point, and Arbitrated Loop. With Xsan initiators you, use Automatic or Point-to-Point. Each of the host adapters gets plugged into the Fibre Channel switch.

NOTE

The release tab on the cable is very close to the chassis when the Apple FC PCI card is installed in the upper PCI slot of the dual PCI riser card. The limited space can make it difficult to press the tab on the connector to release the cable. In this case, use a flat object such as a screwdriver or knife to depress the tab before pulling on the cable. Do not force the connectors.

You can use Quantum StorNext to set up non-Apple clients on the Xsan. The software supports AIX, IRIX, Linux, Solaris, and Windows clients. Many of these non-Apple machines can connect to the switch with Fibre Channel cards manufactured by ATTO Technology, LSI, Qlogic, and other suppliers (ATTO and Qlogic both have drivers for Mac OS X as well).

4.2.1.5. Switches

For a SAN to be considered a fabric, it must have an FC switch. With Xsan, we strongly recommended that you use one supported by Apple. Such devices include the Brocade Silkworm 200E, 4100, and 4900, the Cisco MDS 9000 Series, and the QLogic SANbox 2-64, 1400, 5200, 5600, 9100, and 9200 series. If, for an earlier Xsan release, Apple certified a switch that you're using, the company will likely continue its support even if the device isn't in the current list of qualified switches (which you'll find at www.apple.com/xsan/specs.html).

Whatever the brand of switch, in an Xsan deployment, some parts of the configuration process are identical. Before anything else, you should upgrade the firmware (which you should do with most any device). Even after setup, continually updating firmware is important. (Of course, in many cases, you don't want to do so at the expense of bringing a SAN down unless there's a compelling reason for the upgrade.) Once your switches are running the latest firmware, you can administer most through a Web-based interface.

You also want to prevent any interruption in communications with your targets. This includes Registered State Change Notifications (RSCNs), which should be suppressed on initiator ports for all switches. Typically, a client sends an RSCN when connecting to a fabric, and that can cause communications interrupts. Because client workstations tend to reboot often, suppressing RSCN on initiator ports ensures that communication between initiators and targets remains uninterrupted.

You should also make sure that communications occur at the appropriate speed. If a switch and a target or a switch and a LUN are both capable of running at 4Gbps, you should verify that the link appears as 4Gbps on both ends. Switches, targets, and initiators assign speeds automatically (in much the way most Ethernet cards and switches auto-sense), so you don't usually have to statically set a port's speed. But as you add new devices to your fabric, verify that they communicate at the proper rate. When a SAN client displays poor performance or high latency, statically assigning link type and speed can sometimes address the issue. Also, Promise support advises that you statically configure controllers with these settings to reduce latency.

When dealing with FC link negotiation, having some basic knowledge about various port topologies is important. These are broken down according to type and use. The FC spec calls for several initiator port topologies:

  • N_port (node port): Specifies a point-to-point topology.

  • NL_port (node-loop port): Refers to a client port that will negotiate as an arbitrated loop device. Generally, you should avoid this configuration, but many tape drives support NL_port topologies only.

On the fabric-switch side, you'll find these topologies include:

  • E_port (expansion port): Used to connect two switches together via an ISL (Inter-Switch Link) connection.

  • F_port (fabric port): Negotiates a point-to-point connection with an N_port device.

  • FL_port (fabric loop port): Can operate as an F port but can also connect via arbitrated loop to NL_port devices.

  • G_port (generic port): Can operate either as an N_port or E_port, as needed.

  • GL_port (generic loop port): This is a generic port that can act as a G_port or an FL_port. This is the topology used by Qlogic switches out of the box.

When setting up FC switches and storage, also set the NTP (Network Time Protocol) service, and when possible, centralize logging and set e-mail alerts. Each of these steps can help down the road if you ever need to troubleshoot your Xsan or have issues that you need to be alerted about.

NOTE

Xsan environments do not support switching hubs. When added (and sometimes on rebooting), a device sends out a loop initialization primitive (LIP) to request an address. All activity on the loop can cease as each device establishes a connection within the newly enumerated fabric. A hub-based SAN consists of one loop and therefore must be entirely rebuilt every time any device is added or removed. This wreaks havoc on an Xsan, and can even cause a LIP storm, which can cause endless streams of initialization requests. FC switches can also respond poorly to LIP requests, which are sent when a computer with an improperly set startup disk reboots. The FC port will be queried for a startup disk and a LIP will occur. Because of this, for all clients that are Xsan initiators (yes, that includes your metadata controllers) you should go to the Startup Disk System Preference pane and set the startup disk. We also recommended that you statically set your FC connections to point-to-point (N port) using the FC System Preference (In 10.4, this is found in /Applications/Servers).

4.2.1.5.1. Brocade Switches

You administer Brocade Switches through a Web portal (at the IP address 10.77.77.77) using admin for the user name and password for the password. The first time you use the Web Tools you'll have to enter a license.

4.2.1.5.2. Emulex Switches

Some older Emulex SAN switches—the 12-port 355, the 375, and the 9200—maintain legacy support for Xsan. Emulex switches require that you set the host machine ports to Initiator with Stealth and the storage-device ports are as Target with Stealth. You can access Emulex switches at the IP address 169.254.10.10. They require no user name and the default password is password.

4.2.1.5.3. QLogic Switches

The latest QLogic firmware supports administration through a Web portal only. With devices using old versions of the firmware, install the included configuration application on a workstation attached to the switch and go to the IP address 10.0.0.1. The company has updated and enhanced the software, which it now calls the Qlogic Fabric Suite. To authenticate with the switch for the first time, use admin for the user name and password for the password.

In the past, Apple supported the QLogic SANbox 2-8 and 2-16. As noted previously, the company currently certifies the newer SANbox 2-64, 5000 series (which offers devices with 4 10-gigabit stacking ports and 16 2-gigabit device ports), Qlogic 9000 series. Qlogic switches are a common in Xsan environments.

4.2.1.5.4. Cisco Switches

The most recent brand added to the line of supported switches with Xsan is Cisco. The Cisco MDS 9000 family supports 16- and 32-port modules. The Cisco FC switch is the most highly configurable and feature-rich of the FC switches supported by Xsan. The tradeoff of flexibility is that the Cisco switch is the most complicated of the bunch. While there is a Web-based utility for the system, it is only for monitoring. Initial setup of the switch is performed through the serial port on the system.

4.2.1.6. Zones

You can control access to the SAN using either LUN masking or switch zoning. All of the switches we've mentioned so far support FC zoning. Zoning is similar to creating a VLAN on an Ethernet switch. With LUN masking, you slice the physical storage into partitions (LUNs), and establish filters based on the LUNs' World Wide Names to ensure that only the intended servers have access. When using LUN Masking, you can use the switch to designate a LUN as accessible to one system only or to put both target and initiator ports in a larger zone with other devices, then use the software on a target to restrict access to an initiator.

Admins who decide to use zones can go about it a few different ways. With the first, you build a zone based on whichever device is in the physical port you include. You can find this port zoning helpful in environments where administrators simply need a map of which ports are in which zones but don't need access to make changes. Note, though, that if you add new targets, they may appear to be formatted on a number of clients, and could be formatted accidentally by an unwitting user. The second zoning method does so by the address of devices. Across brands, you'll find different terms associated with these approaches.

Opinions about zone management of clients also differ, as do methods. Some people create a new zone for every initiator, restricting what targets each can access. Others leave all their initiators and targets in one big zone and simply let initiators access each target as needed. Still others choose to create two zones, one for metadata controllers and one for client initiators. Each approach has merits, but given that these methods will have similar effects, in most cases your choice boils down to doing whatever filts the security policy and the logic of your environment.

In general, zoning based on Fibre Channel WWNs provides the most resilient setup, eliminating port-lock in and providing a generally less-ambiguous management environment, provided you properly nickname. If your switch supports aliases, grouping target WWNs into a single alias container can greatly simplify deploying a large number of targets across multiple zones. If storage is grouped/aliased in logical divisions, adding new storage is a much more efficient process, as you need only upgrade the group to have the addition applied across all zones that reference the alias.

Generally, when using a tape library that's directly attached to the FC fabric, you'll need to zone the tape drive to be accessible by only a single host port (basically, the backup server). This is often necessary to ensure consistent functionality and to prevent the backup software from producing odd errors.


4.2.2. Configuring Storage

Whether you're configuring switches or storage, the basic precept is pretty much the same: Don't impede the ability of the initiator to write data to the target. Do everything you can to maximize the likelihood that data will be efficiently delivered to the appropriate location.

Whichever vendor you choose, when setting up storage you'll have these options, or some combination of these options for configuring logical RAID constructs:

RAID 0: Offers no redundancy. Gives the fastest data access speeds and is the most inexpensive option but can't guarantee data availability, since it offers no fault tolerance.

RAID 1: Provides data mirroring. Highest-cost option with regard to data capacity. A RAID 1 mirror set typically provides significantly better read speeds than a single member, though write speeds will be roughly equivalent.

RAID 3: Utilizes data striping with one drive dedicated to parity. RAID 3 sets can suffer a single drive loss without data loss.

RAID 5: Stripes both userland and parity data across all of the drives, yet produces a relatively small diminishment in capacity (generally the equivalent of a single drive or less). Provides redundancy at the lowest cost in drive space. RAID 5 sets can suffer the loss of a single drive without loss of data.

RAID 6: Similar to RAID 5 but with a second parity drive so that if two drives go down concurrently, the RAID setup isn't compromised.

4.2.2.1. Promise Vtrak

Promise ships two types of RAID devices that Apple has certified to work with Xsan: the E-class, which has RAID controllers and the J-class, which is an expansion chassis that has a SAS (Serial-Attached SCSI) interconnect and that you can hook up to an E-class unit. For approved Promise storage, Apple provides a collection of scripts that configure the Vtrak automatically. They include the following, along with the Web pages where you'll find them:

Metadata and Data on one E-class: http://support.apple.com/kb/HT1160

Data Only on one E-class: http://support.apple.com/kb/HT1161

Metadata and Data on an E-class and J-class (with SAS interconnect): http://support.apple.com/kb/HT1162

Data only on an E-class and J-class (with SAS interconnect): http://support.apple.com/kb/HT1163

Data only on one J-class: http://support.apple.com/kb/HT1121

The scripts provided by Apple are meant to offer a starting point. You can easily tweak the settings according to the Xsan-specifc configuration parameters and by following the instructions published at the http://support.apple.com/kb/HT1200 Web page. If you use Apple's scripts to configure Promise RAID systems, make sure that the metadata LUN has a Read Policy of ReadCache, and a write policy of WriteThru. This ensures that any pending writes to a metadata LUN get written to disk immediately. Data storage LUNs, on the other hand, should have a Read Policy of ReadAhead, and a Write Policy of WriteBack. These settings ensure data that buffering during read operations is more aggressive, and that write buffers have filled before they're written to disk.

Promise RAID hardware that has the latest firmware ships with Bonjour enabled. When you plug the device into your Ethernet network, it pulls an IP from DHCP and becomes accessible through Safari using Bonjour (from the Safari History, click the Bonjour entry under COLLECTIONS). Once you've connected, you'll be able to upload the Vtrak scripts you've downloaded from the Apple site. Click the Administrative Tools icon and select Import. Change the drop-down list for Type: to Configuration Script and then click the Browse button and choose the script you want to upload, clicking on Submit when you're ready. When the RAID system finishes formatting, you can label the LUNs (a process we'll get to later in this chapter).

4.2.2.2. Xserve RAID

To set up the Xserve RAID (a legacy device employed prior to the Apple-Promise relationship), you use the RAID Admin utility, which lets you configure Multiple LUNs or RAIDs in each RAID device. Use the program when specifying the drives to put in each array as well as RAID levels and also when configuring RAID settings and notifications.

Before you can do anything, you have to add a RAID to administer, so select Applications, click on Server, open RAID Admin, and you'll see the dialog box shown in Figure 4-7. Click the Add System button, and from the list of available Xserve RAIDS, choose the one you'd like to configure. Enter the password (the default is public) to view the RAID system's setting, then click Add and you should see your choice appear in the utility's RAID list.

Figure 4.7. The Raid Admin utility

Once you've added all of the Xserve RAIDs, you'll want to make the settings of each conform to the Apple standards. To do that, click Settings in the Raid Admin utility toolbar, enter the management password for the Xserve RAID you're customizing, and then, under the System tab, make the following adjustments to the settings:

  • Enter the Name for the Xserve RAID in the System Name field.

  • Select a Time Synchronization Method (hopefully you'll just be able to use an NTP server), and if appropriate, enter an NTP server to use for clock synchronization.

  • Use the Change button in the Passwords section to change the monitoring password, the management password, (or both) for the Xserve RAID to something other than the default settings. The monitoring password allows for access to the main window UI to view stats, configurations, and logs. The management password lets you access the advanced configuration options presented in the toolbar.

  • Check the box for Restart automatically after a power failure to have the Xserve RAID reboot on its own after a loss of power.

Each controller on the Xserve RAID has its own network controller. By default these receive DHCP addresses. We suggest that you give each network controller a static IP address or use Static Mappings in your DHCP pool to give controllers IP addresses. If the addresses change, controllers become unavailable in RAID Admin, and you must re-add them, which generally isn't a good idea. Xserve RAIDs do support Bonjour discovery in case you forget the configured IP address. You can modify network settings from the Network Tab.

You'll find the FC settings under the Fibre Channel tab. This is where you can view the WWN, create hard loops, set speeds to static, and define the topology. Generally you can leave the default settings unless you'll be using arbitrated loops in your FC topology (see Figure 4-8). These settings will be detected automatically, for the most part, but sometimes you may need to assign them manually. For example, if you use an FC switch that doesn't detect the speed of the FC on the Xserve RAID automatically, you may need to set this value by hand.

Figure 4.8. Xserve RAID Fibre Channel Configuration

Under the Performance tab you can customize certain features to enhance the performance of the Xserve RAID (see Figure 4-9) in an Xsan environment. You can:

  • Enable Controller Write Cache (recommended for performance only if a UPS provides power protection to the unit)

  • Enable Host Cache Flushing (recommended to have disabled for best performance)

  • Enable or disable the drive write cache (recommended for performance only if a UPS provides power protection to the unit)

  • Set read prefetch to 1, 8 or 128 stripes for each controller

    Figure 4.9. Xserve RAID Performance Settings

Now you'll want to set up the RAID system's LUNs. The admin utility refers to these logical portions of an Xserve RAID as arrays. You assign each a RAID level according to your requirements. The Xserve supports levels 0, 1, 3, 5, and 0+1. With Xsan you should use level 1 for your metadata LUN and 5 for data LUNs. In the admin utility, select the Xserve RAID on which you'll be creating a LUN and click the Create Array button in the utility's toolbar. When prompted, enter the management password for the Xserve RAID and click OK, then select the RAID level, as shown in Figure 4-10.

Figure 4.10. Creating a new LUN

Next, select the drives you want included in the LUN—simply click in the box that represents each drive in the Step 2 diagram of Figure 4-10. If you wish to begin writing data onto the Xsan during the drives' initialization period, leave the background initialization option enabled. With drives used in an Xsan environment, we prefer to leave the Use drive cache option enabled. Check this box and then click the Create Array button.

With most Xserve RAIDs, fully formatting the drives takes 36 to 72 hours. If you'll be working on the Xsan during its first day and a half of deployment, you might want to consider using the background initialization option. The performance of the Xsan won't be optimal during the initialization process, but that's the only issue with the feature.


4.2.3. Configuring Ethernet

Yes, we said it, Ethernet. But why do you need to worry about a bunch of category 5e or 6 cables if you have fiber now? Because in an Xsan environment, we use the Fibre Channel environment only for streaming actual data. We need a dedicated network for command and control. This means that most Xsan clients will have two Ethernet networks. The first is the standard corporate LAN, which allows for managing network and storage devices, directory services data, and network volumes. It also provides general IP connectivity.

The second Ethernet network is dedicated to metadata. An Xsan client uses this network to communicate with an Xsan metadata controller to request access to an existing asset or ask for access to write data. The MDC is responsible for informing the Xsan client about where data will be streamed to a drive. Whenever a SAN client wants to access a SAN resource, it must first request that resource from the SAN's active MDC. This prevents conflicts with other clients on the SAN. The metadata controller is responsible for ensuring the data integrity of the volume and the filesystem objects on it.

Because a lot of IO requests may be occurring concurrently, it's critical that the metadata network be very fast and have minimal latency. Nearly every operation performed on an Xsan requires filesystem queries, so any latency introduced between the Xsan clients and MDCs will result in perceivable performance degradation on the volume. This can become particularly problematic if you use your SAN for basic file-server storage and it contains mostly smaller files.

This means you need a good switch. Most often you'll use a managed switch with the management features disabled (especially spanning-tree PortFast). The switch and cabling should be Gigabit and there should be very little latency.

You also want very little traffic on the metadata network to help reduce collisions. This means there should be no DHCP server. Also, you shouldn't do management of SAN-connected devices using the metadata network. You need no router/default gateway, and DNS shouldn't be running. In addition, make the subnet mask as small as possible, with class C being about the largest.

The configuration on the client systems will also be stripped down. You need only an IP address and a subnet mask. List the metadata network second in the Ethernet stack with your organization's main network listed first.

Lastly, though not officially required, we recommend you always set up forward and reverse DNS specifically for the metadata network, and when possible, make that data available over it. Creating a new top-level domain, such as metadata.xsan, for this purpose is one common practice. So a primary metadata controller may have a public hostname mdc.myco.com that resolves to the corporate 10.0.2.10 IP address, and its secondary Xsan interface has an IP address of 192.168.2.10, which resolves to mdc.metadata.xsan. As an alternative, you can simply create another subdomain, such as xsan.myco.com, in your organization. In this case the metadata controller would have resolution on its secondary interface point to mdc.xsan.myco.com.

If possible, each client's secondary interface should have DNS entries configured for a DNS server local to the Xsan subnet, such as a backup metadata controller. Having DNS services configured on the metadata networks can help prevent DNS-related timeouts should the primary interface fail. These failures can be particularly detrimental on OS X metadata controllers, causing extremely laggy performance and possibly complete SAN downtime. With Xsan 2, this is less of an issue, but Xsan 1 is fairly sensitive to DNS problems.

4.2.4. Setting up the Xsan

When building an Xsan, the installation of the Xsan software is typically one of the last tasks. It's important to verify DNS operation, TCP/IP connectivity, and connectivity to FC LUNs, prior to configuring the software.

To verify DNS functionality, first use changeip, as covered in Chapter 1, to check that the forward and reverse DNS of the primary interfaces on your future metadata controllers resolve properly. Next, use dig to test DNS for the metadata network as well. For example, the following will look up the hostname for the IP address 192.168.210.2:

dig +short -x 192.168.210.2.

To test forward lookups, the syntax is:

dig +short myhost.myco.com

Alternatively, you can use the host command to perform both forward and reverse lookups. The command requires no additional flags for either type of lookup:

host 192.168.210.2
host myhost.myco.com

You'll also find other utilities that accomplish the same purpose. Windows users will be familiar with nslookup utility, which Apple has deprecated as of OS X 10.4. Another command, changeip_ds (the full command path is /usr/libexec/changeip/changeip_ds) can perform reverse DNS lookups, when used with the -nameforaddress flag. Additionally, you can simply ping the hostname, in which case the host will use internal facilities to resolve and display the appropriate address.

Once you have all of your storage, hosts, and switches in place and have cabled them and verified connectivity, you're finally ready for Xsan software installation and configuration. You'll want to start with the metadata controllers, the aforementioned traffic cops of the SAN. Make one final check to ensure that you're satisfied with the DNS naming and have a good working installation of Mac OS X on the proposed metadata controller before proceeding (we recommend redundant storage for the host OS in the form of a RAID 1 internal volume).

4.2.5. Installation

The first step in setting up Xsan is to install the package file that comes with the software, keeping the default settings. Next, run the update to ensure that the Xsan software and admin tools are the latest versions.

Once you've completed those steps, you'll find the Admin tool in the /Applications/Server directory (which you can remove when doing a custom installation). The bin, config, debug, examples, man, and ras folders will appear in the /Library/FileSystems/Xsan folder. The bin folder will contain the Xsan command-line binary files, which allow you to do everything you can do within Xsan Admin and more. The config folder, at first, will have only a uuid (Universally Unique Identifier) file, but will collect more once you set up the SAN.

Now you're ready to place the Xsan Admin application in your dock and open it. The first time, you'll see the SAN Setup dialog, which will show an introduction screen. Click Continue to go to the Initial SAN Setup dialog, where you can choose from two options. If this isn't the first MDC you're installing (we assume it is), select Connect to Existing SAN. Otherwise, pick Configure new SAN and click Continue.

At the next screen, name the SAN (see Figure 4-11). What you decide on can be somewhat arbitrary and isn't the same as a volume name. In this screen you can also enter the administrator's name and e-mail address. (This is purely to provide administrator contact information to Xsan users.) Click Continue.

Figure 4.11. SAN Settings

You're now at the Add Computers screen, where you'll see a list of client systems that already have Xsan software installed (see Figure 4-12). We can easily add these later, so click the Select None button to clear all the check boxes for the computers on the subnet, unselecting them. For now, check only the box for the MDC you're currently establishing, then click Continue.

Figure 4.12. Add Computers screen

Next, at the Authenticate SAN Computers screen, type the username and password for the MDC you're working on and click Continue. The system will briefly present an authenticating window and then will ask you to enter your serial numbers into the Serial Numbers screen, seen in Figure 4-13.

Figure 4.13. Enter Xsan 2 Serial Numbers into a global pool

As of Xsan 2, you no longer need to associate a serial number with a specific client. Rather, you build a pool of serial numbers which Xsan provisions to clients automatically as they join the SAN. (Each client needs a unique serial number to be able to mount the volume). When you purchase Xsan, Apple will often distribute serial numbers via e-mail. In this case, you can simply lasso them and drag them into the License area of this screen. Otherwise you can use the Add Serial Number screen to type each one in manually. In either case, Xsan 2 will then dynamically allocate licenses to clients when needed.

Hit Continue to go to the SAN metadata network screen. Per best-practice guidelines, the metadata network should be dedicated—in other words, solely for Xsan traffic with no other devices attached. To keep dropped packets and collisions to a minimum, it should have a good switch (no D-Link, LinkSys, or the like) and shouldn't be a VLAN from a bigger switch or have any managed switching services (such as link aggregation or spanning tree protocol) enabled.

Also per best practices, each client should have two connected network interfaces. One is for your standard network and must be able to provide directory services, Internet access, file server access, and so forth. The second, the metadata interface, should be dedicated to Xsan traffic and need not be routable or running any DHCP services. As a result, clients on this interface will need static IP addresses. In the SAN metadata network screen, choose the network your metadata will run on and click on the Continue button, as seen in Figure 4-14.

Figure 4.14. Xsan Choose Metadata Network

This brings you to the Summary screen. Review the settings carefully, and if they're correct, click on the Continue button. You'll then be brought to the Create Volume screen, where you can make one or more volumes. First, let's take a brief detour. For now, choose to bypass the process (we'll go through it momentarily), which will send you to the main Xsan Admin Screen. At this point, if you look in your config folder, you'll see these files in the /Library/FileSystems/Xsan/Config directory:

  • Config.plist: This XML file contains licenses, the SAN name, controller settings, and the like.

  • Fsnameservers: You'll find a listing of metadata controllers here.

  • Notifications.plist: This file holds XML data used for e-mail notifications.

Later, as you create volumes, Xsan will add other files (each volume will have its own CFG file and an accompanying FSM (Finite State Machine) process spawned by Xsan). Now let's create a Volume.

4.2.6. Creating a Volume

Once you've created your SAN set up, you need to build a volume. This is the logical entity end users see, and you can configure it to mount for them automatically when they log into their Xsan clients. Once you understand the different components that make up a volume, creating one is straightforward.

To begin, open Xsan Admin and click on the Volumes section in your SAN Assets sidebar. At this point, you should have a blank listing of Volumes. In the bottom right-hand corner of the screen, click on the plus sign (+) to begin the volume-creation wizard, which will open to the SAN Setup screen, seen in Figure 4-15.

Figure 4.15. SAN Setup screen

In this screen, type the volume name and choose what type of data will reside on the volume. Note that you can't use spaces or special characters in the name, nor can you change it, so ensure that whatever you specify will last through the ages. Other than the name, the options you choose during volume creation will directly impact the performance of the SAN in a variety of ways. To see and adjust the settings that the wizard applied (based on your selection of data types), click on the Advanced Settings... button, which pops up the window shown in Figure 4-16.

Figure 4.16. Advanced SAN settings

The most import setting here is the block allocation size. Xsan uses the storage-pool stripe breadth and volume block-allocation size to decide how to write data to a volume. Writes typically impact performance more than reads, so it's important to match these in a manner that makes sense given the type of data the SAN will be storing. As of the time of this writing, Apple hasn't released a tuning guide for Xsan 2.x, but you can find the one for Xsan 1.x at http://images.apple.com/my/server/docs/20050901_Xsan_Tuning_Guide.pdf. Per the Apple Xsan Deployment and Tuning Guide:

In general, smaller file system block sizes are best in cases where there are many small, random reads and writes, as when a volume is used for home directories or general file sharing. In cases such as these, the default 4KB block size is best. If, however, the workflow supported by the volume consists mostly of sequential reads or writes, as is the case for audio or video streaming or capture, you can get better performance with a larger block size. Try a 64KB block size in such cases.

There are other options for this portion of the volume setup. Here is a bit of information about them:

  • Allocation Strategy: This determines how data is written to the affinity tags (a collection of storage pools—we'll cover them shortly). Round robin is the default strategy and the recommended one in most cases. It works by simply iterating through storage pools during writes. For instance, when writing file1, round robin will go to storage pool1; when writing file2, it will go to storage pool2. If you want to ensure that your data gets evenly distributed across pools, you can use another approach: balance allocation. This configuration uses the pool that has the most available capacity. Fill allocation, on the other hand, loads storage pools to capacity in sequence; it won't use storagepool2 until storagepool1 runs out of room. The two options, though, can degrade performance, as it's possible for a single LUN to bear a higher percentage of the load over time.

  • Spotlight: This option lets you enable and disable the OS X system search tool (Spotlight) for volumes. Given that it doesn't currently work effectively in Xsan 2.x, you should disable it.

  • Access Control Lists: You can enable and disable ACLs on the volume using this setting, but typically you should leave it enabled, as ACLs provide for extensible permissions management.

  • Windows ID Mapping: If Windows machines will participate in the SAN, leave this option at its default—enabled.

Allocation Settings: Always check with a technical project manager before customizing any of the following three settings in this section of the dialog box.

  • File Expansion Min: This value determines the minimum number of blocks written to a SAN file for the first expansion request. Increasing it speeds up writes for large files; decreasing it speeds them up for a large number of smaller files.

  • File Expansion Increment: To change the number of blocks used for each expansion request after the first, adjust this number.

  • File Expansion Max: The maximum number of blocks used for the file. Can help to reduce fragmentation on your volume.

Cache Settings: Always check with a technical project manager before customizing the two settings (listed below) in this section of the dialog box.

  • iNode Cache Size: In Unix, data structures called iNodes hold information about files. A file—which always has an iNode—is uniquely identified by the file system it's on and its iNode number on that system. To track the data that resides on it, Xsan (which is just a file system)—uses iNodes. This setting specifies that maximum number of these data structures that a metadata controller can cache on a volume.

  • Buffer Cache Size: Altering the size of the buffer cache changes the amount of memory the metadata controller can use for storing a volume's metadata. The cache is useful when you have a system with high latency—buffers act to mitigate latency issues.

When you've finished with the advanced settings, click OK to return to the listing of volumes. At this point, if the software detects any unlabeled LUNs, it will give you the option to label them—unless a LUN has a unique identifier, you can't integrate it into an Xsan volume. We recommend using a labeling practice that indicates the specific shelf the LUN is in as well as its controller. For example, PROM_03J_C1 would specify the Promise Jclass expansion unit and controller affinity 1. Having good labels helps to easily identify storage in the event that troubleshooting is needed.

Once you click Continue, the wizard will take you to the Configure Volume Affinities pane, where you can start setting up affinity tags (the collection of storage pools mentioned earlier). The tags, which allow Xsan to share bandwidth, are best used to group storage pools that have similar characteristics. For instance, you might want to create a tag called Video that contains LUNs optimized for sustained high throughput. Alternatively, you might have a Files tag containing LUNs tuned for efficiently handling random I/O rather than for delivering raw sustained throughput. (If they'll receive heavy use, it's often better to keep items like these in separate volumes).

When building an affinity tag, it's important that the LUNs which it contains are of similar capacity—as mentioned earlier, the storage pool will reduce the capacity of all its LUNs to that of the smallest. With Xsan 2, building tags is complicated a bit, as the GUI does configuration by Affinity Tags rather than storage pools. When adding LUNs to a tag, Xsan 2 automatically groups LUNs into storage pools. It determines the composition of the pools based on the usage pattern you defined when creating the volume. If you chose any of the media usage types, then an affinity tag will contain storage pools consisting of four LUNs each. If you selected the File Server option, pools will consist of two LUNs.

As such, when building an Xsan primarily for video, the best approach is to add LUNs in numbers divisible by four. To see why, suppose you add nine LUNs to an affinity tag. The system will create three storage pools, two composed of four LUNs. But the third, with just a single LUN, will have performance drastically inferior to that of the other two. As a result, you'll have intermittent bandwidth problems on your SAN. If you can't add LUNs in groups of four, at least try to avoid adding an odd number to a storage pool. If you think you'll be creating an unbalanced setup, please consult a technical project manager first.

Once configured, you can view the volume's composition by consulting the XML file located at /Library/FileSystems/Xsan/config/VolumeName-auxdata.plist (replacing VolumeName, of course, with the volume's actual name). The Xsan Admin software uses this file during volume expansion. Specifically the software employs the values specified at StoragePoolIdealLUNCount and StoragePoolStripeBreadth, which it applies to any newly added pools.

The metadata affinity tag is a special beast and has a unique makeup. Optimally, the tag will sit on its own RAID controller to prevent congestion that could negatively affect performance across the entire volume.

The affinity tag options are:

  • Any data: Selecting this lets the Affinity Tag house metadata or user data.

  • Journaling and metadata only: This limits tags exclusively to use for the information that tracks where the pieces of the user data reside on the SAN and to use for metadata.

  • User data only: Choose this, and writing data to the volume will be the only use allowed for the tag.

  • Only data with affinity: This forces data written to the affinity tag to a specified folder.

  • Stripe Breadth: Use this option to set the size of each data stripe (in blocks) written to one LUN in a storage pool before moving on to the next.

Setting the proper stripe breadth requires a bit of consideration, and is dependent upon the volume's make-up. Specifically, we need to know the volume's storage-pool block-allocation size, which we defined earlier.

This setting is applied specifically to the storage pools that make up the affinity tag. Its purpose is to properly tune the transfer sizes to best coincide with I/O characteristics of the host OS. OS X transfers data in 1MB chunks, referred to as the transfer size. Apple explains that the goal is to configure the stripe breadth such that each transaction with a storage pool is equal to your transfer size.

Thus, in a the default Xsan, the block size of 16KB uses a 4-LUN storage pool. To ensure a 1MB stripe across all 4 LUNs, Apple recommends a 16-block breadth (consisting of 16KB) to each LUN. For example, if you increase the block size to 64 KB (say, to suit data streaming), set the stripe breadth to 4 blocks, so that each LUN again receives a 256KB write. If you choose the File-Server purpose when creating your Xsan volume, the standard storage pool size is 2 LUNs; as such, you'd use a default setting of 32 blocks.

Apple's Xsan 2 Admin guide recommends using the formula:

stripe breadth = (transfer size / number of LUNs) / block allocation size

where stripe breadth is expressed in blocks, and transfer size and block-allocation size in bytes. But this formula differs from our recommendations (and the recommendations Apple provides in its Xsan 1.4 tuning guide). In our findings, the best approach is to configure transfer settings such that the OS X transfer size directly correlates to each transaction with a LUN (rather than a storage pool).

Given that, we suggest setting this value so that the stripe breadth itself is equal to 1MB (rather than the aggregate across all LUNs). So in our previous example, where the File System block size is 16KB, we'd use a stripe breadth of 64 blocks, resulting in a 1MB transfer to each LUN before moving on to the next. In this case, we don't really care how many LUNs are in a storage pool. From Apple's Xsan Deployment and Tuning Guide:

The Mac OS X (or Mac OS X Server) operating system, which handles file data transfers for Xsan, performs 1 megabyte (MB) data transfers. As a result, Xsan gets maximum efficiency from the operating system when it transfers blocks of data that are a multiple of 1 MB.

At the other end of the transfer, the LUN also works well when it receives 1 MB of data at a time. So, when data is written to a storage pool, you get the best performance if 1 MB of data is written to each LUN in the storage pool The amount of data Xsan writes to a LUN is determined by the product of two values you specify when you set up a volume:

  • The volume's block allocation size (in kilobytes)

  • The stripe breadth of the storage pools that make up the volume (in number of allocation blocks)

transfer size = block size × stripe breadth

It's worth noting that fundamentally, not much changed between Xsan1 and Xsan2. Certainly, Apple made great improvements to its GUI admin tools, but the underlying functionality of LUNs, storage pools, and data allocation did not drastically change. As such, we feel that many of the tuning principals prescribed by Apple for Xsan 1.4 still fundamentally apply to Xsan 2.

The Apple tuning guide says little about metadata stripe breadths. The default breadth on a metadata storage pool is 256 blocks. But according to Quantum, this is too high. The company recommends a 16- or 64-block stripe breadth for metadata storage pools. Therefore, if you have a relatively small volume with a small number of files, consider using 16. If you have a larger environment with big files, think about trying 64, rather than the default of 16.

The calculation to make the stripe breadth times the storage pool equal 1MB is more important for your data storage pool than for your metadata pool. As with many things Xsan, you'll get the biggest bang for your buck from tuning stripe breadths and block sizes to match the environment is where you will get the biggest bang for your buck. This is just a starting place, though. You should plan on tearing down and rebuilding your volume a few times to maximize speed (after all, doing so usually takes less than five minutes.

After setting up volume affinities, you can configure your volume's metadata controllers. The wizard will show you each system that's set as a metadata controller. Generally two such controllers should suffice for any given volume or set of volumes. Once configured, metadata controllers run an instance of the FSM process. You can track performance, in terms of memory and processing required, as you do with any other service using tools such as Activity Monitor or the Unix top command.

To start setting up the metadata controllers, when you're done with the Configure Volume Affinities pane, click on Continue. In the Volume Failover Priority pane that appears, you can customize the priority assigned to a controller. Simply drag items higher or lower in the list to raise or lower priority. The top item, which has the highest priority, should always be the default MDC. Also, more than three MDCs can cause unneeded latency on the volume, so feel free to deactivate any that aren't needed.

When you click Continue, you'll now see your new Volume, as illustrated in Figure 4-17 be seen below). The Affinity Tags and the percentage of each that is consumed with data will also appear along with the capacity of each (in the Size column), the available disk space (under the Available column) and the active MDC (in the Hosted By column). You can also see the structure of the volume in terms of affinity tags and LUNs.

Figure 4.17. Xsan Volume overview

Click on the volume name and then the gear, as shown in Figure 4-18. Here, you'll see a number of available options.

Figure 4.18. Volume options

Use the Edit Notification Settings to configure Xsan to send SAN administrators e-mail updates (we recommend triggering alerts when the volume reaches 75 percent of capacity). Use Edit Failover Priority to add or remove MDCs or just change their priority. Use the Force Failover to test failover between MDCs and use Start and Stop to Start and Stop the volume.

When you created the volume, Xsan automatically started and mounted it on the metadata controller. You've now completed the volume set up and can move on to adding computers that can see the volume.

4.2.6.1. Adding a Computer

Now that you've created your volume, it's time to add clients that can access it. This is one place where Xsan 2 is very different from Xsan 1.x. To add a computer that has access to your volume, click on Computers in your SAN Assets list and then on the plus (+) sign in the lower right-hand corner of the screen.

This invokes a list of computers that have the Xsan software installed and have been discovered via Bonjour (as seen previously in Figure 4-12). You can add clients running version 1.4.2 or 2.x to your SAN. You can also add clients using 10.5.0 through 10.5.2, but you'll receive an amber warning indicator for system with software that preceded 10.5.3.

If the client you're trying to add doesn't appear in the list, you can click on the Remote Computer button and enter the machine's DNS name or IP address. If the DNS name won't resolve, you'll receive an error saying that the information is invalid.

If you use the Select All button on this screen to select all the clients, when you click on Continue and enter the proper authentication credentials, you'll be able to enter a user name and password for all clients at once or each individually.

Authenticate into all of your clients and click on Continue. You've now added them all and can click on the Mount icon to mount the volume for each (if it hasn't mounted already). If you've been following along with this walk-through then you now have an Xsan with a mounted volume and clients.

NOTE

In order to add a new controller, promote an existing client to a controller, or demote a controller to a client, all clients on the SAN must be online and reachable via the Xsan Admin Application.

4.2.7. Resharing the Volume

To view the shared folders on a system, open Server Admin and click on the name of the system you want in the Servers list. Now click the File Sharing button in the Server Admin toolbar and you'll get a list of the logical volumes that your server can see along with a handy disk space image that shows how full the various volumes are. At this point, you can select Share Points to see which folders are currently being shared over SMB, AFP, NFS or FTP. If you click on Volumes and then the Browse button, you'll be able to configure new folders that you want others to access as share points. Browse to the folder you want shared, then click on the Share button in the upper right-hand corner below the tool bar.

With the Xsan volume selected, three tabs appear along the bottom of the screen: Share Point, Permissions, and Quotas. Click on Share Point to review and modify these settings:

  • Enable AutoMount: This gives you choices for setting up an Open Directory automount for the share.

  • Enable Spotlight Searching: Selecting this allows the volume to be searchable using the OS X system search tool.

  • Enable as Time Machine Backup Destination: Turn on this setting if you want to let client computers use the OS X utility for safely storing copies of their data.

  • Protocol Options: This brings up the screen that allows you to configure SMB, AFP, NFS, and FTP settings (looks very much like the old screen in Workgroup Manager)

Once you've set the options for your share point, click over to the Permissions tab, where you can determine who has access to shared data. From this point, access to share points is controlled by file system permissions. You'll see ACLs listed above POSIX permissions, and when you drag a user or group into the window, a blue line will appear, indicating that the object will stay if you drop it on the window.

Finally, if you click on the Quotas tab and enable quotas, you'll find that you can't drag users and groups into the window. Using Server Admin, you can't configure users that don't have a home folder on the volume. You can, however, configure quotas at the command line.

4.2.7.1. Xsan Block Sizes

Xsan can act as the back-end storage to provide front-end network file-sharing services for a Mac OS X environment. This isn't to say that it'll work like a charm without some fine-tuning, though. In tuning any Xsan volume, one of your most important tools is the block size. As mentioned previously, the stripe breadth multiplied by the block size should come to about 1MB. You'll have to customize the stripe breadth on the storage pools whenever you change the block sizes for the volume.

If you're using Xsan as a repository for data to be shared over clustered file storage, then it's important to maintain a small block size. How small? That depends on your data. If it's in large files, you may be able to stick with the default settings for clustered file storage in the volume setup wizard. If it's in small files, though, consider going even lower.

4.2.7.2. AFP Tuning

When an AFP client gets disconnected from a share point, it attempts to look for a token in order to reconnect automatically. If the server doesn't have the token, the client can't reconnect because it's comparing the presented token with its own cache (stored, by default, in /etc/AFP.conf). However, if you're using an Xsan, you'll want your servers to share a token location.

The token store is the reconnectKeyLocation key in the /Library/Preferences/com.apple.AppleFileServer.plist property list. You can use the defaults command to move the tokens to an Xsan volume. Follow the command with the appropriate option switch (in this case, write to put data into the property list), followed by the name of the property list and then the key that we'll be writing into. We want to write text into the key, with the text string being a path. If the volume name is bighonkinvolume, the command will be:

defaults write com.apple.AppleFileServer reconnectKeyLocation 
/Volumes/bighonkinvolume/AFP.conf

Additionally, you'll want to customize file-locking mechanisms in some cases. AFP locks files at the application layer by default. With Xsan, where multiple file-server heads are involved, it's best to use locks at the file-system level. Therefore, you can use the AFP settings for the daemon to prevent AFP from locking files itself. The command is:

serveradmin settings afp:lock_manager = no

4.2.7.3. Tickle Times

In Windows, when you've connected to a share using, say, a mapped drive letter, that share shows as active. At some point, if the client can't communicate to an SMB/CIFS server with the open session, the drive will appear offline to the Windows client. AFP does something similar, but the result is what, as perceived from the Finder, appears to be constant communication with the AFP server. Actually, it's verified by the server every 30 seconds. With the AFP client, if no poll is sent from the server, the client will also attempt to reach out to the server to verify that the connection is available. This process is known as tickling.

AFP uses the tickle to verify that clients are still connected to a server. This communication causes a small amount of network congestion, but in some environments you want to keep even that to a minimum. This is similar to the concept of disabling protocols in a network stack that aren't being used. Such protocols aren't likely to cause issues on one machine, but when employed by thousands of hosts, they can effectively cause a denial of service on the server.

Provided that you use AFP, you're just not going to want to disable the tickle. By default, it happens every 30 seconds with no user intervention (other than connecting to a share point and having a session greater than 30 seconds). But while you might not want to disable it, you can reduce traffic by increasing the number of seconds between updates. For example, the following command will up the time between tickles to 60 seconds:

serveradmin settings afp:tickleTime = 60

In order to set the tickleTime value back to 30 seconds, you would simply issue the following command:

serveradmin settings afp:tickleTime = 30

Setting a tickle time isn't for everyone. In fact, you'll rarely need to take this kind of step. But if your Mac servers are causing a lot of collisions on the network, and using packet analysis you determine the traffic comes from DSI/AFP (Data Service Information/AFP) packets, that's a fine time to test out tickleTime as a solution. If it doesn't resolve your issue, you can always move it back to 30 seconds. While increasing the tickleTime variable can cause beach balls to spin for a fair amount of time when you lose a server connection, doing so can also reduce the amount of traffic slightly, which scales into larger environments.

Even to a seasoned Windows admin, the concept of a constant communication channel for file services may be foreign, it's just a reality of playing in the Apple sandbox.

4.2.8. Using Third-Party Clients

With the Quantum StorNext client, you can connect systems running other OSes to Xsan. StorNext also provides controllers that Mac Xsan clients can connect to. Not all versions of Xsan are compatible with all versions of StorNext. Apple provides a compatibility page at http://support.apple.com/kb/HT1517. When adding third-party clients to an Xsan, you should still follow the usual best practices for infrastructure, such as using a dedicated metadata connection with a static IP assignment and, ideally, DNS forward and resolution.

4.2.8.1. Installing Linux Clients

Linux running Helios can make for a good alternative to running AFP or SMB shares off of an Xsan using Mac OS X Server. A Linux system can run Quantum StorNext software and mount a Mac OS X-based Xsan, then share data out. Currently only RedHat Enterprise Linux (versions 4 and 5) and SUSE Linux Enterprise Server (version 10) support the product.

Once you've bought and registered Quantum's software, make sure your StorNext client can ping the metadata controller over the metadata network by whatever IP address is used for metadata. With that done, you can get underway.

For starters, go to your metadata controller and backup metadata controller, run the cvfsid command, copy the text string it produces, go to http://Prodreg.quantum.treehousei.com/login.aspx, and complete the form there using the string. Once you get the necessary information back from Quantum, add it to the /Library/FileSystems/Xsan/config/license.dat file on each metadata controller and reboot them. Now you're ready to set up your clients.

To do so you'll need the auth_secret and fsnameservers files from one of the metadata controllers (an Xsan's metadata controllers will have identical auth_secret files) and the StorNext rpm client installer.

From each client, first verify that you can ping the metadata controllers. Then, extract the rpm with the command:

tar xf sn_dsm_linuxRedHat40AS_x86_64_client.tar.gz

or use gunzip to extract it. Next, install the rpm by issuing the command:

rpm -ivh sn_dsm_linuxRedHat40AS_x86_64_client.tar

Now copy the auth_secret and fsnameservers files to the /usr/cvfs/config directory created by the installer, then use the cvlabel -l command to verify that you can see all of the LUNs that make up the volume. And for good measure, make sure you can ping each of the metadata controllers by IP address one more time. Finally, add cvfs to the list of file systems in the PRUNEFS field of the /etc/updatedb.conf file.

With those tasks finished, you're ready to mount the volume. In the /mnt directory, create a folder with the same as your volume (for this example we'll use myXsan). Next, open /etc/fstab and add this line:

Xsan        /mnt/myXsan       cvfs     verbose=yes 0 0

Now try to mount the volume using the command:

mount -t cvfs Xsan /mnt/myXsan

Provided it works, you can reboot and proceed with setting up Helios.

NOTE

All metadata controllers need to have that license DAT file. If they don't, your clients won't fail over properly. When you're finished with the integration, we recommend backing up the entire /Library/FileSystems/Xsan/config directory and running a cvgather to make a tar file of your Xsan configuration.

4.2.8.2. Windows Clients

As with Linux, Windows can access Xsan via StorNext. This allows you to install and configure Microsoft UAM-based volumes, standard SMB/CIFS, shares and even ExtremeZ-IP using an Xsan volume—the same back-end storage can serve a number of different platforms. To install Xsan on a Windows client you'll actually be installing StorNext and you'll need a version of appropriate for your version of Windows.

The first step in setting up StorNext is to register the software with Quantum. To do so, go to http://prodreg.quantum.treehousei.com/Login.aspx. Now wait to get the registration information back in an e-mail, which usually takes about an hour but can take up to 24. While waiting, you need to get some other information before you can get your actual license: unique identifiers for each Xsan metadata controller. To get them, you must run cvfsid on each of the metadata controllers (but not on any other systems). Go to the primary MDC, cd into the /Library/FileSystems/Xsan/bin directory and then run the ./cvfsid command. The output you get will look something like:

C1A1B97A11 MAC 0 mdc.domain.com

Copy the text string, and repeat the process on your backup MDC for volumes accessible by StorNext clients, then go to www.quantum.com/swlicense. In the form you see, enter both the serial number for each host that will run StorNext as well as the output of the cvfsid commands from earlier.

Quantum usually responds to these requests in about the same time frame as with the initial request. Once you receive the e-mail response, open the license.dat file from /Library/FileSystems/Xsan/config and paste the message content that is indicated as being required into the file using pico or vi. After the files are updated, reboot the metadata controllers. Each will then create a file called .auth_secret in its /Library/FileSystems/Xsan/config directory. This file is hidden, so to access it through finder, you need to cp it to removable storage or copy it to another location that's not hidden.

Now you can install your Fiber Channel card into your Windows system. If you've patched the StorNext client into your network environment, you'll see a prompt to install the Promise drivers. If you're installing a Vtrak from Apple on Microsoft Windows, you can download the Promise drivers from http://www.promise.com/support/download/download_eng.asp.

You can also use the drivers (or generic ones) if the Promise is serving as a target and connecting to those LUNs (managed by Xsan) via StorNext. But although you can use generic drivers because StorNext is managing the LUNs, most Windows administrators won't want to (nor should they). To see the LUNs, check Windows Device Manager.

Now install the StorNext software on the client, following the defaults and rebooting when the installation completes. Then copy the .auth_secret file to the c:Program FilesStornextconfig folder and reboot the StorNext computer. When you log back in, go to Start All Programs File System Services, a new entry added by the base StorNext installation.

The main application you'll use here is Client Configuration, which allows you to interact with an Xsan as a client. In many cases we'll remove most of the other applications from the Start menu so that users don't accidentally do more to the SAN than we'd like, which, regrettably, is an option otherwise.

From the Client Configuration application, click on the fsnameservers tab, and type the IP address of each metadata controller. The hostname in .auth_secret tells the StorNext client which MDC it can talk to and which hosts can communicate with it, based on a pre-shared shared key. Now click on the Drive Mapping tab. This is where you set the way in which Microsoft Windows interprets the Volume seen by StorNext. Doing so allows the Windows client software to be aware of which volumes reside on the server and can mount them as needed. The Drive Map option simply allows you to specify which Windows drive letter will be mapped to each volume. When you've finished the mapping, you should be able to browse as needed.

If you create a folder called debug in c:Program FilesStorNext, after you restart, the StorNext FSS (File System Service) will create a file called c:Program FilesStorNextdebug ssdebug.out, which contains very verbose logs from the perspective of the StorNext system. These can be useful, for example, in debugging connectivity issues with other StorNext systems, Xsan, or both.

StorNext for Windows includes many of the commands available with Xsan on Mac OS X. The default location for the commands is c:Program FilesStorNextin. You can use the cv-based commands (explained further later in this chapter) in much the same way as on a Mac, which can help with troubleshooting.

For example, if you're having problems getting a volume to mount even though it shows up when you go to map the drive in Client Configuration, you can use cvlabel -l (assuming your working directory is the StorNext bin directory) to see the LUNs you host can access. If you can't see the LUNs, you also can't map a drive to them (you can in the Client Configuration utility, but you won't be able to see the volumes in Windows Explorer or from a command prompt). Once you confirm that you can see Xsan LUNs from StorNext and that you can communicate with the metadata controller, stop and start the FSS to see if the volume then appears in Windows Explorer.

If you're using StorNext systems as actual metadata controllers, you'll find a number of other commands you can leverage; again, in much the way you would with Xsan. For example, to start a volume, you can use the cvadmin command followed by start and then the name of the volume. For example, if your volume is bighonkinvolume you'd use:

cvadmin start bighonkinvolume

4.2.9. Xsan Management

If set up properly, Xsan is typically very stable and healthy when first installed. As with any system, though, over time various maintenance and troubleshooting tasks will need attending to. Volumes fill, frames start dropping, response times slow, files corrupt, and many more problems occur with increasing frequency. The purpose of this section is to help you figure out what's going on and how you can nurse your Xsan back to health.

4.2.9.1. Reinstalling the Software

A number of client-configuration issues seem to call for uninstalling and reinstalling the Xsan software. But do you actually have to uninstall the software, reinstall it, run Software Update, reboot, re-add the client to Xsan Admin on the MDC, and then attempt to mount when, for example, a single client isn't mounting a volume? No. That's a lot of crap when one step will reset a client back to the way it was before it ever joined its first Xsan. Just delete the contents of the /Library/Filesystems/Xsan/config folder (but not the folder itself) and then reboot.

On reboot, the Xsan process is waiting to be controlled by a metadata controller, so use Xsan Admin to add the client—it should receive the same serial number it had before and mount the SAN volumes automatically. This will fix a few different client-specific issues. Don't, however, try this with a metadata controller unless you know what you're doing!

4.2.10. Upgrades to your Xsan

Once your Xsan is installed and working perfectly, chances are you won't want to do anything to it. But eventually you'll have to perform software updates, volume expansion, the occasional (and regrettable) changing of IP addresses, and other maintenance. Also, make sure the Metadata Controllers are running the most recent version of the operating system in use in your environment.

4.2.10.1. Operating System Upgrades

All MDCs should run the same version of Mac OS X. When upgrading one to a new version, you should upgrade all the others to the same version. During upgrades, do not make any modifications to the Xsan using Xsan Admin or the command-line utilities. Upgrade all MDCs before you make any configuration changes. When performing a clean installation of Mac OS X, all of the Volume configuration data and SAN contents could be lost.

If you can take Xsan volumes down during the upgrades, you should—you'll greatly reduce the possibility of problems. But if the volumes must be up, is possible to upgrade while they're accessible to client systems. When doing this, run the upgrades on the metadata controller and restart, then run the upgrades on the backup metadata controller and restart. If you have no backup metadata controller, promote a client system to become a backup metadata controller, fail the volume over to it and then upgrade the metadata controller. By following this procedure, you prevent data loss due to a single point of failure during an upgrade.

If you choose to upgrade the Xsan without interrupting the availability of the Volumes and you have 2 Metadata Controllers then it is also a good idea to temporarily upgrade a client system into the role of a Backup Metadata Controller to mitigate the risk of having a single point of failure during the upgrade of each metadata Controller.

4.2.11. Upgrading the Volume

The process of adding storage to an Xsan volume, volume expansion, not only provides the benefit of increased capacity, it also can increase bandwidth. On an Xsan 2.0 volume you can perform two types of volume expansion: Storage Expansion and Bandwidth Expansion.

With the latter, you add LUNs to an existing storage pool. This is relatively intrusive, though, and you can only do it on volumes built with the Custom data type. You may have to do this when a storage pool isn't configured in a manner consistent with others on the volume, and performance is paramount.

This type of expansion will result in a misbalance of information across the storage pool's LUNS, and you'll have to defragment to avoid severe performance degradation. As such, you should always defrag shortly after performing the expansion. Remember, though, because you're modifying existing datastores that the volume uses, bandwidth expansion is inherently risky, and you should avoid it if possible.

With storage expansion, you add new LUNs to a an existing affinity tag, which in turn creates a new storage-pool member. Because you're simply introducing elements, this type of expansion isn't as intrusive as a bandwidth expansion, and is less prone to problems.

When performing volume expansion, generally you want to add storage in increments equivalent to existing storage pools (typically four LUNs per pool). Xsan 2 will do its best to determine existing pool utilization based on values set in the volume's auxdata.plist file, and you can carry out more granular edits of these as needed.

Prior to running the expansion, we recommend you follow a few procedures. First, ensure that all new LUNs consist of RAID sets that are consistent with the designated affinity tag's current LUNs.

You can determine your volume's ideal LUN count by consulting its respective auxdata.plist file, found at /Library/Filesystems.


Next, verify recent backups of the volume by performing a test restore. Now stop the volume, and perform a repair on it using the command:

cvfsck -wv VolumeName

When that completes, back up the metadata by issuing the command:

snmetadump -d VolumeName

Also, perform a cvgather on the volume (which, among other procedures, backs up the volume's configuration files) by entering this at the command line:

cvgather -f volumeName

After these steps, you're ready to perform the expansion, which you can do using Xsan Admin by dragging your new, labeled LUN into the desired affinity tag. As mentioned, it's best to add LUNs in numbers compatible with existing settings. So if the volume uses 4 LUN storage pools, the number of LUNs you add during expansion should be divisible by 4. If you have a custom volume, you'll need to manually create the storage pools and assign 4 LUNs to each. After dragging all desired LUNs into the proper affinity tag or storage pools, click Save. The expansion will proceed, generally taking 20 to 30 minutes.

Following the actual expansion, it's a good idea to carry out additional maintenance on the volume to ensure proper health. In particular, to ensure that data is properly re-striped across the pool, an snfsdefrag is an absolute requirement. Running the command

snfsdefrag -dr /Volumes/VolumeName

will rebalance the data.

Even after performing a volume expansion, it may be desirable to rebalance the data onto the newly added storage. By spreading it across the storage more evenly, you not only help prevent slowdowns, you actually get a net gain in speed. That might not be the case if you don't rebalance. In Xsan 1.4, this was usually a straightforward task, because each storage pool had a different affinity name, so you could balance data using, for example:

snfsdefrag -r -k newstoragepoolaffinitykey -m 1 /Volumes/VolumeName

This command would relocate any files with more than one extent to the new pool's affinity key. Unfortunately, this technique gives you no clean way to completely balance out data, because it relocates all fragmented data, a process that could easily exceed the capacity of the pool. So when using this method, it's important to monitor the process and ensure that pools don't over-balance, so to speak.

There's a better alternative: Change the volume's allocation strategy to Balance and then defragment the volume (the options are Fill, Balance, and Round). This relocated fragmented files to the lowest-capacity pool, an extremely effective method for balancing data. In Xsan 2.x, you can change a volume's allocation strategy at the GUI level, which results in a quick restart of the volume.

In our experience, the quick restart does not result in Xsan client service interruption to the volume, and active transfers proceed with no disruption. Even so, it's best to perform the switch at a time when there's minimal activity (preferably none) on the volume and no active transfers in progress.

Xsan 1.4 doesn't officially support changing allocation strategies on a volume. To do so, you must completely stop the volume, then change its strategy to Balance in its configuration file before restarting. Once you've converted the volume to the new strategy, you can proceed with the optimization, which is a fairly straightforward defrag performed with the command

snfsdefrag -r /Volumes/VolumeName

This will defragment any files with more than one extent, re-provisioning the optimized files to the next LUN in the allocation strategy. And because we're now using the Balance strategy, the next LUN will always be the one with the lowest capacity—our new LUNs, in this case. If, however, you had a healthy Xsan volume, this command may not properly balance data, because fragmented files will be rare. In such an event, run the command

snfsdefrag -r -m 0 /Volumes/VolumeName

This will defragment files with more than 0 extents, which is every file on the system, letting you rest assured that the volume will be nicely balanced at the end of the operation. Given that using the -m 0 flag with snfsdefrag can avert improper balancing, you may want to use it from the get-to, rather than excluding it.

The main trade off here is that doing so reprovisions all files on the volume, which can be a very time consuming task. If the volume has standard levels of fragmentation, running the command without the flag should do a decent job of balancing without having to operate against non-fragmented files as well. The second problem caused by using the -m 0 switch is that the operation will flag files it affects for backup. Thus any following incremental backups will essentially be a full backups. Save yourself the trouble and set your backup system to perform a new full backup after the migration.

4.2.11.1. Changing IP Addresses

Because Xsan retrieves the locations of files and the status of information on the SAN using the metadata network, it's important to keep this network as free from interference as possible. File-sharing, backup operations, and other bandwidth-intensive tasks should occur on your organization's standard network.

Using DHCP servers on an Xsan metadata network is not a good idea because it can make clients fail to not respond to the administrative commands sent from the Xsan Admin utility. In general, DHCP is inappropriate for Xsan metadata Controllers and Xsan clients. Certain environments sometimes require DHCP-supplied static IP addresses, though. Provided those addresses don't change, DHCP is acceptable in an Xsan installation—but on the production network only, not on the metadata network.

If you ever need to change the metadata controller's metadata IP address, the best option is to first demote the metadata controller to a standard Xsan client, then remove it from the SAN. After you change the IP address of the former metadatacontroller, re-add it to the SAN, specifying it as a metadata controller.

4.2.12. Common Xsan Repair and Troubleshooting Procedures

Proactive maintenance is essential, but despite your best efforts, problems will crop up. Certain types occur more frequently than others, though, so you'll often find yourself repeating the same repair procedures. Here are some of the more common ones.

4.2.12.1. Resetting Xsan Client settings

As mentioned earlier, when remedying Xsan issues, you should rarely have to uninstall and reinstall the software. Often, simply returning it to its default settings will do the trick. All you need do is delete the contents of the /Library/Filesystems/Xsan/config folder (but not the folder itself), and then reboot. If this doesn't resolve a client-specific issue, read on for additional measures to try.

4.2.12.2. Rebuilding an Array on an Xserve RAID

Sometimes a drive fails or a controller for a RAID setup with redundant drives (RAID 5 or RAID 3, for example) goes down, and you have to rebuild the parity drive. You should do so as quickly as possible, but it can result in data loss. And if a second drive in the array fails, you could lose most of the data. Although the failure has caused parity problems, the data itself may be safe, so you should back it up first, and as soon as you can. With that precaution in place, you can carry out the parity rebuild.

Start the process by opening RAID Admin from /Applications/Server, then selecting the RAID containing the damaged array, and clicking on the Advanced button in the toolbar. Enter the management password for the Xserve RAID device in question, click on the button to Verify or Rebuild Parity, then on Continue, selecting the array. Select Rebuild Array and the process will start. In a few hours, when it completes, perform a Verify Array. Finally, verify the data on the volumes.

If the rebuild doesn't go well and you lose the array, you'll likely need to delete and re-add it. In many cases, this will cause you to lose the data stored on that array and, therefore, on the volume—one of the many good reasons to have a backup.

4.2.12.3. Rebuilding an Array on a Promise RAID

Promise RAID setups, like those of Xserve (or any other) will eventually suffer a drive failure. But the Promise products contain a few features that differentiate them Xserve's. One feature, Media Patrol, is a failure-detection routine that watches for bad blocks. A feature called Predictive Data Migration (PDM) will preemptively rebuild a RAID set on a global hot-spare drive. If a drive does fail, this significantly accelerates the parity rebuild.

However, under certain circumstances, these features can be detrimental to performance. For instance, PDM will kick in if it detects even a minor drive malfunction. In certain instances, this results in a data parity rebuild, but never activates the global hot spare as a replacement. And with the default notification settings, none of this activity will result in an e-mail notification. During the rebuild performance is degraded.

If you experience poor performance with an Xsan that uses Promise hardware, you can look for clues to the source in a few places. First and foremost, make sure to check the event logs on all your Promise RAID sets. If you see a lot of Media Patrol or PDM events, you likely have a failing drive. PDM, as noted earlier, attempts to intelligently detect drive failures and will begin to build a hot-spare drive into the array. But while this means that when the drive ultimately fails, the overall rebuild process will take a very short time, in the meantime the process can seriously degrade performance.

If you continue to experience performance problems on Promise equipment, consider disabling Forced Read Ahead on your Promise controllers. Apple's publicly available configuration scripts turn this option on by default, but it's truly needed only in high-throughput environments, such as those that process uncompressed HD. In the majority of scenarios, you can greatly increase Xsan read performance by disabling Forced Read Ahead.

4.2.12.4. Latency

In Xsan, the PIO HiPriWr value in logs (specifically the sysavg value) shows you how latent the connection to your metadata LUNs is. HiPriWr values are written on an hourly basis to a volume's cvlog file found at /Library/Filesystems/Xsan/data/volume/log/cvlog. Alternatively, you can summon these values as needed by using the tool cvadmin:

cvadmin
>select MyVolume
(MyVolume)>debug 0x01000000

For a metadata LUN on an Xserver RAID set, the average latency, shown by sysavg, is usually 500ms or less. Promise RAID's active/active controllers result in additional latency, and will result in values between 800ms and 2,000ms.

If the physical fiber connection to a system's LUNs is too slow (or latent), it can cause instability and worse, volume-integrity issues. If you run into issues with latency on the fabric then it probably comes from problems with the fabric . To address the issue, look into statically assigning FC port configurations on targets and initiators. Specifically, ensure that connections are of type N_Port, often referred to as PTP (point-to-point). On its boxes, Promise support recommends always statically configuring Fibre interfaces to N_Port 4GB static settings to help reduce latency. After ensuring you have static settings, assign an ALPA ID of 255 to prevent Fibre Channel LIPs from being sent.

In situations where latency is excessive, you can deal with it programmatically by increasing the buffer cache size. This will allow Xsan to cache more data, helping mitigate the effect of latent LUNs on the overall performance, health, and viability of the SAN. Additionally, you should increase the iNode Cache allowing Xsan to write iNodes more effectively if you have latency on your Metadata LUNs. You define these settings in the volume setup wizard, but can update them in your SAN volume's volumename.cfg file in /Library/Preferences/FileSystems/Xsan/config.

4.2.12.5. Schedules

3:15 a.m. Most of us may be asleep, but plenty of people are hard at work and need data access. Unfortunately, those attempting to get it from an Xsan may end up a little frustrated—Mac OS X system software runs its weekly or daily scripts at 3:15 a.m. To reduce user irritation, you can disable the periodic scripts by editing their launchd calls, which you'll find in the following files:

  • /System/Library/LaunchDaemons/com.apple.periodic-daily.plist

  • /System/Library/LaunchDaemons/com.apple.periodic-monthly.plist

  • /System/Library/LaunchDaemons/com.apple.periodic-weekly.plist

If you disable these scripts, though, you should still let them run every once in a while. Chances are that, with a little planning, you'll be able to run the process at regular intervals.

4.2.13. Fragmentation

You'll find the snfsdefrag tool, which is part of Xsan, in the /Library/Filesystems/Xsan/bin directory. You can use the utility to look up fragmentation statistics as well as to perform defragmentation operations. If you're using Xsan as back-end storage, you may need to perform defragmentation operations routinely.

NOTE

When you're defragmenting a volume we recommend that you always use the -v switch to enable verbose mode.

The snfsdefrag utility can defrag individual files or recurse directories or volumes. Before you initiate the actual operation, though, you should run it with the -c switch to perform an extent count so you can see how many each fragmented file has. To do so on bighonkinvolume, type

snfsdefrag -c -r /Volumes/bighonkinvolume

The -r option causes the utility to recursively search through the volume. Additionally, you can specify a single directory (likely one deeper in the hierarchy). You can also select files based on their number of extents by using the -m option followed by the desired maximum number allowed. For instance, to output a summary of all fragmented files with 2 or more extents, you'd use

snfsdefrag -c -r -m 2 /Volumes/bighonkinvolume

There's also a -p option, which you can use to free up blocks that were allocated (according to the way you configured the File Expansion Min value during volume setup) but not used.

The -k option is one of the most useful for environments in the midst of migrating. You can use it to specify an affinity to which you'll move a file following the defragmentation process. That lets you move data between affinities and allows for the safe (or as safe as possible) removal of storage pools during migrations.

4.2.14. Backup

An Xsan has a special file system—it's case sensitive, accept characters that some backup tools don't recognize, and allows data sets of over 100TB at times. All of these factors make for a fairly complicated backup paradigm. You can't use just any application. But there are a number of third-party tools on the market that have been developed to do the job. Here are some, along with the URLs for their web sites:

Archiware's PresSTORE: www.archiware.com

Atempo Time Navigator: www.atempo.com

BakBone's NetVault: Backup: www.bakbone.com

TOLIS Group's BRU Server Backup & Restore Software: www.tolisgroup.com

Maintaining regular backups of an Xsan volume is an absolute must. A cluster file system performs a delicate dance with many members, and badness can occur in a variety of scenarios. The file system itself is completely reliant on the back-end as a whole. If you run your business on an Xsan, not having protection is a huge mistake.

4.2.15. The Xsan Command Line

A number of command-line utilities let you perform Xsan management. You'll find these in the /Library/Filesystems/Xsan/bin folder. We recommend adding this to the search path of your shell of preference so you can use the commands without having to type in the full path or to the Xsan bin directory every time.

NOTE

Whether or not you make substantial use of the Xsan command line, having a fundamental understanding of it will increase the depth of your Xsan knowledge. If you'll be putting an Xsan into production, we highly recommend that you read this section.

4.2.15.1. Fibreconfig

Although not a part of Xsan, you'll use fibreconfig often because it mirrors the functionality of the Fibre Channel System Preference pane, but it's faster, very verbose, and has more options for configuring Apple-branded FC cards. To get started, use the -l option to query fibreconfig for all information about your FC environment by in typing in fibreconfig -l, which will produce this output:

Controllers

  PortWWN 10:00:00:05:1C:B2:90:1A
    Port Status: Link Established
       Speed: Automatic (2 Gigabit)
     Topology: Automatic (N_Port)
       Slot: Slot-2
       Port: 1

  PortWWN 10:00:00: 05:1C:B2:90:1B
Port Status: Link Established
       Speed: Automatic (2 Gigabit)
     Topology: Automatic (N_Port)
       Slot: Slot-2
       Port: 0

Targets

  NodeWWN 20:05:00:B0:A1:19:9B:14
      Status: Connected
       LUNs: 0, 1, 2, 3

  NodeWWN 20:05:00:B0:A1:20:2A:1A
      Status: Connected
       LUNs: 0, 1, 2, 3

  NodeWWN 20:05:00:B0:A1:13:9B:14
      Status: Connected
       LUNs: 0, 1, 2, 3

  NodeWWN 13:05:00:B0:A1:19:2A:1A
      Status: Connected
       LUNs: 0, 1, 2, 3

Notice that the PortWWN of the controller is listed as well as an indication as to whether the port is connected.

Immediately below that, you'll see the card's speed and topology—the only two controller settings you can customize. When you alter them, you need to use the -c option followed by the controller's PortWNN to identify the card on which you're making the change. This means that to change both of the controller's ports, you have to run the command twice.

Available topologies for the card include nport, nlport, and auto (the default). Occasionally you'll have an issue that requires you to set the topology manually. You can automate the process for a number of hosts by sending them the fibreconfig command using the -t option followed by the topology to set. For most Xsan environments, you'll want to use N Port. To customize the topology you can use the following two commands (one per controller) as part of a script (or more likely, convert the address to a variable and use the variable instead):

fibreconfig -c 10:00:00:05:1C:B2:90:1A -t nport
fibreconfig -c 10:00:00:05:1C:B2:90:1B -t nport

You can statically assign speed from the command line as well. To do so, use the -s option followed by one of four speed choices: 1 Gb, 2 Gb, 4 Gb, or auto. To customize the speed you can use the following two commands (one per Controller, and again, using a variable in the place of the address if you're doing so programmatically):

fibreconfig -c 10:00:00:05:1C:B2:90:1A -s 4gigabit
fibreconfig -c 10:00:00:05:1C:B2:90:1A -s 4gigabit

The other setting you can customize from the command line is the Loop Arbitration Physical Address (the AL_PA). If you use this setting with an Xsan, however, it can cause some serious issues, long term, but if you must, to set the AL_PA with fibreconfig, use the -a option followed by an address.

NOTE

To implement changes you make to any of these settings, you must do a reboot.

The fibreconfig command is very useful for automating reporting with Xsan, especially when used en masse through Apple Remote Desktop (ARD). You can use it to display which targets are available to metadata controllers and clients by focusing on the NodeWWN information. This can be incredibly useful in triangulating zoning and RAID-controller issues quickly and effectively. For example, you can obtain a listing from fibreconfig but constrain the output to NodeWWN items with grep as follows:

fibreconfig -l | grep NodeWWN

You can also obtain the unique address information from all of your clients concurrently without touching each system, again using a combination of fibreconfig and ARD. This can be a very useful way to get a list of addresses by node name so that you can label your FC switch ports, allow access if you're LUN-masking on Promise device, or just documenting settings. To grab the PortWWN, simply send the following command through ARD:

fibreconfig -l | grep PortWWN

Overall, there aren't a lot of settings available with the fibreconfig command. Of those settings, most that are useful in an Xsan environment are also available from the GUI. But when managing many Xsan clients, fibreconfig can help speed up the process of narrowing down issues, reporting, setting up RAIDs, and FC switch configuration.

4.2.15.2. Labeling LUNs

You can label LUNs using the cvlabel command rather than doing so within Xsan Admin. If you want to list all your available LUNs first, simply type cvlabel -l. The command cvlabel -c >labels will dump your label information out to a standard text file called labels.

Next, open the file in your favorite text editor, and change the very first text field to the name that you want for your LUN within Xsan Admin. Edit any other lines you'd like to be labeled, and save the file. Now run the command cvlabel labels, which will read the file you just edited and label the LUN for use with ACFS using the name you just provided, making it appear in Xsan admin.

Xsan Admin (2.x) will show you only the LUNs from the Fibre Channel controller, but you can use cvlabel to label LUNs on local hard drives and even removable media. Though you should use this for testing only, it does give you the ability to test Xsan commands that you otherwise might not be able to run in a lab environment.


4.2.15.3. cvadmin

The cvadmin command allows an administrator to view and change volume and storage-pool settings. Options include -H, which specifies a host to run against (if you don't indicate a host, the command attempts to run on the localhost) and -F, which sets a volume name to run against. There are also -f and -e, options, which load commands from a file and from stdin respectively. Or you can run cvadmin interactively by simply typing sudo cvadmin at the command prompt, which will provide output similar to the following:

Enter command(s)
For command help, enter "help" or "?".

In the following example, we have one volume and two metadata controllers. When we first invoke cvadmin, it displays all of the valid file-system services (which in this context means volumes per metadata controller) and selects our only volume. Notice that, in the output shown below, MyVolume has two entries. This is completely normal, because you should see one entry per volume per MDC. In this case, we have one volume and two metadata controllers, so we have two entries. The asterisk denotes the active FSS (or active metadata controller), 192.168.56.5.

List FSS

File System Services (* indicates service is in control of FS):
1>*MyVolume[0] located on 192.168.56.5:51520 (pid 512)
2> MyVolume[1] located on 192.168.56.6:51520 (pid 509)

To perform any worthwhile tasks using these tools, you need to select a volume. In this particular instance, there's only one volume, so cvadmin selected the active one and displayed the statistics for it. But when there are multiple volumes, you must select one before you use cvadmin. For example:

Select FSM "MyVolume"

Created :       Tue Jan 13 15:33:57 2009
Active Connections:     1
Fs Block Size : 16K
Msg Buffer Size :       4K

Disk Devices :  2
Stripe Groups : 2
Fs Blocks :     61277616 (935.02 GB)
Fs Blocks Free :        61006893 (930.89 GB) (99%)

So there's no confusion as to which volume you're administering, the cvadmin prompt always displays the active volume—MyVolume, in our case, as you can see here:

Xsanadmin (MyVolume) >

To get a full list of available commands you can look in the cvadmin man pages type help in an interactive cvadmin session. Below we've listed the most frequently used commands and what they do.

>fail VolumeName: Entering fail and the volume name starts failover of the volume by initiating an FSS vote among your metadata controllers. The MDC that provides services for this volume and has the highest failover priority should win the election. If no failover is available, the volume will fail back to its original host.

>fsmlist: This command outputs a list of FSM processes on the machine that's selected, which is useful when determining which volumes the machine is capable of hosting as a metadata controller.

>repof: If you need an open file report, repof will generate one, saving it to /Library/Filesystems/Xsan/data/MyVolume/open_file_report.txt. The output contains a slew of information, but the actual file name is noticeably absent. Argh! You do get an inode number for the file in question though, so you can use a command such as find /Volumes/MyVolume -inum X to determine the actual file from the published inode number. The repof command can be very useful when attempting to determine why a client will not unmount a volume.

>start: The start command is equivalent to starting the volume in Xsan. However, by specifying a hostname/ip, you can start file system services on just that particular MDC, which can be handy for maintenance purposes.

>stats: Issuing this command produces volume statistics.

>stop: The stop command is equivalent to stopping the volume in Xsan. But by supplying a hostname/ip, you can stop file system services on just that particular MDC, which can be handy for maintenance purposes.

>who: You can list all metadata controllers, client, and administration sessions open relating to this volume using who. Nodes with the volume mounted will be indicated with a [CLI] entry.

If you need help with more-complex troubleshooting, you can try these commands:

>activate VolumeName xxx.xxx.xxx.xxx: This command activates FSS services for the specified hostname and IP you put in place of VolumeName xxx.xxx.xxx.xxx. Alternatively you can leave off the IP and you'll activate the local server (if applicable). You can also run activate on an MDC if it's not showing appropriate FSS services available. If you see errors to the effect that an MDC is on standby, activating the volume on the respective server will often address this issue.

>debug 0x01000000: Entering this debug command will immediately generate I/O latency numbers and save them in /Library/Filesystems/Xsan/data/MyVolume/log/cvlog immediately (a process that normally occurs only hourly). The key figure in the output is the sysavg number for PIO HiPriWr SUMMARY. If your metadata is hosted on an Xserve RAID volume, this number should be below 500ms. If you're using storage systems from Promise, the active/active controller setup introduces additional latency, so the numbers should be in the 805 to 1,000ms range.

>latency-test: Run latency tests between the FSM and clients. It can be used to isolate problematic clients on the SAN.

>paths: Output a list of LUNs visible to the node and the corresponding HBA port used to communicate with that LUN. This option can be helpful when you are getting those pesky "stripe group not found" errors.

>show: This will output information about the stripe groups/storage pools used by this volume. It is useful for cross referencing index numbers outputted in system.log to human readable storage pool names. It also provides various statistics and configuration, such as stripe group role, corresponding LUNs, affinity tags, multipath method, and other useful bits of information.

Overall, the cvadmin tool is very useful when troubleshooting metadata controller behavior. But you don't use it when you want to perform Xsan setup or client-management operations. To label LUNs, use cvlabel. To mount and unmount volumes, you'd likely use the new xsanctl tool or mount -t acfs. To perform defrag operations and volume maintenance, use the snfsdefrag and cvfsck tools, respectively. And while you can add serial numbers and create volumes from the command line, you'll probably find it much easier to continue performing these operations through the Xsan Admin GUI tool.

4.2.15.4. Repairing Volumes

When checking for volume-integrity and repair issues on Xsan volumes, don't use the standard fsck command; use its replacement, cvfsck. And if you're going to repair a volume, check the Apple Knowledge Base article at http://support.apple.com/kb/HT1081.

4.2.15.5. Other Commands

You can also leverage other Xsan commands for use in Xsan management. Here are some, along with descriptions of what they do:

cvcp: To copy files or directories in and out of an acfs volume (one managed by Xsan) use cvcp. It has Xsan specific options and runs faster than the standard cp command. During the initial migration of data into the Xsan, we recommend using this command rather than copying with the finder.

cvmkdir: This command lets you create a new directory on an Xsan volume with an affinity.

cvmkfile: You can use cvmkfile to make a file on an Xsan volume, a procedure that's useful for testing speed.

cvmkfs: If you need to create a new Xsan volume, cvmkfs will do it.

cvupdatefs: You use cvupdatefs during some upgrades of Xsan software (for example, from 1.x to 2.x)

fsm and fsmpm: These are the Xsan processes that you invoke from launchd instead of running manually.

Additionally, you'll find a number of useful files. This list describes some of the types and where to look for them.

Logs for Volumes: /Library/Filesystems/Xsan/data/volume name/log/cvlog

Configuration Files for volumes: /Library/FileSystems/Xsan/config/VOLUME.cfg directory.

Configuration files for the Volume auto-start list: /Library/Filesystems/Xsan/config/fsmlist.

Configuration files for the Controller list: /Library/FileSystems/Xsan/Config/fsnameservers

Default Volume Configuration File: Located in /Library/Filesystems/Xsan/config, each volume has a corresponding CFG file.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.17.203.68