© Damian Wojsław 2017

Damian Wojsław, Introducing ZFS on Linux, https://doi.org/10.1007/978-1-4842-3306-1_2

2. Hardware

Damian Wojsław

(1)ul. Duńska 27i/8, Szczecin, 71-795 Zachodniopomorskie, Poland

Before you buy hardware for your storage, there are a few things to consider. How much disk space will you need? How many client connections (sessions) will your storage serve? Which protocol will you use? What kind of data do you plan to serve?

Don’t Rush

The first piece of advice that you always should keep in mind: don’t rush it. You are about to invest your money and time. While you can later modify the storage according to your needs, some changes will require that you recreate the ZFS pool, which means all data on it will be lost. If you buy the wrong disks (e.g., they are too small), you will need to add more and may run out of free slots or power.

Considerations

There are a few questions you should ask yourself before starting to scope the storage . Answers that you give here will play a key role in later deployment.

How Much Data?

The amount of data you expect to store will determine the number and size of disks you need to buy. It will also affect other factors, such as server size. To scope your space needs you would need to assess how much data you currently have and how quickly it grows. Consider how long you aregoing to run the storage you are building. It may be that you plan to replace it completely in three years and thus don’t have to be very careful. It may be you don’t know when new storage will be implemented and thus need to add some margin. Look at your organisation growth plans. Are you going to double number of office personnell within three years? Are they all going to produce data? That would mean three years from now data will grow at least three times as quick as currently.

How Many Concurrent Clients?

The number of concurrent client connections determines the amount of RAM that you’ll need. You could buy SSD disks to serve as level 2 cache for your storage and resign from using SATA disk at all, if you were considering them. Even if you are going to store hundreds of terabytes, but only a few client machines will ever utilize it and not very intensively, you may be able to get by with a low amount of memory. This will also determine the kind of network interface in your server and the kind of switch it should be attached to.

How Critical Is the Data?

How critical is your data? If it’s mission-critical, look at certified and probably more costly hardware, known to perform well and for a longer time. The importance of your data will also tell you which redundancy level you should use, which influences the final cost. My personal experience from various data centers suggests that SATA disks are failing much faster than SAS.

What Types of Data ?

The kind of data you will serve may affect the architecture of your storage pool. Streaming video files for a considerable number of clients or servicing virtual machines and data files will most probably mean you need to use mirrors, which directly influence the final capacity of your array and final cost.

What Kind of Scope?

First, create an upper bounds for what will be considered SoHo storage in this guide:

  • Given your current disk sizes, up to 12 slots in a single node, and up to 30 TB of raw capacity.

  • Either internal SAS or SATA drives.

  • One or two slots for eventual SSDs for speeding up reads.

  • Possibly a mirrored ZIL device to speed up and concatenate writes to disks. A system drive, possibly mirrored, although currently setting up Linux system on ZFS is not trivial and booting from ZFS is not recommended.

  • Up to 128 GB of RAM, possibly 64.

  • A 64-bit CPU with four or more cores. While running ZFS on 32-bit systems is possible, it certainly is not recommended.

If you intend to use external disk enclosures (JBODS) connected through SAS or FibreChannel, this book is probably not intended for you. It is possible to set up and administer such storage manually and many people have done so, but it may involve additional steps not covered in the guide. If you want to run tens or hundreds of disks, do yourself a favor and consider FreeNAS or even commercial solutions with paid support. Keeping track of system performance, memory usage, disks, controllers, and cable health is probably best managed by specialized products.

Hardware Purchase Guidelines

Hardware is usually a long-term investment. Try to remember the following points.

Same Vendor, Different Batch

When buying disks, a common practice is to make sure you buy each disk from the same vendor and model , to keep geometry and firmware the same, but from different batches, so you minimize the risk of several disks dying at the same time. I suppose that for a small-time buyer (a few up to 20 disks), the simplest way to achieve it is to buy disks from different shops. Might be cumbersome, but storage operators have seen disk batches failing at the same time many times in their lives.

Buy a Few Pieces for Spares

Storage system lifetime is usually counted in years and is often longer than a disk model, especially if you decide to use consumer-grade SATA disks. When one of them fails in a few years, you may be surprised by the fact that you cannot buy this model any more. Introducing a different one in a pool is always a performance risk. If that happens, don’t despair. ZFS lets you exchange all disks in a pool. This trick has been used in the past to increase the size of the pool when it became insufficient. Be aware that replacing all disks in a 10-disk pool can take weeks on a filled and busy system.

Scope Power Supply Properly

If your power unit is unstable or insufficient, you may encounter mysterious failures (disks disappearing, disk connection dropping, or random I/O errors to the pool) or may not be able to use your disks at all.

Consider Performance, Plan for RAM

Performance-wise, the more disks the better. The smaller disks, the better. ZFS threads writes and reads among vdevs. The more vdevs, the more read/write threads. Plan for much RAM. ZFS needs at least 2 GB of RAM to work sensibly, but for any real-life use, don’t go below 8 GB. For a storage system for SoHo, I recommend looking at 64 GB or more. ZFS caches data very aggressively, so it will try to use as much RAM as possible. It will, however, yield when the system demands RAM for normal operations (such as new programs being run). So the more it can fit in your memory, the better.

Plan for SSDs (At Least Three)

You don’t need to buy them upfront. ZFS is a hybrid storage file system, which means that it can use SSD disks for the level 2 cache. It’s gonna be much slower than you RAM, but it’s cheaper and still much faster than your platter disks. For a fraction of the RAM price, you can get a 512 GB SSD drive, which should allow for another speed improvement. That’s one SSD. Two SSDs would be for an external ZFS Intent Log. The file system doesn’t flush all the data all the time to physical storage. It ties writes in transactions and flushes several at the same time, to minimize file system fragmentation and real I/O to disks.

If you give ZFS external devices for ZIL, it can speed things up by grouping even more data before flushing it down. This additional pool device should be mirrored, because it’s where you can lose your data. In case of power failure, data on external ZIL must be persistent. There are battery backed-up DRAM devices that emulate small SSD disks, i.e., ZeusRAM. They come in 8 and 16 GB sizes, which is enough for ZIL. They are fast as memory, but they are costly. You can think of mirroring your L2ARC too (the level 2 cache), but losing this device won’t endanger your data.

Consider SATA

While the SAS standard is sure to get better performance and life expectancy from your disks, for SoHo solutions SATA is enough, especially if you consider that there are enterprise-class SATA disks. The price difference for such deployment shouldn’t be very high. If you’re unsure, choose SAS if your budget allows.

Do Not Buy Hardware and Soft RAID Controllers

While in the past, RAID cards were necessary to offload both CPU units and RAM, both of those resources are now abundant and cheap. You CPU and your RAM will be more than enough for the workload and RAID cards take away one important capability of ZFS. ZFS ensures data safety by talking directly to the disk: getting reliable information on when data is flushed to physical disks and what block sizes are being used.

RAID controllers mediate in between and can make their own “optimizations” to the I/O, which may lower ZFS reliability. The other thing is, RAID controllers are incompatible between various vendors and even the same card but different firmware revision may be unable to access your RAID set. This means that in case of controller failure, you lose the whole setup and need to restore data from a backup. Soft RAIDs are even worse, in that they need special software (often limited to only one operating system) to actually work.

ZFS is superior in all of these areas. Not only can it use all processing power and all RAM you can give it to speed up your I/O, but the disks in the pool can also be migrated between all software platforms that implement the same OpenZFS version. Also, the exact sequence of disks in disk slots is not important, as the pool remembers its configuration based on disk device names (i.e., /dev/sdb) as well as by the disk GUID given them by ZFS during pool creation.

Networking Cards at Least 1 GB of Speed

Remember that this server networking card’s bandwidth will be spread among all the machines that will simultaneously utilize the storage. It is quite sensible to consider 10 GB, but you also need to consider your other networking gear—switches, cabling, etc. Remember that the network plays a role in performance analysis and quite a large amount of performance issues are caused not by the storage itself, but by the networking layer. For serious work in an office I would suggest going no lower than 10GB network cards. 1GB are acceptable in a very small environment where storage won’t be used extensively. Anything less will quickly become inconvenient at best.

Plan for Redundancy

Always. This means that for high-speed read pools, you need to consider mirrored storage, effectively halving total capacity of the disks you buy. RAIDZ setup means your capacity will be lowered by one disk per each vdev you create. For RAIDZ-2, it will be two disks.

Data Security

You are going to use ZFS storage to keep data and serve it to various people in your company. Be it two, ten, or fifty people, always put some thought into planning the layout. Various directories that will store data that vary by kind, sensitivity, and compressibility will pay off in the future. Well-designed directory structure will simplify both organizational things, like access control and the technical side, like enabled or disabled compression, time options, etc.

ZFS file systems behave like directories. It is quite common to create a separate ZFS file system per the user home directory, for example, so that they can have fine-grained backup policies, ACLs, and compression mechanisms.

You need to consider your company size, number of employees accessing the storage, growth perspectives, data sensitivity, etc. Whatever you do, however, don’t skip this point. I’ve seen quite a few companies that overlooked the moment they needed to switch from an infrastructure that freely evolves into something that is engineered.

CIA

There are many data security methodologies and one of them, I believe the most classic, uses the acronym CIA to explain aspects of data security. This stands for Confidentiality, Integrity, and Availability . While it focuses rather on the InfoSec side of things, it’s a pretty good view of storage administration also. The next sections introduce these concepts from the point of view of the storage administrator.

Confidentiality

Data must be available only to people who are entrusted with it. No one who is not explicitly allowed to view data should be able to access it. This side of security is covered by many infrastructural tools, from policies and NDAs that people allowed to view data should read and sign, through network access separation (VPNs, VLANs, access control through credentials). There are also aspects directly related to storage itself: Access Control Lists (ACLs), sharing through secure protocols and in secure networks, working with storage firewalls, etc.

Integrity

It must be guaranteed that the data is genuine and was not changed by people who are not entrusted. Also, the change should not be introduced by software or hardware, intentionally or not, if it’s not supposed to. Through the whole data lifecycle, only people with sufficient privileges should be allowed to modify the data. Unintentional data integrity breaches may be a disk failure that breaks data blocks. While with text data it is usually easily spotted, with other data, like sound or video, it’s harder because there can be subtle differences from the original state. As with all aspects of security, it’s also only partially administered by storage. The data integrity is covered by ACLs, but also by ZFS checksumming data blocks to detect corruption. If your setup uses any redundancy, ZFS can, to great extent, fix those for you using the redundant set.

Availability

The data should be available at all times it is required and guaranteed. This is probably one of most obvious aspects of storage. Any time you expect your data should be up, the data should be up. Typically, storage redundancy comes into play here (mirror, RAIDZ, and RAIDZ-2), but so do network cards trunking, switch stacking, and the redundancy of any credentials checking solution you are using (Active Directory server, primary and secondary, for example).

Types of Workload

The workload you are going to run on the storage will play a major role in how you should plan the pool layout.

If you are going to mostly host databases and they are going to be the dominating consumers of the space, L2ARC SSD device may not provide you with special performance gains. Databases are very good at caching their own data and if it so happens that the data fits into the database server RAM, ARC will not have much to do. On the other hand, if the data in your database change often and needs to be reread from disks, you are going to have high miss ratio anyway and, again, the L2ARC device will not fulfill its purpose.

The snapshotting data is also going to be tricky. Databases need lots more than a snapshot of the file system to be able to work on the data. This is why they come with their own dump commands—because the full working backup usually contains more than what lives in the database files. Hosting a database would usually mean you run the engine on the same host as your ZFS. Again, the database will use the RAM more efficiently than the file system itself. Consider though, if you will serve data from the same server for other purposes, such as CIFS or NFS share. In that case, the database and file system cache may compete for RAM. While this shouldn’t affect the system stability, it may adversely affect performance.

If you host documents and pictures for office workers, files like procedures, and technical documentations, a L2ARC device is something to seriously consider. Snapshotting is then a reliable way of capturing data at certain points in time. If your data is not being accessed 24 hours a day and you can have just a few seconds of off-time, a snapshot can reliably host your data at a specified point of time. It usually takes about a second to create. You can later mount this snapshot—remember it is read-only—and transfer it to a backup location, not worrying about data integrity.

Above all, don’t rush it. You can always add L2ARC later on to your pool, if performance tests prove to be unsatisfactory.

Other Components To Pay Attention To

It is important to pay attention to other infrastructure elements. The network is of special interest. In a small company of a few persons, a small switch with the workstation refit as a storage server might perform without any issue, but once the number of data consumers starts to grow, this kind of network may soon become a bottleneck. Switches may not be the only limiting factor. Network cards in your storage server may prove to be another one. Also, if you serve your data over VPNs from a remote location, it may turn out that the interlink is too slow. Quite often on a storage performance analysis case, we were able to point to networking infrastructure as the faulty element.

Hardware Checklist

Don’t rush. Before buying your hardware, sit down with a piece of paper or with your laptop and make a list. Think about how much space you will need and how this need may grow in several years. Think about your budget. How much you can spend? Count the number of machines you will be connecting to the storage and describe the kind of traffic that will be served. Lots of small files? Big, several gigabyte-sized files? Plan some tests and assume you’ll need a few days to make them.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.135.219.166