Network benchmarking

There are a number of areas that we need to benchmark on the network to be able to understand any limitation and make sure there are no miss configurations.

A standard ethernet frame is 1500 bytes, a jumbo frame is typically 9000 bytes. This increased frame size reduces the overheads for sending data. If you have configured your network with a jumbo frame, the first thing to check is that they are configured correctly across all your servers and networking devices. If jumbo frames are configured incorrectly, Ceph will exhibit strange random behavior that is very hard to trace; therefore, it is essential that jumbo frames are configured correctly and confirmed to be working before deploying Ceph over the top of your network.

To confirm whether jumbo frames are working correctly, you can use ping to send large packets with the don't fragment flag set:

    ping -M do -s 8972 <destination IP>

This command should be run across all your nodes to make sure they can ping each other using jumbo frames. If it fails, investigate the issue and resolve before deploying Ceph. It is also worth trying to automate this test, as future network changes could break this behavior, and diagnosing misconfigured jumbo frames through Ceph is almost impossible.

The next testing to undertake is to measure the round trip time also with the ping tool. Using the packet size parameter again but with the don't fragment flag, it is possible to test the round trip time of certain packet sizes up to 64 KB, which is the maximum IP packet size.

Here are some example readings between two hosts on a 10GBase-T network:

  • 32 B = 85 microseconds
  • 4 KB = 112 microseconds
  • 16 KB = 158 microseconds
  • 64 KB = 248 microseconds

As you can see, larger packet sizes impact the round trip time; this is one reason why larger I/O sizes will see a decrease in IOPS in Ceph.

Finally, let's test the bandwidth between two hosts to confirm whether we get the expected performance or not.

Run iperf -s on the server that will run the iPerf server role:

Then, run iperf -c <address of iperf server>:

We can see that in this example, the two hosts are connected via a 10G network and obtain near the maximum theoretical throughput. If you do not see the correct throughput, then an investigation into the network, including host configuration, needs to be done.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.225.117.233