Ceph node running out of resources during recovery

On day-to-day operations, a Ceph node uses very little resources such as CPU and memory. But during a cluster recovery, Ceph redistributes a large amount of data between OSDs, which uses up a large portion of the node resources. If a node is constantly running out of resources during recovery, check whether there are any VMs running on that node. Those VMs will need to be powered off or migrated to another node until the rebalancing finishes. If this is not the case, then check the available resources of the node. It may be that the node simply does not have enough resources to keep up with the Ceph recovery. Another common reason for running out of resources is that Ceph may be configured with higher performance values, such as a number of threads allocated for recovery or maximum backfills allowed. A great feature of Ceph is lots of the configuration can be applied during runtime, which gets applied immediately. The following are some of the configuration options that need to be checked if nodes are running out of resources during recovery:

To check the recovery values of an OSD, run this command format:

# ceph daemon osd.0 config show | grep recovery

This command will show all the OSD recovery-related options currently set for OSDs, as shown in the following screenshot:

From the previous screenshot, we can see that currently the value for osd_recovery_max_active is set to 3. This means the OSD recovery will use three threads during recovery. If the Ceph node is struggling, we need to drop the value to one thread using the following command:

# ceph tell osd.* injectargs '--osd-recovery-max-active 1'

The previous command will change the recovery thread to 1 for all OSDs because we have added a wildcard as the OSD ID instead of specifying one particular OSD. The injectargs syntax changes values in real time without needing to restart any OSD or node.

If we want to check the value currently set for max backfills, we can enter a similar command as follows:

# ceph daemon osd.0 config show | grep backfills

For our example cluster, the command shows the backfills set to 6, as shown in the following screenshot:

As we can see from the previous screenshot, the backfills are set to a value of 6. This may be too high for a smaller node. This value should be set to 1 if the node is running out of resources during recovery. We are going to change the value using the following command we already have seen for the recovery thread:

# ceph tell osd.* injectargs '--osd-max-backfills 1'

It is important to note here that besides a node running out of resources, there can also be a network bottleneck due to higher recovery values, such as an extreme slowdown in network connectivity to the point where users will not be able to access their VMs. In such a scenario, these recovery values will also prove very helpful. Lower recovery values ensure that user requests do not get interrupted, yet recovery takes place at a slower pace. If user connectivity is not a priority, for example overnight, we can inject new higher values to speed up recovery, then change them to a lower value before the working day starts.

Table of Contents for Ceph node running out of resources during recovery

Create new playlist

Sign In

Sign Up

Table of Contents for
Ceph node running out of resources during recovery