This setting allows play developers to define a percentage of hosts that can fail before the whole operation is aborted. At the end of each task, Ansible will perform a calculation to determine the number of hosts targeted by the play that have reached a failure state, and if that number is greater than the number allowed, Ansible will abort the playbook. This is similar to any_errors_fatal; in fact, any_errors_fatal just internally expresses a max_fail_percentage parameter of 0, where any failure is considered fatal. Let's edit our play from the preceding and remove any_errors_fatal, replacing it with the max_fail_percentage parameter set to 20:
--- - name: any errors fatal hosts: failtest gather_facts: false max_fail_percentage: 20
By making that change, our play should complete both tasks without aborting:
Now, if we change the condition on our first task so that we fail on over 20 percent of the hosts, we'll see the playbook abort early:
- name: fail last host
fail:
msg: "I am last"
when: inventory_hostname in play_hosts[0:3]
We're setting up three hosts to fail, which will give us a failure rate of greater than 20 percent. The max_fail_percentage setting is the maximum allowed, so our setting of 20 would allow 2 out of the 10 hosts to fail. With three hosts failing, we will see a fatal error before the second task:
With this combination of parameters, we can easily set up and control fail fast conditions on a group of hosts, which is incredibly valuable if our goal is to maintain the integrity of an environment during an Ansible deployment.