Fuzzing the simulation

Feruscore has been changed in this chapter to include raw memory access in both Rust and through FFI. The responsible thing to do is fuzz Mars and make sure we're not causing segmentation faults. We'll use AFL, which we discussed back in Chapter 2, Sequential Rust Performance and Testing, and again in Chapter 5, Locks – Mutex, Condvar, Barriers, and RWLock. The fuzz target is src/bin/fuzz_target.rs. The trick with fuzzing is ensuring stability. That is, AFL can't really do its work if for some input applied multiple times multiple paths come out. Fuzzing is more efficient in the case of a deterministic system. We were careful to make Mars::compete_inner deterministic, where Mars::compete uses randomness to determine warrior positions. Fuzzing, then, will go through compete_inner only. The preamble for fuzz_target doesn't contain any new crates:

extern crate byteorder;
extern crate feruscore;

use byteorder::{BigEndian, ReadBytesExt};
use feruscore::individual::*;
use feruscore::instruction::*;
use feruscore::mars::*;
use std::io::{self, Cursor, Read};

Remember that AFL passes a byte slice through stdin and the fuzzing target is responsible for deserializing that array into something sensible for itself. We'll build up a Config structure:

#[derive(Debug)]
struct Config {
    pub rounds: u16,
    pub core_size: u16,
    pub cycles: u16,
    pub processes: u16,
    pub max_warrior_length: u16,
    pub left_chromosome_size: u16,
    pub right_chromosome_size: u16,
    pub left: Individual,
    pub right: Individual,
    pub left_pos: u16,
    pub right_pos: u16,
}

Hopefully, these fields are familiar. The rounds, core_size, cycles, and processes each affect the MARS environment. The max_warrior_length, left_chromosome_size, and right_chromosome_size affect the two individuals that will be made to compete. left and right are those Individual instances. left_pos and right_pos set where the warriors will be placed in the MARS core memory. The numbers that we'll deserialize from the byte slice won't always be entirely sensible, so there'll be some cleanup needed, like so:

impl Config {
    pub fn new(rdr: &mut Cursor<Vec<u8>>) -> io::Result<Config> {
        let rounds = (rdr.read_u16::<BigEndian>()? % 1000).max(1);
        let core_size = (rdr.read_u16::<BigEndian>()? % 24_000).max(256);
        let cycles = (rdr.read_u16::<BigEndian>()? % 10_000).max(100);
        let processes = (rdr.read_u16::<BigEndian>()? % 1024).max(2);
        let max_warrior_length = (rdr.read_u16::<BigEndian>()? % 256).max(4);
        let left_chromosome_size = (rdr.read_u16::<BigEndian>()? 
                                       % max_warrior_length).max(2);
        let right_chromosome_size = (rdr.read_u16::<BigEndian>()? 
                                        % max_warrior_length).max(2);

It doesn't make any sense for there to be 0 rounds, as an example, so we say thatthere must be at least one round. Likewise, we need two processes, desire at least four warrior instructions, and so forth. Creating the left and right warriors is a matter of passing the byte reader into Config::mk_individual:

        let left = Config::mk_individual(rdr, 
                                         max_warrior_length,
                                         left_chromosome_size, 
                                         core_size)?;
        let right =
            Config::mk_individual(rdr, 
                                  max_warrior_length, 
                                  right_chromosome_size, 
                                  core_size)?;

Config::mk_individual deserializes into InstructionBuilder. The whole thing is kind of awkward. While we can convert a field-less Enum into an integer, it's not possible to go from an integer to a field-less Enum without some hairy match statements:

    fn mk_individual(
        rdr: &mut Cursor<Vec<u8>>,
        max_chromosome_size: u16,
        chromosome_size: u16,
        core_size: u16,
    ) -> io::Result<Individual> {
        assert!(chromosome_size <= max_chromosome_size);
        let mut indv = IndividualBuilder::new();
        for _ in 0..(chromosome_size as usize) {
            let builder = InstructionBuilder::new(core_size);
            let a_field = rdr.read_i8()?;
            let b_field = rdr.read_i8()?;
            let a_mode = match rdr.read_u8()? % 5 {
                0 => Mode::Direct,
                1 => Mode::Immediate,
                2 => Mode::Indirect,
                3 => Mode::Decrement,
                _ => Mode::Increment,
            };
            let b_mode = match rdr.read_u8()? % 5 {
                0 => Mode::Direct,
                1 => Mode::Immediate,
                2 => Mode::Indirect,
                3 => Mode::Decrement,
                _ => Mode::Increment,
            };

Here, we've established the InstructionBuilder and read the Mode for a-field and b-field out from the byte slice. If a field is added, we'll have to come through here and update the fuzzing code. It's a real pain. Reading the Modifier out works the same way:

            let modifier = match rdr.read_u8()? % 7 {
                0 => Modifier::F,
                1 => Modifier::A,
                2 => Modifier::B,
                3 => Modifier::AB,
                4 => Modifier::BA,
                5 => Modifier::X,
                _ => Modifier::I,
            };

As does reading out the OpCode:

            let opcode = match rdr.read_u8()? % 16 {
                0 => OpCode::Dat,   // 0
                1 => OpCode::Spl,   // 1
                2 => OpCode::Mov,   // 2
                3 => OpCode::Djn,   // 3
                4 => OpCode::Add,   // 4
                5 => OpCode::Jmz,   // 5
                6 => OpCode::Sub,   // 6
                7 => OpCode::Seq,   // 7
                8 => OpCode::Sne,   // 8
                9 => OpCode::Slt,   // 9
                10 => OpCode::Jmn,  // 10
                11 => OpCode::Jmp,  // 11
                12 => OpCode::Nop,  // 12
                13 => OpCode::Mul,  // 13
                14 => OpCode::Modm, // 14
                _ => OpCode::Div,   // 15
            };

Producing an instruction is simple enough, thanks to the builder pattern in use here:

            let inst = builder
                .a_field(a_field)
                .b_field(b_field)
                .a_mode(a_mode)
                .b_mode(b_mode)
                .modifier(modifier)
                .opcode(opcode)
                .freeze();
            indv = indv.push(inst);
        }
        Ok(indv.freeze())
    }

Moving back up to Config::new, we create the left and right positions:

        let left_pos =
            Config::adjust_pos(core_size, 
                               rdr.read_u16::<BigEndian>()?, 
                               max_warrior_length);
        let right_pos =
            Config::adjust_pos(core_size, 
                               rdr.read_u16::<BigEndian>()?, 
                               max_warrior_length);

The adjust_pos function is a small thing, intended to keep warrior positions properly in bounds:

    fn adjust_pos(core_size: u16, mut pos: u16, space: u16) -> u16 {
        pos %= core_size;
        if (pos + space) > core_size {
            let past = (pos + space) - core_size;
            pos - past
        } else {
            pos
        }
    }

It's entirely possible that the warriors will overlap with this calculation. That is okay. Our ambition with fuzzing is not to check the logic of the program, only to seek out crashes. In fact, if overlapping two warriors causes a crash, that's a fact we need to know. The close of Config::new is fairly straightforward:

        Ok(Config {
            rounds,
            core_size,
            cycles,
            processes,
            max_warrior_length,
            left_chromosome_size,
            right_chromosome_size,
            left,
            right,
            left_pos,
            right_pos,
        })
    }

After all that, the main function of fuzz_target is minimal:

fn main() {
    let mut input: Vec<u8> = Vec::with_capacity(1024);
    let result = io::stdin().read_to_end(&mut input);
    if result.is_err() {
        return;
    }
    let mut rdr = Cursor::new(input);
    if let Ok(config) = Config::new(&mut rdr) {
        let mut mars = MarsBuilder::default()
            .core_size(config.core_size)
            .cycles(config.cycles)
            .processes(u32::from(config.processes))
            .max_warrior_length(config.max_warrior_length as u16)
            .freeze();
        mars.compete_inner(
            &config.left,
            config.left_pos,
            &config.right,
            config.right_pos,
        );
    }
}

Stdin is captured and a Cursor built from it, which we pass into Config::new, as explained earlier. The resulting Config is used to fuel MarsBuilder, and the Mars is then the arena for competition between the two fuzzing Individual instances that may or may not overlap. Remember, before running AFL, be sure to run cargo afl build—release and not cargo build —release. Both will work, but the first is significantly faster to discover crashes, as AFL's instrumentation will be inlined in the executable. I've found that even a single instance of cargo afl fuzz -i /tmp/in -o /tmp/out target/release/fuzz_target will run through AFL cycles at a good clip. There aren't many branches in the code and, so, few paths for AFL to probe.

Table of Contents for Fuzzing the simulation

Create new playlist

Sign In

Sign Up

Table of Contents for
Fuzzing the simulation