Benchmarking and Profiling Genetic Algorithms

The first step in any optimization process is doing some investigative work by establishing baselines and determining where your code is slow. Benchmarking is the process of evaluating your code from a performance point of view. Benchmarking is usually done to assess the performance of units of code in the context of space (memory usage) or time (execution time). Profiling is the process of evaluating specific aspects of a program in space or time to aid in performance optimization. Profiling helps you identify specific bottlenecks in your code.

The difference between benchmarking and profiling is subtle. The purpose of benchmarking is to establish performance metrics for an entire operation to compare between operations. This could mean comparing operations performed with different settings or comparing operations on different hardware.

The purpose of profiling is to understand the behavior of a program. For example, a profiler might tell you where a program spends most of its time or what functions a program invokes the most. Profiling offers detailed insights into where you should try to optimize your code the most.

You can use a number of tools to benchmark and profile your code. In this section, you’ll use Benchee to benchmark your algorithms and ExProf to profile them. But before you begin optimizing, you need to decide if it’s even worth it.

Deciding When to Optimize

Oftentimes, programmers make the decision to optimize code prematurely and unnecessarily. Optimization isn’t a bad thing; however, choosing to optimize at the wrong time—or for the wrong reasons—hinders the development process.

Practical genetic algorithms are computationally expensive—real-world problems will often require you to work with significant amounts of data. Fortunately, modern hardware alleviates a lot of the pressure on software to be optimized. Often, the available computing power far exceeds computational requirements. In a lot of situations, optimization isn’t necessary.

Of course, when you’re dealing with a lot of data, optimization can be necessary to prevent algorithms from crashing. On a moderately powerful machine, your genetic algorithms will become noticeably slower when the population size, and the size of each chromosome, multiplies to around 1,000,000. In most other circumstances, you’ll see no noticeable performance dips and optimization is unnecessary.

You can reference the Erlang documentation[15] for explicit information on the limits of the BEAM and memory usage of different data types. This might help you in determining if you need to devote time to optimization.

Using Benchee to Analyze Performance

If you decide that optimization is necessary, the first thing you need to do is establish a baseline. Benchmarking your algorithm allows you to determine if the optimizations you’re making are having any impact on the overall performance of your algorithm. Benchmarking is a necessary first step to determine if your optimizations are meaningful.

Benchee[16] is an Elixir benchmarking package that’s easy to use and provides a lot of information out of the box. You’ll use Benchee to benchmark the different aspects of your genetic algorithms.

Start by adding Benchee to your dependencies in mix.exs:

 defp​ deps ​do
 # ... other deps
  {​:benchee​, ​~​> ​"​​1.0.1"​}
 end

Next, create a new file benchmark.exs in a new directory bench. You’ll declare basic benchmarks of each of your core functions in this file.

Benchmarks are typically done on large parts of programs. In this example, you’ll benchmark each of the core functions in your genetic algorithm framework, and you’ll also benchmark a single evolution. To do this, you need to declare a DummyProblem that serves as a baseline problem for your genetic algorithm to work on. The problem you declare won’t solve anything useful, but it will serve as a means to benchmark the different functions in your algorithm. In benchmark.exs, declare your DummyProblem, like this:

 defmodule​ DummyProblem ​do
  @behaviour Problem
  alias Types.Chromosome
 
  @impl true
 def​ genotype ​do
  genes = for _ <- 1..100, ​do​: Enum.random(0..1)
  %Chromosome{​genes:​ genes, ​size:​ 100}
 end
 
  @impl true
 def​ fitness_function(chromosome), ​do​: Enum.sum(chromosome.genes)
 
  @impl true
 def​ terminate?(_population, generation), ​do​: generation == 1
 end

initialize requires a genotype that returns a new chromosome. Here, you declare a genotype that’s identical to one you’ve used in other problems. You can change this to whatever you like, but the idea is to use a typical genotype so that you understand how initialize performs in an average scenario.

evaluate requires a fitness_function. In the same spirit as the genotype you declared for initialize, you’ll use a fitness function that’s identical to other ones you’ve used before. Once again, you can change the fitness function, but the function you defined here will represent a typical fitness function for any problem.

terminate? is a dummy termination function. It terminates after a single generation to ensure you can get one full iteration of the evolve function. You could create a dummy evolve function and see negligible differences in performance; however, this methodology ensures you get a true picture of the performance of your functions.

Next, you need to declare dummy populations to pass to your other functions. You can use initialize and the genotype you declared, like this:

 dummy_population = Genetic.initialize(&DummyProblem.genotype/0,
 population_size:​ 100)
 {dummy_selected_population, _} = Genetic.select(dummy_population,
 selection_rate:​ 1.0)

You need two different populations for the functions in your algorithm. dummy_population is a standard population that can be passed to select, evaluate, and mutation. dummy_selected_population represents a population of parents for use in crossover. If you recall from Chapter 2, Breaking Down Genetic Algorithms, crossover requires a list of tuples. Calling select on the dummy_population takes care of this for you. Additionally, you declare a :selection_rate of 1.0. You can change this, but understand that the performance of crossover changes with respect to the selection rate because the selection rate determines how many parents are in the selected population.

The final step is to declare your benchmarks for each function. At the bottom of benchmark.exs, add the following:

 Benchee.run(
  %{
 "​​initialize"​ => ​fn​ -> Genetic.initialize(&DummyProblem.genotype/0) ​end​,
 "​​evaluate"​ =>
 fn​ ->
  Genetic.evaluate(dummy_population, &DummyProblem.fitness_function/1)
 end​,
 "​​select"​ => ​fn​ -> Genetic.select(dummy_population) ​end​,
 "​​crossover"​ => ​fn​ -> Genetic.crossover(dummy_selected_population) ​end​,
 "​​mutation"​ => ​fn​ -> Genetic.mutation(dummy_population) ​end​,
 "​​evolve"​ => ​fn​ -> Genetic.evolve(dummy_population, DummyProblem, 0) ​end
  },
 memory_time:​ 2
 )

Here, you declare the functions you’re trying to benchmark. Benchee will run each of these functions and report the performance of each. The :memory_time option tells Benchee how long memory tests should be performed for. The default is 0, so to turn it on, you need to declare a positive number.

Next, run your benchmarks, like this:

 $ ​​mix​​ ​​run​​ ​​bench/benchmark.exs
 ...
 
 Comparison:
 evaluate 8.94 K
 select 7.84 K - 1.14x slower +15.72 μs
 mutation 5.59 K - 1.60x slower +67.12 μs
 crossover 2.30 K - 3.88x slower +322.20 μs
 evolve 0.88 K - 10.17x slower +1025.02 μs
 initialize 0.26 K - 34.72x slower +3770.38 μs
 
 ...
 
 Comparison:
 evaluate 24.29 KB
 select 31.66 KB - 1.30x memory usage +7.37 KB
 mutation 117.23 KB - 4.83x memory usage +92.94 KB
 crossover 264.77 KB - 10.90x memory usage +240.48 KB
 evolve 412.39 KB - 16.98x memory usage +388.10 KB
 initialize 2201.65 KB - 90.64x memory usage +2177.36 KB

After some system information, you’ll see the results of the benchmarks. Your results might differ slightly from what you see here. You should see some statistics related to each benchmark. That information has been left out here. You can also see the baseline performance of each of your functions in both memory usage and overall performance. You might notice that initialize/2 is the slowest function. Your immediate reaction here might be to aggressively optimize initialize/2; however, that wouldn’t make much sense because initialize/2 only runs once. You wouldn’t get much benefit out of optimizing it.

In the next section, you’ll learn how to use a profiler to determine where you should optimize.

Identifying Bottlenecks with ExProf

ExProf[17] is an Elixir profiling tool that wraps around Erlang’s :eprof. A profiler tells you about the behavior of your code and helps you determine where to optimize.

To start using ExProf, first add it as a dependency:

 defp​ deps ​do
  ...
  {​:exprof​, ​"​​~> 0.2.0"​}
  ...
 end

Next, create a new file profile.exs in bench. In the newly created file, add a DummyProblem module identical to the one you declared in the last section, like this:

 defmodule​ DummyProblem ​do
  @behaviour Problem
  alias Types.Chromosome
 
  @impl true
 def​ genotype ​do
  genes = for _ <- 1..100, ​do​: Enum.random(0..1)
  %Chromosome{​genes:​ genes, ​size:​ 100}
 end
 
  @impl true
 def​ fitness_function(chromosome), ​do​: Enum.sum(chromosome.genes)
 
  @impl true
 def​ terminate?(_population, generation), ​do​: generation == 1
 end

You’ll use this problem to benchmark the performance of your algorithm. Next, create a Profiler module like this:

 defmodule​ Profiler ​do
 import​ ExProf.Macro
 
 def​ do_analyze ​do
  profile ​do
  Genetic.run(DummyProblem)
 end
 end
 
 def​ run ​do
  {records, _block_result} = do_analyze
  total_percent = Enum.reduce(records, 0.0, &(&1.percent + &2))
  IO.inspect ​"​​total = ​​#{​total_percent​}​​"
 end
 end

This code uses the profile macro from ExProf to profile the performance of your genetic algorithm. You wrap whatever functions you want to profile inside of the profile macro. The rest of the code is required to run the overall profile.

Next, you need to run the profiler in the script:

 Profiler.run()

Now, run the profiler:

 $ ​​mix​​ ​​run​​ ​​scripts/profile.exs
 ...

You should see a long list of results indicating where your genetic algorithm spends most of its time. Overall, you’ll notice that most of the work in the algorithm happens in list functions and random number generation. So how do you best address these? You can speed up list functions by writing algorithmically efficient code and by parallelizing the code. You’ll learn about both of those concepts in the next two sections. You can speed up RNG by rolling simpler, custom RNGs with NIFs. You’ll learn about that in Improving Performance with NIFs.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.151.141