Let’s take everything you learned in the previous chapter on measurements and apply that knowledge here to write a benchmark function. To reiterate, here’s what such a function should do:
Run the code multiple times to gather measurements. It’s best if we can do 30 runs or more.
Skip the results of the first run to reduce the warm-up effects and let caching do its job.
Force GC before each run.
Fork the process before measurement to make sure all runs are isolated and don’t interfere with each other.
Store all measurements somewhere (in the file, on S3, etc.) to be processed later.
Calculate and report average performance and its standard deviation.
This list makes for a pretty detailed spec, so let’s go ahead and write the benchmark function.
chp8/performance_benchmark.rb | |
| require 'benchmark' |
| |
| def performance_benchmark(name, &block) |
| # 31 runs, we'll discard the first result |
| (0..30).each do |i| |
| # force GC in parent process to make sure we reclaim |
| # any memory taken by forking in previous run |
| GC.start |
| |
| # fork to isolate our run |
| pid = fork do |
| # again run GC to reduce effects of forking |
| GC.start |
| # disable GC if you want to see the raw performance of your code |
| GC.disable if ENV["RUBY_DISABLE_GC"] |
| |
| # because we are in a forked process, we need to store |
| # results in some shared space. |
| # local file is the simplest way to do that |
| benchmark_results = File.open("benchmark_results_#{name}", "a") |
| |
| elapsed_time = Benchmark::realtime do |
| yield |
| end |
| |
| # do not count the first run |
| if i > 0 |
| # we use system clock for measurements, |
| # so microsecond is the last significant figure |
| benchmark_results.puts elapsed_time.round(6) |
| end |
| benchmark_results.close |
| |
| GC.enable if ENV["RUBY_DISABLE_GC"] |
| end |
| Process::waitpid pid |
| end |
| |
| measurements = File.readlines("benchmark_results_#{name}").map do |value| |
| value.to_f |
| end |
| File.delete("benchmark_results_#{name}") |
| |
| average = measurements.inject(0) do |sum, x| |
| sum + x |
| end.to_f / measurements.size |
| stddev = Math.sqrt( |
| measurements.inject(0){ |sum, x| sum + (x - average)**2 }.to_f / |
| (measurements.size - 1) |
| ) |
| |
| # return both average and standard deviation, |
| # this time in millisecond precision |
| # for all practical purposes that should be enough |
| [name, average.round(3), stddev.round(3)] |
| end |
We made three simplifications in the benchmarking function. First, we used the Ruby round function that doesn’t follow the tie-breaking rule of rounding when the first non-significant digit is 5 followed by 0. Instead of rounding to the nearest even number, it’ll always round up. Second, we decreased precision to milliseconds despite the system clock being able to measure times with microsecond precision. Finally, we hard-coded the number of measurements to 30.
You can easily undo the first and the last simplifications, but I recommend you keep the second. Ruby isn’t a systems programming language, so we usually don’t care about microseconds of execution time. In fact, in most cases we don’t care about milliseconds or even tens of milliseconds—that’s why we rounded off our measurements in our example.
Now let’s see how our benchmarking function works. Run this simple program:
chp8/test_performance_benchmark.rb | |
| require 'performance_benchmark' |
| |
| result = performance_benchmark("sleep 1 second") do |
| sleep 1 |
| end |
| puts "%-28s %0.3f ± %0.3f" % result |
| $ cd code/chp8 |
| $ ruby -I . test_performance_benchmark.rb |
| sleep 1 second 1.000 ± 0.000 |
As expected, sleep(1) takes exactly one second, with no deviation. We can be sure our measurements are correct. Now it’s time to write the function to assert the performance.
3.14.144.216