Optimize Your Iterators

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Optimize Your Iterators

To a Ruby newcomer, Ruby iterators typically look like a convenient syntax for loops. In fact, iterators are such a good abstraction that even seasoned developers often forget that they really are nothing more than methods of Array and Hash classes with a block argument.

However, keeping this in mind is important for performance. We talked in Modify Arrays and Hashes in Place about the importance of in-place operations on hashes and arrays. But that’s not the end of the story.

Because a Ruby iterator is a function of an object (Array, Range, Hash, etc.), it has two characteristics that affect performance:

Ruby GC will not garbage collect the object you’re iterating before the iterator is finished. This means that when you have a large list in memory, that whole list will stay in memory even if you no longer need the parts you’ve already traversed.
Iterators, being functions, can and will create temporary objects behind the scenes. This adds work for the garbage collector and hurts performance.

Compounding these performance hits, iterators (just like loops) are sensitive to the algorithmic complexity of the code. An operation that by itself is just a tad slow becomes a huge time sink when repeated hundreds of thousands of times.

So let’s see when exactly iterators become slow and what can we do about that.

Free Objects from Collections During Iteration

Let’s assume we have a list of objects, say one thousand elements of class Thing. We iterate over the list, do something useful, and discard the list. I’ve seen and written a lot of such code in production applications. For example, you read data from a file, calculate some stats, and return only the stats.

	class Thing; end
	list = Array.new(1000) { Thing.new }

	list.each do \|item\|
	# do something with the item
	end
	list = nil

Obviously we can’t deallocate list before each finishes. So it will stay in memory even if we no longer need access to previously traversed items. Let’s prove that by counting the number of Thing instances before each iteration.

chp2/each_bang.rb
	class Thing; end
	list = Array.new(1000) { Thing.new }
	puts ObjectSpace.each_object(Thing).count # 1000 objects

	list.each do \|item\|
	GC.start
	puts ObjectSpace.each_object(Thing).count # same count as before
	# do something with the item
	end

	list = nil
	GC.start
	puts ObjectSpace.each_object(Thing).count # everything has been deallocated

	$ ruby -I . each_bang.rb
	1000
	1000
	...
	1000
	1000
	0

As expected, only when we clear the list reference does the whole list get garbage collected. We can do better by using a while loop and removing elements from the list as we process them, like this:

chp2/each_bang.rb
	class Thing; end
	list = Array.new(1000) { Thing.new } # allocate 1000 objects again
	puts ObjectSpace.each_object(Thing).count

	while list.count > 0
	GC.start # this will garbage collect item from previous iteration
	puts ObjectSpace.each_object(Thing).count # watch the counter decreasing
	item = list.shift
	end

	GC.start # this will garbage collect item from previous iteration
	puts ObjectSpace.each_object(Thing).count # watch the counter decreasing

	$ ruby -I . each_bang.rb
	1000
	999
	...
	2
	1

See how the object counter decreases as we loop through the list? I’m again running GC before each iteration to show you that all previous elements are garbage and will be collected. In the real world you wouldn’t want to force GC. Just let it do its job and your loop will neither take too much time nor run out of memory.

Don’t worry about negative effects of list modification inside the loop. GC time savings will outweight them if you process lots of objects. That happens both when your list is large and when you load linked data from these objects—for example, Rails associations.

Use the Each! Pattern

If we wrap our loop that removes items from an array during iteration into a Ruby iterator, we’ll get what its creator, Alexander Goldstein, called “Each!”. This is how the simplest each! iterator looks:

chp2/each_bang_pattern.rb
	class Array
	def each!
	while count > 0
	yield(shift)
	end
	end
	end

	Array.new(10000).each! { \|element\| puts element.class }

This implementation is not 100% idiomatic Ruby because it doesn’t return an Enumerator if there’s no block passed. But it illustrates the concept well enough. Also note how it avoids creating Proc objects from anonymous blocks (there’s no &block argument).

Avoid Iterators That Create Additional Objects

It turns out that some Ruby iterators (not all of them as we will see) internally create additional Ruby objects. Compare these two examples:

chp2/iterator_each1.rb
	GC.disable
	before = ObjectSpace.count_objects

	Array.new(10000).each do \|i\|
	[0,1].each do \|j\|
	end
	end

	after = ObjectSpace.count_objects
	puts "# of arrays: %d" % (after[:T_ARRAY] - before[:T_ARRAY])
	puts "# of nodes: %d" % (after[:T_NODE] - before[:T_NODE])

	$ ruby -I . iterator_each1.rb
	# of arrays: 10001
	# of nodes: 0

chp2/iterator_each2.rb
	GC.disable
	before = ObjectSpace.count_objects

	Array.new(10000).each do \|i\|
	[0,1].each_with_index do \|j, index\|
	end
	end

	after = ObjectSpace.count_objects
	puts "# of arrays: %d" % (after[:T_ARRAY] - before[:T_ARRAY])
	puts "# of nodes: %d" % (after[:T_NODE] - before[:T_NODE])

	$ ruby -I . iterator_each2.rb
	# of arrays: 10001
	# of nodes: 20000

As you’d expect, the code creates 10,000 temporary [0,1] arrays. But something fishy is going on with the number of T_NODE objects. Why would each_with_index create 20,000 extra objects?

The answer is in the Ruby source code. Here’s the implementation of each:

	VALUE
	rb_ary_each(VALUE array)
	{
	long i;
	volatile VALUE ary = array;
	RETURN_SIZED_ENUMERATOR(ary, 0, 0, ary_enum_length);
	for (i=0; i<RARRAY_LEN(ary); i++) {
	rb_yield(RARRAY_AREF(ary, i));
	}
	return ary;
	}

Compare it to the implementation of and each_with_index.

	enum_each_with_index(int argc, VALUE *argv, VALUE obj)
	{
	NODE *memo;
	RETURN_SIZED_ENUMERATOR(obj, argc, argv, enum_size);
	memo = NEW_MEMO(0, 0, 0);
	rb_block_call(obj, id_each, argc, argv, each_with_index_i, (VALUE)memo);
	return obj;
	}

	static VALUE
	each_with_index_i(RB_BLOCK_CALL_FUNC_ARGLIST(i, memo))
	{
	long n = RNODE(memo)->u3.cnt++;
	return rb_yield_values(2, rb_enum_values_pack(argc, argv), INT2NUM(n));
	}

Even if your C-fu is not that strong, you’ll still see that each_with_index creates an additional NODE *memo variable. Because our each_with_index loop is nested in another loop, we get to create 10,000 additional nodes. Worse, the internal function each_with_index_i allocates one more node. Thus we end up with the 20,000 extra T_NODE objects that you see in our example output.

How does that affect performance? Imagine your nested loop is executed not 10,000 times, but 1 million times. You’ll get 2 million objects created. And while they can be freed during the iteration, GC still gets way too much work to do. How’s that for an iterator that you would otherwise easily mistake for a syntactic construct?

It would be nice to know which iterators are bad for performance and which are not, wouldn’t it? I thought so, and so I calculated the number of additional T_NODE objects created per iterator. The table summarizes the results for commonly used iterators.

Table 1. Number of additional T_NODE objects created by an iterator

† Enum is Enumerable

Iterator	Enum†	Array	Range	Iterator	Enum†	Array	Range
all?	3	3	3	fill	0	—	—
any?	2	2	2	find	2	2	2
collect	0	1	1	find_all	1	1	1
cycle	0	1	1	grep	2	2	2
delete_if	0	—	0	inject	2	2	2
detect	2	2	2	map	0	1	1
each	0	0	0	none?	2	2	2
each_index	0	—	—	one?	2	2	2
each_key	—	—	0	reduce	2	2	2
each_pair	—	—	0	reject	0	1	0
each_value	—	—	0	reverse	0	—	—
each_with_index	2	2	2	reverse_each	0	1	1
each_with_object	1	1	1	select	0	1	0

Iterators that create 0 additional objects are safe to use in nested loops. But be careful with those that allocate two or even three additional objects: all?, each_with_index, inject, and others.

Looking at the table, we can also spot that iterators of the Array class, and in some cases the Hash class, behave differently. It turns out that Range and Hash use default iterator implementations from the Enumerable module, while Array reimplements most of them. That not only results in better algorithmical performance (that was the reason behind the reimplementation), but also in better memory consumption. This means that most of Array’s iterators are safe to use, with the notable exceptions of each_with_index and inject.

Watch for Iterator-Unsafe Ruby Standard Library Functions

Iterators are where the algorithmic complexity of the functions you use matters, even in Ruby. One millisecond lost in a loop with one thousand iterations translates to a one-second slowdown. Let me show which commonly used Ruby functions are slow and how to replace them with faster analogs.

Date#parse

Date parsing in Ruby has been traditionally slow, but this function is especially harmful for performance. Let’s see how much time it uses in a loop with 100,000 iterations:

chp2/date_parsing1.rb
	require 'date'
	require 'benchmark'

	date = "2014-05-23"
	time = Benchmark.realtime do
	100000.times do
	Date.parse(date)
	end
	end
	puts "%.3f" % time

	$ ruby date_parsing1.rb
	0.833

Each Date#parse call takes a minuscule 0.02 ms. But in a moderately large loop, that translates into almost one second of execution time.

A better solution is to let the date parser know which date format to use, like this:

chp2/date_parsing2.rb
	require 'date'
	require 'benchmark'

	date = "2014-05-23"
	time = Benchmark.realtime do
	100000.times do
	Date.strptime(date, '%Y-%m-%d')
	end
	end
	puts "%.3f" % time

	$ ruby date_parsing2.rb
	0.182

That is already 4.6 times faster. But avoiding date string parsing altogether is even faster:

chp2/date_parsing3.rb
	require 'date'
	require 'benchmark'

	date = "2014-05-23"
	time = Benchmark.realtime do
	100000.times do
	Date.civil(date[0,4].to_i, date[5,2].to_i, date[8,2].to_i)
	end
	end
	puts "%.3f" % time

	$ ruby date_parsing3.rb
	0.108

While slightly uglier, that code is almost eight times faster than the original, and almost two times faster than the Date#strptime version.

Object#class, Object#is_a?, Object#kind_of?

These have considerable performance overhead when used in loops or frequently used functions like constructors or == comparison operators.

chp2/class_check1.rb
	require 'benchmark'

	obj = "sample string"
	time = Benchmark.realtime do
	100000.times do
	obj.class == String
	end
	end
	puts time

	$ ruby class_check1.rb
	0.022767841

chp2/class_check2.rb
	require 'benchmark'

	obj = "sample string"
	time = Benchmark.realtime do
	100000.times do
	obj.is_a?(String)
	end
	end
	puts time

	$ ruby class_check2.rb
	0.019568893

In a moderately large loop, again 100,000 iterations, such checks take 19--22 ms. That doesn’t sound bad, except that, for example, a Rails application can call comparison operators more than 1 million times per request and spend longer than 200 ms doing type checks.

It’s a good idea to move type checking away from iterators or frequently called functions and operators. If you can’t, unfortunately there’s not much you can do about that.

BigDecimal::==(String)

Code that gets data from databases uses big decimals a lot. That is especially true for Rails applications. Such code often creates a BigDecimal from a string that it reads from a database, and then compares it directly with strings.

The catch is that the natural way to do this comparison is unbelievably slow in Ruby version 1.9.3 and lower:

chp2/bigdecimal1.rb
	require 'bigdecimal'
	require 'benchmark'

	x = BigDecimal("10.2")
	time = Benchmark.realtime do
	100000.times do
	x == "10.2"
	end
	end
	puts time

	$ rbenv shell 1.9.3-p551
	$ ruby bigdecimal1.rb
	0.773866128

	$ rbenv shell 2.0.0-p598
	$ ruby bigdecimal1.rb
	0.025224029
	$ rbenv shell 2.1.5
	$ ruby bigdecimal1.rb
	0.027570681
	$ rbenv shell 2.2.0
	$ ruby bigdecimal1.rb
	0.02474011096637696

Older Rubys have unacceptably slow implementations of the BigDecimal::== function. This performance problem goes away with a Ruby 2.0 upgrade. But if you can’t upgrade, use this smart trick. Convert a BigDecimal to a String before comparison:

chp2/bigdecimal2.rb
	require 'bigdecimal'
	require 'benchmark'

	x = BigDecimal("10.2")
	time = Benchmark.realtime do
	100000.times do
	x.to_s == "10.2"
	end
	end
	puts time

	$ rbenv shell 1.9.3-p545
	$ ruby bigdecimal2.rb
	0.195041792

This hack is three to four times faster—not forty times faster, as in the Ruby 2.x implementation, but still an improvement.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Optimize Your Iterators

Create new playlist

Sign In

Sign Up

Optimize Your Iterators

Free Objects from Collections During Iteration

Use the Each! Pattern

Avoid Iterators That Create Additional Objects

Watch for Iterator-Unsafe Ruby Standard Library Functions

Date#parse

Object#class, Object#is_a?, Object#kind_of?

BigDecimal::==(String)

Table of Contents for
Optimize Your Iterators