Save Memory

The first step to make your application faster is to save memory. Every time you create or copy something in memory, you add work for GC. Let’s look at the best practices to write code that doesn’t use too much memory.

Modify Strings in Place

Ruby programs use a lot of strings, and copy them a lot. In most cases they really shouldn’t. You can do most string manipulations in place, meaning that instead of making a changed copy, you change the original.

Ruby has a bunch of “bang!” functions for in-place modification. Those are gsub!, capitalize!, downcase!, upcase!, delete!, reverse!, slice!, and others. It’s always a good idea to use them as much as you can when you no longer need the original string.

chp2/string_in_place1.rb
Line 1 
require ​'wrapper'
str = ​"X"​ * 1024 * 1024 * 10 ​# 10 MB string
measure ​do
str = str.downcase
end
measure ​do
str.downcase!
end
 
$ ​ruby -I . string_in_place1.rb --no-gc
 
{"2.2.0":{"gc":"disabled","time":0.02,"gc_count":0,"memory":"9 MB"}}
 
{"2.2.0":{"gc":"disabled","time":0.01,"gc_count":0,"memory":"0 MB"}}

The String#downcase call on line 5 allocates another 10 MB in memory to copy a string, then changes it to lowercase. The bang version of the same function on line 8 does not need any extra memory. And that’s exactly what we see in the measurements.

Another useful in-place modification function is String::<<. It concatenates strings by appending a new string to the original. When asked to append one string to another, most developers write this:

 
x = ​"foo"
 
x += ​"bar"

This code is equivalent to

 
x = ​"foo"
 
y = x + ​"bar"
 
x = y

Here Ruby allocates extra memory to store the result of the concatenation. The same code using the shift operator will need no additional memory if your resulting string is less than 40 bytes (on a 64-bit architecture; more on that later here). If the string is larger than that, Ruby will only allocate enough memory to store the appended string. So next time, write this instead:

 
x = ​"foo"
 
x << ​"bar"

Behind the scenes the String#<< may not be able to increase the size of the original string to do a true in-place modification. In this case it may have to move the string data in memory into the new location. However, that happens in the realloc() C library function behind Ruby’s back and does not trigger GC.

Another thing worth pointing out is that “bang!” functions are not guaranteed to do an in-place modification. Most of them do, but that’s implementation dependent. So don’t be surprised when one of them doesn’t optimize anything.

Modify Arrays and Hashes in Place

Like strings, arrays and hashes can be modified in place. If you look at the Ruby API documentation, you’ll again see “bang!” functions like map!, select!, reject!, and others. The idea is the same: do not create a modified copy of the same array unless really necessary.

String, array, and hash in-place modification functions are extremely powerful when used together. Compare these two examples:

chp2/combined_in_place1.rb
 
require ​'wrapper'
 
 
data = Array.new(100) { ​"x"​ * 1024 * 1024 }
 
 
measure ​do
 
data.map { |str| str.upcase }
 
end
chp2/combined_in_place2.rb
 
require ​'wrapper'
 
 
data = Array.new(100) { ​"x"​ * 1024 * 1024 }
 
 
measure ​do
 
data.map! { |str| str.upcase! }
 
end

map and upcase

map! and upcase!

Total time

0.22 s

0.14 s

Extra memory

100 MB

0 MB

# of GC calls

3

0

See how this code got 35% faster by simply adding two “!” characters? Easy optimization, isn’t it? The second example gives no work to GC at all despite crunching through 100 MB of data.

Read Files Line by Line

It takes memory to read the whole file. We expect that, of course, and sometimes willingly do that for convenience. But as usual with Ruby, it takes a toll on memory. How big is the overhead? It’s insignificant if you just read the file. For example, reading the 26 MB data.csv file[2] takes exactly 26 MB of memory.

chp2/file_reading1.rb
 
require ​'wrapper'
 
measure ​do
 
File.read(​"data.csv"​)
 
end
 
$ ​ruby -I . file_reading1.rb --no-gc
 
{"2.2.0":{"gc":"disabled","time":0.02,"gc_count":0,"memory":"25 MB"}}

Here we simply create one File object (it takes just 40 bytes on a 64-bit architecture) and store the 26 MB string there. No extra memory is used.

Things rapidly become less perfect when we try to parse the file. For example, it takes 158 MB to split the same CSV file into lines and columns.

chp2/file_reading2.rb
 
require ​'wrapper'
 
measure ​do
 
File.readlines(​"data.csv"​).map! { |line| line.split(​","​) }
 
end
 
$ ​ruby -I . file_reading2.rb --no-gc
 
{"2.2.0":{"gc":"disabled","time":0.45,"gc_count":0,"memory":"186 MB"}}

What does Ruby use this memory for? The file has about 163,000 rows of data in 11 columns. So, to store the parsed contents we should allocate 163,000 objects for rows and 1,793,000 objects for columns—1,956,000 objects in total. On a 64-bit architecture that requires approximately 75 MB. Together with 26 MB necessary to read the file, our program needs at least 101 MB of memory. In addition to that, not all strings are small enough to fit into 40-byte Ruby objects. Ruby will allocate more memory to store them. That’s what the remaining 85 MB are used for. As the result, our simple program takes seven times the size of our data after parsing.

The Ruby CSV parser takes even more. It needs 346 MB of memory, 13 times the data size.

chp2/file_reading3.rb
 
require ​'wrapper'
 
require ​'csv'
 
 
measure ​do
 
CSV.read(​"data.csv"​)
 
end
 
$ ​ruby -I . file_reading3.rb --no-gc
 
{"2.2.0":{"gc":"disabled","time":2.66,"gc_count":0,"memory":"368 MB"}}

This memory consumption math is really disturbing. In my experience, the size of the data after parsing increases anywhere from two up to ten times depending on the nature of the data in real-world applications. That’s a lot of work for Ruby GC.

The solution? Read and parse data files line by line as much as possible. In the previous chapter we did that for the CSV file and got a two times speedup.

Whenever you can, read files line by line, as in this example:

chp2/file_reading4.rb
 
require ​'wrapper'
 
 
measure ​do
 
file = File.open(​"data.csv"​, ​"r"​)
 
while​ line = file.gets
 
line.split(​","​)
 
end
 
end

And do the same with CSV files:

chp2/file_reading5.rb
 
require ​'csv'
 
require ​'wrapper'
 
 
measure ​do
 
file = CSV.open(​"data.csv"​)
 
while​ line = file.readline
 
end
 
end

Now, let’s measure these examples with our wrapper code. To our surprise, memory allocation is about the same as before: 171 MB and 367 MB.

 
$ ​ruby -I . file_reading4.rb --no-gc
 
{"2.2.0":{"gc":"disabled","time":0.45,"gc_count":0,"memory":"171 MB"}}
 
$ ​ruby -I . file_reading5.rb --no-gc
 
{"2.2.0":{"gc":"disabled","time":2.64,"gc_count":0,"memory":"367 MB"}}

But if you think about this a little more, you’ll understand. It doesn’t matter how we parse the file—in one go, or line by line. We’ll end up allocating the same amount of memory anyway. And look at execution time. It’s the same as before. What’s the deal?

We’ve been measuring the total amount of memory allocated. That makes sense when we want to know exactly how much memory in total a certain snippet of code needs. But it doesn’t tell us anything about peak memory consumption. During program execution, GC will deallocate unused memory. This will reduce both peak memory consumption and GC time because there’s much less data held in memory at any given moment.

When we read a file line by line, we’re telling Ruby that we don’t need the previous lines anymore. GC will then collect them as your program executes. So, to see the optimization, you need to turn on GC. Let’s do that and compare before and after numbers.

Before optimization:

 
$ ​ruby -I . file_reading2.rb
 
{"2.2.0":{"gc":"enabled","time":0.68,"gc_count":11,"memory":"144 MB"}}
 
$ ​ruby -I . file_reading3.rb
 
{"2.2.0":{"gc":"enabled","time":3.25,"gc_count":17,"memory":"175 MB"}}

After optimization:

 
$ ​ruby -I . file_reading4.rb
 
{"2.2.0":{"gc":"enabled","time":0.44,"gc_count":106,"memory":"0 MB"}}
 
$ ​ruby -I . file_reading5.rb
 
{"2.2.0":{"gc":"enabled","time":2.62,"gc_count":246,"memory":"1 MB"}}

Now you see why reading files line by line is such a good idea. First, you’ll end up using almost no additional memory. In fact, you’ll end up storing just the line you are processing and any previous lines that were allocated after the last GC call. Second, the program will run faster. Speedup depends on the data size; in our examples it is 35% for plain file reading and 20% for CSV parsing.

Joe asks:
Joe asks:
Optimized CSV Parsing Example Runs GC Way More Often but Still Finishes Faster. What Gives?

Ruby 2.2 has incremental garbage collection that runs more often, but collects only a small part of object space. That’s why you see several hundreds of GC runs. For our example this works best, as GC runs once per about 1,600 rows processed (163,000 rows divided by 106 collections in the plain file parsing example). This amounts to only 260k of additional memory needed for the parsing at any given moment during the program execution. Our example reports 0 MB of additional memory because it does the rounding.

The math will be different for older Rubys, but expect the end result to be similar. You will see the optimization with any Ruby version. Go check it yourself!

Watch for Memory Leaks Caused by Callbacks

Rails developers know and use callbacks a lot. But when done wrong, callbacks can hurt performance. For example, let’s write a logger object that will lazily record object creation. For that, instead of writing the output right away, it will log events and replay them later all at once. It is tempting to implement the event logger using Ruby closures (lambdas or Procs) like this:

chp2/callbacks1.rb
 
module​ Logger
 
extend self
 
attr_accessor :output, :log_actions
 
 
def​ log(&event)
 
self.log_actions ||= []
 
self.log_actions << event
 
end
 
 
def​ play
 
output = []
 
log_actions.each { |e| e.call(output) }
 
puts output.join(​" "​)
 
end
 
end
 
 
class​ Thing
 
def​ initialize(id)
 
Logger.log { |output| output << ​"created thing ​#{id}​"​ }
 
end
 
end
 
 
def​ do_something
 
1000.times { |i| Thing.new(i) }
 
end
 
 
do_something
 
GC.start
 
Logger.play
 
puts ObjectSpace.each_object(Thing).count

We log an event by storing a block of code that gets executed later. The code actually looks quite cool. At least I feel cool every time I use bits of functional programming in Ruby.

Unfortunately, when I write something cool or smart, it tends to turn out slow and inefficient. The same thing happens here. Such logging will keep the references to all created objects even if we don’t need them. So add the following lines to the end of the program and run it:

 
GC.start ​# collect all unused objects
 
puts ObjectSpace.each_object(Thing).count
 
$ ​ruby -I . callbacks1.rb
 
created thing 0
 
created thing 1
 
...
 
created thing 999
 
1000

After we’re done with the do_something, we don’t really need all one thousand of these Thing objects. But even an explicit GC.start call does not collect them. What’s going on?

Callbacks stored in the Logger class are the reason the objects are still there. When you pass an anonymous block in the Thing constructor to the Logger#log function, Ruby converts it into the Proc object and stores references to all objects in the block’s execution context. That includes the Thing instance. In this way we end up keeping references from the Logger object to all one thousand instances of Thing. It’s a classic example of a memory leak.

A dumbed-down version of the Logger class will look less cool, but will prevent the memory leak. You can of course write an even more dumb version that doesn’t use any callbacks at all, but I’ll keep them for this example.

chp2/callbacks2.rb
 
module​ Logger
 
extend self
 
attr_accessor :output
 
 
def​ log(&event)
 
self.output ||= []
 
event.call(output)
 
end
 
 
def​ play
 
puts output.join(​" "​)
 
end
 
end
 
 
class​ Thing
 
def​ initialize(id)
 
Logger.log { |output| output << ​"created thing ​#{id}​"​ }
 
end
 
end
 
 
def​ do_something
 
1000.times { |i| Thing.new(i) }
 
end
 
 
do_something
 
GC.start
 
Logger.play
 
puts ObjectSpace.each_object(Thing).count
 
$ ​ruby -I . callbacks1.rb
 
created thing 0
 
created thing 1
 
...
 
created thing 999
 
0

In this case no memory is leaked and all Thing objects are garbage collected.

So be careful every time you create a block or Proc callback. Remember, if you store it somewhere, you will also keep references to its execution context. That not only hurts the performance, but also might even leak memory.

Are All Anonymous Blocks Dangerous to Performance?

Anonymous blocks do not store execution context unless they are converted to Proc objects. When calling a function that takes an anonymous block, Ruby stores a reference to the caller’s stack frame. It’s OK to do that since the callee is guaranteed to exit before the caller’s stack frame is popped. When calling a function that takes a named block, Ruby assumes that this block is long-lived and clones the execution context right there.

An obvious case of anonymous block to Proc conversion is when your receiving function defines the &block argument.

 
def​ take_block(&block)
 
block.call(args)
 
end
 
take_block { |args| do_something(args) }

It’s a good idea to change such code to use anonymous blocks. We don’t really need the Proc conversion since the block is simply executed, and never stored as in the logger example in the previous section.

 
def​ take_block
 
yield​(args)
 
end
 
take_block { |args| do_something(args) }

However, it’s not always clear when conversion happens. It may be hidden well down into the call stack, or even happen in C code inside the Ruby interpreter. Let’s look at this example:

chp2/signal1.rb
Line 1 
class​ LargeObject
def​ initialize
@data = ​"x"​ * 1024 * 1024 * 20
end
end
def​ do_something
obj = LargeObject.new
trap(​"TERM"​) { puts obj.inspect }
10 
end
do_something
# force major GC to make sure we free all objects that can be freed
GC.start(full_mark: true, immediate_sweep: true)
15 
puts ​"LargeObject instances left in memory: %d"​ %
ObjectSpace.each_object(LargeObject).count
 
$ ​ruby -I . signal1.rb
 
LargeObject instances left in memory: 1

This example behaves suspiciously similar to what we saw with the smart logger in the previous section. It leaves one large object behind. There’s only one place in the code that could cause that. Line 9 passes an anonymous block to the trap function. A quick look at the source code[3] reveals that the trap implementation calls cmd = rb_block_proc(); that indeed converts the block to Proc behind the scenes.

If you comment out line 9, the program will report 0 large objects left after execution.

So, if you suspect memory leaks in named blocks, you’ll have to review the code down the stack—at least down to the Ruby standard library, including functions implemented in C. It’s not as hard as it sounds. You can always look up the function implementation in the Ruby API docs from the website. Ruby source code is well written and clean. You’ll be able to make sense of it even if you don’t know any C, as with the trap example earlier.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.9.169