© Carleton DiLeo, Peter Cooper 2021
C. DiLeo, P. CooperBeginning Ruby 3https://doi.org/10.1007/978-1-4842-6324-2_11

11. Advanced Ruby Features

Carleton DiLeo1   and Peter Cooper2
(1)
Boulder, CO, USA
(2)
Louth, UK
 

In this chapter, we’re going to look at some advanced Ruby techniques that have not been covered in prior chapters. This chapter is the last instructional chapter in the second part of the book, and although we’ll be covering useful libraries, frameworks, and Ruby-related technologies in Part 3, this chapter rounds off the mandatory knowledge that any proficient Ruby programmer should have. This means that although this chapter will jump between several different topics, each is essential to becoming a professional Ruby developer.

The myriad topics covered in this chapter include how to create Ruby code dynamically on the fly, methods to make your Ruby code safe, how to issue commands to the operating system, how to integrate with Microsoft Windows, and how to create libraries for Ruby using other programming languages. Essentially, this chapter is designed to cover a range of discrete, important topics that you might find you need to use, but that fall outside the immediate scope of other chapters.

Dynamic Code Execution

As a dynamic, interpreted language, Ruby is able to execute code created dynamically. The way to do this is with the eval method, for example:
eval "puts 2 + 2"
4
Note that while 4 is displayed, 4 is not returned as the result of the whole eval expression. puts always returns nil. To return 4 from eval, you can do this:
puts eval("2 + 2")
4
Here’s a more complex example that uses strings and interpolation:
my_number = 15
my_code = %{#{my_number} * 2}
puts eval(my_code)
30

The eval method simply executes (or evaluates) the code passed to it and returns the result. The first example made eval execute puts 2 + 2, whereas the second used string interpolation to build an expression of 15 * 2, which was then evaluated and printed to the screen using puts.

Bindings

In Ruby, a binding is a reference to a context, scope, or state of execution. A binding includes things such as the current value of variables and other details of the execution environment.

It’s possible to pass a binding to eval and to have eval execute the supplied code under that binding rather than the current one. In this way, you can keep things that happen with eval separate from the main execution context of your code.

Here’s an example:
def binding_elsewhere
  x = 20
  return binding
end
remote_binding = binding_elsewhere
x = 10
eval("puts x")
eval("puts x", remote_binding)
10
20

This code demonstrates that eval accepts an optional second parameter, a binding, which in this case is returned from the binding_elsewhere method . The variable remote_binding contains a reference to the execution context within the binding_elsewhere method rather than in the main code. Therefore, when you print x, 20 is shown, as x is defined as equal to 20 in binding_elsewhere!

Note

You can obtain the binding of the current scope at any point with the Kernel module’s binding method.

Let’s build on the previous example:
eval("x = 10")
eval("x = 50", remote_binding)
eval("puts x")
eval("puts x", remote_binding)
10
50

In this example, two bindings are in play: the default binding and the remote_binding (from the binding_elsewhere method).

Therefore, even though you set x first to 10, and then to 50, you’re not dealing with the same x in each case. One x is a local variable in the current context, and the other x is a variable in the context of binding_elsewhere.

Other Forms of eval

Although eval executes code within the current context (or the context supplied with a binding), class_eval, module_eval, and instance_eval can evaluate code within the context of classes, modules, and object instances, respectively.

class_eval is ideal for adding methods to a class dynamically:
class Person
end
def add_accessor_to_person(accessor_name)
  Person.class_eval %{
    attr_accessor :#{accessor_name}
  }
end
person = Person.new
add_accessor_to_person :name
add_accessor_to_person :gender
person.name = "Carleton DiLeo"
person.gender = "male"
puts "#{person.name} is #{person.gender}"
Carleton DiLeo is male

In this example, you use the add_accessor_to_person method to add accessors dynamically to the Person class. Prior to using the add_accessor_to_person method, neither the name nor gender accessors exist within Person.

Note that the key part of the code, the class_eval method, operates by using string interpolation to create the desired code for Person:
Person.class_eval %{
  attr_accessor :#{accessor_name}
}

String interpolation makes the eval methods powerful tools for generating different features on the fly. This ability is a power unseen in the majority of programming languages, and is one that’s used to great effect in systems such as Ruby on Rails (covered in Chapter 13).

It’s possible to take the previous example a lot further and add an add_accessor method to every class by putting your class_eval cleverness in a new method, defined within the Class class (from which all other classes descend):
class Class
  def add_accessor(accessor_name)
    self.class_eval %{
      attr_accessor :#{accessor_name}
    }
  end
end
class Person
end
person = Person.new
Person.add_accessor :name
Person.add_accessor :gender
person.name = "Carleton DiLeo"
person.gender = "male"
puts "#{person.name} is #{person.gender}"

In this example, you add the add_accessor method to the Class class, thereby adding it to every other class defined within your program. This makes it possible to add accessors to any class dynamically, by calling add_accessor. (If the logic of this approach isn’t clear, make sure to try this code yourself, step through each process, and establish what is occurring at each step of execution.)

The technique used in the previous example also lets you define classes like this:
class SomethingElse
  add_accessor :whatever
end

Because add_accessor is being used within a class, the method call will work its way up to the add_accessor method defined in class Class.

Moving back to simpler techniques, using instance_eval is somewhat like using regular eval, but within the context of an object (rather than a method). In this example, you use instance_eval to execute code within the scope of an object:
class MyClass
  def initialize
    @my_variable = 'Hello, world!'
  end
end
obj = MyClass.new
obj.instance_eval { puts @my_variable }
Hello, world!

Creating Your Own Version of attr_accessor

So far, you’ve used the attr_accessor method within your classes to generate accessor functions for instance variables quickly. For example, in longhand you might have this code:
class Person
  def name
    @name
  end
  def name=(name)
    @name = name
  end
end
This allows you to do things such as puts person.name and person.name = 'Fred'. Alternatively, however, you can use attr_accessor:
class Person
  attr_accessor :name
end

This version of the class is more concise and has exactly the same functionality as the longhand version. Now it’s time to ask the question, how does attr_accessor work?

It turns out that attr_accessor isn’t as magical as it looks, and it’s extremely easy to implement your own version using eval. Consider this code:
class Class
  def add_accessor(accessor_name)
    self.class_eval %{
      def #{accessor_name}
        @#{accessor_name}
      end
      def #{accessor_name}=(value)
        @#{accessor_name} = value
      end
    }
  end
end

At first, this code looks complex, but it’s very similar to the add_accessor code you created in the previous section. You use class_eval to define getter and setter methods dynamically for the attribute within the current class.

If accessor_name is equal to name, then the code that class_eval is executing is equivalent to this code:
def name
  @name
end
def name=(value)
  @name = value
end

Thus, you have duplicated the functionality of attr_accessor.

You can use this technique to create a multitude of different “code generators” and methods that can act as a “macro” language to perform things in Ruby that are otherwise lengthy to type out.

Running Other Programs from Ruby

Often, it’s useful to be able to run other programs on the system from your own programs. In this way, you can reduce the amount of features your program needs to implement, as you can pass off work to other programs that are already written. It can also be useful to hook up several of your own programs so that functionality is spread among them. Rather than using the RPC systems covered in the previous chapter, you can simply run other programs from your own with one of a few different methods made available by Ruby.

Getting Results from Other Programs

There are three simple ways to run another program from within Ruby: the system method (defined in the Kernel module), backtick syntax (``), and delimited input literals (%x{}). Using system is ideal when you want to run another program and aren’t concerned with its output, whereas you should use backticks when you want the output of the remote program returned.

These lines demonstrate two ways of running the system’s directory list program:

On OS X or Linux:
x = system("ls")
x = `ls`
On Windows:
x = system("dir")
x = `dir`

For the first line, the list program output displays in the console and x equals true. For the second line, x contains the output of the list command. Which method you use depends on what you’re trying to achieve. If you don’t want the output of the other program to show on the same screen as that of your Ruby script, then use backticks (or a literal, %x{}).

Note

%x{} is functionally equivalent to using backticks, for example, %x{ls} or %x{dir}.

Transferring Execution to Another Program

Sometimes it’s desirable to jump immediately to another program and cease execution of the current program. This is useful if you have a multistep process and have written an application for each. To end the current program and invoke another, simply use the exec method in place of system, for example:
exec "ruby another_script.rb"
puts "This will never be displayed"

In this example, execution is transferred to a different program, and the current program ceases immediately—the second line is never executed.

Running Two Programs at the Same Time

Forking is where an instance of a program (a process) duplicates itself, resulting in two processes of that program running concurrently. You can run other programs from this second process by using exec, and the first (parent) process will continue running the original program.

fork is a method provided by the Kernel module that creates a fork of the current process. It returns the child process’s process ID in the parent, but nil in the child process—you can use this to determine which process a script is in. The following example forks the current process into two processes and only executes the exec command within the child process (the process generated by the fork):
if fork.nil?
  exec "ruby some_other_file.rb"
end
puts "This Ruby script now runs alongside some_other_file.rb"
Caution

Don’t run the preceding code from irb. If irb forks, you’ll end up with two copies of irb running simultaneously, and the result will be unpredictable.

If the other program (being run by exec) is expected to finish at some point and you want to wait for it to finish executing before doing something in the parent program, you can use Process.wait to wait for all child processes to finish before continuing. Here’s an example:
child = fork do
  sleep 3
  puts "Child says 'hi'!"
end
puts "Waiting for the child process..."
Process.wait child
puts "All done!"
Waiting for the child process...
<3 second delay>
Child says 'hi'!
All done!
Note

Forking is not possible with the Windows version of Ruby, as POSIX-style forking is not natively supported on that platform. You will use the spawn() method instead. More information at https://ruby-doc.org/core/Kernel.html#method-i-spawn.

Interacting with Another Program

The previous methods are fine for simple situations where you just want to get basic results from a remote program and don’t need to interact directly with it in any way while it’s running. However, sometimes you might want to pass data back and forth between two separate programs.

Ruby’s IO module has a popen method that allows you to run another program and have an I/O stream between it and the current program. The I/O stream between programs works like the other types of I/O streams we looked at in Chapter 9, but instead of reading and writing to a file, you’re reading and writing to another program. Obviously, this technique only works successfully with programs that accept direct input and produce direct output at a command prompt level (so not GUI applications).

Here’s a simple read-only example:
ls = IO.popen("ls", "r")
while line = ls.gets
  puts line
end
ls.close

In this example, you open an I/O stream with ls (the UNIX command to list the contents of the current directory—try it with dir if you’re using Microsoft Windows). You read the lines one by one, as with other forms of I/O streams, and close the stream when you’re done.

Similarly, you can also open a program with a read/write I/O stream and handle data in both directions:
handle = IO.popen("other_program", "r+")
handle.puts "send input to other program"
handle.close_write
while line = handle.gets
  puts line
end
Note

The reason for handle.close_write is to close the I/O stream’s writing stream, thereby sending any data waiting to be written out to the remote program. IO also has a flush method that can be used if the write stream needs to remain open.

Threads

Thread is short for thread of execution. You use threads to split the execution of a program into multiple parts that can be run concurrently. For example, a program designed to email thousands of people at once might split the task between 20 different threads that all send email at once. Such parallelism is faster than processing one item after another, especially on systems with more than one CPU, because different threads of execution can be run on different processors. It can also be faster because rather than wasting time waiting for a response from a remote machine, you can continue with other operations.

Ruby 1.8 didn’t support threads in the traditional sense. Typically, threading capabilities are provided by the operating system and vary from one system to another. However, Ruby 1.8 provided Ruby’s threading capabilities directly which meant they lacked some of the power of traditional system-level threads. In Ruby 1.9, Ruby began to use system-based threads, and this is now the default expectation among Rubyists.

While Ruby 1.9 and 2.x’s threads are system (native) threads, in order to remain compatible with 1.8 code, a global interpreter lock (GIL) has been left in place so that threads do not truly run simultaneously. This means that all of what is covered in this section is relevant to all of 1.8, 1.9, 2.0, and beyond. A Ruby 1.9-and-beyond–only alternative, fibers, is covered in the next primary section of this chapter which now supports non-blocking concurrency.

Basic Ruby Threads in Action

Here’s a basic demonstration of Ruby threading in action:
threads = []
10.times do
  thread = Thread.new do
    10.times { |i| print i; $stdout.flush; sleep rand(2) }
  end
  threads << thread
end
threads.each { |thread| thread.join }

You create an array to hold your Thread objects so that you can easily keep track of them. Then you create ten threads, sending the block of code to be executed in each thread to Thread.new, and add each generated thread to the array.

Note

When you create a thread, it can access any variables that are within scope at that point. However, any local variables that are then created within the thread are entirely local to that thread. This is similar to the behavior of other types of code blocks.

Once you’ve created the threads, you wait for all of them to complete before the program finishes. You wait by looping through all the thread objects in threads and calling each thread’s join method. The join method makes the main program wait until a thread’s execution is complete before continuing. In this way, you make sure all the threads are complete before exiting.

The preceding program results in output similar to the following (the variation is due to the randomness of the sleeping):
001012000100101012123121242325123234532343366345443655467445487765578866897567656797
9789878889899999

The example has created ten Ruby threads whose sole job is to count and sleep randomly. This results in the preceding pseudo-random output.

Rather than sleeping, the threads could have been fetching web pages, performing math operations, or sending emails. In fact, Ruby threads are ideal for almost every situation where concurrency within a single Ruby program is desired.

Note

In Chapter 15, you’ll be using threads to create a server that creates new threads of execution for each client that connects to it, so that you can develop a simple chat system.

Advanced Thread Operations

As you’ve seen, creating and running basic threads is fairly simple, but threads also offer a number of advanced features. These are discussed in the following subsections.

Waiting for Threads to Finish Redux

When you waited for your threads to finish by using the join method, you could have specified a timeout value (in seconds) for which to wait. If the thread doesn’t finish within that time, join returns nil. Here’s an example where each thread is given only one second to execute:
threads.each do |thread|
  puts "Thread #{thread.object_id} didn't finish in 1s" unless thread.join(1)
end

Getting a List of All Threads

It’s possible to get a global list of all threads running within your program using Thread.list. In fact, if you didn’t want to keep your own store of threads, you could rewrite the earlier example from the section “Basic Ruby Threads in Action” down to these two lines:
10.times { Thread.new { 10.times { |i| print i; $stdout.flush; sleep rand(2) } } } Thread.list.each { |thread| thread.join unless thread == Thread.main }

However, keeping your own list of threads is essential if you’re likely to have more than one group of threads working within an application and you want to keep them separate from one another when it comes to using join or other features.

The list of threads also includes the main thread representing the main program’s thread of execution, which is why we explicitly do not join it in the prior code.

Thread Operations from Within Threads Themselves

Threads aren’t just tiny, dumb fragments of code. They have the ability to talk with the Ruby thread scheduler and provide updates on their status. For example, a thread can stop itself:
Thread.new do
  10.times do |i|
    print i
    $stdout.flush
    Thread.stop
  end
end
Every time the thread created in this example prints a number to the screen, it stops itself. It can then only be restarted or resumed by the parent program calling the run method on the thread, like so:
Thread.list.each { |thread| thread.run }
A thread can also tell the Ruby thread scheduler that it wants to pass execution over to another thread. The technique of voluntarily ceding control to another thread is often known as cooperative multitasking, because the thread or process itself is saying that it’s okay to pass execution on to another thread or process. Used properly, cooperative multitasking can make threading even more efficient, as you can code in pass requests at ideal locations. Here’s an example showing how to cede control from a thread:
2.times { Thread.new { 10.times { |i| print i; $stdout.flush; Thread.pass } } } Thread.list.each { |thread| thread.join unless thread == Thread.main }
00112233445566778899

In this example, execution flip-flops between the two threads, causing the pattern shown in the results.

Fibers

Fibers offer an alternative to threads in Ruby 1.9 and beyond. In Ruby 3, Fiber was rewritten, so it no longer blocks on IO operations and supports non-blocking fibers. Fibers are lightweight units of execution that control their own scheduling (often referred to as cooperative scheduling). Whereas threads will typically run continually, fibers hand over control once they have performed certain tasks. Unlike regular methods, however, once a fiber hands over control, it continues to exist and can be resumed at will.

In short, fibers are pragmatically similar to threads, but fibers aren’t scheduled to all run together. You have to manually control the scheduling.

A Fiber in Action

Nothing will demonstrate fibers as succinctly as a demonstration, so let’s look at a very simple implementation to generate a sequence of square numbers:
sg = Fiber.new do
  s = 0
  loop do
    square = s * s
    Fiber.yield square
    s += 1
  end
end
10.times { puts sg.resume }
0
1
4
9
16
25
36
49
64
81

In this example, we create a fiber using a block, much in the same style as we created threads earlier. The difference, however, is that the fiber will run solely on its own until the Fiber.yield method is used to yield control back to whatever last told the fiber to run (which, in this case, is the sg.resume method call). Alternatively, if the fiber “ends,” the value of the last executed expression is returned.

In this example, it’s worth noting that you don’t have to use the fiber forever, although since the fiber contains an infinite loop, it would certainly be possible to do so. Even though the fiber contains an infinite loop, however, the fiber is not continually running, so it results in no performance issues.

If you do develop a fiber that has a natural ending point, calling its resume method once it has concluded will result in an exception (which, of course, you can catch—refer to Chapter 8’s “Handling Exceptions” section) that states you are trying to resume a dead fiber.

Passing Data to a Fiber

It is possible to pass data back into a fiber when you resume its execution as well as receive data from it. For example, let’s tweak the square number generator fiber to support receiving back an optional new base from which to provide square numbers:
sg = Fiber.new do
  s = 0
  loop do
    square = s * s
    s += 1
    s = Fiber.yield(square) || s
  end
end
puts sg.resume
puts sg.resume
puts sg.resume
puts sg.resume
puts sg.resume 40
puts sg.resume
puts sg.resume
puts sg.resume 0
puts sg.resume
puts sg.resume
0
1
4
9
1600
1681
1764
0
1
4

In this case, we start out by getting back square numbers one at a time as before. On the fifth attempt, however, we pass back the number 40, which is then assigned to the fiber’s s variable and used to generate square numbers. After a couple of iterations, we then reset the counter to 0. The number is received by the fiber as the result of calling Fiber.yield.

It is not possible to send data into the fiber in this way with the first resume, however, since the first resume call does not follow on from the fiber yielding or concluding in any way. In that case, any data you passed is passed into the fiber block, much as if it were a method.

Non-blocking Fiber

Ruby 3 introduces the ability to create non-blocking fibers. Creating a non-blocking fiber is simple: specify the parameter blocking: false in the constructor. This option prevents blocking on blocking operations such as I/O, sleep, and so on:
non_blocking = Fiber.new(blocking: false) do
  puts "Blocking Fiber? #{Fiber.current.blocking?}"
  # Will not block
  sleep 2
end
3.times { puts non_blocking.resume }
Blocking Fiber? false
Blocking Fiber? false
Blocking Fiber? false

When used correctly, non-blocking fibers will increase performance since multiple operations are performed at once. Since non-blocking fibers are opt-in, Ruby 3 will not break existing code. By default, all I/O operations in fiber are non-blocking with Ruby 3.

Why Fibers?

A motivation to use fibers over threads in some situations is efficiency. Creating hundreds of fibers is a lot faster than creating the equivalent threads, since threads are created at the operating system level. There are also significant memory efficiency benefits.

One of the greatest benefits of fibers is in implementing lightweight I/O management routines within other libraries, so even if you don’t use fibers directly, you might still end up benefiting from their use elsewhere.

Unicode, Character Encodings, and UTF-8 Support

Unicode is the industry standard way of representing characters from every writing system (character set) in the world. It’s the only viable way to be able to manage multiple different alphabets and character sets in a reasonably standard context.

One of Ruby 1.8’s most cited flaws was in the way it dealt with character encodings—namely, hardly at all. There were some workarounds, but they were hackish. Ruby 1.8 treated strings as simple collections of bytes rather than true characters, which is just fine if you’re using a standard English character set, but if you wanted to work with, say, Arabic or Japanese, you have problems!

Ruby 1.9 and beyond, on the other hand, support Unicode, alternative character sets, and encodings out of the box. In this chapter, we’ll focus on the direct support in Ruby 1.9 and up.

Note

For a full rundown of Unicode and how it works and relates to software development, read www.joelonsoftware.com/articles/Unicode.html. The official Unicode site, at http://unicode.org/, also has specifications and further details.

Ruby 1.9 and Beyond’s Character Encoding Support

Unlike with Ruby 1.8, no hacks or workarounds are necessary to work with multiple character sets and encodings in Ruby 1.9 and above. Ruby 1.9 supports a large number of encodings out of the box (over 100 at the time of writing), and the interface is seamless. You not only get character encoding support for strings within your programs, but for your source code itself too.

Note

Encoding.list returns an array of Encoding objects that represent the different character encodings that your Ruby interpreter supports.

Strings

Strings have encoding support out of the box. To determine the current encoding for a string, you can call its encoding method:
"this is a test".encoding
=> #<Encoding:US-ASCII>
By default, a regular ASCII string will be encoded using the US-ASCII, UTF-8, or CP850 encodings, depending on how your system is set up, but if you get a bit more elaborate, then UTF-8 (a character encoding that can be used to represent any Unicode character) will typically be used:
"ça va?".encoding
=> #<Encoding:UTF-8>
To convert a string into a different encoding, use its encode method:
"ça va?".encode("ISO-8859-1")
Not every character encoding will support being able to represent every type of character that exists in your text. For example, the cedilla character (ç) in the preceding example cannot be represented in plain US-ASCII. If we try to do a conversion to US-ASCII, therefore, we get the necessary error:
"ça va?".encode("US-ASCII")
Encoding::UndefinedConversionError: "xC3xA7" from UTF-8 to US-ASCII

I would personally suggest that, where possible, you try and use the UTF-8 encoding exclusively in any apps that are likely to accept input from people typing in many different languages. UTF-8 is an excellent “global” encoding that can represent any character in the Unicode standard, so using it globally throughout your projects will ensure that everything works as expected.

Tip

Make sure to refer to Chapter 9 to see how to open files and read data that is in different character encodings.

Source Code

As well as supporting character encodings out of the box for strings and files, Ruby 1.9 and beyond also allow you to use any of the supported character sets for your actual source code files.

All you need to do is include a comment on the first or second line (in case you’re using a shebang line) that contains coding: [format name], for example:
# coding: utf-8

The primary reason for doing this is so that you can use UTF-8 (or whichever encoding you choose to specify) within literal strings defined with your source files without running into snags with String#length, regular expressions, and the like.

Another fun (but not endorsed by me!) option is to use alternate non-ASCII characters in method names, variable names, and so forth. The danger of this, of course, is that you reduce the usability of your code with developers who might prefer to use other encodings.

Summary

In this chapter, we looked at an array of advanced Ruby topics, from dynamic code execution to writing high-performance functions in the C programming language. This is the last chapter that covers general Ruby-related knowledge that any intermediate Ruby programmer should be familiar with. In Chapter 12, we’ll be taking a different approach and will develop an entire Ruby application, much as we did in Chapter 4.

Let’s reflect on the main concepts covered in this chapter:
  • Binding: A representation of a scope (execution) context as an object.

  • Forking: When an instance of a program duplicates itself into two processes, one as a parent and one as a child, both continuing execution.

  • Threads: Separate “strands” of execution that run concurrently with each other. Ruby’s threads in 1.8 were implemented entirely by the Ruby interpreter, but since Ruby 1.9 use system-based threads, and are a commonly used tool in application development.

  • Fibers: Lightweight cooperative alternatives to threads. They must yield execution in order to be scheduled.

  • Character encoding: This describes a system and code that pair characters (whether they’re Roman letters, Chinese symbols, Arabic letters, etc.) to a set of numbers that a computer can use to represent those characters.

  • UTF-8 (Unicode Transformation Format-8): This is a character encoding that can support any character in the Unicode standard. It supports variable-length characters, and is designed to support ASCII coding natively, while also providing the ability to use up to four bytes to represent characters from other character sets.

Now you can move on to Chapter 12, where you’ll develop an entire Ruby application using much of the knowledge obtained in this book so far.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.118.23.147