In this chapter, we’re going to step back from focusing on individual facets of Ruby and develop an entire program using much of the knowledge you’ve gained so far. We’ll focus on the structural concerns of developing a program and look at how a flexible structure can benefit you and other developers in the long run.
The important thing to remember while working through this chapter is that the program itself is not as important as the concepts used while developing it. We’ll be rapidly (and relatively shallowly) covering a number of areas relevant to most development you’ll do, such as testing and basic refactoring.
Let’s Build a Bot
Before we get to any code, we’re going to look at what we’re going to build, why we’re going to build it, and how we’re going to do it.
What Is a Bot?
In this chapter, we’re going to build a robot. Not a sci-fi type of robot, such as that in Lost in Space, but a computer program that can hold a conversation with us. These types of programs are commonly known as bots or chatterbots . Bots are present in a lot of different software and tools these days. You can ask them for gift ideas and movie times. In short, it’s a little like talking to a customer service agent, except the agent is entirely automated.
You might be familiar with bots on your own computer. Microsoft Office used to come with the “Clippy” bot turned on by default, and many websites have automated chatbots in an attempt to cut down on support costs and, supposedly, to improve usability.
The history of bots goes back to the 1960s, when a computer scientist at MIT named Joseph Weizenbaum developed a bot called ELIZA. It eventually became so popular that most computer users throughout the 1980s and 1990s were exposed to it in one form or another through the many “talk to your computer”–type programs that became popular.
Our bot won’t be exactly like ELIZA—that is, it won’t be an ELIZA clone—but will share some of the same features and use some similar techniques. We’ll also look at how to extend our bot with other features.
If you want to learn about or play with some Internet-hosted versions of ELIZA, visit https://en.wikipedia.org/wiki/ELIZA.
Why a Bot?
The good thing about developing a bot is that it can be as simple or as complex as you like. Toward the end of this chapter, we’ll be looking at ways you can extend the bot, but the initial construction is quite simple.
You’ll be using most of the techniques covered so far in this book to build your bot. You’ll be doing a bit of testing and documentation, as well as using classes and complex data structures. You’ll also be using files to store information the bot uses, and looking at how to make your bot available to the general public using HTTP servers and CGI scripts. This project also demands you use a lot of string and list-related functions, along with comparison logic. These are all things you’re likely to use in a larger development project, and as Ruby is a particularly good language for text processing, this project is perfect for demonstrating Ruby’s strengths.
A bot also allows you to have some fun and experiment. Working on a contact information management tool (for example) isn’t that much fun, even though such a system would use similar techniques to your bot. You can still implement testing, documentation, classes, and storage systems, but end up with a fun result that can be extended and improved indefinitely.
How?
The primary focus of this chapter is to keep each fragment of functionality in your bot loosely coupled to the others. This is an important decision when developing certain types of applications if you plan to extend them in the future. The plan for this bot is to make it as easy to extend, or change, as possible, allowing you to customize it, add features, and make it your own.
In terms of the general operation of the chatterbot, your bot will exist within a class, allowing you to replicate bots easily by creating new instances. When you create a bot, it will be “blank,” except for the logic contained within the class, and you’ll pass in a special data file to give it a set of knowledge and a set of responses it can use when conversing with users. User input will be via the keyboard, but the input mechanism will be kept flexible enough so that the bot could easily be used from a website or elsewhere.
Your bot will only have a few public methods to begin with. It needs to be able to load its data file into memory and accept input given by the user and then return its responses. Behind the scenes, the bot will need to parse what the users “say” and be able to build up a coherent reply. Therefore, the first step is to begin processing language and recognizing words.
Creating a Simple Text Processing Library
Several stages are required to accept input such as “I am bored” and turn it into a response such as “Why are you bored?” The first is to perform some preprocessing—tasks that make the text easier to parse—such as cleaning up the text, expanding terms such as “I’m” into “I am,” “you’re” into “you are,” and so forth. Next, you’ll split up the input into sentences and words, choose the best sentence to respond to, and finally look up responses from your data files that match the input.
Some of these language tasks are generic enough that they could be useful in other applications, so you’ll develop a basic library for them. This will make your bot code simpler and give you a library to use in other applications if you need. Logic and methods that are specific to bots can go in the bot’s source code, and generic methods that perform operations on text can go into the library.
This section covers the development of a simple library, including testing and documentation.
Building the WordPlay Library
Now that you’ve got the library’s main file set up, you’ll move on to implementing some of the text manipulation and processing features you know your bot will require, but are reasonably application agnostic. (I covered the construction of classes in depth in Chapter 6.)
Splitting Text into Sentences
Your bot, like most others, is only interested in single-sentence inputs. Therefore, it’s important to accept only the first sentence of each line of input. However, rather than specifically tear out the first sentence, you’ll split the input into sentences and then choose the first one. The reason for this approach is to have a generic sentence-splitting method, rather than to create a unique solution for each case.
The preceding sentences method only splits text into sentences based on a period followed by whitespace. A more accurate technique could involve dealing with other punctuation (e.g., question marks and semicolons).
Splitting Sentences into Words
This test picks out the second sentence with sentences[1] and then the fourth word with words[3]—remember, arrays are zero-based. (The splitting techniques covered in this section were also explained in Chapter 3.)
Word Matching
In this example, you define three “hot” words that you want to find within sentences, and you look through the sentences in my_string for any that contain either of your hot words. The way you do this is by seeing if, for any of the words in the sentence, it’s true that the hot_words array also contains that word.
This class method accepts an array of sentences and an array of “desired words” as arguments. Next, it sorts the sentences by how many words difference each sentence has from the desired words list. If the difference is high, then there must be many desired words in that sentence. At the end of best_sentence, the sentence with the biggest number of matching words is returned.
Switching Subject and Object Pronouns
Inputs Coupled with Potential Responses
Input | Response |
---|---|
My cat is sick. | Your cat is sick. |
I hate my car. | You hate your car. |
You are an awful bot. | I are an awful bot. |
These aren’t elaborate conversations, but the first two responses are valid English and are the sort of thing your bot can use. The third response highlights that you also need to pay attention to conjugating “am” to “are” and vice versa when using “I” and “you.”
This method accepts any text supplied as a string and performs a substitution on each instance of “I am,” “you are,” “I,” “you,” “your,” or “my.” Next, a case construction is used to substitute each pronoun with its opposing pronoun. (You first used the case/when syntax in Chapter 3, where you can also find a deeper explanation of how it works.)
The reason for performing a substitution in this way is so that you only change each pronoun once. If you’d used four gsubs to change all “I’s” to “you’s,” “you’s” to “I’s,” and so on, changes made by the previous gsub would be overwritten by the next. Therefore, it’s important to use one gsub that scans through the input pronoun by pronoun rather than making several blanket substitutions in succession.
When the “you” or “I” is the object of the sentence, rather than the subject, “you” becomes “me” and “me” becomes “you,” whereas “I” becomes “you” and “you” becomes “I” on the subject of the sentence.
What you do in this case seems odd on the surface. You let switch_pronouns process the pronouns and then correct it when it changes “you” to “me” at the start of a sentence by changing the “me” to “I.” This is done with the chained sub at the end.
Success!
We clearly have some work to do!
Testing the Library
When building a larger application or libraries upon which other applications will depend, it’s important to make sure everything is fully tested. In Chapter 8, we looked at using Ruby’s unit testing features for simple testing. You can use the same methods here to test WordPlay. Make sure the Minitest gem is installed. If you need help, review Chapter 8.
Now let’s write some tests.
Testing Sentence Separation
The first assertion tests that the dummy sentence "a. b. c d. e f g." is successfully separated into the constituent “sentences.” The second assertion uses a longer predefined text string and makes sure that the third sentence is correctly identified.
Ideally, you’d extend this basic set of assertions with several more to test more complex cases, such as sentences ending with multiple periods, commas, and other oddities. As these extra tests wouldn’t demonstrate any further Ruby functionality, they’re not covered here, but feel free to try some out!
Testing Word Separation
These assertions are simple. You split sentences into words and compare them with predefined arrays of those words. The assertions pass.
This highlights one reason why test-first development can be a good idea. It’s easy to see how you could develop these tests first and then use their passing or failure as an indicator that you’ve implemented words correctly. This is an advanced programming concept, but one worth keeping in mind if writing tests in this way “clicks” with you.
Testing Best Sentence Choice
This test method performs a simple assertion that the correct sentence is chosen from three options. Three sentences are provided to WordPlay.best_sentence, along with the desired keywords of “test,” “great,” and “this.” Therefore, the third sentence should be the best match. The second assertion makes sure that WordPlay.best_sentence returns a sentence even if there are no matches, because in this case, any sentence is a “best” match.
Testing Pronoun Switches
These basic assertions prove that the “you are,” “I am,” “you,” and “me” phrases are switched correctly.
These examples are more complex, but prove that switch_pronouns can handle a few more complex situations with multiple pronouns.
These tests both fail because they circumvent the trick you used to make sure that “you” is translated to “me” and “I” in the right situations. In these situations, they should become “I,” but because “I” isn’t at the start of the sentence, they become “me” instead. It’s important to notice that basic statements tend to work okay, whereas questions or more elaborate statements can fail. However, for your bot’s purposes, the basic substitutions suffice and you can remove these tests.
If you were to focus solely on producing an accurate language processor, you could use tests such as these to guide your development, and you’ll probably use this technique when developing libraries to deal with edge cases such as these in your own projects.
WordPlay’s Source Code
Your nascent WordPlay library is complete for now, and in a state that you can use its features to make your bot’s source code simpler and easier to read. Next, I’ll present the source code for the library as is, as well as its associated unit test file. As an addition, the code also includes comments prior to each class and method definition, so that you can use RDoc to produce HTML documentation files, as covered in Chapter 8.
Remember that source code for this book is available in the Source Code area at www.apress.com, so it isn’t necessary to type in code directly from the book.
wordplay.rb
test_wordplay.rb
Building the Bot’s Core
In the previous section, you put together the WordPlay library to provide some features you knew that your bot would need, such as basic sentence and word separation. Now you can get on with the task of fleshing out the logic of the bot itself.
You’ll use this barebones client program as a yardstick while creating the Bot class. In the previous example, you created a bot object and passed in some parameters, which enables you to use the bot’s methods, along with keyboard input, to make the bot converse with the user.
In certain situations, it’s useful to write an example of the higher-level, more abstracted code that you expect ultimately to write, and then write the lower-level code to satisfy it. This isn’t the same as test-first development, although the principle is similar. You write the easiest, most abstract code first and then work your way down to the details.
Next, let’s look at how you expect the bot to operate throughout a normal session and then begin to develop the required features one by one.
The Program’s Lifecycle and Parts
So far we have focused on verbal descriptions of what we want to do. In Figure 12-2, however, we take a more visual look at the more overall lifecycle of a bot, and the client accessing it, that we’ll develop.
- 1.
The Bot class, within bot.rb, containing all the bot’s logic and any subclasses.
- 2.
The WordPlay library, within wordplay.rb, containing the WordPlay class and extensions to String.
- 3.
Basic “client” applications that create bots and allows users to interact with them. You’ll first create a basic keyboard-entry client, but we’ll look at some alternatives later in the chapter.
- 4.
A helper program to generate the bot’s data files easily.
You’ll begin putting together the Bot class and then look at how the bot will find and process its data.
Bot Data
One of your first concerns is where the bot will get its data. The bot’s data includes information about word substitutions to perform during preprocessing, as well as myriad keywords and phrases that the bot can use in its responses.
The Data Structure
The main hash has two parent elements, :presubs and :responses. The :presubs element references an array of arrays that contain substitutions to be made to the user’s input before the bot forms a response. In this instance, the bot will expand some contractions and also change any reference of “love” to “like.” The reason for this becomes clear when you look at :responses.
The preceding data structure is intentionally lightly populated to save space for discussion of the practicalities. By the end of this chapter, you’ll have a more complete set of data to use with your bot. This style of data structure was also covered in Chapter 3.
:responses references another hash: one that has elements with the names :default, :greeting, :farewell, 'hello', and 'i like *'. This hash contains all the different phrases the bot will use as responses, or templates used to create full phrases. The array assigned to :default contains some phrases to use at random when the bot cannot figure out what to say based on the input. Those associated with :greeting and :farewell contain generic greeting and farewell phrases.
More interesting are the arrays associated with 'hello' and 'i like *'. These phrases are used when the input matches the hash key for each array. For example, if a user says “hello computer,” then a match with 'hello' is made, and a response is chosen from the array at random. If a user says “i like computers,” then 'i like *' is matched and the asterisk is used to substitute the remainder of the user’s input (after “i like”) into the bot’s output phrase. This could result in output such as “Wow! I like computers too,” if the second phrase were to be used.
Storing the Data Externally
Using a hash makes data access easy (rather than relying on, say, a database) and fast when it comes to choosing sentences and performing matches. However, because your bot class needs to be able to deal with multiple datasets, it’s necessary to store the hash of data for each bot within a file that can be chosen when a bot is started.
In Chapter 9, you learned about the concept of object persistence, where Ruby data structures can be “frozen” and stored. One library you used was called PStore, which stores Ruby data structures in a non-human-readable binary format; and the other was YAML, which is human-readable and represented as a specially formatted text file. For this project, you’ll use YAML, as you want to be able to make changes to the data files on the fly, to change things your bot will say, and to test out new phrases without constructing a whole new file each time.
Note that as the YAML data is plain text, you can edit it directly in the file or just tweak the bot_data structure and re-run bot_data_to_yaml.rb. From here on out, let’s assume you’ve run this and generated the preceding YAML file as bot_data in the current directory.
Now that you have a basic data file, you need to construct the Bot class and get its initialize method to use it.
Constructing the Bot Class and Data Loader
The initialize method sets up each newly created object and uses the options hash to populate two class variables, @name and @data. External access to @name is provided courtesy of attr_reader. File.open, along with the read method, opens the data file and reads in the full contents to be processed by the YAML library. YAML.load converts the YAML data into the original hash data structure and assigns it to the @data class variable. If the data file opening or YAML processing fails, an exception is raised, as the bot cannot function without data.
This method simplifies the routine of taking a random phrase from a particular phrase set in @data. The second line of random_response performs a substitution so that any responses that contain [name] have [name] substituted for the bot’s name. For example, one of the demo greeting phrases is “Hi. I’m [name]. Want to chat?” However, if you created the bot object and specified a name of “Fred,” the output would appear as “Hi. I’m Fred. Want to chat?”
Remember that a private method is a method that cannot be called from outside the class itself. As random_response is only needed internally to the class, it’s a perfect candidate to be a private method.
Isn’t separating common functionality into distinct methods great? These methods now look a lot simpler and make immediate sense compared to the jumble they contained previously.
This technique is also useful in situations where you have “ugly” or complex-looking code and you simply want to hide it inside a single method you can call from anywhere. Keep complex code in the background and make the rest of the code look as simple as possible.
The response_to Method
The core of the Bot class is the response_to method. It’s used to pass user input to the bot and get the bot’s response in return. However, the method itself should be simple and have one line per required operation to call private methods that perform each step.
- 1.
Accept the user’s input.
- 2.
Perform preprocessing substitutions, as described in the bot’s data file.
- 3.
Split the input into sentences and choose the most keyword-rich sentence.
- 4.
Search for matches against the response phrase set keys.
- 5.
Perform pronoun switching against the user input.
- 6.
Pick a random phrase that matches (or a default phrase if there are no matches) and perform any substitutions of the user input into the result.
- 7.
Return the completed output phrase.
Let’s look at each action in turn.
Accepting Input and Performing Substitutions
Then you move on to performing the preprocessing word and phrase substitutions as dictated by the :presubs array in the bot data file. You’ll recall the :presubs array is an array of arrays that specifies words and phrases that should be changed to another word or phrase. The reason for this is so that you can deal with multiple terms with a single phrase. For example, if you substitute all instances of “yeah” for “yes,” a relevant phrase will be shown whether the user says “yeah” or “yes,” even though the phrase is only matching on “yes.”
This code loops through each substitution defined in the :presubs array and uses gsub! on the input.
At this point, it’s worth wondering why you have a string of methods just to get to the perform_substitutions method. Why not just call it directly from response_to?
The rationale in this case is that you’re trying to keep logic separated from other logic within this program as much as possible. This is how larger applications work, as it allows you to extend them more easily. For example, if you wanted to perform more preprocessing tasks in the future, you could simply create methods for them and call them from preprocess without having to make any changes to response_to. Although this looks inefficient, it actually results in code that’s easy to extend and read in the long run. A little verbosity is the price for a lot of flexibility. You’ll see a lot of similar techniques used in other Ruby programs, which is why it’s demonstrated so forcefully here.
Choosing the Best Sentence
First, best_sentence collects an array of single words from the keys in the :responses hash. It looks for all keys that are strings (you don’t want the :default, :greeting, or :farewell symbols getting mixed in) and only a single word. You then use this list with the WordPlay.best_sentence method you developed earlier in this chapter to choose the sentence from the user input that matches the most “hot” words (if any).
Again, by having the tiny piece of logic of choosing the best sentence in a separate method, you can change the way the program works without meddling with larger methods.
Looking for Matching Phrases
Now you have the sentence you want to parse and the substitutions have been performed. The next step is to find the phrases that are suitable as responses to the chosen sentence and to pick one at random.
possible_responses accepts a single sentence and then uses the string keys within the :responses hash to check for matches. Whenever the sentence has a match with a key from :responses, the various suitable responses are pushed onto the responses array. This array is flattened so that a single array is returned.
If no specifically matched responses are found, the default ones (found in :responses with the :default key) are used.
Putting Together the Final Phrase
This rule matches when the user says “I like.” The first possible response—“Why do you like *?”—contains an asterisk symbol that you’ll use to substitute in part of the user’s sentence in conjunction with the pronoun-switching method you developed in WordPlay earlier.
For example, a user might say, “I like to talk to you.” If the pronouns were switched, you’d get “You like to talk to me.” If the segment following “You like” were substituted into the first possible response, you’d end up with “Why do you like to talk to me?” This is a great response that compels the user to continue typing and demonstrates the power of the pronoun-switching technique.
Therefore, if the chosen response contains an asterisk (the character you’re using as a placeholder in response phrases), you’ll need to substitute the relevant part of the original sentence into the phrase and perform pronoun switching on that part.
This new version of possible_responses checks to see if the pattern contains an asterisk, and if so, extracts the correct part of the source sentence to use into matching_section, switches the pronouns on that section, and then substitutes that into each relevant phrase.
Playing with the Bot
You have the basic methods implemented in the Bot class, so let’s play with it asis before looking at extending it any further. The first step is to prepare a better set of data for the bot to use so that your conversations can be more engaging than those with the dummy test data shown earlier in this chapter.
Fred: Your Bot’s Personality
If you run this with ruby bot_data_to_yaml.rb fred.bot, you’ll end up with a bot data file called fred.bot that contains the necessary data to converse with a basic bot.
The First Real Conversation
The full code for bot.rb is provided a little later in the chapter, so if you run into problems, check it out in case the code you have implemented is missing anything.
It’s almost the same as the script we thought of before implementing the Bot class. You used the method names dictated by that program and made it fit. (The completed source for the Bot class is provided in the next section if you want to refer to it.)
The bot works! The conversation might be a little stilted and manipulated to use some of the phrases and words covered by your dataset, but with this basic mechanism, and a dataset extended even further, significantly more complex conversations would be possible. Unfortunately, it’s outside the scope of this book to provide a large dataset.
In the next section, the final code for the basic bot is presented, and then you’ll see how you can extend the bot’s functionality further.
Main Bot Code Listing
This section makes available the full source code to the Bot class, bot.rb, including extra documentation that RDoc can use. Also included is the source to a basic bot client that you can use to converse with a bot on a one-on-one basis using the keyboard from the command line.
You will also need the WordPlay class we wrote earlier.
As this code is commented, as opposed to the examples so far in this chapter, I recommend you at least browse through the following code to get a feel for how the entire program operates as a set of parts.
You can also find these listings available to download in the Source Code/Download area of www.apress.com/.
bot.rb
basic_client.rb
You can find listings for basic web, bot-to-bot, and text file clients in the next section of this chapter, “Extending the Bot.”
Extending the Bot
One significant benefit of keeping all your bot’s functionality well separated within its own class and with multiple interoperating methods is that you can tweak and add functionality easily. In this section, we’re going to look at some ways we can easily extend the basic bot’s functionality to handle other input sources than just the keyboard.
When you began to create the core Bot class, you looked at a sample client application that accepted input from the keyboard, passed it on to the bot, and printed the response. This simple structure demonstrated how abstracting separate sections of an application into loosely coupled classes makes applications easier to amend and extend. You can use this loose coupling to create clients that work with other forms of input.
When designing larger applications, it’s useful to keep in mind the usefulness of loosely coupling the different sections so that if the specifications or requirements change over time, it doesn’t require a major rewrite of any code to achieve the desired result.
Using Text Files As a Source of Conversation
This program accepts the bot’s name, data filename, and conversation filename as command-line arguments, reads in the user-side conversation into an array, and loops through the array, passing each line to the bot in turn.
Connecting the Bot to the Web
You also need to make sure you upload the bot.rb, wordplay.rb, and bot data file(s).
Bot-to-Bot Conversations
It’s not the greatest conversation ever seen, but it’s certainly entertaining to see two ersatz therapists getting along with each other. Of course, if you manage to develop two bots that actually have an engrossing conversation, you’ll be on the path to artificial intelligence stardom!
The key problem with your bot’s data is that none of the default data contains any keywords that can be picked up by other phrases, so both bots are locked in a loop of throwing default phrases at each other. That’s why it’s important to extend the basic set of data if you want to use the bot for anything that looks impressive!
Summary
In this chapter, we looked at developing a simple chatterbot, developed a library along the way, produced tests for the library, worked with storing our bot’s vocabulary in an external file, and looked at a number of ways to extend our project with databases or by hooking it up to a website.
This chapter marks the end of the second part of this book, and you should now have enough Ruby knowledge to pass as a solid, yet still learning, Ruby developer. You should be able to understand the majority of Ruby documentation available online and be able to use Ruby productively either professionally or for fun.
Part 3 of this book digs a little deeper into Ruby’s libraries and frameworks, from Ruby on Rails and the Web to general networking and library use. Chapter 16, which looks at a plethora of different Ruby libraries and how to use them, will be particularly useful to refer to as you develop your own programs, so that you don’t reinvent the wheel too often!