CHAPTER 1. Exploring Bot Framework Architecture

When you pick up a new technology, it’s reasonable to look for what the benefit is. How does this toolset or framework help get the job done? Is it more productive than writing the same code from scratch? Why not create an app, website, or desktop program with existing (or familiar) technology? The goal of this chapter is to help answer some of these questions.

Beyond talking about the Bot Framework, this chapter starts with a discussion of what a chatbot is and why you might need to build one. This book takes you further into understanding and validating what you know. The Bot Framework has several major components, and this chapter shows you what they are, how they relate to each other, and their benefits.

What Is a Chatbot?

Though the Microsoft Bot Framework uses the term “bot,” this book uses the term “chatbot” to refer to the type of programs that the Bot Framework helps you build. This section explores what a chatbots are, their benefits, and some pros and cons of whether a chatbot is appropriate for a given task.

Defining a Chatbot

There are many opinions, but no common agreed upon formal definition of a chatbot. A couple of dictionaries mention bots, but don’t clearly capture the essence of what a modern day chatbot is. The following definition might not find its way into a dictionary, though it contains a description of what a chatbot is, what it can do, and the type of platform it resides upon:

A chatbot is an application, often available via messaging platforms and using some form of intelligence, that interacts with a user via a conversational user interface (CUI).

Examining this definition in greater detail shows that a chatbot is an application, meaning it’s a program that a developer writes. The chatbot essentially takes input from a user, processes it, and responds to the user.

The next phrase on being available via messaging platforms is important because that’s where users find a lot of moden chatbots today. These messaging platforms include Skype, Messenger, Slack, and others. The reason that’s important is because messaging applications are some of the most popular places for people to interact with each other. Because of this, it makes sense that applications, in the form of chatbots, are available where people normally are.

The definition includes Intelligence because there are a whole set of Artificial Intelligence (AI) services available via Microsoft Cognitive Services and many other third parties. Though it isn’t required, it’s common to see chatbots leveraging some type of AI. Notice that we use the term “often” associated with messaging and intelligence to denote that it’s common, but not absolute or required.

The last part of the definition, about interacting with a user via CUI, is a distinguishing factor of chatbots. People communicate with chatbots via text and possibly even voice. Conversation can be text and/or voice. Most of the chatbots in this book use text, but some use cards for quick interaction and others, as discussed in Chapter 15, Adding Voice Services, use Cortana Skills for a voice interface. The next section explains why conversation is so important.


Images Note

Although the community continues to enthusiastically debate the meaning of a chatbot, one thing most people seem to have coalesced around is to use the term “chatbot” rather than “bot.” In the past, bots have represented automated server processes, spiders, and web crawlers. The word is even shorthand for robot. Looking at the definition in this chapter and considering the potential for ambiguity, many people consider the term chatbot to be more descriptive for programs they interact with via plain language.


Why Conversation?

Conversation is an important characteristic of chatbots. Let’s dive deeper into what a conversation is and why it’s important.

Chatbots use Conversational User Interfaces (CUI) rather than Graphical User Interfaces (GUI). A GUI is a program that appears on screen and has graphical elements and pictures for mouse or touch interaction. CUI is text that a user types into a messaging app. Many of the chatbots available today are primarily text, but a chatbot can also use voice to interact with users. As you’ll learn, a chatbot can have some GUI elements, but a chatbot is still characterized primarity by its use of text to facilitate conversations.

Some software developers may look at this and say what’s old is new again because CUI sounds like the same thing as a command-line or console application. However, there’s a huge difference because CUI is conversational as opposed to command driven. Chatbots often use Natural Language Processing (NLP) to understand normal language text from users. On the other hand, a typical command-line application requires precise command syntax to understand what the user wants. You could certainly use NLP and converse with users in a command-line application, but then you’ve created a chatbot that happens to reside in a command-line application. In the past, command-line applications didn’t normally use NLP or engage in conversations, which are unique characteristics of chatbots. We say “normally” because there are several AI projects in the past that interact with users via conversation and it would be reasonable to consider those to be chatbots by the definition in this chapter.

The previous paragraphs explain how conversation makes chatbots different from earlier application types, command-line, and GUI. To appreciate why, think of how humans communicate with each other. They use conversation. Typing terse commands into a command-line application or touching a GUI screen are learned behaviors. Much has been researched and written on how to create great user experiences through GUI interfaces. However, the core issue remains—people still have to learn how to use the interface. Not only do they have to learn the interface of one program, but they have to learn the interface of every other program they use. People familiar with computers often figure out how to use a GUI through common patterns across vendors or other conventions. However, non-technical people still encounter a learning curve, regardless of how technically intuitive a user interface might be. The alternative, through chatbots, is conversation.

Conversation is natural. It’s what people have done before recorded time to communicate. People learn to communicate, through conversation, from the time they’re born and lasting all their lives. Why can’t people use normal conversation when trying to get their computers to work? Now that goal is a lot closer because the primary interface of a chatbot is conversation.

Because messaging apps are so popular, entire generations are accustomed to interacting via text. Think of Skype messaging, Facebook Messenger, Slack, and many more applications that people use every day. In fact, people have been using chat applications, like AOL Messenger, IRC, and SMS for years. The advent of conversations with chatbots, on messaging platforms, providing service via text is an advantageous and normal evolution of computing in the lives of many people.

Conversation is one of the benefits of chatbots and there are others, covered in the next section.

Chatbot Benefits

There are more reasons why chatbots are a good choice of platform for building applications, including: conversation, ease of deployment, device versatility, and platform independence.

Conversation

The previous section discussed conversation in-depth. Because conversation is natural, a chatbot is an ideal platform for creating an interface for anyone to use.


Images Note

In addition to conversation, messaging platforms have some GUI support. However, it’s minimal and supplementary to the conversational aspects of the chatbot.


Ease of Deployment

One problem with desktop applications or apps is all of the additional work associated with deployment. With desktop applications, you have to download and install the program. For apps, you visit a store and install on your device. A problem with websites is that they sometimes don’t work on a certain browser. The deployment experience for existing technologies has some subjective degree of friction that affects people in different ways.

In contrast, chatbots reside in messaging apps that people already use. The deployment process is as simple as inviting the bot into their workspace. There isn’t any heavy deployment or installation. People just say “Hi” to the chatbot and it begins communicating.

Any Device

Today, people often think of chatbots as something that resides in a messaging application. However, note that we used the term “often” in the earlier definition of a chatbot. This is intentional because there isn’t anything to say that a chatbot “must” reside in a messaging app. Further, there isn’t a requirement for interacting with a chatbot via typed text. The interaction could be voice.

Think of Cortana from Microsoft or Siri from Apple, each communicating via voice. Other commercial devices, such as Amazon’s Alexa or Google’s Home communicate with voice also. Developers are even able to build chatbots that use voice, translated via text-to-speech/speech-to-text interfaces offered through APIs like the Cognitive Services offered by Microsoft.

Besides messaging, chatbots can reside in apps, websites, and desktop applications. Anyone could build their own hardware device that has audio/voice and communicates across the Internet to interact with a chatbot. A command-line or GUI app requires a screen, but a chatbot is more versatile because it isn’t constrained to traditional computing devices.

Platform Independence

Apps tend to adopt the design and functionality conventions of the platform they’re built for. For example, Universal Windows Programs (UWP) have their recommended design patters, Google has Material Design, and Apple has their own recommendations. If you wanted to build a single app with a cross-platform tool like Xamarin, you either throw away those conventions or tweak the UI to move it closer to the expectations of that platform. Furthermore, different versions of those platforms change over time and conventions and capabilities change, fragmenting the app base and confusing developers and users. The alternative to reach different platforms is to build separate apps with technologies specific to those platforms, which is even more work. With the Microsoft Bot Framework, one chatbot serves all platforms.


Images Note

Xamarin is an excellent platform for developing cross-platform apps. The comments here serve to help illuminate the difference between apps and chatbots.


With chatbots you have one convention: plain language with the user. The conversations with the user are driven by the purpose of the chatbot and the desires of the user. Instead of following a call tree or hierarchy for a user to arrive at the desired functionality, users just ask the chatbot directly what they want. Logically, there needs to be some context for many types of requests, but again, the chatbot doesn’t have to climb back out of a hierarchy to answer the next question a user has. The chatbot isn’t tied to a corporate driven idea of what everyone’s user experience for a generation of a platform should be. The chatbot is free to converse based on how a company/developer builds it for intended users.

These are a few of the benefits of chatbots and there is much promise in their implementation. However, there might be times when a chatbot isn’t the right solution and the next section discusses that in more detail.

To Bot or Not

Most software developers who have been in the business for a while have seen their share of programming language debates. Some people enjoy these interactions of endless exchanges of bit, nuance, and theory, and it’s possible to learn a thing or two from the intellectual tidbits that occasionally surface in these threads. Often you’ll see a seasoned developer chime into these conversations and remind people that not every programming language, platform, or tool is perfect for every scenario and that an informed approach is to consider what is the right tool for the job. In this spirit, there are valid reasons why a chatbot might not be the best solution to every problem. There are likely various reasons, but this section covers a few to give you an idea on times you might question using a chatbot when planning a project, including: need to change, appropriate UI, and criticality.

Need to Change

Consider that a certain app already provides a service that satisfies users. They’re happy, are trained to use the app, and the job is getting done. If a change is going to cause those users discomfort with little or no apparent benefit, why re-write it as a chatbot? On the other hand, if the users like new tools and there’s a clear benefit, a chatbot might be a possible solution.

Other times, you might have a lack of resources, such as no budget or enough people to build the chatbot. Sometimes the technical desire to re-write an app as a chatbot doesn’t outweigh the business constraints to make what you have last a little longer. However, if you have the budget and the chatbot promises significant improvements, the decision could change.

Appropriate UI

A chatbot UI is conversational. Chapter 10, Attaching Cards, discusses some GUI elements you can use with a chatbot, but that’s limited. In particular, imagine needing to use a map where users need to pan, zoom, and perform other manipulations. It might be possible to talk to the chatbot and get it to re-render images, but that would be cumbersome for the user.

Another area where a chatbot wouldn’t be logical is as an augmented/virtual reality (AR/VR) application. It might sound obvious, but the point here is to mention a few items to facilitate a thinking process of whether a chatbot is appropriate for a scenario. In this case, a chatbot might not be the program, but maybe an AR program might have a character that is a chatbot.

Criticality

One of the things about working with chatbots are that conversations can be hard to design. There are so many different ways to say something, and so many different directions a conversation can take. By constraining the domain, it’s possible to manage this complexity. However, there are real-life situations where it isn’t practical to constrain a conversation, such as a 911 service, hospital emergency rooms, or first responder coordination. These are all situations that are unpredictable and conversations could go in any direction. Emergencies would also be catastrophic in a case where a chatbot was unable to understand what a person is saying. Given all the emotion, unpredictability, and complexity of these situations, a chatbot is probably not the correct solution.

That said, a chatbot might not be the right solution today, but as technology and tools advance, a chatbot might not only be capable, but also desired. If the chatbot had the ability to properly navigate a conversation, it might actually be safer by not allowing emotion and other human mistakes to make incorrect decisions. A similar discussion is being had for autonomous cars right now in that they might not be 100 percent reliable today, but are likely to be much safer than human drivers in the future.

Bot Framework Architecture

When working with the Bot Framework, it’s important to know how the components fit together. e.g. What are the paths a message takes, what does the bot communicate with, or which services are available? Throughout this book, you’ll read about these components and having a visual of everything can help with the technical details of a concept. This section takes a high-level view and gradually digs into each component. You’ll learn about the relationships between each component and the services that each component offers.

Visualizing Chatbots, Connector, and Channels

A birds-eye view includes the major components of the Bot Framework, including channels, the Bot Connector, and chatbot. Figure 1-1 shows the communication flows between Bot Framework components.

Images

FIGURE 1-1 Channels communicate with the Bot Connector and the Bot Connector communicates with the chatbot.

Channels are apps, like Skype or Facebook Messenger, they’re used to communicate with your chatbot. The Bot Connector is a Microsoft cloud component that channels send messages to and receive messages from. The Bot Connector then sends and receives messages with the chatbot. The chatbot component is something the developer builds and this book goes into detail on how to do that.

Each of these components: channel, Bot Connector, and chatbot; offers its own features and services, which you learn about in the following sections.

Overview of Channels

A channel is often associated with a messaging app, like Skype, Slack, or Facebook messenger. While this is true, a channel can also be any program that sends and receives messages to and from the Bot Connector, illustrated in Figure 1-2.

Images

FIGURE 1-2 The Microsoft Bot Framework supports several third-party channels, email, SMS, and websites.

In addition to messaging applications, the Bot Framework supports email, SMS, and websites. The Bot Framework offers a webchat control that resides in a web page. Chapter 11, Configuring Channels, describes how to set up channels to work with a chatbot and Chapter 12, Creating Email, SMS, and Web Chatbots, shows how to set up and use email, SMS, and the webchat control.

Essentially, the channel represents a place where a user can interact with a chatbot. Microsoft integrates with several third-party apps and has its own channels, but chatbots aren’t limited as to what type of channel they can offer a user. Imagine a company that needs to expose a chatbot through its own application. That’s possible because the Bot Framework supports building custom channels and you can learn more about that in Chapter 14, Coding Custom Channels with the Direct Line API.

Bot Connector Services

Previous sections showed how the Bot Connector facilitates communication between channels and chatbots, which is called routing. However, there are more considerations for the Bot Connector’s role, as shown in Figure 1-3.

Images

FIGURE 1-3 The Bot Connector offers several different services.

Besides routing, the Bot Connector stores state, which is custom information a chatbot can save. A chatbot is able to store custom information for a conversation, a user, or a user with conversation. Each of these types of state can store up to 32kb. Additionally, the Bot Framework will serialize the contents of Dialog types for storage in the Bot Connector state service.

Chatbots communicate with the Bot Connector via messages called activities. These activities can be text between the user and chatbot or other conversational events, like updating a conversation. These activities have various identifiers for a channel, conversation, and users. While the Bot Connector performs routing of these activities, it’s important to note that the Bot Connector doesn’t manage identifiers. Conversations and users, along with associated identities, are managed by channels. What the Bot Connector does manage is the wire format of messages between a channel and chatbot. Bot Framework chatbots always receive messages in the form of an Activity, deserialized from a JSON string object.

An interesting technical feature of the Bot Connector is that its interface is a REST API. The implications of this is that while the Bot Framework SDK is written in C# and resides on Windows Azure, the Bot Connector is platform agnostic. Anyone can build a custom channel that communicates via the Direct Line API. While the Bot Framework SDK supports C# (.NET programming languages) and node.js, the Bot Connector supports any programming language because it exposes a Connector REST API. There’s also a State REST API that any programming language can access to manage chatbot state.


Images Tip

The Bot Framework SDK is an open-source project, hosted on GitHub, which is located at https://github.com/Microsoft/BotBuilder. You can clone this code, see how it works, and interact with the Microsoft Bot Framework team. We’ve seen a couple related open source projects in Java and Python, and there are more.


Characteristics of a Chatbot

The chatbot serves whatever purpose its creator decides upon. There’s currently a growing list of chatbots for nearly any imaginable domain. e.g. entertainment, information, retail, gaming, team management, and more. This book has several examples of chatbots and you can see more in the Skype bot directory. After forming an idea of what the bot should do, the next step is to start thinking about the design of the chatbot. Figure 1-4 shows a few characteristics of a chatbot that can help.

Images

FIGURE 1-4 Chatbots have unique characteristics that guide their design and development.

From the time a user says ‘Hi’ until the conversation completes, the chatbot is responsible for the flow of the conversation. Certainly, the user drives the conversation and that conversation can go in a number of directions, many of which the chatbot was never designed for. For example, if a chatbot is built to sell phones and the user starts asking about desktop computers, the chatbot might not be able to talk about desktop computers if the developer didn’t program it to do so. It’s common for conversations to get off track and that means it takes significant thought to design a workflow that handles not only the path that the chatbot knows about, but to gracefully handle alternate paths. The definition of graceful depends on the developer and the nature of the application, though the need to map different conversation paths is a prime consideration when designing a chatbot.

Besides the unique requirements of managing conversations, a chatbot should be designed like any other computer program. In general terms, there are normally different layers of an application, which correspond to the design philosophy of the developer (or team) doing the work. In most applications, there is a user interface layer, which could be an HTML page for websites, a window for desktop applications, or a touch screen for phone apps. These are typically graphical user interfaces (GUI). However, a chatbot is primarily a conversational user interface (CUI). From an architectural perspective, the chatbot is the user interface. The code that manages the conversation is part of the user interface (or presentation layer). The developer builds code that the user interface calls to perform business logic, just like layered applications in other technologies. You’ll see examples in this book where layered design isn’t used to simplify an example and other examples that do separate chatbot CUI handling from business logic. Developers are free to design their chatbots any way they want and the concepts here are for ideas to help think about how a chatbot could be designed.

The final section of this chapter takes a step back and examines the inherent distributed architecture of a chatbot, providing information that affects chatbot design.

Chatbot Communications

A lot of applications are responsive because they reside on the same device or have tiers (e.g. database) on the same LAN. Even websites perform decently because they pass text between a server, which contains most of the application. Chatbots are a bit different because they’re built on a distributed architecture. Figure 1-5 shows the communication paths between channel, Bot Connector, chatbot, and additional services.

Images

FIGURE 1-5 Each component in the Bot Framework communicates across the Internet.

There are distributed applications that communicate across the Internet and some of the concerns with this are around performance and scalability. Chatbots are distributed applications. Each of the arrows between objects in Figure 1-5 represents a message being passed between components and, more specifically, each of those arrows represent communication across the Internet. The communication occurring between channels, Bot Connector, and chatbot are the normal part of the Bot Framework architecture. If a user has a slow Internet connection, their experience suffers. Alternatively, if the Internet connection is good, the user experience is better.

Figure 1-5 also contains an interesting set of features that developers must be aware of with services (labeled Service 1 and Service 2) that the chatbot uses. It is normal to call external services, like Language Understanding Intelligence Service (LUIS) for natural language processing (NLP). There might be a couple other services of interest, such as those offered by Microsoft Cognitive Services. If those services are necessary, by all means use them. However, be aware of their nature and their potential to affect a chatbot’s performance and scalability.

Summary

The definition of a chatbot involves many features, including being another type of application, potential messaging platforms and intelligence services, and an inherent conversational nature. There are several benefits of using a chatbot over other application types like ease of deployment, multiple devices, and platform independence.

The Bot Framework architecture has components for channels, Bot Connector, and chatbots. The channels are the interfaces, such as messaging apps, websites, email, or SMS where a user interacts with a chatbot. The Bot Connector offers a set of routing and state management services. A chatbot is the application that a developer writes and contains the logic for interacting with the user.

Now you have the core knowledge needed to understand all the major components of the Bot Framework and where the chatbot fits in. The next chapter, Chapter 2, Setting Up a Project, shows how to build a simple chatbot and how to use the basic services of the Bot Framework SDK.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.188.61.223