Chapter 4. The Extra-Conversational Context

A core pillar to building an effective voicebot is the depth of information that the designer of the voicebot possesses about the broader context within which the conversation is taking place. The more the designer knows about, or is able to anticipate, this context when they are designing the voicebot, the more effective an experience they can design and deliver.

To differentiate this type of context from the one we covered in Chapter 1, which dealt with the context of a conversation in progress (for instance, who owns the conversational turn, whether the conversation is active or paused), we are calling this context Extra-Conversational.

By Extra-Conversational context we mean information such as: the state of the user, the physical conditions of the conversation, the social context within which the conversation is taking place, the recent context, known individual user patterns from past interactions between a human and the voicebot, and known user-base patterns.

1. The State of The User

The Emotional State: In those scenarios and use cases where we can infer something about the emotional state of the user (experiencing a power outage, inquiring about lab test results), design with that emotional state in mind. For instance, if users are likely to be agitated or under some stress , a happy, bouncy voice with a cheerful jingle is probably not the appropriate persona. What is needed instead is a serious, low-key voice that echoes the sense of seriousness and urgency that the user is probably experiencing.

Linguistic competence: If the voicebot is in English and the voicebot knows that the user is not a native speaker of the language (for instance, the voicebot is helping students learn the language), the designer should design prompts that avoid using expressions that trip non-natives.

For example, non-natives are easily tripped up by adverbial particles.

Instead of this:

  • Voicebot: Please hold on one second as I pull up your information.

Design this:

  • Voicebot: Please wait as I get your information.

Level of familiarity with engaging with the voicebot in question : Are the users likely to be frequent users of the voicebot or are they infrequent users?

The frequent user knows how to interact with the voicebot: what to say and how to interpret what the voicebot says. The infrequent user will need hand holding. Build the mechanism that will enable you to track whether the user is new, and if not new, the frequency of use of the voicebot and the last time they used the bot and how extensively they used it, and act on that information. This will help you avoid annoying the frequent user with information they know and avoid frustrating the infrequent user because you have not provided them with the instructions they need to engage successfully with the voicebot.

With the first time or infrequent user:

  • Voicebot: <Chime>. Fairfax Power. Which of the following do you want me to help you with: Report a power outage, ask a question about your bill, or something else.

  • Human: I have a question about my bill.

With a frequent user:

  • Voicebot: <Chime>. Fairfax Power. Is this about an outage, your bill, or something else.

  • Human: My bill.

Technical savviness : Are users at ease with technology in general or do they come to the voicebot with a general sense of anxiety: ‘Here we go: one more gadget to deal with!’ Knowing this will enable you to not only design a bot that is gentle, but to ensure that prior to engaging the voicebot, the person is provided with information that tells them what voicebots are and what they do. Group training sessions and easy to follow videos are effective ways to provide such information.

The physical readiness: Voicebots talk and listen and expect their partners to be able to talk and listen, in a back and forth, flowing exchange. So one of the effective voicebot’s concerns should be to monitor if the user is able to talk and listen comfortably. Usually, this information can be gathered through field research, but it can also be inferred from how much difficulty users are having with the voicebot. For example, the user is speaking but the voicebot can’t hear them; the user keeps asking the voicebot to repeat because they can’t seem to hear what the voicebot is saying (maybe the environment is noisy, or maybe the user is hard of hearing, or maybe the user was asked by the voicebot for their credit card number and the user does not have it handy and is fumbling to retrieve it and is asking the voicebot to repeat to buy time for themselves).

Instead of this:

  • Voicebot: Ok. To help you with your bill, I need your account number. Please go ahead and give it to me, one digit at a time.

  • Human: [Human doesn’t speak].

  • Voicebot: Sorry. I didn’t hear you. I need your account number. Please go ahead and give it to me, one digit at a time.

Design this:

  • Voicebot: Ok. To help you with your bill, I need your account number. Are you ready to give it to me?

  • Human: Er.. No. Hang on. I’m getting it.

  • Voicebot: Ok. I will wait. When you are ready, just say, “Hey Google, I am ready.” [Voicebot plays wait music]

Task-relevant properties : For example, if the voicebot is helping users engage with their bank, it would be useful for the voicebot to determine whether the user is a student with a low balance who usually engages the voicebot to make sure their balance is not dangerously low or a homeowner who who never asks about overdraft protection and may instead be interested in refinancing.

2. The Physical Context

When the voicebot is engaging the user during a noisy interaction (for instance, in a car), the voicebot should try to detect if the user is speaking from an environment where they cannot speak naturally (they would speak loudly to overcome the loudness around them). Usually, this can be inferred by the frequency of “out of scope” events during an interaction: the user is saying something that the voicebot cannot map to anything that it expects to hear. The more such events occur, the more likely the interaction is taking place in a noisy environment. In such cases, the voicebot should explicitly alert the user that the voicebot is not able to hear them well so that the user can act on the voicebot’s complaint in cases where the user is able to do something about it (for instance, lower the volume of the music or mute the TV).

Instead of this:

  • Human: My account number is 1224314

  • Voicebot: Sorry, I didn’t understand you. What’s your account number again?

  • Human: That’s, 1224314

  • Voicebot: Still not getting it. Can you try it one more time?

Design this:

  • Human: My account number is 1224314

  • Voicebot: Sorry, I didn’t understand you. What’s your account number again?

  • Human: That’s, 1224314

  • Voicebot: Still not getting it. You may be in a noisy environment or perhaps the connection is not good. If you are in a noisy place, can you go somewhere quieter?

3. The Social Context

Unlike most information technologies (smartphones, for instance), smart speakers are often communal devices: many people have access to using the same device -- for instance, in the home, in the car, in a hotel lobby or room, or in the office. As a result, given that both the user and the voicebot must speak openly and can be overheard by others, knowing if a given interaction is being heard by other people or by only the person with whom the voicebot is interacting could be a critical piece of information -- especially when the information that is being exchanged is private or sensitive. An intelligent designer should tag whatever information that they are sharing as sensitive and when tagged as such, should alert the user that it is so and ask them for permission to speak it out loud.

Instead of this:

  • Voicebot: <Chime>. HMU Grades Gateway. Welcome back, Drew. Which class?

  • Human: Postmodern philosophy.

  • Voicebot: I have your final exam results. You have passed with a grade of 93.

Design this:

  • Voicebot: <Chime>. HMU Grades Gateway. Welcome back, Drew. Which class?

  • Human: Postmodern philosophy.

  • Voicebot: Your final exam results are ready. Do you want me to speak them to you outloud now?

  • Human: No.

  • Voicebot: Ok. I have sent you an email with instructions on how to access your results.

4. The Recent Context

A low-hanging fruit source of information that the voicebot can leverage are the actions that the user may have taken or events that may have happened that are relevant to the conversation between the voicebot and the human.

Here are some examples:

  • The user just took an exam and the results of that exam are now ready. Knowing this, the voicebot may offer to provide them with those results.

  • The user recently logged a new trouble ticket: knowing this, the voicebot may volunteer to give the user information about that ticket.

  • The user’s bank balance is running low or there was an overdraft: knowing this, the voicebot may alert the user and offer the user to sign up to the overdraft protection program.

  • The user just downgraded or cancelled the service: knowing this, the voicebot may make the user a special offer to upgrade or re-subscribe.

Instead of this:

  • Voicebot: <Chime>. HMU Grades Gateway. Welcome back, Drew. Which class?

  • Human: Postmodern philosophy.

  • Voicebot: Your final exam results are ready. Do you want me to speak them to you outloud now?

  • Human: Yes.

  • Voicebot: You have passed with a grade of 93.

Design this:

  • Voicebot: <Chime>. HMU Grades Gateway. Welcome back, Drew. Your Postmodern Philosophy exam results are in. Do you want me to speak them to you outloud now?

  • Human: Yes.

  • Voicebot: You have passed with a grade of 93.

5. User Patterns

A rich and actionable source of information is the pattern of behavior that a user has followed in the past, especially behavior that is time related. For instance, if the user engages the voicebot every Saturday morning to find out what their balance is, then whenever the user engages the voicebot on a Saturday morning, the voicebot could offer the user the option of helping them obtain their balance in the opening prompt.

On Saturdays with this particular user, instead of this:

  • Voicebot: <Chime>. First Capital. I can help you check your balance, transfer funds, or something else.

  • User: Check balance.

  • Voicebot: From which account: “Checking,” “Savings, or “Money Market”?

  • User: Savings.

Design this:

  • Voicebot: <Chime>. First Capital. Do you want me to give your Savings Balance?

  • Human: Yes.

6. User-base Patterns

Another rich and actionable source of information is the data about what other users have been engaging the voicebot about. Such information is readily available and if used judiciously could delight the user.

An example: If during Sunday mornings, the majority of the interactions between users and the voicebot pertain to store hours, the voicebot can offer users that information before moving with the usual menu offer.

On Sundays, with all users, instead of this:

  • Voicebot: <Chime>. The Two Dollar Store. I can give you the address of our location, our hours, or connect you with someone. Which one would you like?

  • Voicebot: Store hours.

  • Human: Our hours today are from 10:00 am to 6:00 pm. Did you need anything else?

  • Human: No.

Design this:

  • Voicebot: <Chime>. The Two Dollar Store. Our hours today are from 10:00 am to 6:00 pm. Anything else?

  • Human: No.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset