Preparing dataset

If we think about building a chatbot with TF-IDF approach we first need to form a data structure which supports train data with the label.  Now let’s take an example of a chatbot which is built to answer questions from the users. In this case, using historical data we can form a dataset where we have two columns, one is the question and the second column is the answer to that question as shown in the following table:

Question

Answer

When does your shop open?

Our shop timings are 9:00 am - 9:00 pm on weekdays and 11:00 am- 12:00 midnight on weekends.

What is today's special?

Today we have a variety of Italian pasta, with special sauce. And lot more other options in the bakery.

What is the cost of an americano?

Americano with a single shot will cost 1.4$ and the double shot will cost 2.3$.

Do you sell Ice-creams?

We do have desserts like ice-cream, brownies, and pastries.


Let’s take the previous example and consider it as a sample dataset. It is a very small example and in the original hypothetical scenario, we will have a much larger dataset to work with. The typical process will be as follows: the user will interact with the bot and write a random query about the store. The bot will simply send that query to the NLP engine using API and then it is up to the NLP model to decide what to return for a new query (Test data). In reference to our dataset, all the questions are the train data and the answers are labels. In case of a new query, the TF-IDF algorithm will match it to one of the questions with a confidence score, which tells us that the new question asked by the user is close to some specific question from the dataset and the answer against that question is the answer that our bots return.

Let’s take the above example even further. When the user queries:  

" Can I get an Americano, btw how much it will cost ?"

We can see that words like 'I', 'an', 'it'  are the ones that will have higher occurrence frequency in other questions as well. Now if we match our remaining important words, we will see that this question is most close to: "What is the cost of an americano?" 

So our bot will respond back with the historical answer to this type of question:

Americano with a single shot will cost 1.4$ and the double shot will cost 2.3$.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.191.215.96