Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Creating Vectorizer

Now let's initialize the Tf-idf vectorizer and define few parameters such as:

min_df: When building the vocabulary ignore terms that have a document frequency strictly lower than the given threshold.
ngram_range: Configuring our vectorizer to capture n-words at a time
norm: Norm used to normalize term vectors using L1 or L2 norms
encoding: To handle the Unicode characters.

There are lot more other parameters which one can look into and configure and play.

from sklearn.feature_extraction.text import TfidfVectorizer

vectorizer = TfidfVectorizer(min_df=0, ngram_range=(2, 4), strip_accents='unicode',norm='l2' , encoding='ISO-8859-1')

Now we train the model on the questions.

# We create an array for our train data set (questions)
X_train = vectorizer.fit_transform(np.array([''.join(que) for que in question_list]))

# Next step is to transform the query sent by user to bot (test data)
X_query=vectorizer.transform(query)

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

18.222.184.200

Table of Contents for Creating Vectorizer

Create new playlist

Sign In

Sign Up

Table of Contents for
Creating Vectorizer