The algorithms of this chapter

In this chapter, we will focus on two feature learning areas:

  • Restricted Boltzmann Machines (RBM): A simple deep learning architecture that is set up to learn a set number of new dimensions based on a probabilistic model that data follows. These machines are in fact a family of algorithms with only one implemented in scikit-learn. The BernoulliRBM may be a non-parametric feature learner; however, as the name suggests, some expectations are set as to the values of the cells of the dataset.
  • Word embeddings: Likely one of the biggest contributors to the recent deep learning-fueled advancements of natural language processing/understanding/generation is the ability to project strings (words and phrases) into an n-dimensional feature set in order to grasp context and minute detail in wording. We will use the gensim Python package to prepare our own word embeddings and then use pre-trained word embeddings to see some examples of how these word embeddings can be used to enhance the way we interact with text.

All of these examples have something in common. They all involve learning brand new features from raw data. They then use these new features to enhance the way that they interact with data. For the latter two examples, we will have to move away from scikit-learn as these more advanced techniques are not (yet) implemented in the latest versions. Instead, we will see examples of deep learning neural architectures implemented in TensorFlow and Keras.

For all of these techniques, we will be focusing less on the very low-level inner workings of the models, and more on how they work to interpret data. We will go in order and start with the only algorithm that has a scikit-learn implementation, the restricted Boltzmann machine family of algorithms.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.222.111.134