Ashwin Pajankar and Aditya Joshi

Hands-on Machine Learning with Python

Implement Neural Network Solutions with Scikit-learn and PyTorch

Ashwin Pajankar
Nashik, Maharashtra, India
Aditya Joshi
Haldwani, Uttarakhand, India
ISBN 978-1-4842-7920-5e-ISBN 978-1-4842-7921-2
© Ashwin Pajankar and Aditya Joshi 2022
Apress Standard
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This Apress imprint is published by the registered company APress Media, LLC part of Springer Nature.

The registered company address is: 1 New York Plaza, New York, NY 10004, U.S.A.

This book is dedicated to the memory of our teacher, Prof. Govindarajulu Regeti (July 9, 1945March 18, 2021)

Popularly known to everyone as RGR, Prof. Govindarajulu obtained his B.Tech. in Electrical and Electronics Engineering from JNTU Kakinada. He also earned his M.Tech. and Ph.D. from IIT Kanpur. Prof. Govindarajulu was an early faculty member of IIIT Hyderabad and played a significant role in making IIIT Hyderabad a top-class institution that it grew to become today. He was by far the most loved and cheered for faculty member of the institute. He was full of energy to teach and full of old-fashioned charm. There is no doubt he cared for every student as an individual, taking care to know about and to guide them. He has taught, guided, and mentored many batches of students at IIIT Hyderabad (including one of the authors of the book, Ashwin Pajankar).

Introduction

We have long been planning to collaborate and write a book on machine learning. This field has grown and expanded immensely since we started learning these topics almost a decade ago. We realized that, as lifelong learners ourselves, the initial few steps in any field require a much clearer source that shows a path clearly. This also requires a crisp set of explanation and occasional ideas to expand the learning experience by reading, learning, and utilizing what you have learned. We have used Python for a long duration in our academic life and professional careers in software development, data science, and machine learning. Through this book, we have made a very humble attempt to write a step-by-step guide on the topic of machine learning for absolute beginners. Every chapter of the book has the explanation of the concepts used, code examples, explanation of the code examples, and screenshots of the outputs.

The first chapter covers the setup of the Python environment on different platforms. The second chapter covers NumPy and Ndarrays. The third chapter explores visualization with Matplotlib. The fourth chapter introduces us to the Pandas data science library. All these initial chapters build the programming and basic data crunching foundations that are one of the prerequisites for learning machine learning.

The next section discusses traditional machine learning approaches. In Chapter 5, we start with a bird’s-eye view of the field of machine learning followed by the installation of Scikit-learn and a short and quick example of a machine learning solution with Scikit-learn. Chapter 6 elaborates methods to help you understand and transform structural, textual, and image data into the format that’s acceptable by machine learning libraries. In Chapter 7, we introduce supervised learning methods, starting with linear regression for regression problems and logistic regression and decision trees for classification problems. In each of the experiments, we also show how to plot visualizations that the algorithm has learned with the use of decision boundary plots. The eighth chapter ponders over further fine-tuning of machine learning models. We explain some ideas for measuring the performance of the models, issues of overfitting and underfitting, and approaches for handling such issues and improving the model performance. The ninth chapter continues the discussion of supervised learning methods especially focusing on naive Bayes and Support Vector Machines. The tenth chapter explains ensemble learning methods, which are the solutions that combine multiple simpler models to produce a performance better than what they might offer individually. In the eleventh chapter, we discuss unsupervised learning methods, specifically focusing on dimensionality reduction, clustering, and frequent pattern mining methods. Each part contains a complete example of implementing the discussed methods using Scikit-learn.

The last section begins with introducing the basic ideas of neural network and deep learning in the twelfth chapter. We introduce a highly popular open source machine learning framework, PyTorch, that will be used in the examples in the subsequent chapters. The thirteenth chapter begins with the explanation of artificial neural networks and thoroughly discusses the theoretical foundations of feedforward and backpropagation, followed by a short discussion on loss functions and an example of a simple neural network. In the second half, we explain how to create a multilayer neural network that is capable of identifying handwritten digits. In the fourteenth chapter, we discuss convolutional neural networks and work through an example for image classification. The fifteenth chapter discusses recurrent neural networks and walks you through a sequence modeling problem. In the final, sixteenth chapter, we discuss strategies for planning, managing, and engineering machine learning and data science projects. We also discuss a short end-to-end example of sentiment analysis using deep learning.

If you are new to the subject, we highly encourage you to follow the chapters sequentially as the ideas build upon each other. Follow through all the code sections, and feel free to modify and tweak the code structure, datasets, and hyperparameters. If you already know some of the topics, feel free to skip to the topics of your interest and examine the relevant sections thoroughly. We wish you the best for your learning experience.

Acknowledgments

I would like to express my gratitude toward Aditya Joshi, my junior from IIIT Hyderabad and now an esteemed colleague who has written the major and the most important section of this book. I also wish to thank my mentors from Apress, Celestin, Aditee, James Markham, and the editorial team. I wish to thank the reviewers who helped me make this book better. I also thank Prof. Govindrajulu’s family Srinivas (son) and Amy (daughter-in-law) for allowing me to dedicate this book to his memory and sharing his biographical information and his photograph for publication.

—Ashwin Pajankar

My work on this book started with a lot of encouragement and support from my father, Ashok Kumar Joshi, who couldn’t live long enough to see it till completion. I am extremely grateful to friends and family – especially my mother, Bhavana Joshi, and many others, whose constant support was the catalyst to help me work on this project. I also want to extend my heartiest thanks to my wife, Neha Pandey, who was supportive and patient enough when I extended my work especially during weekends. I would like to thank Ashwin Pajankar, who’s been not just a coauthor but a guide throughout this journey. I’d also like to extend my gratitude to the Innomatics team, Kalpana Katiki Reddy, Vishwanath Nyathani, and Raghuram Aduri, for giving me opportunities to interact with hundreds of students who are learning data science and machine learning. I’d also like to thank Akshaj Verma for his support with code examples in one of the advanced chapters. I also thank the editorial team at Apress, especially Celestin Suresh John, Aditee Mirashi, James Markham, and everyone who was involved in the process.

—Aditya Joshi

Table of Contents
Section 2: Machine Learning Approaches63
About the Authors
Ashwin Pajankar

is an author, an online instructor, a content creator, and a YouTuber. He has earned a Bachelor of Engineering from SGGSIE&T Nanded and an M.Tech. in Computer Science and Engineering from IIIT Hyderabad. He was introduced to the amazing world of electronics and computer programming at the age of seven. BASIC is the very first programming language he learned. He has a lot of experience in programming with Assembly Language, C, C++, Visual Basic, Java, Shell Scripting, Python, SQL, and JavaScript. He also loves to work with single-board computers and microcontrollers like Raspberry Pi, Banana Pro, Arduino, BBC Microbit, and ESP32.

He is currently focusing on developing his YouTube channel on computer programming, electronics, and microcontrollers.

 
Aditya Joshi

is a machine learning engineer who’s worked in data science and ML teams of early to mid-stage startups. He has earned a Bachelor of Engineering from Pune University and an M.S. in Computer Science and Engineering from IIIT Hyderabad. He became interested in machine learning during his masters and got associated with the Search and Information Extraction Lab at IIIT Hyderabad. He loves to teach, and he has been involved in training workshops, meetups, and short courses.

 
About the Technical Reviewer
Joos Korstanje

is a data scientist, with over five years of industry experience in developing machine learning tools, of which a large part is forecasting models. He currently works at Disneyland Paris where he develops machine learning for a variety of tools.

 
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.139.80.15