Preface

We have seen major changes in the field of machine learning in the last few years that have impacted our daily lives and the way business decisions are made. If there is one thing that the biotechnology and life sciences industries have in abundance, it is their never-ending sources of data. As we move toward more data-driven models, the intersection of life sciences and machine learning has seen unprecedented growth, uncovering vast quantities of information and hidden patterns giving companies major competitive advantages.

Over the course of this book, we will touch on some of the most important elements of machine learning from both a supervised and unsupervised perspective. We will not only learn to develop and train robust models, but also deploy them in the cloud using AWS and GCP, allowing us to make them immediately available for end users.

Who this book is for

This book specifically caters to scientific professionals in both academia and industry looking to transcend to the data science domain. Individual contributors and managers alike who are already established within the pharmaceutical, life sciences, and biotechnology sectors will find this book not only useful, but immensely applicable to current-day projects. Although an introduction to Python and machine learning is provided, a basic understanding of Python programming and a beginner-level background in data science conjunction is recommended to get the most out of this book.

What this book covers

Chapter 1, Introducing Machine Learning for Biotechnology, provides a brief introduction to the field of biotechnology and some of the areas in which machine learning can be applied, in addition to some of the technology this book will use.

Chapter 2, Introducing Python and the Command Line, comprises a summary of some of the must-know techniques and commands in Bash and the Python programming language, in addition to some of the most common Python libraries.

Chapter 3, Getting Started with SQL and Relational Databases, is where you will gain knowledge of the SQL querying language and learn how to create a remote database using MySQL and AWS RDS.

Chapter 4, Visualizing Data with Python, introduces you to some of the most common methods for visualizing and representing data using the Python programming language.

Chapter 5, Understanding Machine Learning, comprises some of the most important elements of standard machine learning pipelines, introducing you to supervised and unsupervised methods, as well as saving models for future use.

Chapter 6, Unsupervised Machine Learning, is where you will learn about unsupervised models and dive into clustering and dimensionality reduction methods with tutorials relating to breast cancer.

Chapter 7, Supervised Machine Learning, is where you will learn about supervised learning models and dive into classification and regression methods.

Chapter 8, Understanding Deep Learning, provides an overview of the deep learning space, where we will explore the elements of a deep learning model, as well as two tutorials relating to protein classification using Keras and anomaly detection using AWS.

Chapter 9, Natural Language Processing, teaches you some of the most common NLP options as we explore popular libraries and tools, in addition to two tutorials relating to clustering as well as semantic searching using transformers.

Chapter 10, Exploring Time Series Analysis, explores data using a time-based approach in which we break down the components of a time series dataset and develop two forecasting models using Prophet and LSTMs.

Chapter 11, Deploying Models with Flask Applications, provides an introduction to one of the most popular frameworks for deploying models and applications to end users.

Chapter 12, Deploying Applications to the Cloud, provides an introduction to two of the most popular cloud computing platforms, in addition to three tutorials allowing users to deploy their work to AWS LightSail, GCP AppEngine, and GitHub.

To get the most out of this book

To maximize the value of your time, a very basic knowledge of the Python programming language and the Bash command line is recommended. In addition, some background in the biotechnology and life sciences spheres is recommended to best understand the tutorials and use cases.

If you are using the digital version of this book, we advise you to type the code yourself or access the code from the book's GitHub repository (a link is available in the next section). Doing so will help you avoid any potential errors related to the copying and pasting of code.

Download the example code files

You can download the example code files for this book from GitHub at https://github.com/PacktPublishing/Machine-Learning-in-Biotechnology-and-Life-Sciences. If there's an update to the code, it will be updated in the GitHub repository.

We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Download the color images

We also provide a PDF file that has color images of the screenshots and diagrams used in this book. You can download it here: https://static.packt-cdn.com/downloads/9781801811910_ColorImages.pdf.

Conventions used

There are a number of text conventions used throughout this book.

Code in text: Indicates code words in the text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: "Mount the downloaded WebStorm-10*.dmg disk image file as another disk in your system."

A block of code is set as follows:

from sklearn.preprocessing import StandardScaler

scaler = StandardScaler()

X_scaled = scaler.fit_transform(dfx.drop(columns = ["annotation"]))

When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:

>>> heterogenousList[0]

dichloromethane

>>> heterogenousList[1]

3.14

Any command-line input or output is written as follows:

$ mkdir machine-learning-biotech

Bold: Indicates a new term, an important word, or words that you see on screen. For instance, words in menus or dialog boxes appear in bold. Here is an example: "Select System info from the Administration panel."

Tips or Important notes

Appear like this.

Get in touch

Feedback from our readers is always welcome.

General feedback: If you have questions about any aspect of this book, email us at [email protected] and mention the book title in the subject of your message.

Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/support/errata and fill in the form.

Piracy: If you come across any illegal copies of our works in any form on the internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.

If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.

Share your thoughts

Once you've read Machine Learning in Biotechnology and Life Sciences, we'd love to hear your thoughts! Please click here to go straight to the Amazon review page for this book and share your feedback.

Your review is important to us and the tech community and will help us make sure we're delivering excellent quality content.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.21.97.61