0%

Book Description

Understand the constructs of the Python programming language and use them to build data science projects

Key Features

  • Learn the basics of developing applications with Python and deploy your first data application
  • Take your first steps in Python programming by understanding and using data structures, variables, and loops
  • Delve into Jupyter, NumPy, Pandas, SciPy, and sklearn to explore the data science ecosystem in Python

Book Description

Python is the most widely used programming language for building data science applications. Complete with step-by-step instructions, this book contains easy-to-follow tutorials to help you learn Python and develop real-world data science projects. The "secret sauce" of the book is its curated list of topics and solutions, put together using a range of real-world projects, covering initial data collection, data analysis, and production.

This Python book starts by taking you through the basics of programming, right from variables and data types to classes and functions. You'll learn how to write idiomatic code and test and debug it, and discover how you can create packages or use the range of built-in ones. You'll also be introduced to the extensive ecosystem of Python data science packages, including NumPy, Pandas, scikit-learn, Altair, and Datashader. Furthermore, you'll be able to perform data analysis, train models, and interpret and communicate the results. Finally, you'll get to grips with structuring and scheduling scripts using Luigi and sharing your machine learning models with the world as a microservice.

By the end of the book, you'll have learned not only how to implement Python in data science projects, but also how to maintain and design them to meet high programming standards.

What you will learn

  • Code in Python using Jupyter and VS Code
  • Explore the basics of coding - loops, variables, functions, and classes
  • Deploy continuous integration with Git, Bash, and DVC
  • Get to grips with Pandas, NumPy, and scikit-learn
  • Perform data visualization with Matplotlib, Altair, and Datashader
  • Create a package out of your code using poetry and test it with PyTest
  • Make your machine learning model accessible to anyone with the web API

Who this book is for

If you want to learn Python or data science in a fun and engaging way, this book is for you. You'll also find this book useful if you're a high school student, researcher, analyst, or anyone with little or no coding experience with an interest in the subject and courage to learn, fail, and learn from failing. A basic understanding of how computers work will be useful.

Downloading the example code for this ebook: You can download the example code files for this ebook on GitHub at the following link: https://github.com/PacktPublishing/Python-Programming-Projects-Learn-Python-3.7-by-building-applications. If you require support please email: [email protected]

Table of Contents

  1. Title Page
  2. Copyright and Credits
    1. Learn Python by Building Data Science Applications
  3. About Packt
    1. Why subscribe?
  4. Contributors
    1. About the authors
    2. About the reviewers
    3. Packt is searching for authors like you
  5. Preface
    1. Who this book is for
    2. What this book covers
    3. To get the most out of this book
      1. Download the example code files
      2. Download the color images
      3. Code in Action
      4. Conventions used
    4. Get in touch
      1. Reviews
  6. Section 1: Getting Started with Python
  7. Preparing the Workspace
    1. Technical requirements
    2. Installing Python
    3. Downloading materials for running the code
      1. Installing Python packages
    4. Working with VS Code
      1. The VS Code interface
    5. Beginning with Jupyter
      1. Notebooks
      2. The Jupyter interface
    6. Pre-flight check
    7. Summary
    8. Questions
    9. Further reading
  8. First Steps in Coding - Variables and Data Types
    1. Technical requirements
    2. Assigning variables
    3. Naming the variable 
    4. Understanding data types
      1. Floats and integers
        1. Operations with self-assignment
        2. Order of execution
      2. Strings
        1. Formatting
          1. Format method
          2. F-strings
          3. Legacy formatting
          4. Formatting mini-language
        2. Strings as sequences
      3. Booleans
        1. Logical operators
    5. Converting the data types
    6. Exercise
    7. Summary
    8. Questions
    9. Further reading
  9. Functions
    1. Technical requirements
    2. Understanding a function
      1. Interface functions
        1. The input function
        2. The eval function
      2. Variable properties
        1. The help function
        2. The type function
        3. The isinstance function
        4. dir
      3. Math
        1. abs
        2. The round function
      4. Iterables
        1. The len function
        2. The sorted function
        3. The range function
        4. The all and any functions
        5. The max, min, and sum functions
    3. Defining the function
      1. Default values
      2. Var-positional and var-keyword
      3. Docstrings
      4. Type annotations
    4. Refactoring the temperature conversion
    5. Understanding anonymous (lambda) functions
    6. Understanding recursion
    7. Summary
    8. Questions
    9. Further reading
  10. Data Structures
    1. Technical requirements
    2. What are data structures?
      1. Lists
      2. Slicing
      3. Tuples
      4. Immutability
      5. Dictionaries
      6. Sets
    3. More data structures
      1. frozenset
      2. defaultdict
      3. Counter
      4. Queue
      5. deque
      6. namedtuple
      7. Enumerations
    4. Using generators
    5. Useful functions to use with data structures
      1. The sum, max, and min functions
      2. The all and any functions
      3. The zip function
      4. The map, filter, and reduce functions
    6. Comprehensions
    7. Summary
    8. Questions
    9. Further reading 
  11. Loops and Other Compound Statements
    1. Technical requirements
    2. Understanding if, else, and elif statements
      1. Inline if statements
      2. Using if in a comprehension
    3. Running code many times with loops
      1. The for loop
      2. itertools
        1. cycle
        2. chain
        3. product
      3. Enumeration
      4. The while loop
      5. Additional loop functionality – break and continue
    4. Handling exceptions with try/except and try/finally 
      1. Exceptions
      2. try/except
      3. try/except/finally
    5. Understanding the with statements
    6. Summary
    7. Questions
    8. Further reading
  12. First Script – Geocoding with Web APIs
    1. Technical requirements
    2. Geocoding as a service
    3. Learning about web APIs
      1. Working with HTTPS
    4. Working with the Nominatim API
      1. The requests library
      2. Starting to code
    5. Caching with decorators
    6. Reading and writing data
      1. Geocoding the addresses
    7. Moving code to a separate module
    8. Collecting NYC Open Data from the Socrata service
    9. Summary
    10. Questions
    11. Further reading
  13. Scraping Data from the Web with Beautiful Soup 4
    1. Technical requirements
    2. When there is no API
      1. HTML in a nutshell
      2. Scraping with Beautiful Soup 4
      3. CSS and XPath selectors
        1. Developer console
    3. Scraping WWII battles
      1. Step 1 – Scraping the list of battles
        1. Unordered list
      2. Step 2 – Scraping information from the Wiki page
        1. Key information
        2. Additional information
      3. Step 3 – Scraping data as a whole
      4. Quality control
    4. Beyond Beautiful Soup
    5. Summary
    6. Questions
    7. Further reading
  14. Simulation with Classes and Inheritance
    1. Technical requirements
    2. Understanding classes
      1. Special (dunder) methods
        1. __init__
        2. __repr__ and __str__ 
        3. Arithmetical and logical operations
        4. Equality/relationship methods
        5. __len__
        6. __getitem__
        7. __class__
      2. Inheritance
      3. Using super()
      4. Data classes
    3. Using classes in simulation
      1. Writing the base classes
      2. Writing the Island class
      3. Herbivore haven
      4. Harsh islands
      5. Visualization
    4. Summary
    5. Questions
    6. Further reading
  15. Shell, Git, Conda, and More – at Your Command
    1. Technical requirements
    2. Shell
      1. Pipes
      2. Executing Python scripts
      3. Command-line interface
    3. Git
      1. Concept
      2. GitHub
      3. Practical example
      4. gitignore
    4. Conda
      1. Conda for virtual environments
      2. Conda and Jupyter
    5. Make
    6. Cookiecutter
    7. Summary
    8. Questions
  16. Section 2: Hands-On with Data
  17. Python for Data Applications
    1. Technical requirements
    2. Introducing Python for data science
    3. Exploring NumPy
    4. Beginning with pandas
    5. Trying SciPy and scikit-learn
    6. Understanding Jupyter
    7. Summary
    8. Questions
  18. Data Cleaning and Manipulation
    1. Technical requirements
    2. Getting started with pandas
      1. Selection – by columns, indices, or both
      2. Masking
      3. Data types and data conversion
      4. Math
      5. Merging
    3. Working with real data
      1. Initial exploration
      2. Defining the scope of work to be done
    4. Getting to know regular expressions
    5. Parsing locations
      1. Geocoding
    6. Time
    7. Belligerents
    8. Understanding casualties
      1. Multilevel slicing
    9. Quality assurance
    10. Writing the file
    11. Summary
    12. Questions
    13. Further reading
  19. Data Exploration and Visualization
    1. Technical requirements
    2. Exploring the dataset
      1. Descriptive statistics
      2. Data visualization with matplotlib (and its pandas interface)
      3. Aggregating the data to calculate summary statistics 
        1. Resampling
      4. Mapping
    3. Declarative visualization with vega and altair
      1. Drawing maps with Altair
      2. Storing the Altair chart
    4. Big data visualization with datashader
    5. Summary
    6. Questions
    7. Further reading
  20. Training a Machine Learning Model
    1. Technical requirements
    2. Understanding the basics of ML
      1. Exploring unsupervised learning
      2. Moving on to supervised learning
        1. k-nearest neighbors
        2. Linear regression
        3. Decision trees
    3. Summary
    4. Questions
    5. Further reading
  21. Improving Your Model – Pipelines and Experiments
    1. Technical requirements
    2. Understanding cross-validation
    3. Exploring feature engineering
      1. Failed attempts
    4. Optimizing the hyperparameters
      1. Using a random forest model
    5. Tracking your data and metrics with version control
      1. Starting with data
      2. Adding code to the equation
      3. Metrics
    6. Summary
    7. Questions
    8. Further reading
  22. Section 3: Moving to Production
  23. Packaging and Testing with Poetry and PyTest
    1. Technical requirements
    2. Building a package
      1. Bringing your own package
      2. Using a package manager – pip and conda
      3. Creating a package scaffolding
    3. A few ways to build your package
      1. Trying out code with Poetry
      2. Adding actual code
      3. Defining dependencies
      4. Non-code resources
      5. Publishing the package
      6. Development workflow
    4. Testing the code so far
      1. Testing with PyTest
      2. Writing our own tests
    5. Automating the process with CI services
    6. Generating documentation generation with sphinx
    7. Installing a package in editable mode
    8. Summary
    9. Questions
    10. Further reading
  24. Data Pipelines with Luigi
    1. Technical requirements
    2. Introducing the ETL pipeline
      1. Redesigning your code as a pipeline
    3. Building our first task in Luigi
      1. Connecting the dots
    4. Understanding time-based tasks
      1. Scheduling with cron
    5. Exploring the different output formats
      1. Writing to an S3 bucket
      2. Writing to SQL
    6. Expanding Luigi with custom template classes
    7. Summary
    8. Questions
    9. Further reading
  25. Let's Build a Dashboard
    1. Technical requirements
    2. Building a dashboard – three types of dashboard
      1. Static dashboards
      2. Debugging Altair
      3. Connecting your app to the Luigi pipeline
    3. Understanding dynamic dashboards
      1. First try with panel
      2. Reading data from the database
      3. Creating an interactive dashboard in Jupyter
    4. Summary
    5. Questions
    6. Further reading
  26. Serving Models with a RESTful API
    1. Technical requirements
    2. What is a RESTful API?
      1. Python web frameworks
    3. Building a basic API service
      1. Exploring service with OpenAPI
      2. Finalizing our naive first iteration
      3. Data validation
      4. Sending data in with POST requests
      5. Adding features to our service
    4. Building a web page
    5. Speeding up with asynchronous calls
    6. Deploying and testing your API loads with Locust
    7. Summary
    8. Questions
    9. Further reading
  27. Serverless API Using Chalice
    1. Technical requirements
    2. Understanding serverless
    3. Getting started with Chalice
    4. Setting up a simple model
      1. Externalizing medians
    5. Building a serverless API for an ML model
      1. When we're still out of memory
    6. Building a serverless function as a data pipeline
      1. S3-triggered events
    7. Summary
    8. Questions
    9. Further reading
  28. Best Practices and Python Performance
    1. Technical requirements
    2. Speeding up your Python code
      1. Rewriting the code with NumPy
      2. Specialized data structures and algorithms
      3. Dask
        1. Dask-ML
      4. Numba
      5. Concurrency and parallelism
        1. Different types of concurrency
        2. Two types of problems
        3. Before you start rewriting your code
    3. Using best practices for coding in your project
      1. Code formatting with black
      2. Measuring code quality with Wily
      3. Writing tests with hypothesis
    4. Beyond this book – packages and technologies to look out for
      1. Different Python flavors
      2. Docker containers
      3. Kubernetes
    5. Summary
    6. Questions
    7. Further reading
  29. Assessments
    1. Chapter 1
    2. Chapter 2
    3. Chapter 3
    4. Chapter 4
    5. Chapter 5
    6. Chapter 6
    7. Chapter 7
    8. Chapter 8
    9. Chapter 9
    10. Chapter 10
    11. Chapter 11
    12. Chapter 12
    13. Chapter 13
    14. Chapter 14
    15. Chapter 15
    16. Chapter 16
    17. Chapter 17
    18. Chapter 18
    19. Chapter 19
    20. Chapter 20
  30. Other Books You May Enjoy
    1. Leave a review - let other readers know what you think
18.191.171.235