Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Distributed Processing

In the last chapter, we introduced the concept of parallel processing and learned how to leverage multicore processors and GPUs. Now, we can step up our game a bit and turn our attention on distributed processing, which involves executing tasks across multiple machines to solve a certain problem.

In this chapter, we will illustrate the challenges, use cases, and examples of how to run code on a cluster of computers. Python offers easy-to-use and reliable packages for distribute processing, which will allow us to implement scalable and fault-tolerant code with relative ease.

The list of topics for this chapter is as follows:

Distributed computing and the MapReduce model
Directed Acyclic Graphs with Dask
Writing parallel code with Dask's array, Bag, and DataFrame data structures
Distributing parallel algorithms with Dask Distributed
An introduction to PySpark
Spark's Resilient Distributed Datasets and DataFrame
Scientific computing with mpi4py

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

3.12.153.212

Table of Contents for Distributed Processing

Create new playlist

Sign In

Sign Up

Table of Contents for
Distributed Processing