Model/data parallelism

There are mainly two ways to achieve parallelism and scale your task in multiple servers:

  • Model Parallelism: When your model does not fit on the GPU, you need to compute layers on different servers.
  • Data Parallelism: When we have the same model distributed on different servers but handling different batches, so each server will have a different gradient and we need some sort of synchronization between the servers.

In this section, we will focus on data parallelism, which is simple for implementation:

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.137.188.201