Comparing ML workloads to GPU configurations

We can compare the same ML workload by testing it with three different GPU configurations; these are as follows:

  • GPU using DirectPath I/O on vSphere
  • GRID vGPU on vSphere
  • Native GPU on bare metal host

We have tested and found that the virtualization layer (DirectPath I/O and GRID vGPU) introduced only a 4% overhead for the tested ML application. Learning times can be compared to the specific model by using two virtual machines with different configurations.

VM resources along with OS of two VMs with and without GPU:

  • NVIDIA GRID Configuration: 1 vGPU, 12 vCPUs, 60 GB memory, 96 GB of SSD storage, CentOS 7.2
  • No GPU configuration: No GPU, 12 vCPUs, 60 GB memory, 96 GB of SSD storage, CentOS 7.2

Let's look at the following table:

MNIST workload


1 vGPU (sec)


No GPU (sec)

Normalized learning time

1.1

10.01

CPU utilization

9%

45%

 

vGPU reduces the training time by 10 times and CPU utilization also goes down 5 times as shown in the preceding table. ML can be referenced with two components; these are as follows:

  • The convolutional neural network model derived from the TensorFlow library.
  • The Canadian Institute For Advanced Research (CIFAR)-10 dataset has defined images datasets, which we utilize in ML and IT vision algorithms.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.226.169.94