ND4J - high performance linear algebra for the JVM

Let's have a look at ND4J first. As already mentioned, ND4J is a tensor and linear algebra library. This means multidimensional arrays (also called tensors) and operations on them are the main purpose. Operations are simple but fast, for example, like in NumPy for Python, if you are familiar with it.

So what's the advantage of using ND4J over NumPy?

  • First of all, when using Apache Spark, we stay in the same JVM process and don't have to pay the overhead of IPC (inter process communication) at all.
  • Then, ND4J is capable of using SIMD instruction sets on modern CPUs, which doubles the performance of ND4J over NumPy. This is achieved by using OpenBLAS.
  • Finally, ND4J can take advantage of the GPUs present on your machine by just setting a system property on the JVM, provided a recent version of the CUDA drivers and framework is installed on your system.

So, let's have a taste of what this looks like, syntax wise, in Scala:

import org.nd4j.linalg.factory.Nd4j
import org.nd4j.linalg.api.ndarray.INDArray
var v: INDArray = Nd4j.create(Array(Array(1d, 2d, 3d), Array(4d, 5d, 6d)))
var w: INDArray = Nd4j.create(Array(Array(1d, 2d), Array(3d, 4d), Array(5d, 6d)))
print(v.mul(w))

As you can see, we are creating two matrices v and w of type INDArray using the Nd4j.create method. In order to do so, we have to provide a nested Scala array of type double, which we can create inline like this:

Array(Array(1d, 2d, 3d), Array(4d, 5d, 6d))

Finally, the v.mul(w) code actually triggers the matrix multiplication, again, either on a CPU or GPU, but this is totally transparent to us.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.138.120.187