Basic Breeze data types

Breeze is an extensive library providing fast and easy manipulation of arrays of data, routines for optimization, interpolation, linear algebra, signal processing, and numerical integration.

The basic linear algebra operations underlying Breeze rely on the netlib-java library, which can use system-optimized BLAS and LAPACK libraries, if present. Thus, linear algebra operations in Breeze are often extremely fast. Breeze is still undergoing rapid development and can, therefore, be somewhat unstable.

Vectors

Breeze makes manipulating one- and two-dimensional data structures easy. To start, open a Scala console through SBT and import Breeze:

$ sbt console
scala> import breeze.linalg._
import breeze.linalg._

Let's dive straight in and define a vector:

scala> val v = DenseVector(1.0, 2.0, 3.0)
breeze.linalg.DenseVector[Double] = DenseVector(1.0, 2.0, 3.0)

We have just defined a three-element vector, v. Vectors are just one-dimensional arrays of data exposing methods tailored to numerical uses. They can be indexed like other Scala collections:

scala> v(1)
Double = 2.0

They support element-wise operations with a scalar:

scala> v :* 2.0 // :* is 'element-wise multiplication'
breeze.linalg.DenseVector[Double] = DenseVector(2.0, 4.0, 6.0)

They also support element-wise operations with another vector:

scala> v :+ DenseVector(4.0, 5.0, 6.0) // :+ is 'element-wise addition'
breeze.linalg.DenseVector[Double] = DenseVector(5.0, 7.0, 9.0)

Breeze makes writing vector operations intuitive and considerably more readable than the native Scala equivalent.

Note that Breeze will refuse (at compile time) to coerce operands to the correct type:

scala> v :* 2 // element-wise multiplication by integer
<console>:15: error: could not find implicit value for parameter op:
...

It will also refuse (at runtime) to add vectors together if they have different lengths:

scala> v :+ DenseVector(8.0, 9.0)
java.lang.IllegalArgumentException: requirement failed: Vectors must have same length: 3 != 2
...

Basic manipulation of vectors in Breeze will feel natural to anyone used to working with NumPy, MATLAB, or R.

So far, we have only looked at element-wise operators. These are all prefixed with a colon. All the usual suspects are present: :+, :*, :-, :/, :% (remainder), and :^ (power) as well as Boolean operators. To see the full list of operators, have a look at the API documentation for DenseVector or DenseMatrix (https://github.com/scalanlp/breeze/wiki/Linear-Algebra-Cheat-Sheet).

Besides element-wise operations, Breeze vectors support the operations you might expect of mathematical vectors, such as the dot product:

scala> val v2 = DenseVector(4.0, 5.0, 6.0)
breeze.linalg.DenseVector[Double] = DenseVector(4.0, 5.0, 6.0)

scala> v dot v2
Double = 32.0

Tip

Pitfalls of element-wise operators

Besides the :+ and :- operators for element-wise addition and subtraction that we have seen so far, we can also use the more traditional + and - operators:

scala> v + v2
breeze.linalg.DenseVector[Double] = DenseVector(5.0, 7.0, 9.0)

One must, however, be very careful with operator precedence rules when mixing :+ or :* with :+ operators. The :+ and :* operators have very low operator precedence, so they will be evaluated last. This can lead to some counter-intuitive behavior:

scala> 2.0 :* v + v2 // !! equivalent to 2.0 :* (v + v2)
breeze.linalg.DenseVector[Double] = DenseVector(10.0, 14.0, 18.0)

By contrast, if we use :+ instead of +, the mathematical precedence of operators is respected:

scala> 2.0 :* v :+ v2 // equivalent to (2.0 :* v) :+ v2
breeze.linalg.DenseVector[Double] = DenseVector(6.0, 9.0, 12.0)

In summary, one should avoid mixing the :+ style operators with the + style operators as much as possible.

Dense and sparse vectors and the vector trait

All the vectors we have looked at thus far have been dense vectors. Breeze also supports sparse vectors. When dealing with arrays of numbers that are mostly zero, it may be more computationally efficient to use sparse vectors. The point at which a vector has enough zeros to warrant switching to a sparse representation depends strongly on the type of operations, so you should run your own benchmarks to determine which type to use. Nevertheless, a good heuristic is that, if your vector is about 90% zero, you may benefit from using a sparse representation.

Sparse vectors are available in Breeze as the SparseVector and HashVector classes. Both these types support many of the same operations as DenseVector but use a different internal implementation. The SparseVector instances are very memory-efficient, but adding non-zero elements is slow. HashVector is more versatile, at the cost of an increase in memory footprint and computational time for iterating over non-zero elements. Unless you need to squeeze the last bits of memory out of your application, I recommend using HashVector. We will not discuss these further in this book, but the reader should find them straightforward to use if needed. DenseVector, SparseVector, and HashVector all implement the Vector trait, giving them a common interface.

Tip

Breeze remains very experimental and, as of this writing, somewhat unstable. I have found dealing with specific implementations of the Vector trait, such as DenseVector or SparseVector, to be more reliable than dealing with the Vector trait directly. In this chapter, we will explicitly type every vector as DenseVector.

Matrices

Breeze allows the construction and manipulation of two-dimensional arrays in a similar manner:

scala> val m = DenseMatrix((1.0, 2.0, 3.0), (4.0, 5.0, 6.0))
breeze.linalg.DenseMatrix[Double] =
1.0  2.0  3.0
4.0  5.0  6.0

scala> 2.0 :* m
breeze.linalg.DenseMatrix[Double] =
2.0  4.0   6.0
8.0  10.0  12.0

Building vectors and matrices

We have seen how to explicitly build vectors and matrices by passing their values to the constructor (or rather, to the companion object's apply method): DenseVector(1.0, 2.0, 3.0). Breeze offers several other powerful ways of building vectors and matrices:

scala> DenseVector.ones[Double](5)
breeze.linalg.DenseVector[Double] = DenseVector(1.0, 1.0, 1.0, 1.0, 1.0)

scala> DenseVector.zeros[Int](3)
breeze.linalg.DenseVector[Int] = DenseVector(0, 0, 0)

The linspace method (available in the breeze.linalg package object) creates a Double vector of equally spaced values. For instance, to create a vector of 10 values distributed uniformly between 0 and 1, perform the following:

scala> linspace(0.0, 1.0, 10)
breeze.linalg.DenseVector[Double] = DenseVector(0.0, 0.1111111111111111, ..., 1.0)

The tabulate method lets us construct vectors and matrices from functions:

scala> DenseVector.tabulate(4) { i => 5.0 * i }
breeze.linalg.DenseVector[Double] = DenseVector(0.0, 5.0, 10.0, 15.0)

scala> DenseMatrix.tabulate[Int](2, 3) { 
  (irow, icol) => irow*2 + icol 
}
breeze.linalg.DenseMatrix[Int] =
0  1  2
2  3  4

The first argument to DenseVector.tabulate is the size of the vector, and the second is a function returning the value of the vector at a particular position. This is useful for creating ranges of data, among other things.

The rand function lets us create random vectors and matrices:

scala> DenseVector.rand(2)
breeze.linalg.DenseVector[Double] = DenseVector(0.8072865137359484, 0.5566507203838562)

scala> DenseMatrix.rand(2, 3)
breeze.linalg.DenseMatrix[Double] =
0.5755491874682879   0.8142161471517582  0.9043780212739738
0.31530195124023974  0.2095094278911871  0.22069103504148346

Finally, we can construct vectors from Scala arrays:

scala> DenseVector(Array(2, 3, 4))
breeze.linalg.DenseVector[Int] = DenseVector(2, 3, 4)

To construct vectors from other Scala collections, you must use the splat operator, :_ *:

scala> val l = Seq(2, 3, 4)
l: Seq[Int] = List(2, 3, 4)

scala> DenseVector(l :_ *)
breeze.linalg.DenseVector[Int] = DenseVector(2, 3, 4)

Advanced indexing and slicing

We have already seen how to select a particular element in a vector v by its index with, for instance, v(2). Breeze also offers several powerful methods for selecting parts of a vector.

Let's start by creating a vector to play around with:

scala> val v = DenseVector.tabulate(5) { _.toDouble }
breeze.linalg.DenseVector[Double] = DenseVector(0.0, 1.0, 2.0, 3.0, 4.0)

Unlike native Scala collections, Breeze vectors support negative indexing:

scala> v(-1) // last element
Double = 4.0

Breeze lets us slice the vector using a range:

scala> v(1 to 3)
breeze.linalg.DenseVector[Double] = DenseVector(1.0, 2.0, 3.0)

scala v(1 until 3) // equivalent to Python v[1:3]
breeze.linalg.DenseVector[Double] = DenseVector(1.0, 2.0)

scala> v(v.length-1 to 0 by -1) // reverse view of v
breeze.linalg.DenseVector[Double] = DenseVector(4.0, 3.0, 2.0, 1.0, 0.0)

Tip

Indexing by a range returns a view of the original vector: when running val v2 = v(1 to 3), no data is copied. This means that slicing is extremely efficient. Taking a slice of a huge vector does not increase the memory footprint at all. It also means that one should be careful updating a slice, since it will also update the original vector. We will discuss mutating vectors and matrices in a subsequent section in this chapter.

Breeze also lets us select an arbitrary set of elements from a vector:

scala> val vSlice = v(2, 4) // Select elements at index 2 and 4
breeze.linalg.SliceVector[Int,Double] = breeze.linalg.SliceVector@9c04d22

This creates a SliceVector, which behaves like a DenseVector (both implement the Vector interface), but does not actually have memory allocated for values: it just knows how to map from its indices to values in its parent vector. One should think of vSlice as a specific view of v. We can materialize the view (give it its own data rather than acting as a lens through which v is viewed) by converting it to DenseVector:

scala> vSlice.toDenseVector
breeze.linalg.DenseVector[Double] = DenseVector(2.0, 4.0)

Note that if an element of a slice is out of bounds, an exception will only be thrown when that element is accessed:

scala> val vSlice = v(2, 7) // there is no v(7)
breeze.linalg.SliceVector[Int,Double] = breeze.linalg.SliceVector@2a83f9d1

scala> vSlice(0) // valid since v(2) is still valid
Double = 2.0

scala> vSlice(1) // invalid since v(7) is out of bounds
java.lang.IndexOutOfBoundsException: 7 not in [-5,5)
  ...

Finally, one can index vectors using Boolean arrays. Let's start by defining an array:

scala> val mask = DenseVector(true, false, false, true, true)
breeze.linalg.DenseVector[Boolean] = DenseVector(true, false, false, true, true)

Then, v(mask) results in a view containing the elements of v for which mask is true:

scala> v(mask).toDenseVector
breeze.linalg.DenseVector[Double] = DenseVector(0.0, 3.0, 4.0)

This can be used as a way of filtering certain elements in a vector. For instance, to select the elements of v which are less than 3.0:

scala> val filtered = v(v :< 3.0) // :< is element-wise "less than"
breeze.linalg.SliceVector[Int,Double] = breeze.linalg.SliceVector@2b1edef3

scala> filtered.toDenseVector
breeze.linalg.DenseVector[Double] = DenseVector(0.0, 1.0, 2.0)

Matrices can be indexed in much the same way as vectors. Matrix indexing functions take two arguments—the first argument selects the row(s) and the second one slices the column(s):

scala> val m = DenseMatrix((1.0, 2.0, 3.0), (5.0, 6.0, 7.0))
m: breeze.linalg.DenseMatrix[Double] =
1.0  2.0  3.0
5.0  6.0  7.0

scala> m(1, 2)
Double = 7.0

scala> m(1, -1)
Double = 7.0

scala> m(0 until 2, 0 until 2)
breeze.linalg.DenseMatrix[Double] =
1.0  2.0
5.0  6.0

You can also mix different slicing types for rows and columns:

scala> m(0 until 2, 0)
breeze.linalg.DenseVector[Double] = DenseVector(1.0, 5.0)

Note how, in this case, Breeze returns a vector. In general, slicing returns the following objects:

  • A scalar when single indices are passed as the row and column arguments
  • A vector when the row argument is a range and the column argument is a single index
  • A vector transpose when the column argument is a range and the row argument is a single index
  • A matrix otherwise

The symbol :: can be used to indicate every element along a particular direction. For instance, we can select the second column of m:

scala> m(::, 1)
breeze.linalg.DenseVector[Double] = DenseVector(2.0, 6.0)

Mutating vectors and matrices

Breeze vectors and matrices are mutable. Most of the slicing operations described above can also be used to set elements of a vector or matrix:

scala> val v = DenseVector(1.0, 2.0, 3.0)
v: breeze.linalg.DenseVector[Double] = DenseVector(1.0, 2.0, 3.0)

scala> v(1) = 22.0 // v is now DenseVector(1.0, 22.0, 3.0)

We are not limited to mutating single elements. In fact, all the indexing operations outlined above can be used to set the elements of vectors or matrices. When mutating slices of vectors or matrices, use the element-wise assignment operator, :=:

scala> v(0 until 2) := DenseVector(50.0, 51.0) // set elements at position 0 and 1
breeze.linalg.DenseVector[Double] = DenseVector(50.0, 51.0)

scala> v
breeze.linalg.DenseVector[Double] = DenseVector(50.0, 51.0, 3.0)

The assignment operator, :=, works like other element-wise operators in Breeze. If the right-hand side is a scalar, it will automatically be broadcast to a vector of the given shape:

scala> v(0 until 2) := 0.0 // equivalent to v(0 until 2) := DenseVector(0.0, 0.0)
breeze.linalg.DenseVector[Double] = DenseVector(0.0, 0.0)

scala> v
breeze.linalg.DenseVector[Double] = DenseVector(0.0, 0.0, 3.0)

All element-wise operators have an update counterpart. For instance, the :+= operator acts like the element-wise addition operator :+, but also updates its left-hand operand:

scala> val v = DenseVector(1.0, 2.0, 3.0)
v: breeze.linalg.DenseVector[Double] = DenseVector(1.0, 2.0, 3.0)

scala> v :+= 4.0
breeze.linalg.DenseVector[Double] = DenseVector(5.0, 6.0, 7.0)

scala> v
breeze.linalg.DenseVector[Double] = DenseVector(5.0, 6.0, 7.0)

Notice how the update operator updates the vector in place and returns it.

We have learnt how to slice vectors and matrices in Breeze to create new views of the original data. These views are not independent of the vector they were created from—updating the view will update the underlying vector and vice-versa. This is best illustrated with an example:

scala> val v = DenseVector.tabulate(6) { _.toDouble }
breeze.linalg.DenseVector[Double] = DenseVector(0.0, 1.0, 2.0, 3.0, 4.0, 5.0)

scala> val viewEvens = v(0 until v.length by 2)
breeze.linalg.DenseVector[Double] = DenseVector(0.0, 2.0, 4.0)

scala> viewEvens := 10.0 // mutate viewEvens
breeze.linalg.DenseVector[Double] = DenseVector(10.0, 10.0, 10.0)

scala> viewEvens
breeze.linalg.DenseVector[Double] = DenseVector(10.0, 10.0, 10.0)

scala> v  // v has also been mutated!
breeze.linalg.DenseVector[Double] = DenseVector(10.0, 1.0, 10.0, 3.0, 10.0, 5.0)

This quickly becomes intuitive if we remember that, when we create a vector or matrix, we are creating a view of an underlying data array rather than creating the data itself:

Mutating vectors and matrices

A vector slice v(0 to 6 by 2) of the v vector is just a different view of the array underlying v. The view itself contains no data. It just contains pointers to the data in the original array. Internally, the view is just stored as a pointer to the underlying data and a recipe for iterating over that data: in the case of this slice, the recipe is just "start at the first element of the underlying data and go to the seventh element of the underlying data in steps of two".

Breeze offers a copy function for when we want to create independent copies of data. In the previous example, we can construct a copy of viewEvens as:

scala> val copyEvens = v(0 until v.length by 2).copy
breeze.linalg.DenseVector[Double] = DenseVector(10.0, 10.0, 10.0)

We can now update copyEvens independently of v.

Matrix multiplication, transposition, and the orientation of vectors

So far, we have mostly looked at element-wise operations on vectors and matrices. Let's now look at matrix multiplication and related operations.

The matrix multiplication operator is *:

scala> val m1 = DenseMatrix((2.0, 3.0), (5.0, 6.0), (8.0, 9.0))
breeze.linalg.DenseMatrix[Double] =
2.0  3.0
5.0  6.0
8.0  9.0

scala> val m2 = DenseMatrix((10.0, 11.0), (12.0, 13.0))
breeze.linalg.DenseMatrix[Double] 
10.0  11.0
12.0  13.0

scala> m1 * m2
56.0   61.0
122.0  133.0
188.0  205.0

Besides matrix-matrix multiplication, we can use the matrix multiplication operator between matrices and vectors. All vectors in Breeze are column vectors. This means that, when multiplying matrices and vectors together, a vector should be viewed as an (n * 1) matrix. Let's walk through an example of matrix-vector multiplication. We want the following operation:

Matrix multiplication, transposition, and the orientation of vectors
scala> val v = DenseVector(1.0, 2.0)
breeze.linalg.DenseVector[Double] = DenseVector(1.0, 2.0)

scala> m1 * v 
breeze.linalg.DenseVector[Double] = DenseVector(8.0, 17.0, 26.0)

By contrast, if we wanted:

Matrix multiplication, transposition, and the orientation of vectors

We must convert v to a row vector. We can do this using the transpose operation:

scala> val vt = v.t
breeze.linalg.Transpose[breeze.linalg.DenseVector[Double]] = Transpose(DenseVector(1.0, 2.0))

scala> vt * m2
breeze.linalg.Transpose[breeze.linalg.DenseVector[Double]] = Transpose(DenseVector(34.0, 37.0))

Note that the type of v.t is Transpose[DenseVector[_]]. A Transpose[DenseVector[_]] behaves in much the same way as a DenseVector as far as element-wise operations are concerned, but it does not support mutation or slicing.

Data preprocessing and feature engineering

We have now discovered the basic components of Breeze. In the next few sections, we will apply them to real examples to understand how they fit together to form a robust base for data science.

An important part of data science involves preprocessing datasets to construct useful features. Let's walk through an example of this. To follow this example and access the data, you will need to download the code examples for the book (www.github.com/pbugnion/s4ds).

You will find, in directory chap02/data/ of the code attached to this book, a CSV file with true heights and weights as well as self-reported heights and weights for 181 men and women. The original dataset was collected as part of a study on body image. Refer to the following link for more information: http://vincentarelbundock.github.io/Rdatasets/doc/car/Davis.html.

There is a helper function in the package provided with the book to load the data into Breeze arrays:

scala> val data = HWData.load
HWData [ 181 rows ]

scala> data.genders
breeze.linalg.Vector[Char] = DenseVector(M, F, F, M, ... )

The data object contains five vectors, each 181 element long:

  • data.genders: A Char vector describing the gender of the participants
  • data.heights: A Double vector of the true height of the participants
  • data.weights: A Double vector of the true weight of the participants
  • data.reportedHeights: A Double vector of the self-reported height of the participants
  • data.reportedWeights: A Double vector of the self-reported weight of the participants

Let's start by counting the number of men and women in the study. We will define an array that contains just 'M' and do an element-wise comparison with data.genders:

scala> val maleVector = DenseVector.fill(data.genders.length)('M')
breeze.linalg.DenseVector[Char] = DenseVector(M, M, M, M, M, M,... )

scala> val isMale = (data.genders :== maleVector)
breeze.linalg.DenseVector[Boolean] = DenseVector(true, false, false, true ...)

The isMale vector is the same length as data.genders. It is true where the participant is male, and false otherwise. We can use this Boolean array as a mask for the other arrays in the dataset (remember that vector(mask) selects the elements of vector where mask is true). Let's get the height of the men in our dataset:

scala> val maleHeights = data.heights(isMale)
breeze.linalg.SliceVector[Int,Double] = breeze.linalg.SliceVector@61717d42

scala> maleHeights.toDenseVector
breeze.linalg.DenseVector[Double] = DenseVector(182.0, 177.0, 170.0, ...

To count the number of men in our dataset, we can use the indicator function. This transforms a Boolean array into an array of doubles, mapping false to 0.0 and true to 1.0:

scala> import breeze.numerics._
import breeze.numerics._

scala> sum(I(isMale))
Double: 82.0

Let's calculate the mean height of men and women in the experiment. We can calculate the mean of a vector using mean(v), which we can access by importing breeze.stats._:

scala> import breeze.stats._
import breeze.stats._

scala> mean(data.heights)
Double = 170.75690607734808

To calculate the mean height of the men, we can use our isMale array to slice data.heights; data.heights(isMale) is a view of the data.heights array with all the height values for the men:

scala> mean(data.heights(isMale)) // mean male height
Double = 178.0121951219512

scala> mean(data.heights(!isMale)) // mean female height
Double = 164.74747474747474

As a somewhat more involved example, let's look at the discrepancy between real and reported weight for both men and women in this experiment. We can get an array of the percentage difference between the reported weight and the true weight:

scala> val discrepancy = (data.weights - data.reportedWeights) / data.weights
breeze.linalg.Vector[Double] = DenseVector(0.0, 0.1206896551724138, -0.018867924528301886, -0.029411764705882353, ... )

Notice how Breeze's overloading of mathematical operators allows us to manipulate data arrays easily and elegantly.

We can now calculate the mean and standard deviation of this array for men:

scala> mean(discrepancy(isMale))
res6: Double = -0.008451852933123775

scala> stddev(discrepancy(isMale))
res8: Double = 0.031901519634244195

We can also calculate the fraction of men who overestimated their height:

scala> val overReportMask = (data.reportedHeights :> data.heights).toDenseVector
breeze.linalg.DenseVector[Boolean] = DenseVector(false, false, false, false...

scala> sum(I(overReportMask :& isMale))
Double: 10.0

There are thus ten men who believe they are taller than they actually are. The element-wise AND operator :& returns a vector that is true for all indices for which both its arguments are true. The vector overReportMask :& isMale is thus true for all participants that are male and over-reported their height.

Breeze – function optimization

Having studied feature engineering, let's now look at the other end of the data science pipeline. Typically, a machine learning algorithm defines a loss function that is a function of a set of parameters. The value of the loss function represents how well the model fits the data. The parameters are then optimized to minimize (or maximize) the loss function.

In Chapter 12, Distributed Machine Learning with MLlib, we will look at MLlib, a machine learning library that contains many well-known algorithms. Often, we don't need to worry about optimizing loss functions directly since we can rely on the machine learning algorithms provided by MLlib. It is nevertheless useful to have a basic knowledge of optimization.

Breeze has an optimize module that contains functions for finding a local minimum:

scala> import breeze.optimize._
import breeze.optimize._

Let's create a toy function that we want to optimize:

Breeze – function optimization

We can represent this function in Scala as follows:

scala> def f(xs:DenseVector[Double]) = sum(xs :^ 2.0)
f: (xs: breeze.linalg.DenseVector[Double])Double

Most local optimizers also require the gradient of the function being optimized. The gradient is a vector of the same dimension as the arguments to the function. In our case, the gradient is:

Breeze – function optimization

We can represent the gradient in Breeze with a function that takes a vector argument and returns a vector of the same length:

scala> def gradf(xs:DenseVector[Double]) = 2.0 :* xs
gradf: (xs:breeze.linalg.DenseVector[Double])breeze.linalg.DenseVector[Double]

For instance, at the point (1, 1, 1), we have:

scala> val xs = DenseVector.ones[Double](3)
breeze.linalg.DenseVector[Double] = DenseVector(1.0, 1.0, 1.0)

scala> f(xs)
Double = 3.0

scala> gradf(xs)
breeze.linalg.DenseVector[Double] = DenseVector(2.0, 2.0, 2.0)

Let's set up the optimization problem. Breeze's optimization methods require that we pass in an implementation of the DiffFunction trait with a single method, calculate. This method must return a tuple of the function and its gradient:

scala> val optTrait = new DiffFunction[DenseVector[Double]] {
  def calculate(xs:DenseVector[Double]) = (f(xs), gradf(xs))
}
breeze.optimize.DiffFunction[breeze.linalg.DenseVector[Double]] = <function1>

We are now ready to run the optimization. The optimize module provides a minimize function that does just what we want. We pass it optTrait and a starting point for the optimization:

scala> val minimum = minimize(optTrait, DenseVector(1.0, 1.0, 1.0))
breeze.linalg.DenseVector[Double] = DenseVector(0.0, 0.0, 0.0)

The true minimum is at (0.0, 0.0, 0.0). The optimizer therefore correctly finds the minimum.

The minimize function uses the L-BFGS method to run the optimization by default. It takes several additional arguments to control the optimization. We will explore these in the next sections.

Numerical derivatives

In the previous example, we specified the gradient of f explicitly. While this is generally good practice, calculating the gradient of a function can often be tedious. Breeze provides a gradient approximation function using finite differences. Reusing the same objective function def f(xs:DenseVector[Double]) = sum(xs :^ 2.0) as in the previous section:

scala> val approxOptTrait = new ApproximateGradientFunction(f)
breeze.optimize.ApproximateGradientFunction[Int,breeze.linalg.DenseVector[Double]] = <function1>

The trait approxOptTrait has a gradientAt method that returns an approximation to the gradient at a point:

scala> approxOptTrait.gradientAt(DenseVector.ones(3))
breeze.linalg.DenseVector[Double] = DenseVector(2.00001000001393, 2.00001000001393, 2.00001000001393)

Note that this can be quite inaccurate. The ApproximateGradientFunction constructor takes an epsilon optional argument that controls the size of the step taken when calculating the finite differences. Changing the value of epsilon can improve the accuracy of the finite difference algorithm.

The ApproximateGradientFunction instance implements the DiffFunction trait. It can therefore be passed to minimize directly:

scala> minimize(approxOptTrait, DenseVector.ones[Double](3))
breeze.linalg.DenseVector[Double] = DenseVector(-5.000001063126813E-6, -5.000001063126813E-6, -5.000001063126813E-6)

This, again, gives a result close to zero, but somewhat further away than when we specified the gradient explicitly. In general, it will be significantly more efficient and more accurate to calculate the gradient of a function analytically than to rely on Breeze's numerical gradient. It is probably best to only use the numerical gradient during data exploration or to check analytical gradients.

Regularization

The minimize function takes many optional arguments relevant to machine learning algorithms. In particular, we can instruct the optimizer to use a regularization parameter when performing the optimization. Regularization introduces a penalty in the loss function to prevent the parameters from growing arbitrarily. This is useful to avoid overfitting. We will discuss regularization in greater detail in Chapter 12, Distributed Machine Learning with MLlib.

For instance, to use L2Regularization with a hyperparameter of 0.5:

scala> minimize(optTrait, DenseVector(1.0, 1.0, 1.0), L2Regularization(0.5))
breeze.linalg.DenseVector[Double] = DenseVector(0.0, 0.0, 0.0)

The regularization makes no difference in this case, since the parameters are zero at the minimum.

To see a list of optional arguments that can be passed to minimize, consult the Breeze documentation online.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.16.137.117