
To create the neural network with the Shogun library, we have to start by defining the architecture of the network. We use the CNeuralLayers class in the Shogun library to do so, which is used for aggregating the network layers. It has different methods for creating layers:

  • input: Creates the input layer with a specified number of dimensions
  • logistic: Creates a fully connected hidden layer with the logistic (sigmoid) activation function
  • linear: Creates a fully connected hidden layer with the linear activation function
  • rectified_linear: Creates a fully connected hidden layer with the ReLU activation function
  • leaky_rectified_linear: Creates a fully connected hidden layer with the Leaky ReLU activation function
  • softmax: Creates a fully connected hidden layer with the softmax activation function

Each of these methods returns a new object of the CNeuralLayers class, which contains all the previous layers with an added new one. So, to add a new layer, we can write the following code:

// create the initial object
auto layers = some<CNeuralLayers>();
// add the input layer
layers = wrap(layers->input(dimensions));
// add the hidden layer
layers = wrap(layers->logistic(32));

Each time we add a new layer, we rewrite the pointer to the CNeuralLayers type object. We have to call the done method of the CNeuralLayers class after all the layers have been added. Then, it returns an array of configured layers, which can be used to create the CNeuralNetwork type object. The CNeuralNetwork class implements functionality for network initialization and training. After we've created the CNeuralNetwork object, we have to connect all the layers by calling the quick_connect method. Then, we can initialize the weights of all the layers by calling the initialize_neural_network method. This method can take an optional parameter, sigma, which is the standard deviation of the Gaussian that's used to initialize the parameters randomly.

After we've configured the neural network, we should configure the optimization algorithm. This configuration can also be done with the CNeuralNetwork object. First, we should specify the optimization method. This class supports the gradient descent and Broyden–Fletcher–Goldfarb–Shanno (BFGS) algorithms. The BFGS is a second-order (based on second derivatives) iterative method for solving unconstrained, nonlinear optimization problems.

For this sample, we chose the gradient descent method by calling set_optimization method with the NNOM_GRADIENT_DESCENT enumeration value argument. Other settings are standard for the gradient descent method's configuration. The set_gd_mini_batch_size method sets the size of a mini-batch. The set_l2_coefficient method sets the value of the regularization weight decay parameter. The set_gd_learning_rate method sets the learning rate parameter. The set_gd_momentum method sets the momentum, , parameter value. With the set_max_num_epochs method, we can set the maximum number of training epochs, and with the set_epsilon method, we can define the convergence criteria value for a loss function.

The loss function can't be explicitly configured in the Shogun library. It is automatically selected based on the type of labels specified with the set_labels method. In this example, we used the CRegressionLabels type for the labels because we are solving the regression task. Network training can be done with the {train} method, which takes an object of the CDenseFeatures type. This contains a set of all the training samples.

The source code for this example is as follows:

ize_t n = 10000;
SGMatrix<float64_t> x_values(1, static_cast<index_t>(n));
SGVector<float64_t> y_values(static_cast<index_t>(n));

auto x = some<CDenseFeatures<float64_t>>(x_values);
auto y = some<CRegressionLabels>(y_values);

auto dimensions = x->get_num_features();
auto layers = some<CNeuralLayers>();
layers = wrap(layers->input(dimensions));
layers = wrap(layers->rectified_linear(32));
layers = wrap(layers->rectified_linear(16));
layers = wrap(layers->rectified_linear(8));
layers = wrap(layers->linear(1));
auto all_layers = layers->done();

auto network = some<CNeuralNetwork>(all_layers);

network->set_l2_coefficient(0.0001); // regularization
network->set_epsilon(0.0); // convergence criteria


To see the training's progress, we can set the higher logging level for the Shogun library with the following call:


This function allows us to see a lot of additional information about the overall training process, which can help us debug and find problems in the network we train.

