Boltzmann machine architecture

Boltzmann machine architecture is based on input, output, and hidden nodes. The connection weights are symmetrical:

Based on this assumption, Boltzmann machines are highly recurrent, and this recurrence eliminates any basic difference between input and output nodes, which can be considered as input or output when needed. The Boltzmann machine is a network of units with an energy defined for the overall network. Its units produce binary results ((1,0) values). Outputs are computed probabilistically, and depend upon the temperature variable T.

The consensus function of the Boltzmann machine is given by the following formula:

In the previous formula, the terms are defined as follows:

  • Si is the state of unit i(1,0)
  • wij is the connection strength between unit j and unit i
  • uj is the output of unit j

The calculation proceeds within the machine in a stochastic manner so that the consent is increased. Thus, if wij is positive, there is a tendency to have units i and j both activated or both deactivated, while if the weight is negative, there is a tendency to have them with different activations (one activated and the other not). When a weight is positive, it is called excitatory; otherwise, it is called inhibitory.

Each binary unit makes a stochastic decision to be either 1 (with probability pi) or 0 (with probability 1- pi). This probability is given by the following formula:

At the equilibrium state of the network, the likelihood is defined as the exponentiated negative energy, known as the Boltzmann distribution. You can imagine that by administering energy, you can get the system out of the local minima. This must be done slowly, because a violent shock can drive the system away from the global minimum. The best method is to give energy and then slowly reduce it. This concept is used in metallurgy, where an ordered state of the metal is obtained first by melting, and then slowly the temperature is reduced. The reduction in temperature as the process is under way is called simulated annealing.

This method can be reproduced by adding a probabilistic update rule to the Hopfield network (refer to Chapter 13, Beyond Feedforward Networks – CNN and RNN); the network that reproduces it is called Boltzmann machine. There will be a parameter that varies: the temperature. So at high T, the probability of jumping to a higher energy is much greater than at low temperatures.

When the temperature drops, the probability of assuming the correct minimum energy status approaches 1, and the network reaches the thermal equilibrium. Each unit of the network makes an energy leap given by the following formula:

The system changes to a state of lower energy according to the following probabilistic rule (transition function):

It is seen that the probability of transition to a higher energy state is greater at high T than at low T. The network can assume a configuration of stable states according to the following Boltzmann distribution:

That is, it depends on the energy of the state and temperature of the system. Lower energy states are more likely; in fact if Ea < Eb, then Pa/Pb > 1, because of which Pa>Pb. So the system tends toward a state of minimum energy.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.221.156.50