Multivariate Bernoulli classification

So far, our investigation of the Naïve Bayes has focused on features that are essentially binary {UP=1, DOWN=0}. The mean value is computed as the ratio of the number of observations for which xi = UP over the total number of observations.

As stated in the first section, the Gaussian distribution is more appropriate for either continuous features or binary features in the case of very large labeled datasets. The example is the perfect candidate for the Bernoulli model.

Model

The Bernoulli model differs from the Naïve Bayes classifier in that it penalizes the features x, which does not have any observation; the Naïve Bayes classifier ignores them [5:10].

Note

The Bernoulli mixture model

M8: For a feature function fk with fk = 1 if the feature is observed, 0 otherwise, and the probability p of the observed feature xk belongs to the class Cj, the posterior probability is computed as follows:

Model

Implementation

The implementation of the Bernoulli model consists of modifying the score function in the Likelihood class using the Bernoulli density method, bernoulli, defined in the Stats object:

def bernoulli(mean: Double, p: Int): Double = 
     mean*p + (1-mean)*(1-p)
def bernoulli(x: Double*): Double = bernoulli(x(0), x(1).toInt)

The first version of the Bernoulli algorithm is the direct implementation of the mathematical formula, M8. The second version uses the signature of the Density (Double*) => Double type.

The mean value is the same as in the Gaussian density function. The binary feature is implemented as an Int type with the value UP = 1 (with respect to DOWN = 0) for the upward (with respect to downward) direction of the financial technical indicator.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.191.68.18