Bayes' theorem refresher

Bayes' theorem expresses the conditional probability of one event (for instance, that an email is spam as opposed to benign ham) given another event (for example, that the email contains certain words), as follows:

The posterior probability that an email is in fact spam, given it contains certain words, depends on the interplay of three factors:

  • The prior probability that an email is spam
  • The likelihood of encountering these word in a spam email
  • The evidence; that is, the probability of seeing these words in an email

To compute the posterior, we can ignore the evidence because it is the same for all outcomes (spam versus ham), and the unconditional prior may be easy to compute.

However, the likelihood poses insurmountable challenges for a reasonably sized vocabulary and a real-world corpus of emails. The reason is the combinatorial explosion of words that did or did not appear jointly in different documents and that prevent the evaluation required to compute a probability table and assign a value to the likelihood.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.133.144.197