A discrete Bayesian network via bnlearn

Bayesian networks are probabilistic graphical models used for understanding how different variables interact with each other. They are built by exploiting the conditional dependencies of each variable using Bayesian theory. For example, let's assume that we have three variables: sleep quality, diet quality, and work performance. For the sake of simplicity, let's also assume that each variable can only take two values: high and low. In our usual regression or classification framework, we would model one of these variables in terms of all the rest. Of course, we would need to take care to choose a dependent variable that is caused by the covariates in some way (in order to make an inference in a regression context, causality needs to flow from the covariates to the dependent variable). BNs operate differently, and in this case, we could define several networks (for example, these two):

Note that we are, in principle, establishing a causal ordering between the variables, but causality in BNs (as in any other model actually) needs to be treated with a lot of care.  The main problem is that we are imposing it, but not formally deriving it mathematically. The mechanics of BNs can work regardless of whether sleep quality causes diet quality and whether this last one causes performance, or vice versa. This is why deducing causality from a BN is usually wrong.

A nice aspect of BNs is that we can easily map how conditional probabilities change when modeled as an interconnected system. This can be done in two ways: as a bottom-up strategy, given a high or low performance, we can find out what's the most likely sleep-diet status to have caused it; or, as a top-down (predictive) approach: given a sleep-diet combination, find out what the most likely performance? There are essentially two ways of defining BNs, and there is a third one that is a combination of these two:

  • Expert-defined: External knowledge is used to build the structure of the network. This is what we did in the previous example.
  • Data-driven: The data is used to dictate what an appropriate configuration of the network would be by using an appropriate algorithm.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.224.54.136