402 Handbook of Big Data
learned from observational data. A prominent algorithm for this purpose is the fast causal
inference (FCI) algorithm, an adaptation of the PC algorithm [59–61,70]. Adaptations
of FCI that are computationally more efficient include RFCI and FCI+ [13,15]. High-
dimensional consistency of FCI and RFCI is shown in [15]. The order-dependence issues
studied in [14] (see Section 21.3.1) apply to all these algorithms, and order-independent
versions can be easily derived. The algorithms FCI, RFCI, and FCI+ are available in the
R-package pcalg [29]. There is also an adaptation of LiNGAM that allows for hidden
variables [25]. Causal structure learning methods that allow for feedback loops can be found
in [36,37,50].
21.5.3 Time Series Data
Time series data are suitable for causal inference, because the time component contains
important causal information. There are adaptations of the PC and FCI algorithms for
time series data [11,16,18]. These are computationally intensive when considering several
time lags, because they replicate variables for the different time lags.
Another approach for discrete time series data consists of modeling the system as
a structural vector autoregressive model. One can then use a two-step approach, first
estimating the vector autoregressive model and its residuals, and then applying a causal
structure learning method to the residuals to learn the contemporaneous causal structure.
This approach is for example used in [27].
Finally, an approach based on Bayesian time series models, applicable to large-scale
systems, was proposed in [7].
21.5.4 Causal Structure Learning from Heterogeneous Data
There is interesting work on causal structure learning from heterogeneous data. For example,
one can consider a mix of observational and various experimental datasets [22,47], or
different datasets with overlapping sets of variables [63,64], or a combination of both [67].
A related line of work is concerned with transportability of causal effects [5].
21.5.5 Covariate Adjustment
Given a DAG and a set of intervention variables X and a set of target variables Y, Pearl’s
backdoor criterion is a sufficient graphical criterion to determine whether a certain set of
variables can be used for adjustment to compute the effect of X on Y. This result was
strengthened to a necessary and sufficient condition for DAGs in [56] and for MAGs in [68].
Pearl’s backdoor criterion was generalized to CPDAGs, MAGs and PAGs in [31], and the
necessary and sufficient condition of [56] was generalized to all these graph types in [45].
21.5.6 Measures of Uncertainty
The estimates of IDA come without a measure of uncertainty. (The regression estimates in
IDA do produce standard errors, but these assume that the estimated CPDAG was correct.
Hence, they underestimate the true uncertainty.) Asymptotically valid confidence intervals
could be obtained using sample splitting methods (cf. [35]), but their performance is not
satisfactory for small samples. Another approach that provides a measure of uncertainty
for the presence of direct effects is given in [47]. More work toward quantifying uncertainty
would be highly desirable.
A Review of Some Recent Advances in Causal Inference 403
21.6 Summary
In this chapter, we discussed the estimation of causal effects from observational data.
This problem is relevant in many fields of science, because understanding cause–effect
relationships is fundamental and randomized controlled experiments are not always possible.
There is a lot of recent progress in this field. We have tried to give an overview of some of
the theory behind selected methods, as well as some pointers to further literature.
Finally, we want to emphasize that the estimation of causal effects based on observational
data cannot replace randomized controlled experiments. Ideally, such predictions from
observational data are followed up by validation experiments. In this sense, such predictions
could help in the design of experiments, by prioritizing experiments that are likely to show
a large effect.
References
1. C.F. Aliferis, A. Statnikov, I. Tsamardinos, S. Mani, and X.D. Koutsoukos. Local
causal and Markov blanket induction for causal discovery and feature selection for
classification: part i. Algorithms and empirical evaluation. J. Mach. Learn. Res. 11:
171–234, 2010.
2. C.F. Aliferis, A. Statnikov, I. Tsamardinos, S. Mani, and X.D. Koutsoukos. Local
causal and Markov blanket induction for causal discovery and feature selection for
classification: part ii. Analysis and extensions. J. Mach. Learn. Res. 11:235–284, 2010.
3. S.A. Andersson, D. Madigan, and M.D. Perlman. A characterization of Markov
equivalence classes for acyclic digraphs. Ann. Stat. 25:505–541, 1997.
4. J.D. Angrist, G.W. Imbens, and D.B. Rubin. Identification of causal effects using
instrumental variables. J. Am. Stat. Assoc. 91:444–455, 1996.
5. E. Bareinboim and J. Pearl. Transportability from multiple environments with limited
experiments: Completeness results. In Advances in Neural Information Processing
Systems 27 (NIPS 2014), pp. 280–288. Curran Associates, Inc., 2014.
6. K. Bollen. Structural Equations with Latent Variables. Wiley, New York, 1989.
7. K.H. Brodersen, F. Gallusser, J. Koehler, N. Remy, and S.L. Scott. Inferring causal
impact using Bayesian structural time-series models. Ann. Appl. Stat. 9:247–274, 2015.
8. D. Chicharro and S. Panzeri. Algorithms of causal inference for the analysis of effective
connectivity among brain regions. Front. Neuroinform. 8:64 2014.
9. D.M. Chickering. Learning equivalence classes of Bayesian-network structures. J. Mach.
Learn. Res. 2:445–498, 2002.
10. D.M. Chickering. Optimal structure identification with greedy search. J. Mach. Learn.
Res. 3:507–554, 2003.
11. T. Chu and C. Glymour. Search for additive nonlinear time series causal models.
J. Mach. Learn. Res. 9:967–991, 2008.
404 Handbook of Big Data
12. T. Chu, C. Glymour, R. Scheines, and P. Spirtes. A statistical problem for inference
to regulatory structure from associations of gene expression measurements with
microarrays. Bioinformatics 19:1147–1152, 2003.
13. T. Claassen, J. Mooij, and T. Heskes. Learning sparse causal models is not NP-hard. In
Proceedings of the 29th Conference on Uncertainty in Articial Intelligence (UAI-13),
pp. 172–181. AUAI Press, Corvallis, OR, 2013.
14. D. Colombo and M.H. Maathuis. Order-independent constraint-based causal structure
learning. J. Mach. Learn. Res. 15:3741–82, 2014.
15. D. Colombo, M.H. Maathuis, M. Kalisch, and T.S. Richardson. Learning high-
dimensional directed acyclic graphs with latent and selection variables. Ann. Stat.
40:294–321, 2012.
16. I. Ebert-Uphoff and Y. Deng. Causal discovery for climate research using graphical
models. J. Climate 25:5648–5665, 2012.
17. I. Ebert-Upho and Y. Deng. Using causal discovery algorithms to learn about our
planet’s climate. In Machine Learning and Data Mining to Climate Science. Proceedings
of the 4th International Workshop on Climate Informatics, pp. 113–126. Springer, New
York, 2015.
18. D. Entner and P.O. Hoyer. On causal discovery from time series data using FCI. In
Proceedings of the 5th European Workshop on Probabilistic Graphical Models (PGM
2010), pp. 121–129. HIIT Publications, 2010.
19. N. Friedman, M. Linial, I. Nachman, and D. Pe’er. Using Bayesian networks to analyze
expression data. J. Comput. Biol. 7:601–620, 2000.
20. C. Hanson, S.J. Hanson, J. Ramsey, and C. Glymour. Atypical effective connectivity of
social brain networks in individuals with autism. Brain Connect. 3:578–589, 2013.
21. N. Harris and M. Drton. PC algorithm for nonparanormal graphical models. J. Mach.
Learn. Res. 14:3365–3383, 2013.
22. A. Hauser and P. B¨uhlmann. Characterization and greedy learning of interventional
Markov equivalence classes of directed acyclic graphs. J. Mach. Learn. Res. 13:
2409–2464, 2012.
23. M.
´
A. Hern´an, B. Brumback, and J.M. Robins. Marginal structural models to estimate
the causal effect of zidovudine on the survival of HIV-positive men. Epidemiology
11:561–570, 2000.
24. P.O. Hoyer, D. Janzing, J. Mooij, J. Peters, and B. Sch¨olkopf. Nonlinear causal discovery
with additive noise models. In Advances in Neural Information Processing Systems 21
(NIPS 2008), pp. 689–696. Curran Associates, Inc., 2008.
25. P.O. Hoyer, S. Shimizu, A.J. Kerminen, and M. Palviainen. Estimation of causal effects
using linear non-Gaussian causal models with hidden variables. Int. J. Approx. Reason.
49:362–378, 2008.
26. T.R. Hughes, M.J. Marton, A.R. Jones, C.J. Roberts, R. Stoughton, C.D. Armour,
H.A. Bennett et al. Functional discovery via a compendium of expression profiles. Cell
102:109–126, 2000.
A Review of Some Recent Advances in Causal Inference 405
27. A. Hyv¨arinen, K. Zhang, S. Shimizu, and P.O. Hoyer. Estimation of a structural vector
autoregression model using non-Gaussianity. J. Mach. Learn. Res. 11:1709–1731, 2010.
28. M. Kalisch and P. uhlmann. Estimating high-dimensional directed acyclic graphs with
the PC-algorithm. J. Mach. Learn. Res. 8:613–636, 2007.
29. M. Kalisch, M. achler, D. Colombo, M.H. Maathuis, and P. uhlmann. Causal
inference using graphical models with the R package pcalg. J. Stat. Softw. 47(11):1–26,
2012.
30. S. Ma, P. Kemmeren, D. Gresham, and A. Statnikov. De-novo learning of genome-scale
regulatory networks in S. cerevisiae. PLoS ONE 9:e106479, 2014.
31. M.H. Maathuis and D. Colombo. A generalized back-door criterion. Ann. Stat. 43:
1060–1088, 2015.
32. M.H. Maathuis, D. Colombo, M. Kalisch, and P. B¨uhlmann. Predicting causal effects
in large-scale systems from observational data. Nat. Methods 7:247–248, 2010.
33. M.H. Maathuis, M. Kalisch, and P. B¨uhlmann. Estimating high-dimensional interven-
tion effects from observational data. Ann. Stat. 37:3133–3164, 2009.
34. D. Marbach, T. Schaffter, C. Mattiussi, and D. Floreano. Generating realistic in silico
gene networks for performance assessment of reverse engineering methods. J. Comput.
Biol. 16:229–239, 2009.
35. N. Meinshausen, L. Meier, and P. B¨uhlmann. P-values for high-dimensional regression.
J. Am. Stat. Assoc. 104:1671–1681, 2009.
36. J.M. Mooij and T. Heskes. Cyclic causal discovery from continuous equilibrium data. In
Proceedings of the 29th Conference on Uncertainty in Articial Intelligence (UAI-13),
pp. 431–439. AUAI Press, Corvallis, OR, 2013.
37. J.M. Mooij, D. Janzing, T. Heskes, and B. Sch¨olkopf. On causal discovery with cyclic
additive noise models. In Advances in Neural Information Processing Systems 24 (NIPS
2011), pp. 639–647. Curran Associates, Inc., 2011.
38. P. Nandy, A. Hauser, and M.H. Maathuis. Understanding consistency in hybrid causal
structure learning. arXiv:1507.02608, 2015.
39. P. Nandy, M.H. Maathuis, and T.S. Richardson. Estimating the effect of joint inter-
ventions from observational data in sparse high-dimensional settings. arXiv:1407.2451,
2014.
40. R. Opgen-Rhein and K. Strimmer. From correlation to causation networks: A simple
approximate learning algorithm and its application to high-dimensional plant gene
expression data. BMC Syst. Biol. 1:37, 2007.
41. J. Pearl. Comment: Graphical models, causality and intervention. Stat. Sci. 8:266–269,
1993.
42. J. Pearl. Causal diagrams for empirical research. Biometrika 82:669–710, 1995. (With
discussion and a rejoinder by the author.)
43. J. Pearl. Causal inference in statistics: An overview. Stat. Surv. 3:96–146, 2009.
406 Handbook of Big Data
44. J. Pearl. Causality: Models, Reasoning and Inference, 2nd edition. Cambridge University
Press, Cambridge, 2009.
45. E. Perkovic, J. Textor, M. Kalisch, and M.H. Maathuis. A complete generalized
adjustment criterion. In Proceedings of the 31st Conference on Uncertainty in Articial
Intelligence (UAI-15), pp. 682–691. AUAI Press, Corvallis, OR, 2015.
46. J. Peters and P. B¨uhlmann. Identifiability of Gaussian structural equation models with
equal error variances. Biometrika 101:219–228, 2014.
47. J. Peters, P. uhlmann, and N. Meinshausen. Causal inference using invariant
prediction: Identification and confidence intervals. J. Roy. Stat. Soc. B, to appear, 2015.
48. J. Ramsey. A PC-style Markov blanket search for high dimensional datasets. Technical
Report CMU-PHIL-177, Carnegie Mellon University, Pittsburgh, PA, 2006.
49. J.D. Ramsey, S.J. Hanson, C. Hanson, Y.O. Halchenko, R.A. Poldrack, and C. Glymour.
Six problems for causal inference from fMRI. Neuroimage 49:1545–1558, 2010.
50. T.S. Richardson. A discovery algorithm for directed cyclic graphs. In Proceedings of
the 12th Conference on Uncertainty in Artificial Intelligence (UAI-96), pp. 454–461.
Morgan Kaufmann, San Francisco, CA, 1996.
51. T.S. Richardson and P. Spirtes. Ancestral graph Markov models. Ann. Stat. 30:
962–1030, 2002.
52. J.M. Robins. A new approach to causal inference in mortality studies with a sustained
exposure period-application to control of the healthy worker survivor effect. Math.
Model. 7:1393–1512, 1986.
53. J.M. Robins, M.
´
A. Hern´an, and B. Brumback. Marginal structural models and causal
inference in epidemiology. Epidemiology 11:550–560, 2000.
54. S. Shimizu, P.O. Hoyer, A. Hyv¨arinen, and A. Kerminen. A linear non-Gaussian acyclic
model for causal discovery. J. Mach. Learn. Res. 7:2003–2030, 2006.
55. S. Shimizu, A. Hyv¨arinen, Y. Kawahara, and T. Washio. A direct method for estimating
a causal ordering in a linear non-Gaussian acyclic model. In Proceedings of the 25th
Conference on Uncertainty in Artificial Intelligence (UAI-09), pp. 506–513. AUAI
Press, Corvallis, OR, 2009.
56. I. Shpitser and J. Pearl. Identification of conditional interventional distributions. In
Proceedings of the 22nd Conference on Uncertainty in Articial Intelligence (UAI-06),
pp. 437–444. AUAI Press, Arlington, VA, 2006.
57. I. Shpitser, T. Van der Weele, and J.M. Robins. On the validity of covariate adjustment
for estimating causal effects. In Proceedings of the 26th Conference on Uncertainty in
Artificial Intelligence (UAI-10), pp. 527–536. AUAI Press, Corvallis, OR, 2010.
58. S.M. Smith, K.L. Miller, G. Salimi-Khorshidi, M. Webster, C.F. Beckmann, T.E.
Nichols, J.D. Ramsey, and M.W. Woolrich. Network modelling methods for fMRI.
Neuroimage 54:875–891, 2011.
59. P. Spirtes, C. Glymour, and R. Scheines. Causation, Prediction, and Search. Springer-
Verlag, New York, 1993.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.189.178.53