21. A Review of Some Recent Advances in Causal Inference (4/5)

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

402 Handbook of Big Data

learned from observational data. A prominent algorithm for this purpose is the fast causal

inference (FCI) algorithm, an adaptation of the PC algorithm [59–61,70]. Adaptations

of FCI that are computationally more eﬃcient include RFCI and FCI+ [13,15]. High-

dimensional consistency of FCI and RFCI is shown in [15]. The order-dependence issues

studied in [14] (see Section 21.3.1) apply to all these algorithms, and order-independent

versions can be easily derived. The algorithms FCI, RFCI, and FCI+ are available in the

R-package pcalg [29]. There is also an adaptation of LiNGAM that allows for hidden

variables [25]. Causal structure learning methods that allow for feedback loops can be found

in [36,37,50].

21.5.3 Time Series Data

Time series data are suitable for causal inference, because the time component contains

important causal information. There are adaptations of the PC and FCI algorithms for

time series data [11,16,18]. These are computationally intensive when considering several

time lags, because they replicate variables for the diﬀerent time lags.

Another approach for discrete time series data consists of modeling the system as

a structural vector autoregressive model. One can then use a two-step approach, ﬁrst

estimating the vector autoregressive model and its residuals, and then applying a causal

structure learning method to the residuals to learn the contemporaneous causal structure.

This approach is for example used in [27].

Finally, an approach based on Bayesian time series models, applicable to large-scale

systems, was proposed in [7].

21.5.4 Causal Structure Learning from Heterogeneous Data

There is interesting work on causal structure learning from heterogeneous data. For example,

one can consider a mix of observational and various experimental datasets [22,47], or

diﬀerent datasets with overlapping sets of variables [63,64], or a combination of both [67].

A related line of work is concerned with transportability of causal eﬀects [5].

21.5.5 Covariate Adjustment

Given a DAG and a set of intervention variables X and a set of target variables Y, Pearl’s

backdoor criterion is a suﬃcient graphical criterion to determine whether a certain set of

variables can be used for adjustment to compute the eﬀect of X on Y. This result was

strengthened to a necessary and suﬃcient condition for DAGs in [56] and for MAGs in [68].

Pearl’s backdoor criterion was generalized to CPDAGs, MAGs and PAGs in [31], and the

necessary and suﬃcient condition of [56] was generalized to all these graph types in [45].

21.5.6 Measures of Uncertainty

The estimates of IDA come without a measure of uncertainty. (The regression estimates in

IDA do produce standard errors, but these assume that the estimated CPDAG was correct.

Hence, they underestimate the true uncertainty.) Asymptotically valid conﬁdence intervals

could be obtained using sample splitting methods (cf. [35]), but their performance is not

satisfactory for small samples. Another approach that provides a measure of uncertainty

for the presence of direct eﬀects is given in [47]. More work toward quantifying uncertainty

would be highly desirable.

A Review of Some Recent Advances in Causal Inference 403

21.6 Summary

In this chapter, we discussed the estimation of causal eﬀects from observational data.

This problem is relevant in many ﬁelds of science, because understanding cause–eﬀect

relationships is fundamental and randomized controlled experiments are not always possible.

There is a lot of recent progress in this ﬁeld. We have tried to give an overview of some of

the theory behind selected methods, as well as some pointers to further literature.

Finally, we want to emphasize that the estimation of causal eﬀects based on observational

data cannot replace randomized controlled experiments. Ideally, such predictions from

observational data are followed up by validation experiments. In this sense, such predictions

could help in the design of experiments, by prioritizing experiments that are likely to show

a large eﬀect.

References

1. C.F. Aliferis, A. Statnikov, I. Tsamardinos, S. Mani, and X.D. Koutsoukos. Local

causal and Markov blanket induction for causal discovery and feature selection for

classiﬁcation: part i. Algorithms and empirical evaluation. J. Mach. Learn. Res. 11:

171–234, 2010.

2. C.F. Aliferis, A. Statnikov, I. Tsamardinos, S. Mani, and X.D. Koutsoukos. Local

causal and Markov blanket induction for causal discovery and feature selection for

classiﬁcation: part ii. Analysis and extensions. J. Mach. Learn. Res. 11:235–284, 2010.

3. S.A. Andersson, D. Madigan, and M.D. Perlman. A characterization of Markov

equivalence classes for acyclic digraphs. Ann. Stat. 25:505–541, 1997.

4. J.D. Angrist, G.W. Imbens, and D.B. Rubin. Identiﬁcation of causal eﬀects using

instrumental variables. J. Am. Stat. Assoc. 91:444–455, 1996.

5. E. Bareinboim and J. Pearl. Transportability from multiple environments with limited

experiments: Completeness results. In Advances in Neural Information Processing

Systems 27 (NIPS 2014), pp. 280–288. Curran Associates, Inc., 2014.

6. K. Bollen. Structural Equations with Latent Variables. Wiley, New York, 1989.

7. K.H. Brodersen, F. Gallusser, J. Koehler, N. Remy, and S.L. Scott. Inferring causal

impact using Bayesian structural time-series models. Ann. Appl. Stat. 9:247–274, 2015.

8. D. Chicharro and S. Panzeri. Algorithms of causal inference for the analysis of eﬀective

connectivity among brain regions. Front. Neuroinform. 8:64 2014.

9. D.M. Chickering. Learning equivalence classes of Bayesian-network structures. J. Mach.

Learn. Res. 2:445–498, 2002.

10. D.M. Chickering. Optimal structure identiﬁcation with greedy search. J. Mach. Learn.

Res. 3:507–554, 2003.

11. T. Chu and C. Glymour. Search for additive nonlinear time series causal models.

J. Mach. Learn. Res. 9:967–991, 2008.

404 Handbook of Big Data

12. T. Chu, C. Glymour, R. Scheines, and P. Spirtes. A statistical problem for inference

to regulatory structure from associations of gene expression measurements with

microarrays. Bioinformatics 19:1147–1152, 2003.

13. T. Claassen, J. Mooij, and T. Heskes. Learning sparse causal models is not NP-hard. In

Proceedings of the 29th Conference on Uncertainty in Artiﬁcial Intelligence (UAI-13),

pp. 172–181. AUAI Press, Corvallis, OR, 2013.

14. D. Colombo and M.H. Maathuis. Order-independent constraint-based causal structure

learning. J. Mach. Learn. Res. 15:3741–82, 2014.

15. D. Colombo, M.H. Maathuis, M. Kalisch, and T.S. Richardson. Learning high-

dimensional directed acyclic graphs with latent and selection variables. Ann. Stat.

40:294–321, 2012.

16. I. Ebert-Uphoﬀ and Y. Deng. Causal discovery for climate research using graphical

models. J. Climate 25:5648–5665, 2012.

17. I. Ebert-Uphoﬀ and Y. Deng. Using causal discovery algorithms to learn about our

planet’s climate. In Machine Learning and Data Mining to Climate Science. Proceedings

of the 4th International Workshop on Climate Informatics, pp. 113–126. Springer, New

York, 2015.

18. D. Entner and P.O. Hoyer. On causal discovery from time series data using FCI. In

Proceedings of the 5th European Workshop on Probabilistic Graphical Models (PGM

2010), pp. 121–129. HIIT Publications, 2010.

19. N. Friedman, M. Linial, I. Nachman, and D. Pe’er. Using Bayesian networks to analyze

expression data. J. Comput. Biol. 7:601–620, 2000.

20. C. Hanson, S.J. Hanson, J. Ramsey, and C. Glymour. Atypical eﬀective connectivity of

social brain networks in individuals with autism. Brain Connect. 3:578–589, 2013.

21. N. Harris and M. Drton. PC algorithm for nonparanormal graphical models. J. Mach.

Learn. Res. 14:3365–3383, 2013.

22. A. Hauser and P. B¨uhlmann. Characterization and greedy learning of interventional

Markov equivalence classes of directed acyclic graphs. J. Mach. Learn. Res. 13:

2409–2464, 2012.

23. M.

A. Hern´an, B. Brumback, and J.M. Robins. Marginal structural models to estimate

the causal eﬀect of zidovudine on the survival of HIV-positive men. Epidemiology

11:561–570, 2000.

24. P.O. Hoyer, D. Janzing, J. Mooij, J. Peters, and B. Sch¨olkopf. Nonlinear causal discovery

with additive noise models. In Advances in Neural Information Processing Systems 21

(NIPS 2008), pp. 689–696. Curran Associates, Inc., 2008.

25. P.O. Hoyer, S. Shimizu, A.J. Kerminen, and M. Palviainen. Estimation of causal eﬀects

using linear non-Gaussian causal models with hidden variables. Int. J. Approx. Reason.

49:362–378, 2008.

26. T.R. Hughes, M.J. Marton, A.R. Jones, C.J. Roberts, R. Stoughton, C.D. Armour,

H.A. Bennett et al. Functional discovery via a compendium of expression proﬁles. Cell

102:109–126, 2000.

A Review of Some Recent Advances in Causal Inference 405

27. A. Hyv¨arinen, K. Zhang, S. Shimizu, and P.O. Hoyer. Estimation of a structural vector

autoregression model using non-Gaussianity. J. Mach. Learn. Res. 11:1709–1731, 2010.

28. M. Kalisch and P. B¨uhlmann. Estimating high-dimensional directed acyclic graphs with

the PC-algorithm. J. Mach. Learn. Res. 8:613–636, 2007.

29. M. Kalisch, M. M¨achler, D. Colombo, M.H. Maathuis, and P. B¨uhlmann. Causal

inference using graphical models with the R package pcalg. J. Stat. Softw. 47(11):1–26,

2012.

30. S. Ma, P. Kemmeren, D. Gresham, and A. Statnikov. De-novo learning of genome-scale

regulatory networks in S. cerevisiae. PLoS ONE 9:e106479, 2014.

31. M.H. Maathuis and D. Colombo. A generalized back-door criterion. Ann. Stat. 43:

1060–1088, 2015.

32. M.H. Maathuis, D. Colombo, M. Kalisch, and P. B¨uhlmann. Predicting causal eﬀects

in large-scale systems from observational data. Nat. Methods 7:247–248, 2010.

33. M.H. Maathuis, M. Kalisch, and P. B¨uhlmann. Estimating high-dimensional interven-

tion eﬀects from observational data. Ann. Stat. 37:3133–3164, 2009.

34. D. Marbach, T. Schaﬀter, C. Mattiussi, and D. Floreano. Generating realistic in silico

gene networks for performance assessment of reverse engineering methods. J. Comput.

Biol. 16:229–239, 2009.

35. N. Meinshausen, L. Meier, and P. B¨uhlmann. P-values for high-dimensional regression.

J. Am. Stat. Assoc. 104:1671–1681, 2009.

36. J.M. Mooij and T. Heskes. Cyclic causal discovery from continuous equilibrium data. In

Proceedings of the 29th Conference on Uncertainty in Artiﬁcial Intelligence (UAI-13),

pp. 431–439. AUAI Press, Corvallis, OR, 2013.

37. J.M. Mooij, D. Janzing, T. Heskes, and B. Sch¨olkopf. On causal discovery with cyclic

additive noise models. In Advances in Neural Information Processing Systems 24 (NIPS

2011), pp. 639–647. Curran Associates, Inc., 2011.

38. P. Nandy, A. Hauser, and M.H. Maathuis. Understanding consistency in hybrid causal

structure learning. arXiv:1507.02608, 2015.

39. P. Nandy, M.H. Maathuis, and T.S. Richardson. Estimating the eﬀect of joint inter-

ventions from observational data in sparse high-dimensional settings. arXiv:1407.2451,

2014.

40. R. Opgen-Rhein and K. Strimmer. From correlation to causation networks: A simple

approximate learning algorithm and its application to high-dimensional plant gene

expression data. BMC Syst. Biol. 1:37, 2007.

41. J. Pearl. Comment: Graphical models, causality and intervention. Stat. Sci. 8:266–269,

1993.

42. J. Pearl. Causal diagrams for empirical research. Biometrika 82:669–710, 1995. (With

discussion and a rejoinder by the author.)

43. J. Pearl. Causal inference in statistics: An overview. Stat. Surv. 3:96–146, 2009.

406 Handbook of Big Data

44. J. Pearl. Causality: Models, Reasoning and Inference, 2nd edition. Cambridge University

Press, Cambridge, 2009.

45. E. Perkovic, J. Textor, M. Kalisch, and M.H. Maathuis. A complete generalized

adjustment criterion. In Proceedings of the 31st Conference on Uncertainty in Artiﬁcial

Intelligence (UAI-15), pp. 682–691. AUAI Press, Corvallis, OR, 2015.

46. J. Peters and P. B¨uhlmann. Identiﬁability of Gaussian structural equation models with

equal error variances. Biometrika 101:219–228, 2014.

47. J. Peters, P. B¨uhlmann, and N. Meinshausen. Causal inference using invariant

prediction: Identiﬁcation and conﬁdence intervals. J. Roy. Stat. Soc. B, to appear, 2015.

48. J. Ramsey. A PC-style Markov blanket search for high dimensional datasets. Technical

Report CMU-PHIL-177, Carnegie Mellon University, Pittsburgh, PA, 2006.

49. J.D. Ramsey, S.J. Hanson, C. Hanson, Y.O. Halchenko, R.A. Poldrack, and C. Glymour.

Six problems for causal inference from fMRI. Neuroimage 49:1545–1558, 2010.

50. T.S. Richardson. A discovery algorithm for directed cyclic graphs. In Proceedings of

the 12th Conference on Uncertainty in Artiﬁcial Intelligence (UAI-96), pp. 454–461.

Morgan Kaufmann, San Francisco, CA, 1996.

51. T.S. Richardson and P. Spirtes. Ancestral graph Markov models. Ann. Stat. 30:

962–1030, 2002.

52. J.M. Robins. A new approach to causal inference in mortality studies with a sustained

exposure period-application to control of the healthy worker survivor eﬀect. Math.

Model. 7:1393–1512, 1986.

53. J.M. Robins, M.

A. Hern´an, and B. Brumback. Marginal structural models and causal

inference in epidemiology. Epidemiology 11:550–560, 2000.

54. S. Shimizu, P.O. Hoyer, A. Hyv¨arinen, and A. Kerminen. A linear non-Gaussian acyclic

model for causal discovery. J. Mach. Learn. Res. 7:2003–2030, 2006.

55. S. Shimizu, A. Hyv¨arinen, Y. Kawahara, and T. Washio. A direct method for estimating

a causal ordering in a linear non-Gaussian acyclic model. In Proceedings of the 25th

Conference on Uncertainty in Artiﬁcial Intelligence (UAI-09), pp. 506–513. AUAI

Press, Corvallis, OR, 2009.

56. I. Shpitser and J. Pearl. Identiﬁcation of conditional interventional distributions. In

Proceedings of the 22nd Conference on Uncertainty in Artiﬁcial Intelligence (UAI-06),

pp. 437–444. AUAI Press, Arlington, VA, 2006.

57. I. Shpitser, T. Van der Weele, and J.M. Robins. On the validity of covariate adjustment

for estimating causal eﬀects. In Proceedings of the 26th Conference on Uncertainty in

Artiﬁcial Intelligence (UAI-10), pp. 527–536. AUAI Press, Corvallis, OR, 2010.

58. S.M. Smith, K.L. Miller, G. Salimi-Khorshidi, M. Webster, C.F. Beckmann, T.E.

Nichols, J.D. Ramsey, and M.W. Woolrich. Network modelling methods for fMRI.

Neuroimage 54:875–891, 2011.

59. P. Spirtes, C. Glymour, and R. Scheines. Causation, Prediction, and Search. Springer-

Verlag, New York, 1993.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 21. A Review of Some Recent Advances in Causal Inference (4/5)

Create new playlist

Sign In

Sign Up

Table of Contents for
21. A Review of Some Recent Advances in Causal Inference (4/5)