learned from observational data. A prominent algorithm for this purpose is the fast causal
inference (FCI) algorithm, an adaptation of the PC algorithm [59–61,70]. Adaptations
of FCI that are computationally more efficient include RFCI and FCI+ [13,15]. High-
dimensional consistency of FCI and RFCI is shown in [15]. The order-dependence issues
studied in [14] (see Section 21.3.1) apply to all these algorithms, and order-independent
versions can be easily derived. The algorithms FCI, RFCI, and FCI+ are available in the
R-package pcalg [29]. There is also an adaptation of LiNGAM that allows for hidden
variables [25]. Causal structure learning methods that allow for feedback loops can be found
in [36,37,50].
21.5.3 Time Series Data
Time series data are suitable for causal inference, because the time component contains
important causal information. There are adaptations of the PC and FCI algorithms for
time series data [11,16,18]. These are computationally intensive when considering several
time lags, because they replicate variables for the different time lags.
Another approach for discrete time series data consists of modeling the system as
a structural vector autoregressive model. One can then use a two-step approach, first
estimating the vector autoregressive model and its residuals, and then applying a causal
structure learning method to the residuals to learn the contemporaneous causal structure.
This approach is for example used in [27].
Finally, an approach based on Bayesian time series models, applicable to large-scale
systems, was proposed in [7].
21.5.4 Causal Structure Learning from Heterogeneous Data
There is interesting work on causal structure learning from heterogeneous data. For example,
one can consider a mix of observational and various experimental datasets [22,47], or
different datasets with overlapping sets of variables [63,64], or a combination of both [67].
A related line of work is concerned with transportability of causal effects [5].
21.5.5 Covariate Adjustment
Given a DAG and a set of intervention variables X and a set of target variables Y, Pearl’s
backdoor criterion is a sufficient graphical criterion to determine whether a certain set of
variables can be used for adjustment to compute the effect of X on Y. This result was
strengthened to a necessary and sufficient condition for DAGs in [56] and for MAGs in [68].
Pearl’s backdoor criterion was generalized to CPDAGs, MAGs and PAGs in [31], and the
necessary and sufficient condition of [56] was generalized to all these graph types in [45].
21.5.6 Measures of Uncertainty
The estimates of IDA come without a measure of uncertainty. (The regression estimates in
IDA do produce standard errors, but these assume that the estimated CPDAG was correct.
Hence, they underestimate the true uncertainty.) Asymptotically valid confidence intervals
could be obtained using sample splitting methods (cf. [35]), but their performance is not
satisfactory for small samples. Another approach that provides a measure of uncertainty
for the presence of direct effects is given in [47]. More work toward quantifying uncertainty
would be highly desirable.
21.6 Summary
In this chapter, we discussed the estimation of causal effects from observational data.
This problem is relevant in many fields of science, because understanding cause–effect
relationships is fundamental and randomized controlled experiments are not always possible.
There is a lot of recent progress in this field. We have tried to give an overview of some of
the theory behind selected methods, as well as some pointers to further literature.
Finally, we want to emphasize that the estimation of causal effects based on observational
data cannot replace randomized controlled experiments. Ideally, such predictions from
observational data are followed up by validation experiments. In this sense, such predictions
could help in the design of experiments, by prioritizing experiments that are likely to show
a large effect.
