with noise, so treating it as fixed and known (as most of the methods in Section 11.3 do)
may be inappropriate. This makes the already-difficult project of causal inference even more
challenging. The naive approach to causal inference using incomplete network data would
be to impute missing data in a first step and then to proceed with causal inference as if
the data estimated in the first step were fixed and known. The primary downside of this
procedure is that it does not incorporate the uncertainty from the network fitting into the
uncertainty about the causal effects; a procedure that performs both tasks simultaneously
is highly desirable.
In Lunagomez and Airoldi (2014), the authors tackle the problem of jointly modeling the
sampling mechanism for causal inference as well as the underlying network on which the data
was collected. The model selected for the network in this chapter is the simple Erdos–Renyi
model that depends on a single parameter p. Since the network is not fully observed under
the sampling scheme discussed in this chapter (respondent-driven sampling), the network
model is chosen to accommodate marginalizing out the missing network information in a
Bayesian framework. The use of a simple network model makes computation tractable, but
the framework proposed by Lunagomez and Airoldi (2014) can theoretically be relaxed to
incorporate any network model.
An alternative approach could be based on the proposal of Fosdick and Hoff (2013).
While these authors do not discuss estimation of causal effects, their procedure for the joint
modeling of network and nodal attributes can be adapted to a model-based causal analysis.
In particular, the authors leverage Section 11.2.3 to first test for a relationship between
nodal attributes Y
and the latent position vector lat
) and, when evidence
for such a relationship is found, to jointly model the vector (Y
, lat
). Considering the nodal
attributes as potential outcomes and writing (Y
(1), lat
), it should in principle be
possible to jointly model the full data vector using the same Markov chain Monte Carlo
procedure as in Fosdick and Hoff (2013).
Without fully observing the network it is difficult to precisely define, let alone to
estimate, the causal effects discussed in Section 11.3. Jointly modeling network topology, to
account for missing data or subsampling, and causal effects for observations sampled from
network nodes, is one of the most important and challenging areas for future research. The
work of Lunagomez and Airoldi (2014) and Fosdick and Hoff (2013) point toward powerful
and promising solutions, but much work remains to be done.
