11. Networks (4/4)

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

186 Handbook of Big Data

with noise, so treating it as ﬁxed and known (as most of the methods in Section 11.3 do)

may be inappropriate. This makes the already-diﬃcult project of causal inference even more

challenging. The naive approach to causal inference using incomplete network data would

be to impute missing data in a ﬁrst step and then to proceed with causal inference as if

the data estimated in the ﬁrst step were ﬁxed and known. The primary downside of this

procedure is that it does not incorporate the uncertainty from the network ﬁtting into the

uncertainty about the causal eﬀects; a procedure that performs both tasks simultaneously

is highly desirable.

In Lunagomez and Airoldi (2014), the authors tackle the problem of jointly modeling the

sampling mechanism for causal inference as well as the underlying network on which the data

was collected. The model selected for the network in this chapter is the simple Erdos–Renyi

model that depends on a single parameter p. Since the network is not fully observed under

the sampling scheme discussed in this chapter (respondent-driven sampling), the network

model is chosen to accommodate marginalizing out the missing network information in a

Bayesian framework. The use of a simple network model makes computation tractable, but

the framework proposed by Lunagomez and Airoldi (2014) can theoretically be relaxed to

incorporate any network model.

An alternative approach could be based on the proposal of Fosdick and Hoﬀ (2013).

While these authors do not discuss estimation of causal eﬀects, their procedure for the joint

modeling of network and nodal attributes can be adapted to a model-based causal analysis.

In particular, the authors leverage Section 11.2.3 to ﬁrst test for a relationship between

nodal attributes Y

and the latent position vector lat

=(a

i·

) and, when evidence

for such a relationship is found, to jointly model the vector (Y

, lat

). Considering the nodal

attributes as potential outcomes and writing (Y

(0),Y

(1), lat

), it should in principle be

possible to jointly model the full data vector using the same Markov chain Monte Carlo

procedure as in Fosdick and Hoﬀ (2013).

Without fully observing the network it is diﬃcult to precisely deﬁne, let alone to

estimate, the causal eﬀects discussed in Section 11.3. Jointly modeling network topology, to

account for missing data or subsampling, and causal eﬀects for observations sampled from

network nodes, is one of the most important and challenging areas for future research. The

work of Lunagomez and Airoldi (2014) and Fosdick and Hoﬀ (2013) point toward powerful

and promising solutions, but much work remains to be done.

References

E.M. Airoldi, D.M. Blei, S.E. Fienberg, and E.P. Xing. Mixed membership stochastic

blockmodels. The Journal of Machine Learning Research, 9:1981–2014, 2008.

E.M. Airoldi, T.B. Costa, and S.H. Chan. Stochastic blockmodel approximation of a

graphon: Theory and consistent estimation. In Advances in Neural Information Processing

Systems, pp. 692–700, 2013.

R. Albert and A.-L. Barab´asi. Statistical mechanics of complex networks. Reviews of Modern

Physics, 74(1):47, 2002.

M.M. Ali and D.S. Dwyer. Estimating peer eﬀects in adolescent smoking behavior: A

longitudinal analysis. Journal of Adolescent Health, 45(4):402–408, 2009.

Networks 187

J.D. Angrist, G.W. Imbens, and D.B. Rubin. Identiﬁcation of causal eﬀects using instru-

mental variables. Journal of the American Statistical Association, 91(434):444–455, 1996.

J.D. Angrist and J.-S. Pischke. Mostly Harmless Econometrics: An Empiricist’s Companion.

Princeton University Press, Princeton, NJ, 2008.

P.M. Aronow and C. Samii. Estimating average causal eﬀects under general interference.

Technical report, http://arxiv.org/abs/1305.6156, 2013.

A.-L. Barab´asi and R. Albert. Emergence of scaling in random networks. Science, 286(5439):

509–512, 1999.

P.J. Bickel and P. Sarkar. Hypothesis testing for automated community detection in

networks. arXiv preprint arXiv:1311.2694, 2013.

J. Blitzstein and P. Diaconis. A sequential importance sampling algorithm for generating

random graphs with prescribed degrees. Internet Mathematics, 6(4):489–522, 2011.

J. Bowers, M.M. Fredrickson, and C. Panagopoulos. Reasoning about interference between

units: A general framework. Political Analysis, 21(1):97–124, 2013.

J.T. Cacioppo, J.H. Fowler, and N.A. Christakis. Alone in the crowd: The structure

and spread of loneliness in a large social network. Journal of Personality and Social

Psychology, 97(6):977, 2009.

D.S. Choi. Estimation of monotone treatment eﬀects in network experiments. arXiv preprint

arXiv:1408.4102, 2014.

D.S. Choi, P.J. Wolfe, and E.M. Airoldi. Stochastic blockmodels with a growing number of

classes. Biometrika, 99(2):273–284, 2012.

N.A. Christakis and J.H. Fowler. The spread of obesity in a large social network over 32

years. New England Journal of Medicine, 357(4):370–379, 2007.

N.A. Christakis and J.H. Fowler. The collective dynamics of smoking in a large social

network. New England Journal of Medicine, 358(21):2249–2258, 2008.

N.A. Christakis and J.H. Fowler. Social network sensors for early detection of contagious

outbreaks. PLoS One, 5(9):e12948, 2010.

N.A. Christakis and J.H. Fowler. Social contagion theory: Examining dynamic social

networks and human behavior. Statistics in Medicine, 32(4):556–577, 2013.

A. Clauset, C.R. Shalizi, and M.E.J. Newman. Power-law distributions in empirical data.

SIAM Review, 51(4):661–703, 2009.

E. Cohen-Cole and J.M. Fletcher. Is obesity contagious? Social networks vs. environmental

factors in the obesity epidemic. Journal of Health Economics, 27(5):1382–1387, 2008.

R. Durrett. Random Graph Dynamics, vol. 200. Cambridge University Press, Cambridge,

New York, 2007.

D. Eckles, B. Karrer, and J. Ugander. Design and analysis of experiments in networks:

Reducing bias from interference. arXiv preprint arXiv:1404.7530, 2014.

188 Handbook of Big Data

P. Erdos and A. Renyi. On random graphs I. Publicationes Mathematicae Debrecen,6:

290–297, 1959.

L. Euler. Solutio problematis ad geometriam situs pertinentis. Commentarii Academiae

Scientiarum Petropolitanae, 8:128–140, 1741.

R.A. Fisher. On the mathematical foundations of theoretical statistics. In Philosophical

Transactions of the Royal Society of London. Series A, Containing Papers of a Mathe-

matical or Physical Character, pp. 309–368, 1922.

B.K. Fosdick and P.D. Hoﬀ. Testing and modeling dependencies between a network and

nodal attributes. arXiv preprint arXiv:1306.4708, 2013.

J.H. Fowler and N.A. Christakis. Estimating peer eﬀects on health in social networks: A

response to Cohen-Cole and Fletcher; Trogdon, Nonnemaker, Pais. Journal of Health

Economics, 27(5):1400, 2008.

E.N. Gilbert. Random graphs. The Annals of Mathematical Statistics, 30:1141–1144, 1959.

A. Goldenberg, A.X. Zheng, S.E. Fienberg, and E.M. Airoldi. A survey of statistical network

models. Foundations and Trends



in Machine Learning, 2(2):129–233, 2010.

S. Greenland. An introduction to instrumental variables for epidemiologists. International

Journal of Epidemiology, 29(4):722–729, 2000.

M.E. Halloran and C.J. Struchiner. Causal inference in infectious diseases. Epidemiology,6

(2):142–151, 1995.

M.A. Hernan. A deﬁnition of causal eﬀect for epidemiological research. Journal of

Epidemiology and Community Health, 58(4):265–271, 2004.

P. Hoﬀ, B. Fosdick, A. Volfovsky, and K. Stovel. Likelihoods for ﬁxed rank nomination

networks. Network Science, 1(03):253–277, 2013.

P.D. Hoﬀ. Bilinear mixed-eﬀects models for dyadic data. Journal of the American Statistical

Association, 100(469):286–295, 2005.

P.D. Hoﬀ. Discussion of “Model-based clustering for social networks,” by Handcock, Raftery

and Tantrum. Journal of the Royal Statistical Society, Series A, 170(2):339, 2007.

P.D. Hoﬀ. Modeling homophily and stochastic equivalence in symmetric relational data. In

J.C. Platt, D. Koller, Y. Singer, and S. Roweis (eds.), Advances in Neural Information

Processing Systems 20, pp. 657–664. MIT Press, Cambridge, MA, 2008. http://cran.r-

project.org/web/packages/eigenmodel/.

P.D. Hoﬀ, A.E. Raftery, and M.S. Handcock. Latent space approaches to social network

analysis. Journal of the American Statistical Association, 97(460):1090–1098, 2002.

P.W. Holland, K.B. Laskey, and S. Leinhardt. Stochastic blockmodels: First steps. Social

Networks, 5(2):109–137, 1983.

B. Karrer and M.E.J. Newman. Stochastic blockmodels and community structure in

networks. Physical Review E, 83(1):016107, 2011.

E.D. Kolaczyk. Statistical Analysis of Network Data. Springer, New York, 2009.

E.D. Kolaczyk and P.N. Krivitsky. On the question of eﬀective sample size in network

modeling. arXiv preprint arXiv:1112.0840, 2011.

Networks 189

D. Lazer, B. Rubineau, C. Chetkovich, N. Katz, and M. Neblo. The coevolution of networks

and political attitudes. Political Communication, 27(3):248–274, 2010.

S. Lunagomez and E. Airoldi. Bayesian inference from non-ignorable network sampling

designs. arXiv preprint arXiv:1401.4718, 2014.

R. Lyons. The spread of evidence-poor medicine via ﬂawed social-network analysis.

Statistics, Politics, and Policy, 2(1), 2011.

C.F. Manski. Identiﬁcation of endogenous social eﬀects: The reﬂection problem. The Review

of Economic Studies, 60(3):531–542, 1993.

C.F. Manski. Identiﬁcation of treatment response with social interactions. The Econometrics

Journal, 16(1):S1–S23, 2013.

H. Noel and B. Nyhan. The unfriending problem: The consequences of homophily in

friendship retention for causal estimates of social inﬂuence. Social Networks, 33(3):

211–218, 2011.

K. Nowicki and T.A.B. Snijders. Estimation and prediction for stochastic blockstructures.

Journal of the American Statistical Association, 96(455):1077–1087, 2001.

E.L. Ogburn and T.J. VanderWeele. Causal diagrams for interference. Statistical Science,

29(4):559–578, 2014a.

E.L. Ogburn and T.J. VanderWeele. Vaccines, contagion, and social networks. arXiv

preprint arXiv:1403.1241, 2014b.

A.J. O’Malley, F. Elwert, J.N. Rosenquist, A.M. Zaslavsky, and N.A. Christakis. Estimat-

ing peer eﬀects in longitudinal dyadic data using instrumental variables. Biometrics,

70(3):506–515, 2014.

J. Pearl. Causality: Models, Reasoning and Inference, vol. 29. Cambridge University Press,

New York, 2000.

K. Rohe, S. Chatterjee, and B. Yu. Spectral clustering and the high-dimensional stochastic

blockmodel. The Annals of Statistics, 39(4):1878–1915, 2011.

P.R. Rosenbaum. Interference between units in randomized experiments. Journal of the

American Statistical Association, 102(477):191–200, 2007.

J.N. Rosenquist, J. Murabito, J.H. Fowler, and N.A. Christakis. The spread of alcohol

consumption behavior in a large social network. Annals of Internal Medicine, 152(7):

426–433, 2010.

D.B. Rubin. Causal inference using potential outcomes. Journal of the American Statistical

Association, 100(469), 2005.

C.R. Shalizi. Comment on “why and when ‘ﬂawed’ social network analyses still yield valid

tests of no contagion.” Statistics, Politics, and Policy, 3(1):1–3, 2012.

C.R. Shalizi and A.C. Thomas. Homophily and contagion are generically confounded in

observational social network studies. Sociological Methods & Research, 40(2):211–239,

2011.

E.A. Thompson and C.J. Geyer. Fuzzy p-values in latent variable problems. Biometrika,94

(1):49–60, 2007.

190 Handbook of Big Data

P. Toulis and E. Kao. Estimation of causal peer inﬂuence eﬀects. In Proceedings of the 30th

International Conference on Machine Learning, pp. 1489–1497, 2013.

J. Ugander, B. Karrer, L. Backstrom, and J. Kleinberg. Graph cluster randomization:

Network exposure to multiple universes. In Proceedings of the 19th ACM SIGKDD

International Conference on Knowledge Discovery and Data Mining, pp. 329–337. ACM,

2013.

M.J. van der Laan. Causal inference for a population of causally connected units. Journal

of Causal Inference, 2(1):13–74, 2014.

M.J. van der Laan, E.L. Ogburn, and I. Diaz. Causal inference for social networks.

(forthcoming).

M.J. van der Laan and S. Rose. Targeted Learning: Causal Inference for Observational and

Experimental Data. Springer, New York, 2011.

T.J. VanderWeele. Sensitivity analysis for contagion eﬀects in social networks. Sociological

Methods & Research, 40(2):240–255, 2011.

T.J. VanderWeele, E.L. Ogburn, and E.J. Tchetgen Tchetgen. Why and when “ﬂawed”

social network analyses still yield valid tests of no contagion. Statistics, Politics, and

Policy, 3(1):1–11, 2012.

A. Volfovsky and E. Airoldi. Characterization of ﬁnite group invariant distributions. arXiv

preprint arXiv:1407.6092, 2014.

A. Volfovsky and P.D. Hoﬀ. Testing for nodal dependence in relational data matrices.

Journal of the American Statistical Association, 2014.

Y.J. Wang and G.Y. Wong. Stochastic blockmodels for directed graphs. Journal of the

American Statistical Association, 82(397):8–19, 1987.

R.M. Warner, D.A. Kenny, and M. Stoto. A new round robin analysis of variance for social

interaction data. Journal of Personality and Social Psychology, 37(10):1742, 1979.

D.J. Watts and S.H. Strogatz. Collective dynamics of “small-world” networks. Nature, 393

(6684):440–442, 1998.

J.J. Yang, Q. Han, and E.M. Airoldi. Nonparametric estimation and testing of exchangeable

graph models. In Proceedings of the 17th International Conference on Artiﬁcial Intelli-

gence and Statistics, pp. 1060–1067, 2014.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 11. Networks (4/4)

Create new playlist

Sign In

Sign Up

Table of Contents for
11. Networks (4/4)