Low-order outcomes and clustered designs: combining design and analysis for causal inference under network interference

التفاصيل البيبلوغرافية
العنوان: Low-order outcomes and clustered designs: combining design and analysis for causal inference under network interference
المؤلفون: Eichhorn, Matthew, Khan, Samir, Ugander, Johan, Yu, Christina Lee
سنة النشر: 2024
المجموعة: Mathematics
Statistics
مصطلحات موضوعية: Statistics - Methodology, Mathematics - Statistics Theory
الوصف: Variance reduction for causal inference in the presence of network interference is often achieved through either outcome modeling, which is typically analyzed under unit-randomized Bernoulli designs, or clustered experimental designs, which are typically analyzed without strong parametric assumptions. In this work, we study the intersection of these two approaches and consider the problem of estimation in low-order outcome models using data from a general experimental design. Our contributions are threefold. First, we present an estimator of the total treatment effect (also called the global average treatment effect) in a low-degree outcome model when the data are collected under general experimental designs, generalizing previous results for Bernoulli designs. We refer to this estimator as the pseudoinverse estimator and give bounds on its bias and variance in terms of properties of the experimental design. Second, we evaluate these bounds for the case of cluster randomized designs with both Bernoulli and complete randomization. For clustered Bernoulli randomization, we find that our estimator is always unbiased and that its variance scales like the smaller of the variance obtained from a low-order assumption and the variance obtained from cluster randomization, showing that combining these variance reduction strategies is preferable to using either individually. For clustered complete randomization, we find a notable bias-variance trade-off mediated by specific features of the clustering. Third, when choosing a clustered experimental design, our bounds can be used to select a clustering from a set of candidate clusterings. Across a range of graphs and clustering algorithms, we show that our method consistently selects clusterings that perform well on a range of response models, suggesting that our bounds are useful to practitioners.
نوع الوثيقة: Working Paper
URL الوصول: http://arxiv.org/abs/2405.07979
رقم الأكسشن: edsarx.2405.07979
قاعدة البيانات: arXiv