Design-Based Anytime-Valid Causal Inference
Randomized experiments are the gold standard for inferring a causal effect. Consequently, many organizations run thousands of randomized experiments to quantify the impact of product changes, which managers then use to inform deployment and investment decisions. Often, these experiments are conducted on customers arriving sequentially; however, the analysis is only performed at the end of the study. This is undesirable because large effects can be detected before the end of the study, which is especially important if the treatment effect is negative. Alternatively, analysts could perform hypotheses tests more frequently and stop the experiment when the estimated causal effect is statistically significant; this practice is often called ``peeking.'' Unfortunately, peeking invalidates the statistical guarantees and an increased type-1 error. Our paper provides valid design-based confidence sequences, sequences of confidence intervals with uniform type-1 error guarantees over time for various sequential experiments in an assumption-light manner. In particular, our results apply to the average treatment effect for different individuals arriving sequentially, the mean reward difference in multi-arm bandit settings with adaptive treatment assignments, the contemporaneous treatment effect for single time series experiment with carryover effects, and the average contemporaneous treatment effect in panel experiments. We further provide a variance reduction technique incorporating modeling assumptions and covariates to reduce the confidence sequence width proportional to how well we can predict the next outcome. Our work constructs both exact and asymptotic design-based confidence sequences; however, our main results focus on the asymptotic regime because of its general applicability and attractive properties.