Accurate and scalable large-scale variational inference for mixed models
Generalized linear mixed models are the workhorse of applied Statistics. In modern applications, from political science to electronic marketing, it is common to have categorical factors with large number of levels. This arises naturally when considering interaction terms in survey-type data, or in recommender-system type of applications. In such contexts it is important to have a scalable computational framework, that is one whose complexity scales linearly with the number of observations $n$ and parameters $p$ in the model. Popular implementations, such as those in lmer, although highly optimized they involve costs that scale polynomially with $n$ and $p$. We adopt a Bayesian approach (although the essence of our arguments applies more generally) for inference in such contexts and design families of variational approximations for approximate Bayesian inference with provable scalability. We also provide guarrantees for the resultant approximation error and in fact link that to the rate of convergence of the numerical schemes used to obtain the variational approximation.
This is joint work with Giacomo Zanella (Bocconi) and Max Goplerud (Pittsburgh)