Skip to main content
Browse by:

Controlled Discovery and Localization of Signals via Bayesian Linear Programming

Event Image
Friday, April 01, 2022
3:30 pm - 4:30 pm
Lucas Janson, Assistant Professor, Statistics and Affiliate, Computer Science, Harvard University
Statistical Science Seminar Series

In many high-dimensional statistical problems, it is necessary to simultaneously discover signals and localize them as precisely as possible. For instance, genetic fine-mapping aims to discover causal genetic variants, but the strong local dependence structure of the genome makes it hard to identify the exact locations of those variants. So the statistical task is to output as many regions as possible and have those regions be as small as possible, while controlling how many outputted regions contain no signal. The same type of problem arises in any signal discovery application where signals cannot be perfectly localized, such as locating stars in astronomical sky surveys and change-point detection in time series. However, there are two competing objectives: maximizing the number of discoveries and minimizing the size of those discoveries (all while controlling false discoveries), so our first contribution is to propose a single unified measure we call the resolution-adjusted power that formally trades off these two objectives and hence, at least in principle, can be maximized subject to a constraint on false discoveries. We take a Bayesian approach, but the resulting constrained posterior optimization over candidate discovery regions is non-convex and extremely high-dimensional. Thus our second contribution is Bayesian Linear Programming (BLiP), which uses linear programming to find a feasible solution (i.e., it controls false discoveries) that verifiably nearly maximizes the expected resolution-adjusted power. BLiP is remarkably computationally efficient and can wrap around any Bayesian model and algorithm for approximating the posterior distribution over signal locations. Applying BLiP on top of existing state-of-the-art Bayesian analyses of UK Biobank data (for genetic fine-mapping) and the Sloan Digital Sky Survey (for astronomical point source detection) increased the resolution-adjusted power by 30-120% with just a few minutes of computation. BLiP is implemented in the new packages pyblip (Python) and blipr (R).

This event will be held in 116 Old Chemistry and it will be on zoom.
Join Zoom Meeting
Meeting ID: 923 9738 2385
Passcode: 425966