Skip to main content
Browse by:
GROUP

Deep Neural Networks Really Are the Right Way for a Biostatistician to Analyze Biological Data or: How I Learned to Stop Worrying and Love the DNN

David Page, PhD, JB Duke Professor and Chair of the Department of Biostatistics & Bioinformatics at Duke University
Monday, September 16, 2024
12:00 pm - 1:00 pm
David Page
CBB Monday Seminar Series

I admit the title intentionally overstates the case. But many (most?) high-throughput biology datasets are based on aggregates, where aggregation occurs during either the experiment or data post-processing. As a result, any node in a graphical model of the data (e.g., Bayes net, dynamic Bayes net, Markov net, point process, or CRF) really is an aggregate of many idealized single-measurement nodes, so the real model can be viewed as a high-dimensional tree-structured graphical model. We prove that such models correspond to neural networks, and also that every neural network can be viewed as such a model. Based on this theoretical result, we discuss potential applications, including causal neural networks and the potential for a future "foundation model" for health. We also use examples from clinical data (such as EHRs) in addition to biological data.