Matteo Sesia (Stanford University)
19 December 2018 @ 11:00 - 12:00
- Past event
“New tools for reproducible variable selection with knockoffs”
Abstract
Model-X knockoffs [1] is a new statistical framework that allows the scientist to investigate the relationship between a response of interest and hundreds or thousands of explanatory variables. In particular, model-X knockoffs can be used to identify a subset of important variables from a larger pool that could potentially explain a phenomenon under study while rigorously controlling the false discovery rate [2] in very complex statistical models. In this talk we will briefly review the fundamentals of knockoffs and their use in two different scenarios. First, we will discuss about how knockoffs can be used to exploit prior knowledge of genetic variation to obtain a powerful tool for genome-wide association studies [3]. Then, we will see how the information contained in large unsupervised datasets can be harnessed to perform effectively “model-free” variable selection [4].
References:
[1] E. J. Candès, Y. Fan, L. Janson, and J. Lv, “Panning for gold: “model-X” knockoffs for high dimensional controlled variable selection”. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 2018.
[2] Y. Benjamini and Y. Hochberg, “Controlling the false discovery rate: a practical and powerful approach to multiple testing”. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 1995.
[3] M. Sesia, C. Sabatti, and E. J. Candès, “Gene hunting with hidden Markov model knockoffs”. Biometrika, 2018.
[4] Yaniv Romano, Matteo Sesia and Emmanuel Candès, “Deep Knockoffs”. arXiv:1811.06687, 2018.