Latest on news

A knockoff filter for high-dimensional selective inference

By projecteuclid.org
Published On :: Fri, 02 Aug 2019 22:04 EDT

Rina Foygel Barber, Emmanuel J. Candès.

Source: The Annals of Statistics, Volume 47, Number 5, 2504--2537.

Abstract:
This paper develops a framework for testing for associations in a possibly high-dimensional linear model where the number of features/variables may far exceed the number of observational units. In this framework, the observations are split into two groups, where the first group is used to screen for a set of potentially relevant variables, whereas the second is used for inference over this reduced set of variables; we also develop strategies for leveraging information from the first part of the data at the inference step for greater power. In our work, the inferential step is carried out by applying the recently introduced knockoff filter, which creates a knockoff copy—a fake variable serving as a control—for each screened variable. We prove that this procedure controls the directional false discovery rate (FDR) in the reduced model controlling for all screened variables; this says that our high-dimensional knockoff procedure “discovers” important variables as well as the directions (signs) of their effects, in such a way that the expected proportion of wrongly chosen signs is below the user-specified level (thereby controlling a notion of Type S error averaged over the selected set). This result is nonasymptotic, and holds for any distribution of the original features and any values of the unknown regression coefficients, so that inference is not calibrated under hypothesized values of the effect sizes. We demonstrate the performance of our general and flexible approach through numerical studies, showing more power than existing alternatives. Finally, we apply our method to a genome-wide association study to find locations on the genome that are possibly associated with a continuous phenotype.

A knockoff filter for high-dimensional selective inference

Property testing in high-dimensional Ising models

Isotonic regression in general dimensions

The two-to-infinity norm and singular subspace geometry with applications to high-dimensional statistics

Cross validation for locally stationary processes

Dynamic network models and graphon estimation

On testing conditional qualitative treatment effects

Convergence complexity analysis of Albert and Chib’s algorithm for Bayesian probit regression

Convergence rates of least squares regression estimators with heavy-tailed errors

On deep learning as a remedy for the curse of dimensionality in nonparametric regression

Negative association, ordering and convergence of resampling methods

Bayes and empirical-Bayes multiplicity adjustment in the variable-selection problem

componentization

AON

Correction: Sensitivity analysis for an unobserved moderator in RCT-to-target-population generalization of treatment effects

Bayesian mixed effects models for zero-inflated compositions in microbiome data analysis

A hierarchical dependent Dirichlet process prior for modelling bird migration patterns in the UK

Estimating causal effects in studies of human brain function: New models, methods and estimands

A comparison of principal component methods between multiple phenotype regression and multiple SNP regression in genetic association studies

Estimating and forecasting the smoking-attributable mortality fraction for both genders jointly in over 60 countries

Regression for copula-linked compound distributions with applications in modeling aggregate insurance claims

Modeling wildfire ignition origins in southern California using linear network point processes

Optimal asset allocation with multivariate Bayesian dynamic linear models

Feature selection for generalized varying coefficient mixed-effect models with application to obesity GWAS

Estimating the health effects of environmental mixtures using Bayesian semiparametric regression and sparsity inducing priors

A hierarchical Bayesian model for predicting ecological interactions using scaled evolutionary relationships

Modifying the Chi-square and the CMH test for population genetic inference: Adapting to overdispersion

TFisher: A powerful truncation and weighting procedure for combining &#36;p&#36;-values

Assessing wage status transition and stagnation using quantile transition regression

Surface temperature monitoring in liver procurement via functional variance change-point analysis

Modeling microbial abundances and dysbiosis with beta-binomial regression

Efficient real-time monitoring of an emerging influenza pandemic: How feasible?

Integrative survival analysis with uncertain event times in application to a suicide risk study

SHOPPER: A probabilistic model of consumer choice with substitutes and complements

A general theory for preferential sampling in environmental networks

Hierarchical infinite factor models for improving the prediction of surgical complications for geriatric patients

Bayesian indicator variable selection to incorporate hierarchical overlapping group structure in multi-omics applications

On Bayesian new edge prediction and anomaly detection in computer networks

Scalable high-resolution forecasting of sparse spatiotemporal events with kernel methods: A winning solution to the NIJ “Real-Time Crime Forecasting Challenge”

A simple, consistent estimator of SNP heritability from genome-wide association studies

New formulation of the logistic-Gaussian process to analyze trajectory tracking data

Empirical Bayes analysis of RNA sequencing experiments with auxiliary information

Outline analyses of the called strike zone in Major League Baseball

Predicting paleoclimate from compositional data using multivariate Gaussian process inverse prediction

A nonparametric spatial test to identify factors that shape a microbiome

A latent discrete Markov random field approach to identifying and classifying historical forest communities based on spatial multivariate tree species counts

Objective Bayes model selection of Gaussian interventional essential graphs for the identification of signaling pathways

Microsimulation model calibration using incremental mixture approximate Bayesian computation

Prediction of small area quantiles for the conservation effects assessment project using a mixed effects quantile regression model

Joint model of accelerated failure time and mechanistic nonlinear model for censored covariates, with application in HIV/AIDS

The Finish Line: Katrina One Year After

The Finish Line: Cast Stone and EIFS

The Finish Line: EPS Vs. Polyisocyanurate Insulation

The Finish Line: A (Faux) Monument for the Ages

The Finish Line: EIFS Inspection

The Finish Line: Right Solutions for the Right Problems

Anti-LEED Legislation

Hydronic Floor Heating

Green Advocacy vs. Informed Consent

The Greenest Low Slope Roofing Solution

Farming with Shipping Containers

Cost-Effective, Energy Efficient Concrete Sandwich Panels

Exoskeleton in the Job Site Closet

Tech giant’s philanthropic arm gives almost £500,000 to two London charities

Only 12 per cent of leading charities publicly recognise a trade union, analysis suggests

Subscribe To Our Newsletter

TFisher: A powerful truncation and weighting procedure for combining $p$-values