or

A Tale of Two Parasites: Statistical Modelling to Support Disease Control Programmes in Africa

Peter J. Diggle, Emanuele Giorgi, Julienne Atsame, Sylvie Ntsame Ella, Kisito Ogoussan, Katherine Gass.

Source: Statistical Science, Volume 35, Number 1, 42--50.

Abstract:
Vector-borne diseases have long presented major challenges to the health of rural communities in the wet tropical regions of the world, but especially in sub-Saharan Africa. In this paper, we describe the contribution that statistical modelling has made to the global elimination programme for one vector-borne disease, onchocerciasis. We explain why information on the spatial distribution of a second vector-borne disease, Loa loa, is needed before communities at high risk of onchocerciasis can be treated safely with mass distribution of ivermectin, an antifiarial medication. We show how a model-based geostatistical analysis of Loa loa prevalence survey data can be used to map the predictive probability that each location in the region of interest meets a WHO policy guideline for safe mass distribution of ivermectin and describe two applications: one is to data from Cameroon that assesses prevalence using traditional blood-smear microscopy; the other is to Africa-wide data that uses a low-cost questionnaire-based method. We describe how a recent technological development in image-based microscopy has resulted in a change of emphasis from prevalence alone to the bivariate spatial distribution of prevalence and the intensity of infection among infected individuals. We discuss how statistical modelling of the kind described here can contribute to health policy guidelines and decision-making in two ways. One is to ensure that, in a resource-limited setting, prevalence surveys are designed, and the resulting data analysed, as efficiently as possible. The other is to provide an honest quantification of the uncertainty attached to any binary decision by reporting predictive probabilities that a policy-defined condition for action is or is not met.




or

Risk Models for Breast Cancer and Their Validation

Adam R. Brentnall, Jack Cuzick.

Source: Statistical Science, Volume 35, Number 1, 14--30.

Abstract:
Strategies to prevent cancer and diagnose it early when it is most treatable are needed to reduce the public health burden from rising disease incidence. Risk assessment is playing an increasingly important role in targeting individuals in need of such interventions. For breast cancer many individual risk factors have been well understood for a long time, but the development of a fully comprehensive risk model has not been straightforward, in part because there have been limited data where joint effects of an extensive set of risk factors may be estimated with precision. In this article we first review the approach taken to develop the IBIS (Tyrer–Cuzick) model, and describe recent updates. We then review and develop methods to assess calibration of models such as this one, where the risk of disease allowing for competing mortality over a long follow-up time or lifetime is estimated. The breast cancer risk model model and calibration assessment methods are demonstrated using a cohort of 132,139 women attending mammography screening in the State of Washington, USA.




or

Statistical Theory Powering Data Science

Junhui Cai, Avishai Mandelbaum, Chaitra H. Nagaraja, Haipeng Shen, Linda Zhao.

Source: Statistical Science, Volume 34, Number 4, 669--691.

Abstract:
Statisticians are finding their place in the emerging field of data science. However, many issues considered “new” in data science have long histories in statistics. Examples of using statistical thinking are illustrated, which range from exploratory data analysis to measuring uncertainty to accommodating nonrandom samples. These examples are then applied to service networks, baseball predictions and official statistics.




or

Larry Brown’s Work on Admissibility

Iain M. Johnstone.

Source: Statistical Science, Volume 34, Number 4, 657--668.

Abstract:
Many papers in the early part of Brown’s career focused on the admissibility or otherwise of estimators of a vector parameter. He established that inadmissibility of invariant estimators in three and higher dimensions is a general phenomenon, and found deep and beautiful connections between admissibility and other areas of mathematics. This review touches on several of his major contributions, with a focus on his celebrated 1971 paper connecting admissibility, recurrence and elliptic partial differential equations.




or

Gaussianization Machines for Non-Gaussian Function Estimation Models

T. Tony Cai.

Source: Statistical Science, Volume 34, Number 4, 635--656.

Abstract:
A wide range of nonparametric function estimation models have been studied individually in the literature. Among them the homoscedastic nonparametric Gaussian regression is arguably the best known and understood. Inspired by the asymptotic equivalence theory, Brown, Cai and Zhou ( Ann. Statist. 36 (2008) 2055–2084; Ann. Statist. 38 (2010) 2005–2046) and Brown et al. ( Probab. Theory Related Fields 146 (2010) 401–433) developed a unified approach to turn a collection of non-Gaussian function estimation models into a standard Gaussian regression and any good Gaussian nonparametric regression method can then be used. These Gaussianization Machines have two key components, binning and transformation. When combined with BlockJS, a wavelet thresholding procedure for Gaussian regression, the procedures are computationally efficient with strong theoretical guarantees. Technical analysis given in Brown, Cai and Zhou ( Ann. Statist. 36 (2008) 2055–2084; Ann. Statist. 38 (2010) 2005–2046) and Brown et al. ( Probab. Theory Related Fields 146 (2010) 401–433) shows that the estimators attain the optimal rate of convergence adaptively over a large set of Besov spaces and across a collection of non-Gaussian function estimation models, including robust nonparametric regression, density estimation, and nonparametric regression in exponential families. The estimators are also spatially adaptive. The Gaussianization Machines significantly extend the flexibility and scope of the theories and methodologies originally developed for the conventional nonparametric Gaussian regression. This article aims to provide a concise account of the Gaussianization Machines developed in Brown, Cai and Zhou ( Ann. Statist. 36 (2008) 2055–2084; Ann. Statist. 38 (2010) 2005–2046), Brown et al. ( Probab. Theory Related Fields 146 (2010) 401–433).




or

Larry Brown’s Contributions to Parametric Inference, Decision Theory and Foundations: A Survey

James O. Berger, Anirban DasGupta.

Source: Statistical Science, Volume 34, Number 4, 621--634.

Abstract:
This article gives a panoramic survey of the general area of parametric statistical inference, decision theory and foundations of statistics for the period 1965–2010 through the lens of Larry Brown’s contributions to varied aspects of this massive area. The article goes over sufficiency, shrinkage estimation, admissibility, minimaxity, complete class theorems, estimated confidence, conditional confidence procedures, Edgeworth and higher order asymptotic expansions, variational Bayes, Stein’s SURE, differential inequalities, geometrization of convergence rates, asymptotic equivalence, aspects of empirical process theory, inference after model selection, unified frequentist and Bayesian testing, and Wald’s sequential theory. A reasonably comprehensive bibliography is provided.




or

Comment: “Models as Approximations I: Consequences Illustrated with Linear Regression” by A. Buja, R. Berk, L. Brown, E. George, E. Pitkin, L. Zhan and K. Zhang

Roderick J. Little.

Source: Statistical Science, Volume 34, Number 4, 580--583.




or

Models as Approximations II: A Model-Free Theory of Parametric Regression

Andreas Buja, Lawrence Brown, Arun Kumar Kuchibhotla, Richard Berk, Edward George, Linda Zhao.

Source: Statistical Science, Volume 34, Number 4, 545--565.

Abstract:
We develop a model-free theory of general types of parametric regression for i.i.d. observations. The theory replaces the parameters of parametric models with statistical functionals, to be called “regression functionals,” defined on large nonparametric classes of joint ${x extrm{-}y}$ distributions, without assuming a correct model. Parametric models are reduced to heuristics to suggest plausible objective functions. An example of a regression functional is the vector of slopes of linear equations fitted by OLS to largely arbitrary ${x extrm{-}y}$ distributions, without assuming a linear model (see Part I). More generally, regression functionals can be defined by minimizing objective functions, solving estimating equations, or with ad hoc constructions. In this framework, it is possible to achieve the following: (1) define a notion of “well-specification” for regression functionals that replaces the notion of correct specification of models, (2) propose a well-specification diagnostic for regression functionals based on reweighting distributions and data, (3) decompose sampling variability of regression functionals into two sources, one due to the conditional response distribution and another due to the regressor distribution interacting with misspecification, both of order $N^{-1/2}$, (4) exhibit plug-in/sandwich estimators of standard error as limit cases of ${x extrm{-}y}$ bootstrap estimators, and (5) provide theoretical heuristics to indicate that ${x extrm{-}y}$ bootstrap standard errors may generally be preferred over sandwich estimators.




or

Conditionally Conjugate Mean-Field Variational Bayes for Logistic Models

Daniele Durante, Tommaso Rigon.

Source: Statistical Science, Volume 34, Number 3, 472--485.

Abstract:
Variational Bayes (VB) is a common strategy for approximate Bayesian inference, but simple methods are only available for specific classes of models including, in particular, representations having conditionally conjugate constructions within an exponential family. Models with logit components are an apparently notable exception to this class, due to the absence of conjugacy among the logistic likelihood and the Gaussian priors for the coefficients in the linear predictor. To facilitate approximate inference within this widely used class of models, Jaakkola and Jordan ( Stat. Comput. 10 (2000) 25–37) proposed a simple variational approach which relies on a family of tangent quadratic lower bounds of the logistic log-likelihood, thus restoring conjugacy between these approximate bounds and the Gaussian priors. This strategy is still implemented successfully, but few attempts have been made to formally understand the reasons underlying its excellent performance. Following a review on VB for logistic models, we cover this gap by providing a formal connection between the above bound and a recent Pólya-gamma data augmentation for logistic regression. Such a result places the computational methods associated with the aforementioned bounds within the framework of variational inference for conditionally conjugate exponential family models, thereby allowing recent advances for this class to be inherited also by the methods relying on Jaakkola and Jordan ( Stat. Comput. 10 (2000) 25–37).




or

User-Friendly Covariance Estimation for Heavy-Tailed Distributions

Yuan Ke, Stanislav Minsker, Zhao Ren, Qiang Sun, Wen-Xin Zhou.

Source: Statistical Science, Volume 34, Number 3, 454--471.

Abstract:
We provide a survey of recent results on covariance estimation for heavy-tailed distributions. By unifying ideas scattered in the literature, we propose user-friendly methods that facilitate practical implementation. Specifically, we introduce elementwise and spectrumwise truncation operators, as well as their $M$-estimator counterparts, to robustify the sample covariance matrix. Different from the classical notion of robustness that is characterized by the breakdown property, we focus on the tail robustness which is evidenced by the connection between nonasymptotic deviation and confidence level. The key insight is that estimators should adapt to the sample size, dimensionality and noise level to achieve optimal tradeoff between bias and robustness. Furthermore, to facilitate practical implementation, we propose data-driven procedures that automatically calibrate the tuning parameters. We demonstrate their applications to a series of structured models in high dimensions, including the bandable and low-rank covariance matrices and sparse precision matrices. Numerical studies lend strong support to the proposed methods.




or

The Geometry of Continuous Latent Space Models for Network Data

Anna L. Smith, Dena M. Asta, Catherine A. Calder.

Source: Statistical Science, Volume 34, Number 3, 428--453.

Abstract:
We review the class of continuous latent space (statistical) models for network data, paying particular attention to the role of the geometry of the latent space. In these models, the presence/absence of network dyadic ties are assumed to be conditionally independent given the dyads’ unobserved positions in a latent space. In this way, these models provide a probabilistic framework for embedding network nodes in a continuous space equipped with a geometry that facilitates the description of dependence between random dyadic ties. Specifically, these models naturally capture homophilous tendencies and triadic clustering, among other common properties of observed networks. In addition to reviewing the literature on continuous latent space models from a geometric perspective, we highlight the important role the geometry of the latent space plays on properties of networks arising from these models via intuition and simulation. Finally, we discuss results from spectral graph theory that allow us to explore the role of the geometry of the latent space, independent of network size. We conclude with conjectures about how these results might be used to infer the appropriate latent space geometry from observed networks.




or

Lasso Meets Horseshoe: A Survey

Anindya Bhadra, Jyotishka Datta, Nicholas G. Polson, Brandon Willard.

Source: Statistical Science, Volume 34, Number 3, 405--427.

Abstract:
The goal of this paper is to contrast and survey the major advances in two of the most commonly used high-dimensional techniques, namely, the Lasso and horseshoe regularization. Lasso is a gold standard for predictor selection while horseshoe is a state-of-the-art Bayesian estimator for sparse signals. Lasso is fast and scalable and uses convex optimization whilst the horseshoe is nonconvex. Our novel perspective focuses on three aspects: (i) theoretical optimality in high-dimensional inference for the Gaussian sparse model and beyond, (ii) efficiency and scalability of computation and (iii) methodological development and performance.




or

The Importance of Being Clustered: Uncluttering the Trends of Statistics from 1970 to 2015

Laura Anderlucci, Angela Montanari, Cinzia Viroli.

Source: Statistical Science, Volume 34, Number 2, 280--300.

Abstract:
In this paper, we retrace the recent history of statistics by analyzing all the papers published in five prestigious statistical journals since 1970, namely: The Annals of Statistics , Biometrika , Journal of the American Statistical Association , Journal of the Royal Statistical Society, Series B and Statistical Science . The aim is to construct a kind of “taxonomy” of the statistical papers by organizing and clustering them in main themes. In this sense being identified in a cluster means being important enough to be uncluttered in the vast and interconnected world of the statistical research. Since the main statistical research topics naturally born, evolve or die during time, we will also develop a dynamic clustering strategy, where a group in a time period is allowed to migrate or to merge into different groups in the following one. Results show that statistics is a very dynamic and evolving science, stimulated by the rise of new research questions and types of data.




or

Rejoinder: Bayes, Oracle Bayes, and Empirical Bayes

Bradley Efron.

Source: Statistical Science, Volume 34, Number 2, 234--235.




or

Comment: Bayes, Oracle Bayes and Empirical Bayes

Aad van der Vaart.

Source: Statistical Science, Volume 34, Number 2, 214--218.




or

Comment: Bayes, Oracle Bayes, and Empirical Bayes

Nan Laird.

Source: Statistical Science, Volume 34, Number 2, 206--208.




or

Comment: Bayes, Oracle Bayes, and Empirical Bayes

Thomas A. Louis.

Source: Statistical Science, Volume 34, Number 2, 202--205.




or

Bayes, Oracle Bayes and Empirical Bayes

Bradley Efron.

Source: Statistical Science, Volume 34, Number 2, 177--201.

Abstract:
This article concerns the Bayes and frequentist aspects of empirical Bayes inference. Some of the ideas explored go back to Robbins in the 1950s, while others are current. Several examples are discussed, real and artificial, illustrating the two faces of empirical Bayes methodology: “oracle Bayes” shows empirical Bayes in its most frequentist mode, while “finite Bayes inference” is a fundamentally Bayesian application. In either case, modern theory and computation allow us to present a sharp finite-sample picture of what is at stake in an empirical Bayes analysis.




or

Generalized Multiple Importance Sampling

Víctor Elvira, Luca Martino, David Luengo, Mónica F. Bugallo.

Source: Statistical Science, Volume 34, Number 1, 129--155.

Abstract:
Importance sampling (IS) methods are broadly used to approximate posterior distributions or their moments. In the standard IS approach, samples are drawn from a single proposal distribution and weighted adequately. However, since the performance in IS depends on the mismatch between the targeted and the proposal distributions, several proposal densities are often employed for the generation of samples. Under this multiple importance sampling (MIS) scenario, extensive literature has addressed the selection and adaptation of the proposal distributions, interpreting the sampling and weighting steps in different ways. In this paper, we establish a novel general framework with sampling and weighting procedures when more than one proposal is available. The new framework encompasses most relevant MIS schemes in the literature, and novel valid schemes appear naturally. All the MIS schemes are compared and ranked in terms of the variance of the associated estimators. Finally, we provide illustrative examples revealing that, even with a good choice of the proposal densities, a careful interpretation of the sampling and weighting procedures can make a significant difference in the performance of the method.




or

Comment: Contributions of Model Features to BART Causal Inference Performance Using ACIC 2016 Competition Data

Nicole Bohme Carnegie.

Source: Statistical Science, Volume 34, Number 1, 90--93.

Abstract:
With a thorough exposition of the methods and results of the 2016 Atlantic Causal Inference Competition, Dorie et al. have set a new standard for reproducibility and comparability of evaluations of causal inference methods. In particular, the open-source R package aciccomp2016, which permits reproduction of all datasets used in the competition, will be an invaluable resource for evaluation of future methodological developments. Building upon results from Dorie et al., we examine whether a set of potential modifications to Bayesian Additive Regression Trees (BART)—multiple chains in model fitting, using the propensity score as a covariate, targeted maximum likelihood estimation (TMLE), and computing symmetric confidence intervals—have a stronger impact on bias, RMSE, and confidence interval coverage in combination than they do alone. We find that bias in the estimate of SATT is minimal, regardless of the BART formulation. For purposes of CI coverage, however, all proposed modifications are beneficial—alone and in combination—but use of TMLE is least beneficial for coverage and results in considerably wider confidence intervals.




or

Comment on “Automated Versus Do-It-Yourself Methods for Causal Inference: Lessons Learned from a Data Analysis Competition”

Susan Gruber, Mark J. van der Laan.

Source: Statistical Science, Volume 34, Number 1, 82--85.

Abstract:
Dorie and co-authors (DHSSC) are to be congratulated for initiating the ACIC Data Challenge. Their project engaged the community and accelerated research by providing a level playing field for comparing the performance of a priori specified algorithms. DHSSC identified themes concerning characteristics of the DGP, properties of the estimators, and inference. We discuss these themes in the context of targeted learning.




or

Matching Methods for Causal Inference: A Review and a Look Forward

Elizabeth A. Stuart

Source: Statist. Sci., Volume 25, Number 1, 1--21.

Abstract:
When estimating causal effects using observational data, it is desirable to replicate a randomized experiment as closely as possible by obtaining treated and control groups with similar covariate distributions. This goal can often be achieved by choosing well-matched samples of the original treated and control groups, thereby reducing bias due to the covariates. Since the 1970s, work on matching methods has examined how to best choose treated and control subjects for comparison. Matching methods are gaining popularity in fields such as economics, epidemiology, medicine and political science. However, until now the literature and related advice has been scattered across disciplines. Researchers who are interested in using matching methods—or developing methods related to matching—do not have a single place to turn to learn about past and current research. This paper provides a structure for thinking about matching methods and guidance on their use, coalescing the existing research (both old and new) and providing a summary of where the literature on matching methods is now and where it should be headed.




or

Heteromodal Cortical Areas Encode Sensory-Motor Features of Word Meaning

The capacity to process information in conceptual form is a fundamental aspect of human cognition, yet little is known about how this type of information is encoded in the brain. Although the role of sensory and motor cortical areas has been a focus of recent debate, neuroimaging studies of concept representation consistently implicate a network of heteromodal areas that seem to support concept retrieval in general rather than knowledge related to any particular sensory-motor content. We used predictive machine learning on fMRI data to investigate the hypothesis that cortical areas in this "general semantic network" (GSN) encode multimodal information derived from basic sensory-motor processes, possibly functioning as convergence–divergence zones for distributed concept representation. An encoding model based on five conceptual attributes directly related to sensory-motor experience (sound, color, shape, manipulability, and visual motion) was used to predict brain activation patterns associated with individual lexical concepts in a semantic decision task. When the analysis was restricted to voxels in the GSN, the model was able to identify the activation patterns corresponding to individual concrete concepts significantly above chance. In contrast, a model based on five perceptual attributes of the word form performed at chance level. This pattern was reversed when the analysis was restricted to areas involved in the perceptual analysis of written word forms. These results indicate that heteromodal areas involved in semantic processing encode information about the relative importance of different sensory-motor attributes of concepts, possibly by storing particular combinations of sensory and motor features.

SIGNIFICANCE STATEMENT The present study used a predictive encoding model of word semantics to decode conceptual information from neural activity in heteromodal cortical areas. The model is based on five sensory-motor attributes of word meaning (color, shape, sound, visual motion, and manipulability) and encodes the relative importance of each attribute to the meaning of a word. This is the first demonstration that heteromodal areas involved in semantic processing can discriminate between different concepts based on sensory-motor information alone. This finding indicates that the brain represents concepts as multimodal combinations of sensory and motor representations.




or

We thank you for not smoking / design : Biman Mullick.

London (33 Stillness Rd, London, SE23 1NG) : Cleanair, Campaign for a Smoke-free Environment, [198-?]




or

No smoking is the norm / design : Biman Mullick.

London : Cleanair, Smoke-free Environment (33 Stillness Rd, London, SE23 1NG), [198-?]




or

We thank you for not smoking / design : Biman Mullick.

London (33 Stillness Rd, London, SE23 1NG) : Cleanair, Campaign for a Smoke-free Environment, [198-?]




or

We thank you for not smoking / Biman Mullick.

London : Cleanair, [1988?]




or

Muchas gracias por no fumar / Biman Mullick.

London : Cleanair, [1988?]




or

We thank you for not smoking / design : Biman Mullick.

London (33 Stillness Rd, London, SE23 1NG) : Cleanair, Campaign for a Smoke-free Environment, [198-?]




or

No smoking is the norm / design : Biman Mullick.

London : Cleanair, Smoke-free Environment (33 Stillness Rd, London, SE23 1NG), [198-?]




or

Gracias por no fumar / deseño : Biman Mullick.

[London] : Cleanair, Campaña para un Medio Ambiente Libre de Humo, [198-?]




or

Muchas gracias por no fumar / Biman Mullick.

[London] : Cleanair, [1989?]




or

Smoking is bad for your image / design : Biman Mullick.

[London?], [199-?]




or

How can the smoker and the nonsmoker be equally free in the same place? George Bernard Shaw / Biman Mullick.

[London?], [199-?]




or

Danny Smith from No Human Being Is Illegal (in all our glory). Collaged photograph by Deborah Kelly and collaborators, 2014-2018.

[London], 2019.




or

The 2019 Victoria’s Secret Fashion Show Is Canceled After Facing Backlash for Lack of Body Diversity

The reaction on social media has been fierce.




or

Blake Lively's Favorite Affordable Jeans Brand Is Having a Major Sale Right Now

Here's everything you need to know about Old Navy's Black Friday and Cyber Monday plans.




or

Healthy Holiday Gift Ideas for Women

Treat the babe in your life to one (or two or three) of these indulgent gifts.




or

Editor’s Pick: Gifts for Your Tech-Obsessed Friend

A guide to the tech gadgets even your hard-to-shop-for friends and family members will love.




or

Taylor Swift, Hailey Bieber, and Tons of Other Celebs’ Favorite Leggings Are on Sale Ahead of Black Friday

Here’s where you can snag their Alo Yoga Moto leggings for less.




or

Gabrielle Union's Mesmerizing Tie Dye Activewear Set Is On Sale for Black Friday

The rainbow sports bra and leggings set from Splits59 is a must-have for anyone craving a pop of color in their workout wardrobe.




or

Kourtney Kardashian's Favorite Leggings Are So Good, Everyone Should Own A Pair

And they're on sale for Black Friday. 




or

These Nordstrom Cyber Monday Deals Are Giving Black Friday a Run for Its Money

This is not a drill: You can get up to 50% off at Nordstrom right now.




or

Katie Holmes’s Affordable Sneakers Are the Star of Her Latest Outfit

Meghan Markle is also a fan of the comfy shoes.




or

These Clark Booties Are Actually Comfortable Enough to Wear All Day—and They’re on Sale

You can save 50% right now. 




or

The Comfy Sneakers That Kate Middleton, Kelly Ripa, and More Celebs Love Are on Sale at Amazon

Keep your feet comfy and your wallet fat.




or

Forget Black Booties, Amal Clooney and J.Lo Are Wearing This Weather-Resistant Boot Trend Instead

And it’s on sale at Nordstrom.




or

Sweatsuits Should Be Your Cozy Day Uniform—and These Are Our Favorites From Amazon

This retro style is making a comeback for a reason.




or

Nike Launches Zoom Pulse Sneakers for Medical Workers Who Are On Their Feet All Day

The new style is available to shop today.




or

Shoppers Swear These $30 Colorfulkoala Leggings Are the Ultimate Lululemon Dupes

And they’re available in 19 fun prints.