cl

Extra-coronal restorations : concepts and clinical application

9783319790930 (electronic bk.)




cl

Encyclopedia of social insects

9783319903064 electronic book




cl

Encyclopedia of signaling molecules

9781461464389 (electronic bk.)




cl

Encyclopedia of renewable and sustainable materials

9780128131961 (print)




cl

Encyclopedia of molecular pharmacology

9783030215736 (electronic bk.)




cl

Encyclopedia of cancer

9783642278419 (electronic bk.)




cl

Early onset scoliosis : a clinical casebook

9783319715803 (electronic bk.)




cl

Clinical manual of fever in children

El-Radhi, A. Sahib, author.
9783319923369 (electronic book)




cl

Clinical approaches in endodontic regeneration : current and emerging therapeutic perspectives

9783319968483 (electronic bk.)




cl

Clinical Manual of Dermatology

Huang, William W. author.
9783030239404




cl

Clinical Cases in Disorders of Melanocytes

9783030227579




cl

Climate change and soil interactions

9780128180334 (electronic bk.)




cl

Climate change and food security with emphasis on wheat

9780128195277




cl

Atlas of ulcers in systemic sclerosis : diagnosis and management

9783319984773 (electronic bk.)




cl

Atlas of sexually transmitted diseases : clinical aspects and differential diagnosis

9783319574707 (electronic bk.)




cl

Anomalies of the Developing Dentition : a Clinical Guide to Diagnosis and Management

Soxman, Jane A., author.
9783030031640 (electronic bk.)




cl

An encyclopaedia of British bridges

McFetrich, David, author.
9781526752963 (electronic bk.)




cl

A handbook of nuclear applications in humans' lives

Tabbakh, Farshid, author.
9781527544512 (electronic bk.)




cl

100 cases in clinical pharmacology, therapeutics and prescribing

Layne, Kerry, author.
9780429624537 electronic book




cl

Asymptotic genealogies of interacting particle systems with an application to sequential Monte Carlo

Jere Koskela, Paul A. Jenkins, Adam M. Johansen, Dario Spanò.

Source: The Annals of Statistics, Volume 48, Number 1, 560--583.

Abstract:
We study weighted particle systems in which new generations are resampled from current particles with probabilities proportional to their weights. This covers a broad class of sequential Monte Carlo (SMC) methods, widely-used in applied statistics and cognate disciplines. We consider the genealogical tree embedded into such particle systems, and identify conditions, as well as an appropriate time-scaling, under which they converge to the Kingman $n$-coalescent in the infinite system size limit in the sense of finite-dimensional distributions. Thus, the tractable $n$-coalescent can be used to predict the shape and size of SMC genealogies, as we illustrate by characterising the limiting mean and variance of the tree height. SMC genealogies are known to be connected to algorithm performance, so that our results are likely to have applications in the design of new methods as well. Our conditions for convergence are strong, but we show by simulation that they do not appear to be necessary.




cl

Model assisted variable clustering: Minimax-optimal recovery and algorithms

Florentina Bunea, Christophe Giraud, Xi Luo, Martin Royer, Nicolas Verzelen.

Source: The Annals of Statistics, Volume 48, Number 1, 111--137.

Abstract:
The problem of variable clustering is that of estimating groups of similar components of a $p$-dimensional vector $X=(X_{1},ldots ,X_{p})$ from $n$ independent copies of $X$. There exists a large number of algorithms that return data-dependent groups of variables, but their interpretation is limited to the algorithm that produced them. An alternative is model-based clustering, in which one begins by defining population level clusters relative to a model that embeds notions of similarity. Algorithms tailored to such models yield estimated clusters with a clear statistical interpretation. We take this view here and introduce the class of $G$-block covariance models as a background model for variable clustering. In such models, two variables in a cluster are deemed similar if they have similar associations will all other variables. This can arise, for instance, when groups of variables are noise corrupted versions of the same latent factor. We quantify the difficulty of clustering data generated from a $G$-block covariance model in terms of cluster proximity, measured with respect to two related, but different, cluster separation metrics. We derive minimax cluster separation thresholds, which are the metric values below which no algorithm can recover the model-defined clusters exactly, and show that they are different for the two metrics. We therefore develop two algorithms, COD and PECOK, tailored to $G$-block covariance models, and study their minimax-optimality with respect to each metric. Of independent interest is the fact that the analysis of the PECOK algorithm, which is based on a corrected convex relaxation of the popular $K$-means algorithm, provides the first statistical analysis of such algorithms for variable clustering. Additionally, we compare our methods with another popular clustering method, spectral clustering. Extensive simulation studies, as well as our data analyses, confirm the applicability of our approach.




cl

Generalized cluster trees and singular measures

Yen-Chi Chen.

Source: The Annals of Statistics, Volume 47, Number 4, 2174--2203.

Abstract:
In this paper we study the $alpha $-cluster tree ($alpha $-tree) under both singular and nonsingular measures. The $alpha $-tree uses probability contents within a set created by the ordering of points to construct a cluster tree so that it is well defined even for singular measures. We first derive the convergence rate for a density level set around critical points, which leads to the convergence rate for estimating an $alpha $-tree under nonsingular measures. For singular measures, we study how the kernel density estimator (KDE) behaves and prove that the KDE is not uniformly consistent but pointwise consistent after rescaling. We further prove that the estimated $alpha $-tree fails to converge in the $L_{infty }$ metric but is still consistent under the integrated distance. We also observe a new type of critical points—the dimensional critical points (DCPs)—of a singular measure. DCPs are points that contribute to cluster tree topology but cannot be defined using density gradient. Building on the analysis of the KDE and DCPs, we prove the topological consistency of an estimated $alpha $-tree.




cl

Regression for copula-linked compound distributions with applications in modeling aggregate insurance claims

Peng Shi, Zifeng Zhao.

Source: The Annals of Applied Statistics, Volume 14, Number 1, 357--380.

Abstract:
In actuarial research a task of particular interest and importance is to predict the loss cost for individual risks so that informative decisions are made in various insurance operations such as underwriting, ratemaking and capital management. The loss cost is typically viewed to follow a compound distribution where the summation of the severity variables is stopped by the frequency variable. A challenging issue in modeling such outcomes is to accommodate the potential dependence between the number of claims and the size of each individual claim. In this article we introduce a novel regression framework for compound distributions that uses a copula to accommodate the association between the frequency and the severity variables and, thus, allows for arbitrary dependence between the two components. We further show that the new model is very flexible and is easily modified to account for incomplete data due to censoring or truncation. The flexibility of the proposed model is illustrated using both simulated and real data sets. In the analysis of granular claims data from property insurance, we find substantive negative relationship between the number and the size of insurance claims. In addition, we demonstrate that ignoring the frequency-severity association could lead to biased decision-making in insurance operations.




cl

Predicting paleoclimate from compositional data using multivariate Gaussian process inverse prediction

John R. Tipton, Mevin B. Hooten, Connor Nolan, Robert K. Booth, Jason McLachlan.

Source: The Annals of Applied Statistics, Volume 13, Number 4, 2363--2388.

Abstract:
Multivariate compositional count data arise in many applications including ecology, microbiology, genetics and paleoclimate. A frequent question in the analysis of multivariate compositional count data is what underlying values of a covariate(s) give rise to the observed composition. Learning the relationship between covariates and the compositional count allows for inverse prediction of unobserved covariates given compositional count observations. Gaussian processes provide a flexible framework for modeling functional responses with respect to a covariate without assuming a functional form. Many scientific disciplines use Gaussian process approximations to improve prediction and make inference on latent processes and parameters. When prediction is desired on unobserved covariates given realizations of the response variable, this is called inverse prediction. Because inverse prediction is often mathematically and computationally challenging, predicting unobserved covariates often requires fitting models that are different from the hypothesized generative model. We present a novel computational framework that allows for efficient inverse prediction using a Gaussian process approximation to generative models. Our framework enables scientific learning about how the latent processes co-vary with respect to covariates while simultaneously providing predictions of missing covariates. The proposed framework is capable of efficiently exploring the high dimensional, multi-modal latent spaces that arise in the inverse problem. To demonstrate flexibility, we apply our method in a generalized linear model framework to predict latent climate states given multivariate count data. Based on cross-validation, our model has predictive skill competitive with current methods while simultaneously providing formal, statistical inference on the underlying community dynamics of the biological system previously not available.




cl

A latent discrete Markov random field approach to identifying and classifying historical forest communities based on spatial multivariate tree species counts

Stephen Berg, Jun Zhu, Murray K. Clayton, Monika E. Shea, David J. Mladenoff.

Source: The Annals of Applied Statistics, Volume 13, Number 4, 2312--2340.

Abstract:
The Wisconsin Public Land Survey database describes historical forest composition at high spatial resolution and is of interest in ecological studies of forest composition in Wisconsin just prior to significant Euro-American settlement. For such studies it is useful to identify recurring subpopulations of tree species known as communities, but standard clustering approaches for subpopulation identification do not account for dependence between spatially nearby observations. Here, we develop and fit a latent discrete Markov random field model for the purpose of identifying and classifying historical forest communities based on spatially referenced multivariate tree species counts across Wisconsin. We show empirically for the actual dataset and through simulation that our latent Markov random field modeling approach improves prediction and parameter estimation performance. For model fitting we introduce a new stochastic approximation algorithm which enables computationally efficient estimation and classification of large amounts of spatial multivariate count data.




cl

Incorporating conditional dependence in latent class models for probabilistic record linkage: Does it matter?

Huiping Xu, Xiaochun Li, Changyu Shen, Siu L. Hui, Shaun Grannis.

Source: The Annals of Applied Statistics, Volume 13, Number 3, 1753--1790.

Abstract:
The conditional independence assumption of the Felligi and Sunter (FS) model in probabilistic record linkage is often violated when matching real-world data. Ignoring conditional dependence has been shown to seriously bias parameter estimates. However, in record linkage, the ultimate goal is to inform the match status of record pairs and therefore, record linkage algorithms should be evaluated in terms of matching accuracy. In the literature, more flexible models have been proposed to relax the conditional independence assumption, but few studies have assessed whether such accommodations improve matching accuracy. In this paper, we show that incorporating the conditional dependence appropriately yields comparable or improved matching accuracy than the FS model using three real-world data linkage examples. Through a simulation study, we further investigate when conditional dependence models provide improved matching accuracy. Our study shows that the FS model is generally robust to the conditional independence assumption and provides comparable matching accuracy as the more complex conditional dependence models. However, when the match prevalence approaches 0% or 100% and conditional dependence exists in the dominating class, it is necessary to address conditional dependence as the FS model produces suboptimal matching accuracy. The need to address conditional dependence becomes less important when highly discriminating fields are used. Our simulation study also shows that conditional dependence models with misspecified dependence structure could produce less accurate record matching than the FS model and therefore we caution against the blind use of conditional dependence models.




cl

A hierarchical Bayesian model for single-cell clustering using RNA-sequencing data

Yiyi Liu, Joshua L. Warren, Hongyu Zhao.

Source: The Annals of Applied Statistics, Volume 13, Number 3, 1733--1752.

Abstract:
Understanding the heterogeneity of cells is an important biological question. The development of single-cell RNA-sequencing (scRNA-seq) technology provides high resolution data for such inquiry. A key challenge in scRNA-seq analysis is the high variability of measured RNA expression levels and frequent dropouts (missing values) due to limited input RNA compared to bulk RNA-seq measurement. Existing clustering methods do not perform well for these noisy and zero-inflated scRNA-seq data. In this manuscript we propose a Bayesian hierarchical model, called BasClu, to appropriately characterize important features of scRNA-seq data in order to more accurately cluster cells. We demonstrate the effectiveness of our method with extensive simulation studies and applications to three real scRNA-seq datasets.




cl

Network classification with applications to brain connectomics

Jesús D. Arroyo Relión, Daniel Kessler, Elizaveta Levina, Stephan F. Taylor.

Source: The Annals of Applied Statistics, Volume 13, Number 3, 1648--1677.

Abstract:
While statistical analysis of a single network has received a lot of attention in recent years, with a focus on social networks, analysis of a sample of networks presents its own challenges which require a different set of analytic tools. Here we study the problem of classification of networks with labeled nodes, motivated by applications in neuroimaging. Brain networks are constructed from imaging data to represent functional connectivity between regions of the brain, and previous work has shown the potential of such networks to distinguish between various brain disorders, giving rise to a network classification problem. Existing approaches tend to either treat all edge weights as a long vector, ignoring the network structure, or focus on graph topology as represented by summary measures while ignoring the edge weights. Our goal is to design a classification method that uses both the individual edge information and the network structure of the data in a computationally efficient way, and that can produce a parsimonious and interpretable representation of differences in brain connectivity patterns between classes. We propose a graph classification method that uses edge weights as predictors but incorporates the network nature of the data via penalties that promote sparsity in the number of nodes, in addition to the usual sparsity penalties that encourage selection of edges. We implement the method via efficient convex optimization and provide a detailed analysis of data from two fMRI studies of schizophrenia.




cl

The classification permutation test: A flexible approach to testing for covariate imbalance in observational studies

Johann Gagnon-Bartsch, Yotam Shem-Tov.

Source: The Annals of Applied Statistics, Volume 13, Number 3, 1464--1483.

Abstract:
The gold standard for identifying causal relationships is a randomized controlled experiment. In many applications in the social sciences and medicine, the researcher does not control the assignment mechanism and instead may rely upon natural experiments or matching methods as a substitute to experimental randomization. The standard testable implication of random assignment is covariate balance between the treated and control units. Covariate balance is commonly used to validate the claim of as good as random assignment. We propose a new nonparametric test of covariate balance. Our Classification Permutation Test (CPT) is based on a combination of classification methods (e.g., random forests) with Fisherian permutation inference. We revisit four real data examples and present Monte Carlo power simulations to demonstrate the applicability of the CPT relative to other nonparametric tests of equality of multivariate distributions.




cl

On Sobolev tests of uniformity on the circle with an extension to the sphere

Sreenivasa Rao Jammalamadaka, Simos Meintanis, Thomas Verdebout.

Source: Bernoulli, Volume 26, Number 3, 2226--2252.

Abstract:
Circular and spherical data arise in many applications, especially in biology, Earth sciences and astronomy. In dealing with such data, one of the preliminary steps before any further inference, is to test if such data is isotropic, that is, uniformly distributed around the circle or the sphere. In view of its importance, there is a considerable literature on the topic. In the present work, we provide new tests of uniformity on the circle based on original asymptotic results. Our tests are motivated by the shape of locally and asymptotically maximin tests of uniformity against generalized von Mises distributions. We show that they are uniformly consistent. Empirical power comparisons with several competing procedures are presented via simulations. The new tests detect particularly well multimodal alternatives such as mixtures of von Mises distributions. A practically-oriented combination of the new tests with already existing Sobolev tests is proposed. An extension to testing uniformity on the sphere, along with some simulations, is included. The procedures are illustrated on a real dataset.




cl

Optimal functional supervised classification with separation condition

Sébastien Gadat, Sébastien Gerchinovitz, Clément Marteau.

Source: Bernoulli, Volume 26, Number 3, 1797--1831.

Abstract:
We consider the binary supervised classification problem with the Gaussian functional model introduced in ( Math. Methods Statist. 22 (2013) 213–225). Taking advantage of the Gaussian structure, we design a natural plug-in classifier and derive a family of upper bounds on its worst-case excess risk over Sobolev spaces. These bounds are parametrized by a separation distance quantifying the difficulty of the problem, and are proved to be optimal (up to logarithmic factors) through matching minimax lower bounds. Using the recent works of (In Advances in Neural Information Processing Systems (2014) 3437–3445 Curran Associates) and ( Ann. Statist. 44 (2016) 982–1009), we also derive a logarithmic lower bound showing that the popular $k$-nearest neighbors classifier is far from optimality in this specific functional setting.




cl

Reliable clustering of Bernoulli mixture models

Amir Najafi, Seyed Abolfazl Motahari, Hamid R. Rabiee.

Source: Bernoulli, Volume 26, Number 2, 1535--1559.

Abstract:
A Bernoulli Mixture Model (BMM) is a finite mixture of random binary vectors with independent dimensions. The problem of clustering BMM data arises in a variety of real-world applications, ranging from population genetics to activity analysis in social networks. In this paper, we analyze the clusterability of BMMs from a theoretical perspective, when the number of clusters is unknown. In particular, we stipulate a set of conditions on the sample complexity and dimension of the model in order to guarantee the Probably Approximately Correct (PAC)-clusterability of a dataset. To the best of our knowledge, these findings are the first non-asymptotic bounds on the sample complexity of learning or clustering BMMs.




cl

Prediction and estimation consistency of sparse multi-class penalized optimal scoring

Irina Gaynanova.

Source: Bernoulli, Volume 26, Number 1, 286--322.

Abstract:
Sparse linear discriminant analysis via penalized optimal scoring is a successful tool for classification in high-dimensional settings. While the variable selection consistency of sparse optimal scoring has been established, the corresponding prediction and estimation consistency results have been lacking. We bridge this gap by providing probabilistic bounds on out-of-sample prediction error and estimation error of multi-class penalized optimal scoring allowing for diverging number of classes.




cl

Cliques in rank-1 random graphs: The role of inhomogeneity

Kay Bogerd, Rui M. Castro, Remco van der Hofstad.

Source: Bernoulli, Volume 26, Number 1, 253--285.

Abstract:
We study the asymptotic behavior of the clique number in rank-1 inhomogeneous random graphs, where edge probabilities between vertices are roughly proportional to the product of their vertex weights. We show that the clique number is concentrated on at most two consecutive integers, for which we provide an expression. Interestingly, the order of the clique number is primarily determined by the overall edge density, with the inhomogeneity only affecting multiplicative constants or adding at most a $log log (n)$ multiplicative factor. For sparse enough graphs the clique number is always bounded and the effect of inhomogeneity completely vanishes.




cl

The story of Thomas & Ann Stone family : including Helping Hobart's Orphans, the King's Orphan School for Boys 1831-1836 / Alexander E.H. Stone.

King's Orphan Schools (New Town, Tas.)




cl

Living through English history : stories of the Urlwin, Brittridge, Vasper, Partridge and Ellerby families / Janet McLeod.

Urlwin (Family).




cl

Item 01: Notebooks (2) containing hand written copies of 123 letters from Major William Alan Audsley to his parents, ca. 1916-ca. 1919, transcribed by his father. Also includes original letters (2) written by Major Audsley.




cl

U.S. chief justice puts hold on disclosure of Russia investigation materials

U.S. Chief Justice John Roberts on Friday put a temporary hold on the disclosure to a Democratic-led House of Representatives committee of grand jury material redacted from former Special Counsel Robert Mueller's report on Russian interference in the 2016 election. The U.S. Court of Appeals for the District of Columbia Circuit ruled in March that the materials had to be disclosed to the House Judiciary Committee and refused to put that decision on hold. The appeals court said the materials had to be handed over by May 11 if the Supreme Court did not intervene.





cl

Chaffetz: I don't understand why Adam Schiff continues to have a security clearance

Fox News contributor Jason Chaffetz and Andy McCarthy react to House Intelligence transcripts on Russia probe.





cl

Meet the Ohio health expert who has a fan club — and Republicans trying to stop her

Some Buckeyes are not comfortable being told by a "woman in power" to quarantine, one expert said.





cl

Spatial Disease Mapping Using Directed Acyclic Graph Auto-Regressive (DAGAR) Models

Abhirup Datta, Sudipto Banerjee, James S. Hodges, Leiwen Gao.

Source: Bayesian Analysis, Volume 14, Number 4, 1221--1244.

Abstract:
Hierarchical models for regionally aggregated disease incidence data commonly involve region specific latent random effects that are modeled jointly as having a multivariate Gaussian distribution. The covariance or precision matrix incorporates the spatial dependence between the regions. Common choices for the precision matrix include the widely used ICAR model, which is singular, and its nonsingular extension which lacks interpretability. We propose a new parametric model for the precision matrix based on a directed acyclic graph (DAG) representation of the spatial dependence. Our model guarantees positive definiteness and, hence, in addition to being a valid prior for regional spatially correlated random effects, can also directly model the outcome from dependent data like images and networks. Theoretical results establish a link between the parameters in our model and the variance and covariances of the random effects. Simulation studies demonstrate that the improved interpretability of our model reaps benefits in terms of accurately recovering the latent spatial random effects as well as for inference on the spatial covariance parameters. Under modest spatial correlation, our model far outperforms the CAR models, while the performances are similar when the spatial correlation is strong. We also assess sensitivity to the choice of the ordering in the DAG construction using theoretical and empirical results which testify to the robustness of our model. We also present a large-scale public health application demonstrating the competitive performance of the model.




cl

Extrinsic Gaussian Processes for Regression and Classification on Manifolds

Lizhen Lin, Niu Mu, Pokman Cheung, David Dunson.

Source: Bayesian Analysis, Volume 14, Number 3, 907--926.

Abstract:
Gaussian processes (GPs) are very widely used for modeling of unknown functions or surfaces in applications ranging from regression to classification to spatial processes. Although there is an increasingly vast literature on applications, methods, theory and algorithms related to GPs, the overwhelming majority of this literature focuses on the case in which the input domain corresponds to a Euclidean space. However, particularly in recent years with the increasing collection of complex data, it is commonly the case that the input domain does not have such a simple form. For example, it is common for the inputs to be restricted to a non-Euclidean manifold, a case which forms the motivation for this article. In particular, we propose a general extrinsic framework for GP modeling on manifolds, which relies on embedding of the manifold into a Euclidean space and then constructing extrinsic kernels for GPs on their images. These extrinsic Gaussian processes (eGPs) are used as prior distributions for unknown functions in Bayesian inferences. Our approach is simple and general, and we show that the eGPs inherit fine theoretical properties from GP models in Euclidean spaces. We consider applications of our models to regression and classification problems with predictors lying in a large class of manifolds, including spheres, planar shape spaces, a space of positive definite matrices, and Grassmannians. Our models can be readily used by practitioners in biological sciences for various regression and classification problems, such as disease diagnosis or detection. Our work is also likely to have impact in spatial statistics when spatial locations are on the sphere or other geometric spaces.




cl

Bayes Factor Testing of Multiple Intraclass Correlations

Joris Mulder, Jean-Paul Fox.

Source: Bayesian Analysis, Volume 14, Number 2, 521--552.

Abstract:
The intraclass correlation plays a central role in modeling hierarchically structured data, such as educational data, panel data, or group-randomized trial data. It represents relevant information concerning the between-group and within-group variation. Methods for Bayesian hypothesis tests concerning the intraclass correlation are proposed to improve decision making in hierarchical data analysis and to assess the grouping effect across different group categories. Estimation and testing methods for the intraclass correlation coefficient are proposed under a marginal modeling framework where the random effects are integrated out. A class of stretched beta priors is proposed on the intraclass correlations, which is equivalent to shifted $F$ priors for the between groups variances. Through a parameter expansion it is shown that this prior is conditionally conjugate under the marginal model yielding efficient posterior computation. A special improper case results in accurate coverage rates of the credible intervals even for minimal sample size and when the true intraclass correlation equals zero. Bayes factor tests are proposed for testing multiple precise and order hypotheses on intraclass correlations. These tests can be used when prior information about the intraclass correlations is available or absent. For the noninformative case, a generalized fractional Bayes approach is developed. The method enables testing the presence and strength of grouped data structures without introducing random effects. The methodology is applied to a large-scale survey study on international mathematics achievement at fourth grade to test the heterogeneity in the clustering of students in schools across countries and assessment cycles.




cl

Some Statistical Issues in Climate Science

Michael L. Stein.

Source: Statistical Science, Volume 35, Number 1, 31--41.

Abstract:
Climate science is a field that is arguably both data-rich and data-poor. Data rich in that huge and quickly increasing amounts of data about the state of the climate are collected every day. Data poor in that important aspects of the climate are still undersampled, such as the deep oceans and some characteristics of the upper atmosphere. Data rich in that modern climate models can produce climatological quantities over long time periods with global coverage, including quantities that are difficult to measure and under conditions for which there is no data presently. Data poor in that the correspondence between climate model output to the actual climate, especially for future climate change due to human activities, is difficult to assess. The scope for fruitful interactions between climate scientists and statisticians is great, but requires serious commitments from researchers in both disciplines to understand the scientific and statistical nuances arising from the complex relationships between the data and the real-world problems. This paper describes a small fraction of some of the intellectual challenges that occur at the interface between climate science and statistics, including inferences for extremes for processes with seasonality and long-term trends, the use of climate model ensembles for studying extremes, the scope for using new data sources for studying space-time characteristics of environmental processes and a discussion of non-Gaussian space-time process models for climate variables. The paper concludes with a call to the statistical community to become more engaged in one of the great scientific and policy issues of our time, anthropogenic climate change and its impacts.




cl

The Importance of Being Clustered: Uncluttering the Trends of Statistics from 1970 to 2015

Laura Anderlucci, Angela Montanari, Cinzia Viroli.

Source: Statistical Science, Volume 34, Number 2, 280--300.

Abstract:
In this paper, we retrace the recent history of statistics by analyzing all the papers published in five prestigious statistical journals since 1970, namely: The Annals of Statistics , Biometrika , Journal of the American Statistical Association , Journal of the Royal Statistical Society, Series B and Statistical Science . The aim is to construct a kind of “taxonomy” of the statistical papers by organizing and clustering them in main themes. In this sense being identified in a cluster means being important enough to be uncluttered in the vast and interconnected world of the statistical research. Since the main statistical research topics naturally born, evolve or die during time, we will also develop a dynamic clustering strategy, where a group in a time period is allowed to migrate or to merge into different groups in the following one. Results show that statistics is a very dynamic and evolving science, stimulated by the rise of new research questions and types of data.




cl

Rejoinder: Bayes, Oracle Bayes, and Empirical Bayes

Bradley Efron.

Source: Statistical Science, Volume 34, Number 2, 234--235.




cl

Comment: Bayes, Oracle Bayes and Empirical Bayes

Aad van der Vaart.

Source: Statistical Science, Volume 34, Number 2, 214--218.




cl

Comment: Bayes, Oracle Bayes, and Empirical Bayes

Nan Laird.

Source: Statistical Science, Volume 34, Number 2, 206--208.




cl

Comment: Bayes, Oracle Bayes, and Empirical Bayes

Thomas A. Louis.

Source: Statistical Science, Volume 34, Number 2, 202--205.




cl

Bayes, Oracle Bayes and Empirical Bayes

Bradley Efron.

Source: Statistical Science, Volume 34, Number 2, 177--201.

Abstract:
This article concerns the Bayes and frequentist aspects of empirical Bayes inference. Some of the ideas explored go back to Robbins in the 1950s, while others are current. Several examples are discussed, real and artificial, illustrating the two faces of empirical Bayes methodology: “oracle Bayes” shows empirical Bayes in its most frequentist mode, while “finite Bayes inference” is a fundamentally Bayesian application. In either case, modern theory and computation allow us to present a sharp finite-sample picture of what is at stake in an empirical Bayes analysis.