ex

Green food processing techniques : preservation, transformation and extraction

9780128153536




ex

Extra-coronal restorations : concepts and clinical application

9783319790930 (electronic bk.)




ex

DeJong's the neurologic examination

Campbell, William W., Jr. (William Wesley), author.
9781496386168 (hardcover)




ex

Comprehensive biochemistry for dentistry : textbook for dental students

Gupta, Anil, author.
9789811310355 (electronic bk.)




ex

Complexity and approximation : in memory of Ker-I Ko

9783030416720 (electronic bk.)




ex

Atlas of sexually transmitted diseases : clinical aspects and differential diagnosis

9783319574707 (electronic bk.)







ex

Almost sure uniqueness of a global minimum without convexity

Gregory Cox.

Source: The Annals of Statistics, Volume 48, Number 1, 584--606.

Abstract:
This paper establishes the argmin of a random objective function to be unique almost surely. This paper first formulates a general result that proves almost sure uniqueness without convexity of the objective function. The general result is then applied to a variety of applications in statistics. Four applications are discussed, including uniqueness of M-estimators, both classical likelihood and penalized likelihood estimators, and two applications of the argmin theorem, threshold regression and weak identification.




ex

Concentration and consistency results for canonical and curved exponential-family models of random graphs

Michael Schweinberger, Jonathan Stewart.

Source: The Annals of Statistics, Volume 48, Number 1, 374--396.

Abstract:
Statistical inference for exponential-family models of random graphs with dependent edges is challenging. We stress the importance of additional structure and show that additional structure facilitates statistical inference. A simple example of a random graph with additional structure is a random graph with neighborhoods and local dependence within neighborhoods. We develop the first concentration and consistency results for maximum likelihood and $M$-estimators of a wide range of canonical and curved exponential-family models of random graphs with local dependence. All results are nonasymptotic and applicable to random graphs with finite populations of nodes, although asymptotic consistency results can be obtained as well. In addition, we show that additional structure can facilitate subgraph-to-graph estimation, and present concentration results for subgraph-to-graph estimators. As an application, we consider popular curved exponential-family models of random graphs, with local dependence induced by transitivity and parameter vectors whose dimensions depend on the number of nodes.




ex

Sparse high-dimensional regression: Exact scalable algorithms and phase transitions

Dimitris Bertsimas, Bart Van Parys.

Source: The Annals of Statistics, Volume 48, Number 1, 300--323.

Abstract:
We present a novel binary convex reformulation of the sparse regression problem that constitutes a new duality perspective. We devise a new cutting plane method and provide evidence that it can solve to provable optimality the sparse regression problem for sample sizes $n$ and number of regressors $p$ in the 100,000s, that is, two orders of magnitude better than the current state of the art, in seconds. The ability to solve the problem for very high dimensions allows us to observe new phase transition phenomena. Contrary to traditional complexity theory which suggests that the difficulty of a problem increases with problem size, the sparse regression problem has the property that as the number of samples $n$ increases the problem becomes easier in that the solution recovers 100% of the true signal, and our approach solves the problem extremely fast (in fact faster than Lasso), while for small number of samples $n$, our approach takes a larger amount of time to solve the problem, but importantly the optimal solution provides a statistically more relevant regressor. We argue that our exact sparse regression approach presents a superior alternative over heuristic methods available at present.




ex

Rerandomization in $2^{K}$ factorial experiments

Xinran Li, Peng Ding, Donald B. Rubin.

Source: The Annals of Statistics, Volume 48, Number 1, 43--63.

Abstract:
With many pretreatment covariates and treatment factors, the classical factorial experiment often fails to balance covariates across multiple factorial effects simultaneously. Therefore, it is intuitive to restrict the randomization of the treatment factors to satisfy certain covariate balance criteria, possibly conforming to the tiers of factorial effects and covariates based on their relative importances. This is rerandomization in factorial experiments. We study the asymptotic properties of this experimental design under the randomization inference framework without imposing any distributional or modeling assumptions of the covariates and outcomes. We derive the joint asymptotic sampling distribution of the usual estimators of the factorial effects, and show that it is symmetric, unimodal and more “concentrated” at the true factorial effects under rerandomization than under the classical factorial experiment. We quantify this advantage of rerandomization using the notions of “central convex unimodality” and “peakedness” of the joint asymptotic sampling distribution. We also construct conservative large-sample confidence sets for the factorial effects.




ex

The phase transition for the existence of the maximum likelihood estimate in high-dimensional logistic regression

Emmanuel J. Candès, Pragya Sur.

Source: The Annals of Statistics, Volume 48, Number 1, 27--42.

Abstract:
This paper rigorously establishes that the existence of the maximum likelihood estimate (MLE) in high-dimensional logistic regression models with Gaussian covariates undergoes a sharp “phase transition.” We introduce an explicit boundary curve $h_{mathrm{MLE}}$, parameterized by two scalars measuring the overall magnitude of the unknown sequence of regression coefficients, with the following property: in the limit of large sample sizes $n$ and number of features $p$ proportioned in such a way that $p/n ightarrow kappa $, we show that if the problem is sufficiently high dimensional in the sense that $kappa >h_{mathrm{MLE}}$, then the MLE does not exist with probability one. Conversely, if $kappa <h_{mathrm{MLE}}$, the MLE asymptotically exists with probability one.




ex

Detecting relevant changes in the mean of nonstationary processes—A mass excess approach

Holger Dette, Weichi Wu.

Source: The Annals of Statistics, Volume 47, Number 6, 3578--3608.

Abstract:
This paper considers the problem of testing if a sequence of means $(mu_{t})_{t=1,ldots ,n}$ of a nonstationary time series $(X_{t})_{t=1,ldots ,n}$ is stable in the sense that the difference of the means $mu_{1}$ and $mu_{t}$ between the initial time $t=1$ and any other time is smaller than a given threshold, that is $|mu_{1}-mu_{t}|leq c$ for all $t=1,ldots ,n$. A test for hypotheses of this type is developed using a bias corrected monotone rearranged local linear estimator and asymptotic normality of the corresponding test statistic is established. As the asymptotic variance depends on the location of the roots of the equation $|mu_{1}-mu_{t}|=c$ a new bootstrap procedure is proposed to obtain critical values and its consistency is established. As a consequence we are able to quantitatively describe relevant deviations of a nonstationary sequence from its initial value. The results are illustrated by means of a simulation study and by analyzing data examples.




ex

Sampling and estimation for (sparse) exchangeable graphs

Victor Veitch, Daniel M. Roy.

Source: The Annals of Statistics, Volume 47, Number 6, 3274--3299.

Abstract:
Sparse exchangeable graphs on $mathbb{R}_{+}$, and the associated graphex framework for sparse graphs, generalize exchangeable graphs on $mathbb{N}$, and the associated graphon framework for dense graphs. We develop the graphex framework as a tool for statistical network analysis by identifying the sampling scheme that is naturally associated with the models of the framework, formalizing two natural notions of consistent estimation of the parameter (the graphex) underlying these models, and identifying general consistent estimators in each case. The sampling scheme is a modification of independent vertex sampling that throws away vertices that are isolated in the sampled subgraph. The estimators are variants of the empirical graphon estimator, which is known to be a consistent estimator for the distribution of dense exchangeable graphs; both can be understood as graph analogues to the empirical distribution in the i.i.d. sequence setting. Our results may be viewed as a generalization of consistent estimation via the empirical graphon from the dense graph regime to also include sparse graphs.




ex

Exact lower bounds for the agnostic probably-approximately-correct (PAC) machine learning model

Aryeh Kontorovich, Iosif Pinelis.

Source: The Annals of Statistics, Volume 47, Number 5, 2822--2854.

Abstract:
We provide an exact nonasymptotic lower bound on the minimax expected excess risk (EER) in the agnostic probably-approximately-correct (PAC) machine learning classification model and identify minimax learning algorithms as certain maximally symmetric and minimally randomized “voting” procedures. Based on this result, an exact asymptotic lower bound on the minimax EER is provided. This bound is of the simple form $c_{infty}/sqrt{ u}$ as $ u oinfty$, where $c_{infty}=0.16997dots$ is a universal constant, $ u=m/d$, $m$ is the size of the training sample and $d$ is the Vapnik–Chervonenkis dimension of the hypothesis class. It is shown that the differences between these asymptotic and nonasymptotic bounds, as well as the differences between these two bounds and the maximum EER of any learning algorithms that minimize the empirical risk, are asymptotically negligible, and all these differences are due to ties in the mentioned “voting” procedures. A few easy to compute nonasymptotic lower bounds on the minimax EER are also obtained, which are shown to be close to the exact asymptotic lower bound $c_{infty}/sqrt{ u}$ even for rather small values of the ratio $ u=m/d$. As an application of these results, we substantially improve existing lower bounds on the tail probability of the excess risk. Among the tools used are Bayes estimation and apparently new identities and inequalities for binomial distributions.




ex

Convergence complexity analysis of Albert and Chib’s algorithm for Bayesian probit regression

Qian Qin, James P. Hobert.

Source: The Annals of Statistics, Volume 47, Number 4, 2320--2347.

Abstract:
The use of MCMC algorithms in high dimensional Bayesian problems has become routine. This has spurred so-called convergence complexity analysis, the goal of which is to ascertain how the convergence rate of a Monte Carlo Markov chain scales with sample size, $n$, and/or number of covariates, $p$. This article provides a thorough convergence complexity analysis of Albert and Chib’s [ J. Amer. Statist. Assoc. 88 (1993) 669–679] data augmentation algorithm for the Bayesian probit regression model. The main tools used in this analysis are drift and minorization conditions. The usual pitfalls associated with this type of analysis are avoided by utilizing centered drift functions, which are minimized in high posterior probability regions, and by using a new technique to suppress high-dimensionality in the construction of minorization conditions. The main result is that the geometric convergence rate of the underlying Markov chain is bounded below 1 both as $n ightarrowinfty$ (with $p$ fixed), and as $p ightarrowinfty$ (with $n$ fixed). Furthermore, the first computable bounds on the total variation distance to stationarity are byproducts of the asymptotic analysis.




ex

Empirical Bayes analysis of RNA sequencing experiments with auxiliary information

Kun Liang.

Source: The Annals of Applied Statistics, Volume 13, Number 4, 2452--2482.

Abstract:
Finding differentially expressed genes is a common task in high-throughput transcriptome studies. While traditional statistical methods rank the genes by their test statistics alone, we analyze an RNA sequencing dataset using the auxiliary information of gene length and the test statistics from a related microarray study. Given the auxiliary information, we propose a novel nonparametric empirical Bayes procedure to estimate the posterior probability of differential expression for each gene. We demonstrate the advantage of our procedure in extensive simulation studies and a psoriasis RNA sequencing study. The companion R package calm is available at Bioconductor.




ex

Distributional regression forests for probabilistic precipitation forecasting in complex terrain

Lisa Schlosser, Torsten Hothorn, Reto Stauffer, Achim Zeileis.

Source: The Annals of Applied Statistics, Volume 13, Number 3, 1564--1589.

Abstract:
To obtain a probabilistic model for a dependent variable based on some set of explanatory variables, a distributional approach is often adopted where the parameters of the distribution are linked to regressors. In many classical models this only captures the location of the distribution but over the last decade there has been increasing interest in distributional regression approaches modeling all parameters including location, scale and shape. Notably, so-called nonhomogeneous Gaussian regression (NGR) models both mean and variance of a Gaussian response and is particularly popular in weather forecasting. Moreover, generalized additive models for location, scale and shape (GAMLSS) provide a framework where each distribution parameter is modeled separately capturing smooth linear or nonlinear effects. However, when variable selection is required and/or there are nonsmooth dependencies or interactions (especially unknown or of high-order), it is challenging to establish a good GAMLSS. A natural alternative in these situations would be the application of regression trees or random forests but, so far, no general distributional framework is available for these. Therefore, a framework for distributional regression trees and forests is proposed that blends regression trees and random forests with classical distributions from the GAMLSS framework as well as their censored or truncated counterparts. To illustrate these novel approaches in practice, they are employed to obtain probabilistic precipitation forecasts at numerous sites in a mountainous region (Tyrol, Austria) based on a large number of numerical weather prediction quantities. It is shown that the novel distributional regression forests automatically select variables and interactions, performing on par or often even better than GAMLSS specified either through prior meteorological knowledge or a computationally more demanding boosting approach.




ex

The classification permutation test: A flexible approach to testing for covariate imbalance in observational studies

Johann Gagnon-Bartsch, Yotam Shem-Tov.

Source: The Annals of Applied Statistics, Volume 13, Number 3, 1464--1483.

Abstract:
The gold standard for identifying causal relationships is a randomized controlled experiment. In many applications in the social sciences and medicine, the researcher does not control the assignment mechanism and instead may rely upon natural experiments or matching methods as a substitute to experimental randomization. The standard testable implication of random assignment is covariate balance between the treated and control units. Covariate balance is commonly used to validate the claim of as good as random assignment. We propose a new nonparametric test of covariate balance. Our Classification Permutation Test (CPT) is based on a combination of classification methods (e.g., random forests) with Fisherian permutation inference. We revisit four real data examples and present Monte Carlo power simulations to demonstrate the applicability of the CPT relative to other nonparametric tests of equality of multivariate distributions.




ex

On Sobolev tests of uniformity on the circle with an extension to the sphere

Sreenivasa Rao Jammalamadaka, Simos Meintanis, Thomas Verdebout.

Source: Bernoulli, Volume 26, Number 3, 2226--2252.

Abstract:
Circular and spherical data arise in many applications, especially in biology, Earth sciences and astronomy. In dealing with such data, one of the preliminary steps before any further inference, is to test if such data is isotropic, that is, uniformly distributed around the circle or the sphere. In view of its importance, there is a considerable literature on the topic. In the present work, we provide new tests of uniformity on the circle based on original asymptotic results. Our tests are motivated by the shape of locally and asymptotically maximin tests of uniformity against generalized von Mises distributions. We show that they are uniformly consistent. Empirical power comparisons with several competing procedures are presented via simulations. The new tests detect particularly well multimodal alternatives such as mixtures of von Mises distributions. A practically-oriented combination of the new tests with already existing Sobolev tests is proposed. An extension to testing uniformity on the sphere, along with some simulations, is included. The procedures are illustrated on a real dataset.




ex

Exponential integrability and exit times of diffusions on sub-Riemannian and metric measure spaces

Anton Thalmaier, James Thompson.

Source: Bernoulli, Volume 26, Number 3, 2202--2225.

Abstract:
In this article, we derive moment estimates, exponential integrability, concentration inequalities and exit times estimates for canonical diffusions firstly on sub-Riemannian limits of Riemannian foliations and secondly in the nonsmooth setting of $operatorname{RCD}^{*}(K,N)$ spaces. In each case, the necessary ingredients are Itô’s formula and a comparison theorem for the Laplacian, for which we refer to the recent literature. As an application, we derive pointwise Carmona-type estimates on eigenfunctions of Schrödinger operators.




ex

Efficient estimation in single index models through smoothing splines

Arun K. Kuchibhotla, Rohit K. Patra.

Source: Bernoulli, Volume 26, Number 2, 1587--1618.

Abstract:
We consider estimation and inference in a single index regression model with an unknown but smooth link function. In contrast to the standard approach of using kernels or regression splines, we use smoothing splines to estimate the smooth link function. We develop a method to compute the penalized least squares estimators (PLSEs) of the parametric and the nonparametric components given independent and identically distributed (i.i.d.) data. We prove the consistency and find the rates of convergence of the estimators. We establish asymptotic normality under mild assumption and prove asymptotic efficiency of the parametric component under homoscedastic errors. A finite sample simulation corroborates our asymptotic theory. We also analyze a car mileage data set and a Ozone concentration data set. The identifiability and existence of the PLSEs are also investigated.




ex

Stratonovich stochastic differential equation with irregular coefficients: Girsanov’s example revisited

Ilya Pavlyukevich, Georgiy Shevchenko.

Source: Bernoulli, Volume 26, Number 2, 1381--1409.

Abstract:
In this paper, we study the Stratonovich stochastic differential equation $mathrm{d}X=|X|^{alpha }circ mathrm{d}B$, $alpha in (-1,1)$, which has been introduced by Cherstvy et al. ( New J. Phys. 15 (2013) 083039) in the context of analysis of anomalous diffusions in heterogeneous media. We determine its weak and strong solutions, which are homogeneous strong Markov processes spending zero time at $0$: for $alpha in (0,1)$, these solutions have the form egin{equation*}X_{t}^{ heta }=((1-alpha)B_{t}^{ heta })^{1/(1-alpha )},end{equation*} where $B^{ heta }$ is the $ heta $-skew Brownian motion driven by $B$ and starting at $frac{1}{1-alpha }(X_{0})^{1-alpha }$, $ heta in [-1,1]$, and $(x)^{gamma }=|x|^{gamma }operatorname{sign}x$; for $alpha in (-1,0]$, only the case $ heta =0$ is possible. The central part of the paper consists in the proof of the existence of a quadratic covariation $[f(B^{ heta }),B]$ for a locally square integrable function $f$ and is based on the time-reversion technique for Markovian diffusions.




ex

Consistent structure estimation of exponential-family random graph models with block structure

Michael Schweinberger.

Source: Bernoulli, Volume 26, Number 2, 1205--1233.

Abstract:
We consider the challenging problem of statistical inference for exponential-family random graph models based on a single observation of a random graph with complex dependence. To facilitate statistical inference, we consider random graphs with additional structure in the form of block structure. We have shown elsewhere that when the block structure is known, it facilitates consistency results for $M$-estimators of canonical and curved exponential-family random graph models with complex dependence, such as transitivity. In practice, the block structure is known in some applications (e.g., multilevel networks), but is unknown in others. When the block structure is unknown, the first and foremost question is whether it can be recovered with high probability based on a single observation of a random graph with complex dependence. The main consistency results of the paper show that it is possible to do so under weak dependence and smoothness conditions. These results confirm that exponential-family random graph models with block structure constitute a promising direction of statistical network analysis.




ex

Tail expectile process and risk assessment

Abdelaati Daouia, Stéphane Girard, Gilles Stupfler.

Source: Bernoulli, Volume 26, Number 1, 531--556.

Abstract:
Expectiles define a least squares analogue of quantiles. They are determined by tail expectations rather than tail probabilities. For this reason and many other theoretical and practical merits, expectiles have recently received a lot of attention, especially in actuarial and financial risk management. Their estimation, however, typically requires to consider non-explicit asymmetric least squares estimates rather than the traditional order statistics used for quantile estimation. This makes the study of the tail expectile process a lot harder than that of the standard tail quantile process. Under the challenging model of heavy-tailed distributions, we derive joint weighted Gaussian approximations of the tail empirical expectile and quantile processes. We then use this powerful result to introduce and study new estimators of extreme expectiles and the standard quantile-based expected shortfall, as well as a novel expectile-based form of expected shortfall. Our estimators are built on general weighted combinations of both top order statistics and asymmetric least squares estimates. Some numerical simulations and applications to actuarial and financial data are provided.




ex

SPDEs with fractional noise in space: Continuity in law with respect to the Hurst index

Luca M. Giordano, Maria Jolis, Lluís Quer-Sardanyons.

Source: Bernoulli, Volume 26, Number 1, 352--386.

Abstract:
In this article, we consider the quasi-linear stochastic wave and heat equations on the real line and with an additive Gaussian noise which is white in time and behaves in space like a fractional Brownian motion with Hurst index $Hin (0,1)$. The drift term is assumed to be globally Lipschitz. We prove that the solution of each of the above equations is continuous in terms of the index $H$, with respect to the convergence in law in the space of continuous functions.




ex

Weak convergence of quantile and expectile processes under general assumptions

Tobias Zwingmann, Hajo Holzmann.

Source: Bernoulli, Volume 26, Number 1, 323--351.

Abstract:
We show weak convergence of quantile and expectile processes to Gaussian limit processes in the space of bounded functions endowed with an appropriate semimetric which is based on the concepts of epi- and hypo- convergence as introduced in A. Bücher, J. Segers and S. Volgushev (2014), ‘ When Uniform Weak Convergence Fails: Empirical Processes for Dependence Functions and Residuals via Epi- and Hypographs ’, Annals of Statistics 42 . We impose assumptions for which it is known that weak convergence with respect to the supremum norm generally fails to hold. For quantiles, we consider stationary observations, where the marginal distribution function is assumed to be strictly increasing and continuous except for finitely many points and to admit strictly positive – possibly infinite – left- and right-sided derivatives. For expectiles, we focus on independent and identically distributed (i.i.d.) observations. Only a finite second moment and continuity at the boundary points but no further smoothness properties of the distribution function are required. We also show consistency of the bootstrap for this mode of convergence in the i.i.d. case for quantiles and expectiles.




ex

The story of Thomas & Ann Stone family : including Helping Hobart's Orphans, the King's Orphan School for Boys 1831-1836 / Alexander E.H. Stone.

King's Orphan Schools (New Town, Tas.)




ex

Economists Expect Huge Future Earnings Loss for Students Missing School Due to COVID-19

Members of the future American workforce could see losses of earnings that add up to trillions of dollars, depending on how long coronavirus-related school closures persist.

The post Economists Expect Huge Future Earnings Loss for Students Missing School Due to COVID-19 appeared first on Market Brief.




ex

&#39;We Cannot Police Our Way Out of a Pandemic.&#39; Experts, Police Union Say NYPD Should Not Be Enforcing Social Distance Rules Amid COVID-19

The New York City police department (NYPD) is conducting an internal investigation into a May 2 incident involving the violent arrests of multiple people, allegedly members of a group who were not social distancing





ex

Cruz gets his hair cut at salon whose owner was jailed for defying Texas coronavirus restrictions

After his haircut, Sen. Ted Cruz said, "It was ridiculous to see somebody sentenced to seven days in jail for cutting hair."





ex

Meet the Ohio health expert who has a fan club — and Republicans trying to stop her

Some Buckeyes are not comfortable being told by a "woman in power" to quarantine, one expert said.





ex

The McMichaels can&#39;t be charged with a hate crime by the state in the shooting death of Ahmaud Arbery because the law doesn&#39;t exist in Georgia

Georgia is one of four states that doesn't have a hate crime law. Arbery's killing has reignited calls for legislation.





ex

Nearly one-third of Americans believe a coronavirus vaccine exists and is being withheld, survey finds

The Democracy Fund + UCLA Nationscape Project found some misinformation about the coronavirus is more widespread that you might think.





ex

Joint Modeling of Longitudinal Relational Data and Exogenous Variables

Rajarshi Guhaniyogi, Abel Rodriguez.

Source: Bayesian Analysis, Volume 15, Number 2, 477--503.

Abstract:
This article proposes a framework based on shared, time varying stochastic latent factor models for modeling relational data in which network and node-attributes co-evolve over time. Our proposed framework is flexible enough to handle both categorical and continuous attributes, allows us to estimate the dimension of the latent social space, and automatically yields Bayesian hypothesis tests for the association between network structure and nodal attributes. Additionally, the model is easy to compute and readily yields inference and prediction for missing link between nodes. We employ our model framework to study co-evolution of international relations between 22 countries and the country specific indicators over a period of 11 years.




ex

Bayesian Design of Experiments for Intractable Likelihood Models Using Coupled Auxiliary Models and Multivariate Emulation

Antony Overstall, James McGree.

Source: Bayesian Analysis, Volume 15, Number 1, 103--131.

Abstract:
A Bayesian design is given by maximising an expected utility over a design space. The utility is chosen to represent the aim of the experiment and its expectation is taken with respect to all unknowns: responses, parameters and/or models. Although straightforward in principle, there are several challenges to finding Bayesian designs in practice. Firstly, the utility and expected utility are rarely available in closed form and require approximation. Secondly, the design space can be of high-dimensionality. In the case of intractable likelihood models, these problems are compounded by the fact that the likelihood function, whose evaluation is required to approximate the expected utility, is not available in closed form. A strategy is proposed to find Bayesian designs for intractable likelihood models. It relies on the development of an automatic, auxiliary modelling approach, using multivariate Gaussian process emulators, to approximate the likelihood function. This is then combined with a copula-based approach to approximate the marginal likelihood (a quantity commonly required to evaluate many utility functions). These approximations are demonstrated on examples of stochastic process models involving experimental aims of both parameter estimation and model comparison.




ex

Extrinsic Gaussian Processes for Regression and Classification on Manifolds

Lizhen Lin, Niu Mu, Pokman Cheung, David Dunson.

Source: Bayesian Analysis, Volume 14, Number 3, 907--926.

Abstract:
Gaussian processes (GPs) are very widely used for modeling of unknown functions or surfaces in applications ranging from regression to classification to spatial processes. Although there is an increasingly vast literature on applications, methods, theory and algorithms related to GPs, the overwhelming majority of this literature focuses on the case in which the input domain corresponds to a Euclidean space. However, particularly in recent years with the increasing collection of complex data, it is commonly the case that the input domain does not have such a simple form. For example, it is common for the inputs to be restricted to a non-Euclidean manifold, a case which forms the motivation for this article. In particular, we propose a general extrinsic framework for GP modeling on manifolds, which relies on embedding of the manifold into a Euclidean space and then constructing extrinsic kernels for GPs on their images. These extrinsic Gaussian processes (eGPs) are used as prior distributions for unknown functions in Bayesian inferences. Our approach is simple and general, and we show that the eGPs inherit fine theoretical properties from GP models in Euclidean spaces. We consider applications of our models to regression and classification problems with predictors lying in a large class of manifolds, including spheres, planar shape spaces, a space of positive definite matrices, and Grassmannians. Our models can be readily used by practitioners in biological sciences for various regression and classification problems, such as disease diagnosis or detection. Our work is also likely to have impact in spatial statistics when spatial locations are on the sphere or other geometric spaces.




ex

Fast Model-Fitting of Bayesian Variable Selection Regression Using the Iterative Complex Factorization Algorithm

Quan Zhou, Yongtao Guan.

Source: Bayesian Analysis, Volume 14, Number 2, 573--594.

Abstract:
Bayesian variable selection regression (BVSR) is able to jointly analyze genome-wide genetic datasets, but the slow computation via Markov chain Monte Carlo (MCMC) hampered its wide-spread usage. Here we present a novel iterative method to solve a special class of linear systems, which can increase the speed of the BVSR model-fitting tenfold. The iterative method hinges on the complex factorization of the sum of two matrices and the solution path resides in the complex domain (instead of the real domain). Compared to the Gauss-Seidel method, the complex factorization converges almost instantaneously and its error is several magnitude smaller than that of the Gauss-Seidel method. More importantly, the error is always within the pre-specified precision while the Gauss-Seidel method is not. For large problems with thousands of covariates, the complex factorization is 10–100 times faster than either the Gauss-Seidel method or the direct method via the Cholesky decomposition. In BVSR, one needs to repetitively solve large penalized regression systems whose design matrices only change slightly between adjacent MCMC steps. This slight change in design matrix enables the adaptation of the iterative complex factorization method. The computational innovation will facilitate the wide-spread use of BVSR in reanalyzing genome-wide association datasets.




ex

Constrained Bayesian Optimization with Noisy Experiments

Benjamin Letham, Brian Karrer, Guilherme Ottoni, Eytan Bakshy.

Source: Bayesian Analysis, Volume 14, Number 2, 495--519.

Abstract:
Randomized experiments are the gold standard for evaluating the effects of changes to real-world systems. Data in these tests may be difficult to collect and outcomes may have high variance, resulting in potentially large measurement error. Bayesian optimization is a promising technique for efficiently optimizing multiple continuous parameters, but existing approaches degrade in performance when the noise level is high, limiting its applicability to many randomized experiments. We derive an expression for expected improvement under greedy batch optimization with noisy observations and noisy constraints, and develop a quasi-Monte Carlo approximation that allows it to be efficiently optimized. Simulations with synthetic functions show that optimization performance on noisy, constrained problems outperforms existing methods. We further demonstrate the effectiveness of the method with two real-world experiments conducted at Facebook: optimizing a ranking system, and optimizing server compiler flags.




ex

Statistical Methodology in Single-Molecule Experiments

Chao Du, S. C. Kou.

Source: Statistical Science, Volume 35, Number 1, 75--91.

Abstract:
Toward the last quarter of the 20th century, the emergence of single-molecule experiments enabled scientists to track and study individual molecules’ dynamic properties in real time. Unlike macroscopic systems’ dynamics, those of single molecules can only be properly described by stochastic models even in the absence of external noise. Consequently, statistical methods have played a key role in extracting hidden information about molecular dynamics from data obtained through single-molecule experiments. In this article, we survey the major statistical methodologies used to analyze single-molecule experimental data. Our discussion is organized according to the types of stochastic models used to describe single-molecule systems as well as major experimental data collection techniques. We also highlight challenges and future directions in the application of statistical methodologies to single-molecule experiments.




ex

Model-Based Approach to the Joint Analysis of Single-Cell Data on Chromatin Accessibility and Gene Expression

Zhixiang Lin, Mahdi Zamanighomi, Timothy Daley, Shining Ma, Wing Hung Wong.

Source: Statistical Science, Volume 35, Number 1, 2--13.

Abstract:
Unsupervised methods, including clustering methods, are essential to the analysis of single-cell genomic data. Model-based clustering methods are under-explored in the area of single-cell genomics, and have the advantage of quantifying the uncertainty of the clustering result. Here we develop a model-based approach for the integrative analysis of single-cell chromatin accessibility and gene expression data. We show that combining these two types of data, we can achieve a better separation of the underlying cell types. An efficient Markov chain Monte Carlo algorithm is also developed.




ex

An Overview of Semiparametric Extensions of Finite Mixture Models

Sijia Xiang, Weixin Yao, Guangren Yang.

Source: Statistical Science, Volume 34, Number 3, 391--404.

Abstract:
Finite mixture models have offered a very important tool for exploring complex data structures in many scientific areas, such as economics, epidemiology and finance. Semiparametric mixture models, which were introduced into traditional finite mixture models in the past decade, have brought forth exciting developments in their methodologies, theories, and applications. In this article, we not only provide a selective overview of the newly-developed semiparametric mixture models, but also discuss their estimation methodologies, theoretical properties if applicable, and some open questions. Recent developments are also discussed.




ex

Comment: Empirical Bayes, Compound Decisions and Exchangeability

Eitan Greenshtein, Ya’acov Ritov.

Source: Statistical Science, Volume 34, Number 2, 224--228.

Abstract:
We present some personal reflections on empirical Bayes/ compound decision (EB/CD) theory following Efron (2019). In particular, we consider the role of exchangeability in the EB/CD theory and how it can be achieved when there are covariates. We also discuss the interpretation of EB/CD confidence interval, the theoretical efficiency of the CD procedure, and the impact of sparsity assumptions.




ex

If you must smoke don't exhale / design : Biman Mullick.

London (33 Stillness Rd, London, SE23 1NG) : Cleanair, Campaign for a Smoke-free Environment, [198-?]




ex

If you must smoke don't exhale / Biman Mullick.

London : Cleanair, [1988?]




ex

Amazon Just Launched an Exclusive Clothing Collection Full of Warm and Comfy Basics Under $45

The womenswear line is new, and there’s already a variety of items to shop.




ex

Allometric Analysis Detects Brain Size-Independent Effects of Sex and Sex Chromosome Complement on Human Cerebellar Organization

Catherine Mankiw
May 24, 2017; 37:5221-5231
Development Plasticity Repair




ex

The Next 50 Years of Neuroscience

Cara M. Altimus
Jan 2, 2020; 40:101-106
Viewpoints