ine

Ethnoveterinary medicine : present and future concepts

9783030322700 (electronic bk.)




ine

Endocrine surgery in children

9783662542569 (electronic book)




ine

Drying atlas : drying kinetics and quality of agricultural products

Mühlbauer, Werner, author
9780128181638 (electronic bk.)




ine

Daily routine in cosmetic dermatology

9783319202501




ine

DNA beyond genes : from data storage and computing to nanobots, nanomedicine, and nanoelectronics

Demidov, Vadim V., author
9783030364342 (electronic bk.)




ine

DICTIONARY OF CONSTRUCTION, SURVEYING, AND CIVIL ENGINEERING

9780192568632 (electronic bk.)




ine

Current developments in biotechnology and bioengineering : resource recovery from wastes

0444643222




ine

Cell biology and translational medicine.

9783030378455 (electronic bk.)




ine

Biology and ecology of venomous marine cnidarians

Santhanam, Ramasamy, 1946- author
9789811516030 (electronic bk.)




ine

Notice of Construction - Woodbine Ave.





ine

Optimal prediction in the linearly transformed spiked model

Edgar Dobriban, William Leeb, Amit Singer.

Source: The Annals of Statistics, Volume 48, Number 1, 491--513.

Abstract:
We consider the linearly transformed spiked model , where the observations $Y_{i}$ are noisy linear transforms of unobserved signals of interest $X_{i}$: egin{equation*}Y_{i}=A_{i}X_{i}+varepsilon_{i},end{equation*} for $i=1,ldots ,n$. The transform matrices $A_{i}$ are also observed. We model the unobserved signals (or regression coefficients) $X_{i}$ as vectors lying on an unknown low-dimensional space. Given only $Y_{i}$ and $A_{i}$ how should we predict or recover their values? The naive approach of performing regression for each observation separately is inaccurate due to the large noise level. Instead, we develop optimal methods for predicting $X_{i}$ by “borrowing strength” across the different samples. Our linear empirical Bayes methods scale to large datasets and rely on weak moment assumptions. We show that this model has wide-ranging applications in signal processing, deconvolution, cryo-electron microscopy, and missing data with noise. For missing data, we show in simulations that our methods are more robust to noise and to unequal sampling than well-known matrix completion methods.




ine

Efficient estimation of linear functionals of principal components

Vladimir Koltchinskii, Matthias Löffler, Richard Nickl.

Source: The Annals of Statistics, Volume 48, Number 1, 464--490.

Abstract:
We study principal component analysis (PCA) for mean zero i.i.d. Gaussian observations $X_{1},dots,X_{n}$ in a separable Hilbert space $mathbb{H}$ with unknown covariance operator $Sigma $. The complexity of the problem is characterized by its effective rank $mathbf{r}(Sigma):=frac{operatorname{tr}(Sigma)}{|Sigma |}$, where $mathrm{tr}(Sigma)$ denotes the trace of $Sigma $ and $|Sigma|$ denotes its operator norm. We develop a method of bias reduction in the problem of estimation of linear functionals of eigenvectors of $Sigma $. Under the assumption that $mathbf{r}(Sigma)=o(n)$, we establish the asymptotic normality and asymptotic properties of the risk of the resulting estimators and prove matching minimax lower bounds, showing their semiparametric optimality.




ine

Hypothesis testing on linear structures of high-dimensional covariance matrix

Shurong Zheng, Zhao Chen, Hengjian Cui, Runze Li.

Source: The Annals of Statistics, Volume 47, Number 6, 3300--3334.

Abstract:
This paper is concerned with test of significance on high-dimensional covariance structures, and aims to develop a unified framework for testing commonly used linear covariance structures. We first construct a consistent estimator for parameters involved in the linear covariance structure, and then develop two tests for the linear covariance structures based on entropy loss and quadratic loss used for covariance matrix estimation. To study the asymptotic properties of the proposed tests, we study related high-dimensional random matrix theory, and establish several highly useful asymptotic results. With the aid of these asymptotic results, we derive the limiting distributions of these two tests under the null and alternative hypotheses. We further show that the quadratic loss based test is asymptotically unbiased. We conduct Monte Carlo simulation study to examine the finite sample performance of the two tests. Our simulation results show that the limiting null distributions approximate their null distributions quite well, and the corresponding asymptotic critical values keep Type I error rate very well. Our numerical comparison implies that the proposed tests outperform existing ones in terms of controlling Type I error rate and power. Our simulation indicates that the test based on quadratic loss seems to have better power than the test based on entropy loss.




ine

Projected spline estimation of the nonparametric function in high-dimensional partially linear models for massive data

Heng Lian, Kaifeng Zhao, Shaogao Lv.

Source: The Annals of Statistics, Volume 47, Number 5, 2922--2949.

Abstract:
In this paper, we consider the local asymptotics of the nonparametric function in a partially linear model, within the framework of the divide-and-conquer estimation. Unlike the fixed-dimensional setting in which the parametric part does not affect the nonparametric part, the high-dimensional setting makes the issue more complicated. In particular, when a sparsity-inducing penalty such as lasso is used to make the estimation of the linear part feasible, the bias introduced will propagate to the nonparametric part. We propose a novel approach for estimation of the nonparametric function and establish the local asymptotics of the estimator. The result is useful for massive data with possibly different linear coefficients in each subpopulation but common nonparametric function. Some numerical illustrations are also presented.




ine

Exact lower bounds for the agnostic probably-approximately-correct (PAC) machine learning model

Aryeh Kontorovich, Iosif Pinelis.

Source: The Annals of Statistics, Volume 47, Number 5, 2822--2854.

Abstract:
We provide an exact nonasymptotic lower bound on the minimax expected excess risk (EER) in the agnostic probably-approximately-correct (PAC) machine learning classification model and identify minimax learning algorithms as certain maximally symmetric and minimally randomized “voting” procedures. Based on this result, an exact asymptotic lower bound on the minimax EER is provided. This bound is of the simple form $c_{infty}/sqrt{ u}$ as $ u oinfty$, where $c_{infty}=0.16997dots$ is a universal constant, $ u=m/d$, $m$ is the size of the training sample and $d$ is the Vapnik–Chervonenkis dimension of the hypothesis class. It is shown that the differences between these asymptotic and nonasymptotic bounds, as well as the differences between these two bounds and the maximum EER of any learning algorithms that minimize the empirical risk, are asymptotically negligible, and all these differences are due to ties in the mentioned “voting” procedures. A few easy to compute nonasymptotic lower bounds on the minimax EER are also obtained, which are shown to be close to the exact asymptotic lower bound $c_{infty}/sqrt{ u}$ even for rather small values of the ratio $ u=m/d$. As an application of these results, we substantially improve existing lower bounds on the tail probability of the excess risk. Among the tools used are Bayes estimation and apparently new identities and inequalities for binomial distributions.




ine

Linear hypothesis testing for high dimensional generalized linear models

Chengchun Shi, Rui Song, Zhao Chen, Runze Li.

Source: The Annals of Statistics, Volume 47, Number 5, 2671--2703.

Abstract:
This paper is concerned with testing linear hypotheses in high dimensional generalized linear models. To deal with linear hypotheses, we first propose the constrained partial regularization method and study its statistical properties. We further introduce an algorithm for solving regularization problems with folded-concave penalty functions and linear constraints. To test linear hypotheses, we propose a partial penalized likelihood ratio test, a partial penalized score test and a partial penalized Wald test. We show that the limiting null distributions of these three test statistics are $chi^{2}$ distribution with the same degrees of freedom, and under local alternatives, they asymptotically follow noncentral $chi^{2}$ distributions with the same degrees of freedom and noncentral parameter, provided the number of parameters involved in the test hypothesis grows to $infty$ at a certain rate. Simulation studies are conducted to examine the finite sample performance of the proposed tests. Empirical analysis of a real data example is used to illustrate the proposed testing procedures.




ine

Modeling wildfire ignition origins in southern California using linear network point processes

Medha Uppala, Mark S. Handcock.

Source: The Annals of Applied Statistics, Volume 14, Number 1, 339--356.

Abstract:
This paper focuses on spatial and temporal modeling of point processes on linear networks. Point processes on linear networks can simply be defined as point events occurring on or near line segment network structures embedded in a certain space. A separable modeling framework is introduced that posits separate formation and dissolution models of point processes on linear networks over time. While the model was inspired by spider web building activity in brick mortar lines, the focus is on modeling wildfire ignition origins near road networks over a span of 14 years. As most wildfires in California have human-related origins, modeling the origin locations with respect to the road network provides insight into how human, vehicular and structural densities affect ignition occurrence. Model results show that roads that traverse different types of regions such as residential, interface and wildland regions have higher ignition intensities compared to roads that only exist in each of the mentioned region types.




ine

Optimal asset allocation with multivariate Bayesian dynamic linear models

Jared D. Fisher, Davide Pettenuzzo, Carlos M. Carvalho.

Source: The Annals of Applied Statistics, Volume 14, Number 1, 299--338.

Abstract:
We introduce a fast, closed-form, simulation-free method to model and forecast multiple asset returns and employ it to investigate the optimal ensemble of features to include when jointly predicting monthly stock and bond excess returns. Our approach builds on the Bayesian dynamic linear models of West and Harrison ( Bayesian Forecasting and Dynamic Models (1997) Springer), and it can objectively determine, through a fully automated procedure, both the optimal set of regressors to include in the predictive system and the degree to which the model coefficients, volatilities and covariances should vary over time. When applied to a portfolio of five stock and bond returns, we find that our method leads to large forecast gains, both in statistical and economic terms. In particular, we find that relative to a standard no-predictability benchmark, the optimal combination of predictors, stochastic volatility and time-varying covariances increases the annualized certainty equivalent returns of a leverage-constrained power utility investor by more than 500 basis points.




ine

Outline analyses of the called strike zone in Major League Baseball

Dale L. Zimmerman, Jun Tang, Rui Huang.

Source: The Annals of Applied Statistics, Volume 13, Number 4, 2416--2451.

Abstract:
We extend statistical shape analytic methods known as outline analysis for application to the strike zone, a central feature of the game of baseball. Although the strike zone is rigorously defined by Major League Baseball’s official rules, umpires make mistakes in calling pitches as strikes (and balls) and may even adhere to a strike zone somewhat different than that prescribed by the rule book. Our methods yield inference on geometric attributes (centroid, dimensions, orientation and shape) of this “called strike zone” (CSZ) and on the effects that years, umpires, player attributes, game situation factors and their interactions have on those attributes. The methodology consists of first using kernel discriminant analysis to determine a noisy outline representing the CSZ corresponding to each factor combination, then fitting existing elliptic Fourier and new generalized superelliptic models for closed curves to that outline and finally analyzing the fitted model coefficients using standard methods of regression analysis, factorial analysis of variance and variance component estimation. We apply these methods to PITCHf/x data comprising more than three million called pitches from the 2008–2016 Major League Baseball seasons to address numerous questions about the CSZ. We find that all geometric attributes of the CSZ, except its size, became significantly more like those of the rule-book strike zone from 2008–2016 and that several player attribute/game situation factors had statistically and practically significant effects on many of them. We also establish that the variation in the horizontal center, width and area of an individual umpire’s CSZ from pitch to pitch is smaller than their variation among CSZs from different umpires.




ine

Joint model of accelerated failure time and mechanistic nonlinear model for censored covariates, with application in HIV/AIDS

Hongbin Zhang, Lang Wu.

Source: The Annals of Applied Statistics, Volume 13, Number 4, 2140--2157.

Abstract:
For a time-to-event outcome with censored time-varying covariates, a joint Cox model with a linear mixed effects model is the standard modeling approach. In some applications such as AIDS studies, mechanistic nonlinear models are available for some covariate process such as viral load during anti-HIV treatments, derived from the underlying data-generation mechanisms and disease progression. Such a mechanistic nonlinear covariate model may provide better-predicted values when the covariates are left censored or mismeasured. When the focus is on the impact of the time-varying covariate process on the survival outcome, an accelerated failure time (AFT) model provides an excellent alternative to the Cox proportional hazard model since an AFT model is formulated to allow the influence of the outcome by the entire covariate process. In this article, we consider a nonlinear mixed effects model for the censored covariates in an AFT model, implemented using a Monte Carlo EM algorithm, under the framework of a joint model for simultaneous inference. We apply the joint model to an HIV/AIDS data to gain insights for assessing the association between viral load and immunological restoration during antiretroviral therapy. Simulation is conducted to compare model performance when the covariate model and the survival model are misspecified.




ine

Statistical inference for partially observed branching processes with application to cell lineage tracking of in vivo hematopoiesis

Jason Xu, Samson Koelle, Peter Guttorp, Chuanfeng Wu, Cynthia Dunbar, Janis L. Abkowitz, Vladimir N. Minin.

Source: The Annals of Applied Statistics, Volume 13, Number 4, 2091--2119.

Abstract:
Single-cell lineage tracking strategies enabled by recent experimental technologies have produced significant insights into cell fate decisions, but lack the quantitative framework necessary for rigorous statistical analysis of mechanistic models describing cell division and differentiation. In this paper, we develop such a framework with corresponding moment-based parameter estimation techniques for continuous-time, multi-type branching processes. Such processes provide a probabilistic model of how cells divide and differentiate, and we apply our method to study hematopoiesis , the mechanism of blood cell production. We derive closed-form expressions for higher moments in a general class of such models. These analytical results allow us to efficiently estimate parameters of much richer statistical models of hematopoiesis than those used in previous statistical studies. To our knowledge, the method provides the first rate inference procedure for fitting such models to time series data generated from cellular barcoding experiments. After validating the methodology in simulation studies, we apply our estimator to hematopoietic lineage tracking data from rhesus macaques. Our analysis provides a more complete understanding of cell fate decisions during hematopoiesis in nonhuman primates, which may be more relevant to human biology and clinical strategies than previous findings from murine studies. For example, in addition to previously estimated hematopoietic stem cell self-renewal rate, we are able to estimate fate decision probabilities and to compare structurally distinct models of hematopoiesis using cross validation. These estimates of fate decision probabilities and our model selection results should help biologists compare competing hypotheses about how progenitor cells differentiate. The methodology is transferrable to a large class of stochastic compartmental and multi-type branching models, commonly used in studies of cancer progression, epidemiology and many other fields.




ine

Imputation and post-selection inference in models with missing data: An application to colorectal cancer surveillance guidelines

Lin Liu, Yuqi Qiu, Loki Natarajan, Karen Messer.

Source: The Annals of Applied Statistics, Volume 13, Number 3, 1370--1396.

Abstract:
It is common to encounter missing data among the potential predictor variables in the setting of model selection. For example, in a recent study we attempted to improve the US guidelines for risk stratification after screening colonoscopy ( Cancer Causes Control 27 (2016) 1175–1185), with the aim to help reduce both overuse and underuse of follow-on surveillance colonoscopy. The goal was to incorporate selected additional informative variables into a neoplasia risk-prediction model, going beyond the three currently established risk factors, using a large dataset pooled from seven different prospective studies in North America. Unfortunately, not all candidate variables were collected in all studies, so that one or more important potential predictors were missing on over half of the subjects. Thus, while variable selection was a main focus of the study, it was necessary to address the substantial amount of missing data. Multiple imputation can effectively address missing data, and there are also good approaches to incorporate the variable selection process into model-based confidence intervals. However, there is not consensus on appropriate methods of inference which address both issues simultaneously. Our goal here is to study the properties of model-based confidence intervals in the setting of imputation for missing data followed by variable selection. We use both simulation and theory to compare three approaches to such post-imputation-selection inference: a multiple-imputation approach based on Rubin’s Rules for variance estimation ( Comput. Statist. Data Anal. 71 (2014) 758–770); a single imputation-selection followed by bootstrap percentile confidence intervals; and a new bootstrap model-averaging approach presented here, following Efron ( J. Amer. Statist. Assoc. 109 (2014) 991–1007). We investigate relative strengths and weaknesses of each method. The “Rubin’s Rules” multiple imputation estimator can have severe undercoverage, and is not recommended. The imputation-selection estimator with bootstrap percentile confidence intervals works well. The bootstrap-model-averaged estimator, with the “Efron’s Rules” estimated variance, may be preferred if the true effect sizes are moderate. We apply these results to the colorectal neoplasia risk-prediction problem which motivated the present work.




ine

Bayesian linear regression for multivariate responses under group sparsity

Bo Ning, Seonghyun Jeong, Subhashis Ghosal.

Source: Bernoulli, Volume 26, Number 3, 2353--2382.

Abstract:
We study frequentist properties of a Bayesian high-dimensional multivariate linear regression model with correlated responses. The predictors are separated into many groups and the group structure is pre-determined. Two features of the model are unique: (i) group sparsity is imposed on the predictors; (ii) the covariance matrix is unknown and its dimensions can also be high. We choose a product of independent spike-and-slab priors on the regression coefficients and a new prior on the covariance matrix based on its eigendecomposition. Each spike-and-slab prior is a mixture of a point mass at zero and a multivariate density involving the $ell_{2,1}$-norm. We first obtain the posterior contraction rate, the bounds on the effective dimension of the model with high posterior probabilities. We then show that the multivariate regression coefficients can be recovered under certain compatibility conditions. Finally, we quantify the uncertainty for the regression coefficients with frequentist validity through a Bernstein–von Mises type theorem. The result leads to selection consistency for the Bayesian method. We derive the posterior contraction rate using the general theory by constructing a suitable test from the first principle using moment bounds for certain likelihood ratios. This leads to posterior concentration around the truth with respect to the average Rényi divergence of order $1/2$. This technique of obtaining the required tests for posterior contraction rate could be useful in many other problems.




ine

A refined Cramér-type moderate deviation for sums of local statistics

Xiao Fang, Li Luo, Qi-Man Shao.

Source: Bernoulli, Volume 26, Number 3, 2319--2352.

Abstract:
We prove a refined Cramér-type moderate deviation result by taking into account of the skewness in normal approximation for sums of local statistics of independent random variables. We apply the main result to $k$-runs, U-statistics and subgraph counts in the Erdős–Rényi random graph. To prove our main result, we develop exponential concentration inequalities and higher-order tail probability expansions via Stein’s method.




ine

Weighted Lépingle inequality

Pavel Zorin-Kranich.

Source: Bernoulli, Volume 26, Number 3, 2311--2318.

Abstract:
We prove an estimate for weighted $p$th moments of the pathwise $r$-variation of a martingale in terms of the $A_{p}$ characteristic of the weight. The novelty of the proof is that we avoid real interpolation techniques.




ine

First-order covariance inequalities via Stein’s method

Marie Ernst, Gesine Reinert, Yvik Swan.

Source: Bernoulli, Volume 26, Number 3, 2051--2081.

Abstract:
We propose probabilistic representations for inverse Stein operators (i.e., solutions to Stein equations) under general conditions; in particular, we deduce new simple expressions for the Stein kernel. These representations allow to deduce uniform and nonuniform Stein factors (i.e., bounds on solutions to Stein equations) and lead to new covariance identities expressing the covariance between arbitrary functionals of an arbitrary univariate target in terms of a weighted covariance of the derivatives of the functionals. Our weights are explicit, easily computable in most cases and expressed in terms of objects familiar within the context of Stein’s method. Applications of the Cauchy–Schwarz inequality to these weighted covariance identities lead to sharp upper and lower covariance bounds and, in particular, weighted Poincaré inequalities. Many examples are given and, in particular, classical variance bounds due to Klaassen, Brascamp and Lieb or Otto and Menz are corollaries. Connections with more recent literature are also detailed.




ine

On sampling from a log-concave density using kinetic Langevin diffusions

Arnak S. Dalalyan, Lionel Riou-Durand.

Source: Bernoulli, Volume 26, Number 3, 1956--1988.

Abstract:
Langevin diffusion processes and their discretizations are often used for sampling from a target density. The most convenient framework for assessing the quality of such a sampling scheme corresponds to smooth and strongly log-concave densities defined on $mathbb{R}^{p}$. The present work focuses on this framework and studies the behavior of the Monte Carlo algorithm based on discretizations of the kinetic Langevin diffusion. We first prove the geometric mixing property of the kinetic Langevin diffusion with a mixing rate that is optimal in terms of its dependence on the condition number. We then use this result for obtaining improved guarantees of sampling using the kinetic Langevin Monte Carlo method, when the quality of sampling is measured by the Wasserstein distance. We also consider the situation where the Hessian of the log-density of the target distribution is Lipschitz-continuous. In this case, we introduce a new discretization of the kinetic Langevin diffusion and prove that this leads to a substantial improvement of the upper bound on the sampling error measured in Wasserstein distance.




ine

On the best constant in the martingale version of Fefferman’s inequality

Adam Osękowski.

Source: Bernoulli, Volume 26, Number 3, 1912--1926.

Abstract:
Let $X=(X_{t})_{tgeq 0}in H^{1}$ and $Y=(Y_{t})_{tgeq 0}in{mathrm{BMO}} $ be arbitrary continuous-path martingales. The paper contains the proof of the inequality egin{equation*}mathbb{E}int _{0}^{infty }iglvert dlangle X,Y angle_{t}igrvert leq sqrt{2}Vert XVert _{H^{1}}Vert YVert _{mathrm{BMO}_{2}},end{equation*} and the constant $sqrt{2}$ is shown to be the best possible. The proof rests on the construction of a certain special function, enjoying appropriate size and concavity conditions.




ine

Logarithmic Sobolev inequalities for finite spin systems and applications

Holger Sambale, Arthur Sinulis.

Source: Bernoulli, Volume 26, Number 3, 1863--1890.

Abstract:
We derive sufficient conditions for a probability measure on a finite product space (a spin system ) to satisfy a (modified) logarithmic Sobolev inequality. We establish these conditions for various examples, such as the (vertex-weighted) exponential random graph model, the random coloring and the hard-core model with fugacity. This leads to two separate branches of applications. The first branch is given by mixing time estimates of the Glauber dynamics. The proofs do not rely on coupling arguments, but instead use functional inequalities. As a byproduct, this also yields exponential decay of the relative entropy along the Glauber semigroup. Secondly, we investigate the concentration of measure phenomenon (particularly of higher order) for these spin systems. We show the effect of better concentration properties by centering not around the mean, but around a stochastic term in the exponential random graph model. From there, one can deduce a central limit theorem for the number of triangles from the CLT of the edge count. In the Erdős–Rényi model the first-order approximation leads to a quantification and a proof of a central limit theorem for subgraph counts.




ine

Influence of the seed in affine preferential attachment trees

David Corlin Marchand, Ioan Manolescu.

Source: Bernoulli, Volume 26, Number 3, 1665--1705.

Abstract:
We study randomly growing trees governed by the affine preferential attachment rule. Starting with a seed tree $S$, vertices are attached one by one, each linked by an edge to a random vertex of the current tree, chosen with a probability proportional to an affine function of its degree. This yields a one-parameter family of preferential attachment trees $(T_{n}^{S})_{ngeq |S|}$, of which the linear model is a particular case. Depending on the choice of the parameter, the power-laws governing the degrees in $T_{n}^{S}$ have different exponents. We study the problem of the asymptotic influence of the seed $S$ on the law of $T_{n}^{S}$. We show that, for any two distinct seeds $S$ and $S'$, the laws of $T_{n}^{S}$ and $T_{n}^{S'}$ remain at uniformly positive total-variation distance as $n$ increases. This is a continuation of Curien et al. ( J. Éc. Polytech. Math. 2 (2015) 1–34), which in turn was inspired by a conjecture of Bubeck et al. ( IEEE Trans. Netw. Sci. Eng. 2 (2015) 30–39). The technique developed here is more robust than previous ones and is likely to help in the study of more general attachment mechanisms.




ine

Efficient estimation in single index models through smoothing splines

Arun K. Kuchibhotla, Rohit K. Patra.

Source: Bernoulli, Volume 26, Number 2, 1587--1618.

Abstract:
We consider estimation and inference in a single index regression model with an unknown but smooth link function. In contrast to the standard approach of using kernels or regression splines, we use smoothing splines to estimate the smooth link function. We develop a method to compute the penalized least squares estimators (PLSEs) of the parametric and the nonparametric components given independent and identically distributed (i.i.d.) data. We prove the consistency and find the rates of convergence of the estimators. We establish asymptotic normality under mild assumption and prove asymptotic efficiency of the parametric component under homoscedastic errors. A finite sample simulation corroborates our asymptotic theory. We also analyze a car mileage data set and a Ozone concentration data set. The identifiability and existence of the PLSEs are also investigated.




ine

Around the entropic Talagrand inequality

Giovanni Conforti, Luigia Ripani.

Source: Bernoulli, Volume 26, Number 2, 1431--1452.

Abstract:
In this article, we study generalization of the classical Talagrand transport-entropy inequality in which the Wasserstein distance is replaced by the entropic transportation cost. This class of inequalities has been introduced in the recent work ( Probab. Theory Related Fields 174 (2019) 1–47), in connection with the study of Schrödinger bridges. We provide several equivalent characterizations in terms of reverse hypercontractivity for the heat semigroup, contractivity of the Hamilton–Jacobi–Bellman semigroup and dimension-free concentration of measure. Properties such as tensorization and relations to other functional inequalities are also investigated. In particular, we show that the inequalities studied in this article are implied by a Logarithmic Sobolev inequality and imply Talagrand inequality.




ine

Rates of convergence in de Finetti’s representation theorem, and Hausdorff moment problem

Emanuele Dolera, Stefano Favaro.

Source: Bernoulli, Volume 26, Number 2, 1294--1322.

Abstract:
Given a sequence ${X_{n}}_{ngeq 1}$ of exchangeable Bernoulli random variables, the celebrated de Finetti representation theorem states that $frac{1}{n}sum_{i=1}^{n}X_{i}stackrel{a.s.}{longrightarrow }Y$ for a suitable random variable $Y:Omega ightarrow [0,1]$ satisfying $mathsf{P}[X_{1}=x_{1},dots ,X_{n}=x_{n}|Y]=Y^{sum_{i=1}^{n}x_{i}}(1-Y)^{n-sum_{i=1}^{n}x_{i}}$. In this paper, we study the rate of convergence in law of $frac{1}{n}sum_{i=1}^{n}X_{i}$ to $Y$ under the Kolmogorov distance. After showing that a rate of the type of $1/n^{alpha }$ can be obtained for any index $alpha in (0,1]$, we find a sufficient condition on the distribution of $Y$ for the achievement of the optimal rate of convergence, that is $1/n$. Besides extending and strengthening recent results under the weaker Wasserstein distance, our main result weakens the regularity hypotheses on $Y$ in the context of the Hausdorff moment problem.




ine

Dynamic linear discriminant analysis in high dimensional space

Binyan Jiang, Ziqi Chen, Chenlei Leng.

Source: Bernoulli, Volume 26, Number 2, 1234--1268.

Abstract:
High-dimensional data that evolve dynamically feature predominantly in the modern data era. As a partial response to this, recent years have seen increasing emphasis to address the dimensionality challenge. However, the non-static nature of these datasets is largely ignored. This paper addresses both challenges by proposing a novel yet simple dynamic linear programming discriminant (DLPD) rule for binary classification. Different from the usual static linear discriminant analysis, the new method is able to capture the changing distributions of the underlying populations by modeling their means and covariances as smooth functions of covariates of interest. Under an approximate sparse condition, we show that the conditional misclassification rate of the DLPD rule converges to the Bayes risk in probability uniformly over the range of the variables used for modeling the dynamics, when the dimensionality is allowed to grow exponentially with the sample size. The minimax lower bound of the estimation of the Bayes risk is also established, implying that the misclassification rate of our proposed rule is minimax-rate optimal. The promising performance of the DLPD rule is illustrated via extensive simulation studies and the analysis of a breast cancer dataset.




ine

Estimation of the linear fractional stable motion

Stepan Mazur, Dmitry Otryakhin, Mark Podolskij.

Source: Bernoulli, Volume 26, Number 1, 226--252.

Abstract:
In this paper, we investigate the parametric inference for the linear fractional stable motion in high and low frequency setting. The symmetric linear fractional stable motion is a three-parameter family, which constitutes a natural non-Gaussian analogue of the scaled fractional Brownian motion. It is fully characterised by the scaling parameter $sigma>0$, the self-similarity parameter $Hin(0,1)$ and the stability index $alphain(0,2)$ of the driving stable motion. The parametric estimation of the model is inspired by the limit theory for stationary increments Lévy moving average processes that has been recently studied in ( Ann. Probab. 45 (2017) 4477–4528). More specifically, we combine (negative) power variation statistics and empirical characteristic functions to obtain consistent estimates of $(sigma,alpha,H)$. We present the law of large numbers and some fully feasible weak limit theorems.




ine

Volume 24 Item 04: William Thomas Manners and customs of Aborigines - Miscellaneous scraps, ca. 1858




ine

Smart research for HSC students: Better searching with online resources

In this online session, we simplify searching for you so that the skills you need in one resource will work wherever you are.




ine

Art Around the Library - Zine to Artist's Book

Find out how easy it is to make a ‘zine’ and you’re well on your way to producing your own mini books.




ine

Where do I start? Discover Your State Library Online

Whether you're looking for a new book to read, a binge-worthy podcast, inspiring stories, or a fun activity to do at home – you can get all of this and more online at your State Library




ine

Where do I start? Discover Your State Library Online

Whether you’re looking for a new book to read, a binge-worthy podcast, inspiring stories, or a fun activity to do at home — you can get all of this and more online at your State Library.   




ine

Federal watchdog finds 'reasonable grounds to believe' vaccine doctor's ouster was retaliation, lawyers say

The Office of Special Counsel is recommending that ousted vaccine official Dr. Rick Bright be reinstated while it investigates his case, his lawyers announced Friday.Bright while leading coronavirus vaccine development was recently removed from his position as the director of the Department of Health and Human Services' Biomedical Advanced Research and Development Authority, and he alleges it was because he insisted congressional funding not go toward "drugs, vaccines, and other technologies that lack scientific merit" and limited the "broad use" of hydroxychloroquine after it was touted by President Trump. In a whistleblower complaint, he alleged "cronyism" at HHS. He has also alleged he was "pressured to ignore or dismiss expert scientific recommendations and instead to award lucrative contracts based on political connections."On Friday, Bright's lawyers said that the Office of Special Counsel has determined there are "reasonable grounds to believe" his firing was retaliation, The New York Times reports. The federal watchdog also recommended he be reinstated for 45 days to give the office "sufficient time to complete its investigation of Bright's allegations," CNN reports. The decision on whether to do so falls on Secretary of Health and Human Services Alex Azar, and Office of Special Counsel recommendations are "not binding," the Times notes. More stories from theweek.com Outed CIA agent Valerie Plame is running for Congress, and her launch video looks like a spy movie trailer 7 scathing cartoons about America's rush to reopen Trump says he couldn't have exposed WWII vets to COVID-19 because the wind was blowing the wrong way





ine

Nearly one-third of Americans believe a coronavirus vaccine exists and is being withheld, survey finds

The Democracy Fund + UCLA Nationscape Project found some misinformation about the coronavirus is more widespread that you might think.





ine

Coronavirus: Chinese official admits health system weaknesses

China says it will improve public health systems after criticism of its early response to the virus.





ine

A Loss-Based Prior for Variable Selection in Linear Regression Methods

Cristiano Villa, Jeong Eun Lee.

Source: Bayesian Analysis, Volume 15, Number 2, 533--558.

Abstract:
In this work we propose a novel model prior for variable selection in linear regression. The idea is to determine the prior mass by considering the worth of each of the regression models, given the number of possible covariates under consideration. The worth of a model consists of the information loss and the loss due to model complexity. While the information loss is determined objectively, the loss expression due to model complexity is flexible and, the penalty on model size can be even customized to include some prior knowledge. Some versions of the loss-based prior are proposed and compared empirically. Through simulation studies and real data analyses, we compare the proposed prior to the Scott and Berger prior, for noninformative scenarios, and with the Beta-Binomial prior, for informative scenarios.




ine

A New Bayesian Approach to Robustness Against Outliers in Linear Regression

Philippe Gagnon, Alain Desgagné, Mylène Bédard.

Source: Bayesian Analysis, Volume 15, Number 2, 389--414.

Abstract:
Linear regression is ubiquitous in statistical analysis. It is well understood that conflicting sources of information may contaminate the inference when the classical normality of errors is assumed. The contamination caused by the light normal tails follows from an undesirable effect: the posterior concentrates in an area in between the different sources with a large enough scaling to incorporate them all. The theory of conflict resolution in Bayesian statistics (O’Hagan and Pericchi (2012)) recommends to address this problem by limiting the impact of outliers to obtain conclusions consistent with the bulk of the data. In this paper, we propose a model with super heavy-tailed errors to achieve this. We prove that it is wholly robust, meaning that the impact of outliers gradually vanishes as they move further and further away from the general trend. The super heavy-tailed density is similar to the normal outside of the tails, which gives rise to an efficient estimation procedure. In addition, estimates are easily computed. This is highlighted via a detailed user guide, where all steps are explained through a simulated case study. The performance is shown using simulation. All required code is given.




ine

Dynamic Quantile Linear Models: A Bayesian Approach

Kelly C. M. Gonçalves, Hélio S. Migon, Leonardo S. Bastos.

Source: Bayesian Analysis, Volume 15, Number 2, 335--362.

Abstract:
The paper introduces a new class of models, named dynamic quantile linear models, which combines dynamic linear models with distribution-free quantile regression producing a robust statistical method. Bayesian estimation for the dynamic quantile linear model is performed using an efficient Markov chain Monte Carlo algorithm. The paper also proposes a fast sequential procedure suited for high-dimensional predictive modeling with massive data, where the generating process is changing over time. The proposed model is evaluated using synthetic and well-known time series data. The model is also applied to predict annual incidence of tuberculosis in the state of Rio de Janeiro and compared with global targets set by the World Health Organization.




ine

Adaptive Bayesian Nonparametric Regression Using a Kernel Mixture of Polynomials with Application to Partial Linear Models

Fangzheng Xie, Yanxun Xu.

Source: Bayesian Analysis, Volume 15, Number 1, 159--186.

Abstract:
We propose a kernel mixture of polynomials prior for Bayesian nonparametric regression. The regression function is modeled by local averages of polynomials with kernel mixture weights. We obtain the minimax-optimal contraction rate of the full posterior distribution up to a logarithmic factor by estimating metric entropies of certain function classes. Under the assumption that the degree of the polynomials is larger than the unknown smoothness level of the true function, the posterior contraction behavior can adapt to this smoothness level provided an upper bound is known. We also provide a frequentist sieve maximum likelihood estimator with a near-optimal convergence rate. We further investigate the application of the kernel mixture of polynomials to partial linear models and obtain both the near-optimal rate of contraction for the nonparametric component and the Bernstein-von Mises limit (i.e., asymptotic normality) of the parametric component. The proposed method is illustrated with numerical examples and shows superior performance in terms of computational efficiency, accuracy, and uncertainty quantification compared to the local polynomial regression, DiceKriging, and the robust Gaussian stochastic process.




ine

Probability Based Independence Sampler for Bayesian Quantitative Learning in Graphical Log-Linear Marginal Models

Ioannis Ntzoufras, Claudia Tarantola, Monia Lupparelli.

Source: Bayesian Analysis, Volume 14, Number 3, 797--823.

Abstract:
We introduce a novel Bayesian approach for quantitative learning for graphical log-linear marginal models. These models belong to curved exponential families that are difficult to handle from a Bayesian perspective. The likelihood cannot be analytically expressed as a function of the marginal log-linear interactions, but only in terms of cell counts or probabilities. Posterior distributions cannot be directly obtained, and Markov Chain Monte Carlo (MCMC) methods are needed. Finally, a well-defined model requires parameter values that lead to compatible marginal probabilities. Hence, any MCMC should account for this important restriction. We construct a fully automatic and efficient MCMC strategy for quantitative learning for such models that handles these problems. While the prior is expressed in terms of the marginal log-linear interactions, we build an MCMC algorithm that employs a proposal on the probability parameter space. The corresponding proposal on the marginal log-linear interactions is obtained via parameter transformation. We exploit a conditional conjugate setup to build an efficient proposal on probability parameters. The proposed methodology is illustrated by a simulation study and a real dataset.




ine

Constrained Bayesian Optimization with Noisy Experiments

Benjamin Letham, Brian Karrer, Guilherme Ottoni, Eytan Bakshy.

Source: Bayesian Analysis, Volume 14, Number 2, 495--519.

Abstract:
Randomized experiments are the gold standard for evaluating the effects of changes to real-world systems. Data in these tests may be difficult to collect and outcomes may have high variance, resulting in potentially large measurement error. Bayesian optimization is a promising technique for efficiently optimizing multiple continuous parameters, but existing approaches degrade in performance when the noise level is high, limiting its applicability to many randomized experiments. We derive an expression for expected improvement under greedy batch optimization with noisy observations and noisy constraints, and develop a quasi-Monte Carlo approximation that allows it to be efficiently optimized. Simulations with synthetic functions show that optimization performance on noisy, constrained problems outperforms existing methods. We further demonstrate the effectiveness of the method with two real-world experiments conducted at Facebook: optimizing a ranking system, and optimizing server compiler flags.