ad

Instruments for health surveys in children and adolescents

9783319988573 (electronic bk.)




ad

Head and neck surgery.

9781451173703 hardcover




ad

Feed additives : aromatic plants and herbs in animal nutrition and health

9780128147016 (electronic bk.)




ad

Diabetes & obesity in women : adolescence, pregnancy, and menopause

Diabetes in women.
9781496390547 (paperback)




ad

Crafting qualitative research : beyond positivist traditions

Prasad, Pushkala, author.
9781315715070 (e-book)




ad

Consequences of microbial interactions with hydrocarbons, oils, and lipids : biodegradation and bioremediation

9783319445359 (electronic bk.)




ad

Breakfast cereals and how they are made : raw materials, processing, and production

9780128120446 (electronic bk.)




ad

Bioremediation and biotechnology : sustainable approaches to pollution degradation

9783030356910 (electronic bk.)




ad

Advances in virus research.

9780123850348 (electronic bk.)




ad

Advances in protein chemistry and structural biology.

9780123819635 (electronic bk.)




ad

Advances in protein chemistry and structural biology.

9780123864840 (electronic bk.)




ad

Advances in parasitology.

9780123742292 (electronic bk.)




ad

Advances in cyanobacterial biology

9780128193129 (electronic bk.)




ad

Advances in applied microbiology.

1282169459




ad

Advances in applied microbiology.

1282169416




ad

Advanced age geriatric care : a comprehensive guide

9783319969985 (electronic bk.)





ad

New Partnerships Emerge for COVID-19 Relief: Dade County Farm Bureau...

Harvested produce crops feed Florida Department of Corrections’ (FDC) more than 87,000 inmates; action saves food costs while reducing COVID-19 related supply chain impacts.

(PRWeb April 20, 2020)

Read the full story at https://www.prweb.com/releases/new_partnerships_emerge_for_covid_19_relief_dade_county_farm_bureau_teams_with_state_leaders_to_launch_farm_to_inmate_program/prweb17052045.htm





ad

Statistical inference for model parameters in stochastic gradient descent

Xi Chen, Jason D. Lee, Xin T. Tong, Yichen Zhang.

Source: The Annals of Statistics, Volume 48, Number 1, 251--273.

Abstract:
The stochastic gradient descent (SGD) algorithm has been widely used in statistical estimation for large-scale data due to its computational and memory efficiency. While most existing works focus on the convergence of the objective function or the error of the obtained solution, we investigate the problem of statistical inference of true model parameters based on SGD when the population loss function is strongly convex and satisfies certain smoothness conditions. Our main contributions are twofold. First, in the fixed dimension setup, we propose two consistent estimators of the asymptotic covariance of the average iterate from SGD: (1) a plug-in estimator, and (2) a batch-means estimator, which is computationally more efficient and only uses the iterates from SGD. Both proposed estimators allow us to construct asymptotically exact confidence intervals and hypothesis tests. Second, for high-dimensional linear regression, using a variant of the SGD algorithm, we construct a debiased estimator of each regression coefficient that is asymptotically normal. This gives a one-pass algorithm for computing both the sparse regression coefficients and confidence intervals, which is computationally attractive and applicable to online data.




ad

Adaptive risk bounds in univariate total variation denoising and trend filtering

Adityanand Guntuboyina, Donovan Lieu, Sabyasachi Chatterjee, Bodhisattva Sen.

Source: The Annals of Statistics, Volume 48, Number 1, 205--229.

Abstract:
We study trend filtering, a relatively recent method for univariate nonparametric regression. For a given integer $rgeq1$, the $r$th order trend filtering estimator is defined as the minimizer of the sum of squared errors when we constrain (or penalize) the sum of the absolute $r$th order discrete derivatives of the fitted function at the design points. For $r=1$, the estimator reduces to total variation regularization which has received much attention in the statistics and image processing literature. In this paper, we study the performance of the trend filtering estimator for every $rgeq1$, both in the constrained and penalized forms. Our main results show that in the strong sparsity setting when the underlying function is a (discrete) spline with few “knots,” the risk (under the global squared error loss) of the trend filtering estimator (with an appropriate choice of the tuning parameter) achieves the parametric $n^{-1}$-rate, up to a logarithmic (multiplicative) factor. Our results therefore provide support for the use of trend filtering, for every $rgeq1$, in the strong sparsity setting.




ad

Sparse SIR: Optimal rates and adaptive estimation

Kai Tan, Lei Shi, Zhou Yu.

Source: The Annals of Statistics, Volume 48, Number 1, 64--85.

Abstract:
Sliced inverse regression (SIR) is an innovative and effective method for sufficient dimension reduction and data visualization. Recently, an impressive range of penalized SIR methods has been proposed to estimate the central subspace in a sparse fashion. Nonetheless, few of them considered the sparse sufficient dimension reduction from a decision-theoretic point of view. To address this issue, we in this paper establish the minimax rates of convergence for estimating the sparse SIR directions under various commonly used loss functions in the literature of sufficient dimension reduction. We also discover the possible trade-off between statistical guarantee and computational performance for sparse SIR. We finally propose an adaptive estimation scheme for sparse SIR which is computationally tractable and rate optimal. Numerical studies are carried out to confirm the theoretical properties of our proposed methods.




ad

Adaptive estimation of the rank of the coefficient matrix in high-dimensional multivariate response regression models

Xin Bing, Marten H. Wegkamp.

Source: The Annals of Statistics, Volume 47, Number 6, 3157--3184.

Abstract:
We consider the multivariate response regression problem with a regression coefficient matrix of low, unknown rank. In this setting, we analyze a new criterion for selecting the optimal reduced rank. This criterion differs notably from the one proposed in Bunea, She and Wegkamp ( Ann. Statist. 39 (2011) 1282–1309) in that it does not require estimation of the unknown variance of the noise, nor does it depend on a delicate choice of a tuning parameter. We develop an iterative, fully data-driven procedure, that adapts to the optimal signal-to-noise ratio. This procedure finds the true rank in a few steps with overwhelming probability. At each step, our estimate increases, while at the same time it does not exceed the true rank. Our finite sample results hold for any sample size and any dimension, even when the number of responses and of covariates grow much faster than the number of observations. We perform an extensive simulation study that confirms our theoretical findings. The new method performs better and is more stable than the procedure of Bunea, She and Wegkamp ( Ann. Statist. 39 (2011) 1282–1309) in both low- and high-dimensional settings.




ad

Additive models with trend filtering

Veeranjaneyulu Sadhanala, Ryan J. Tibshirani.

Source: The Annals of Statistics, Volume 47, Number 6, 3032--3068.

Abstract:
We study additive models built with trend filtering, that is, additive models whose components are each regularized by the (discrete) total variation of their $k$th (discrete) derivative, for a chosen integer $kgeq0$. This results in $k$th degree piecewise polynomial components, (e.g., $k=0$ gives piecewise constant components, $k=1$ gives piecewise linear, $k=2$ gives piecewise quadratic, etc.). Analogous to its advantages in the univariate case, additive trend filtering has favorable theoretical and computational properties, thanks in large part to the localized nature of the (discrete) total variation regularizer that it uses. On the theory side, we derive fast error rates for additive trend filtering estimates, and show these rates are minimax optimal when the underlying function is additive and has component functions whose derivatives are of bounded variation. We also show that these rates are unattainable by additive smoothing splines (and by additive models built from linear smoothers, in general). On the computational side, we use backfitting, to leverage fast univariate trend filtering solvers; we also describe a new backfitting algorithm whose iterations can be run in parallel, which (as far as we can tell) is the first of its kind. Lastly, we present a number of experiments to examine the empirical performance of trend filtering.




ad

Phase transition in the spiked random tensor with Rademacher prior

Wei-Kuo Chen.

Source: The Annals of Statistics, Volume 47, Number 5, 2734--2756.

Abstract:
We consider the problem of detecting a deformation from a symmetric Gaussian random $p$-tensor $(pgeq3)$ with a rank-one spike sampled from the Rademacher prior. Recently, in Lesieur et al. (Barbier, Krzakala, Macris, Miolane and Zdeborová (2017)), it was proved that there exists a critical threshold $eta_{p}$ so that when the signal-to-noise ratio exceeds $eta_{p}$, one can distinguish the spiked and unspiked tensors and weakly recover the prior via the minimal mean-square-error method. On the other side, Perry, Wein and Bandeira (Perry, Wein and Bandeira (2017)) proved that there exists a $eta_{p}'<eta_{p}$ such that any statistical hypothesis test cannot distinguish these two tensors, in the sense that their total variation distance asymptotically vanishes, when the signa-to-noise ratio is less than $eta_{p}'$. In this work, we show that $eta_{p}$ is indeed the critical threshold that strictly separates the distinguishability and indistinguishability between the two tensors under the total variation distance. Our approach is based on a subtle analysis of the high temperature behavior of the pure $p$-spin model with Ising spin, arising initially from the field of spin glasses. In particular, we identify the signal-to-noise criticality $eta_{p}$ as the critical temperature, distinguishing the high and low temperature behavior, of the Ising pure $p$-spin mean-field spin glass model.




ad

Doubly penalized estimation in additive regression with high-dimensional data

Zhiqiang Tan, Cun-Hui Zhang.

Source: The Annals of Statistics, Volume 47, Number 5, 2567--2600.

Abstract:
Additive regression provides an extension of linear regression by modeling the signal of a response as a sum of functions of covariates of relatively low complexity. We study penalized estimation in high-dimensional nonparametric additive regression where functional semi-norms are used to induce smoothness of component functions and the empirical $L_{2}$ norm is used to induce sparsity. The functional semi-norms can be of Sobolev or bounded variation types and are allowed to be different amongst individual component functions. We establish oracle inequalities for the predictive performance of such methods under three simple technical conditions: a sub-Gaussian condition on the noise, a compatibility condition on the design and the functional classes under consideration and an entropy condition on the functional classes. For random designs, the sample compatibility condition can be replaced by its population version under an additional condition to ensure suitable convergence of empirical norms. In homogeneous settings where the complexities of the component functions are of the same order, our results provide a spectrum of minimax convergence rates, from the so-called slow rate without requiring the compatibility condition to the fast rate under the hard sparsity or certain $L_{q}$ sparsity to allow many small components in the true regression function. These results significantly broaden and sharpen existing ones in the literature.




ad

Bayes and empirical-Bayes multiplicity adjustment in the variable-selection problem

James G. Scott, James O. Berger

Source: Ann. Statist., Volume 38, Number 5, 2587--2619.

Abstract:
This paper studies the multiplicity-correction effect of standard Bayesian variable-selection priors in linear regression. Our first goal is to clarify when, and how, multiplicity correction happens automatically in Bayesian analysis, and to distinguish this correction from the Bayesian Ockham’s-razor effect. Our second goal is to contrast empirical-Bayes and fully Bayesian approaches to variable selection through examples, theoretical results and simulations. Considerable differences between the two approaches are found. In particular, we prove a theorem that characterizes a surprising aymptotic discrepancy between fully Bayes and empirical Bayes. This discrepancy arises from a different source than the failure to account for hyperparameter uncertainty in the empirical-Bayes estimate. Indeed, even at the extreme, when the empirical-Bayes estimate converges asymptotically to the true variable-inclusion probability, the potential for a serious difference remains.




ad

metadata

Data about data. In common usage as a generic term, metadata stores data about the structure, context and meaning of raw data, and computers use it to help organize and interpret data, turning it into meaningful information. The WorldWide Web has driven usage of metadata to new levels, as the tags used in HTML and XML are a form of metadata, although the meaning they convey is often limited because the metadata means different things to different people.




ad

Modifying the Chi-square and the CMH test for population genetic inference: Adapting to overdispersion

Kerstin Spitzer, Marta Pelizzola, Andreas Futschik.

Source: The Annals of Applied Statistics, Volume 14, Number 1, 202--220.

Abstract:
Evolve and resequence studies provide a popular approach to simulate evolution in the lab and explore its genetic basis. In this context, Pearson’s chi-square test, Fisher’s exact test as well as the Cochran–Mantel–Haenszel test are commonly used to infer genomic positions affected by selection from temporal changes in allele frequency. However, the null model associated with these tests does not match the null hypothesis of actual interest. Indeed, due to genetic drift and possibly other additional noise components such as pool sequencing, the null variance in the data can be substantially larger than accounted for by these common test statistics. This leads to $p$-values that are systematically too small and, therefore, a huge number of false positive results. Even, if the ranking rather than the actual $p$-values is of interest, a naive application of the mentioned tests will give misleading results, as the amount of overdispersion varies from locus to locus. We therefore propose adjusted statistics that take the overdispersion into account while keeping the formulas simple. This is particularly useful in genome-wide applications, where millions of SNPs can be handled with little computational effort. We then apply the adapted test statistics to real data from Drosophila and investigate how information from intermediate generations can be included when available. We also discuss further applications such as genome-wide association studies based on pool sequencing data and tests for local adaptation.




ad

Spatial modeling of trends in crime over time in Philadelphia

Cecilia Balocchi, Shane T. Jensen.

Source: The Annals of Applied Statistics, Volume 13, Number 4, 2235--2259.

Abstract:
Understanding the relationship between change in crime over time and the geography of urban areas is an important problem for urban planning. Accurate estimation of changing crime rates throughout a city would aid law enforcement as well as enable studies of the association between crime and the built environment. Bayesian modeling is a promising direction since areal data require principled sharing of information to address spatial autocorrelation between proximal neighborhoods. We develop several Bayesian approaches to spatial sharing of information between neighborhoods while modeling trends in crime counts over time. We apply our methodology to estimate changes in crime throughout Philadelphia over the 2006-15 period while also incorporating spatially-varying economic and demographic predictors. We find that the local shrinkage imposed by a conditional autoregressive model has substantial benefits in terms of out-of-sample predictive accuracy of crime. We also explore the possibility of spatial discontinuities between neighborhoods that could represent natural barriers or aspects of the built environment.




ad

Estimating the rate constant from biosensor data via an adaptive variational Bayesian approach

Ye Zhang, Zhigang Yao, Patrik Forssén, Torgny Fornstedt.

Source: The Annals of Applied Statistics, Volume 13, Number 4, 2011--2042.

Abstract:
The means to obtain the rate constants of a chemical reaction is a fundamental open problem in both science and the industry. Traditional techniques for finding rate constants require either chemical modifications of the reactants or indirect measurements. The rate constant map method is a modern technique to study binding equilibrium and kinetics in chemical reactions. Finding a rate constant map from biosensor data is an ill-posed inverse problem that is usually solved by regularization. In this work, rather than finding a deterministic regularized rate constant map that does not provide uncertainty quantification of the solution, we develop an adaptive variational Bayesian approach to estimate the distribution of the rate constant map, from which some intrinsic properties of a chemical reaction can be explored, including information about rate constants. Our new approach is more realistic than the existing approaches used for biosensors and allows us to estimate the dynamics of the interactions, which are usually hidden in a deterministic approximate solution. We verify the performance of the new proposed method by numerical simulations, and compare it with the Markov chain Monte Carlo algorithm. The results illustrate that the variational method can reliably capture the posterior distribution in a computationally efficient way. Finally, the developed method is also tested on the real biosensor data (parathyroid hormone), where we provide two novel analysis tools—the thresholding contour map and the high order moment map—to estimate the number of interactions as well as their rate constants.




ad

A semiparametric modeling approach using Bayesian Additive Regression Trees with an application to evaluate heterogeneous treatment effects

Bret Zeldow, Vincent Lo Re III, Jason Roy.

Source: The Annals of Applied Statistics, Volume 13, Number 3, 1989--2010.

Abstract:
Bayesian Additive Regression Trees (BART) is a flexible machine learning algorithm capable of capturing nonlinearities between an outcome and covariates and interactions among covariates. We extend BART to a semiparametric regression framework in which the conditional expectation of an outcome is a function of treatment, its effect modifiers, and confounders. The confounders are allowed to have unspecified functional form, while treatment and effect modifiers that are directly related to the research question are given a linear form. The result is a Bayesian semiparametric linear regression model where the posterior distribution of the parameters of the linear part can be interpreted as in parametric Bayesian regression. This is useful in situations where a subset of the variables are of substantive interest and the others are nuisance variables that we would like to control for. An example of this occurs in causal modeling with the structural mean model (SMM). Under certain causal assumptions, our method can be used as a Bayesian SMM. Our methods are demonstrated with simulation studies and an application to dataset involving adults with HIV/Hepatitis C coinfection who newly initiate antiretroviral therapy. The methods are available in an R package called semibart.




ad

Radio-iBAG: Radiomics-based integrative Bayesian analysis of multiplatform genomic data

Youyi Zhang, Jeffrey S. Morris, Shivali Narang Aerry, Arvind U. K. Rao, Veerabhadran Baladandayuthapani.

Source: The Annals of Applied Statistics, Volume 13, Number 3, 1957--1988.

Abstract:
Technological innovations have produced large multi-modal datasets that include imaging and multi-platform genomics data. Integrative analyses of such data have the potential to reveal important biological and clinical insights into complex diseases like cancer. In this paper, we present Bayesian approaches for integrative analysis of radiological imaging and multi-platform genomic data, where-in our goals are to simultaneously identify genomic and radiomic, that is, radiology-based imaging markers, along with the latent associations between these two modalities, and to detect the overall prognostic relevance of the combined markers. For this task, we propose Radio-iBAG: Radiomics-based Integrative Bayesian Analysis of Multiplatform Genomic Data , a multi-scale Bayesian hierarchical model that involves several innovative strategies: it incorporates integrative analysis of multi-platform genomic data sets to capture fundamental biological relationships; explores the associations between radiomic markers accompanying genomic information with clinical outcomes; and detects genomic and radiomic markers associated with clinical prognosis. We also introduce the use of sparse Principal Component Analysis (sPCA) to extract a sparse set of approximately orthogonal meta-features each containing information from a set of related individual radiomic features, reducing dimensionality and combining like features. Our methods are motivated by and applied to The Cancer Genome Atlas glioblastoma multiforme data set, where-in we integrate magnetic resonance imaging-based biomarkers along with genomic, epigenomic and transcriptomic data. Our model identifies important magnetic resonance imaging features and the associated genomic platforms that are related with patient survival times.




ad

Wavelet spectral testing: Application to nonstationary circadian rhythms

Jessica K. Hargreaves, Marina I. Knight, Jon W. Pitchford, Rachael J. Oakenfull, Sangeeta Chawla, Jack Munns, Seth J. Davis.

Source: The Annals of Applied Statistics, Volume 13, Number 3, 1817--1846.

Abstract:
Rhythmic data are ubiquitous in the life sciences. Biologists need reliable statistical tests to identify whether a particular experimental treatment has caused a significant change in a rhythmic signal. When these signals display nonstationary behaviour, as is common in many biological systems, the established methodologies may be misleading. Therefore, there is a real need for new methodology that enables the formal comparison of nonstationary processes. As circadian behaviour is best understood in the spectral domain, here we develop novel hypothesis testing procedures in the (wavelet) spectral domain, embedding replicate information when available. The data are modelled as realisations of locally stationary wavelet processes, allowing us to define and rigorously estimate their evolutionary wavelet spectra. Motivated by three complementary applications in circadian biology, our new methodology allows the identification of three specific types of spectral difference. We demonstrate the advantages of our methodology over alternative approaches, by means of a comprehensive simulation study and real data applications, using both published and newly generated circadian datasets. In contrast to the current standard methodologies, our method successfully identifies differences within the motivating circadian datasets, and facilitates wider ranging analyses of rhythmic biological data in general.




ad

Local differential privacy: Elbow effect in optimal density estimation and adaptation over Besov ellipsoids

Cristina Butucea, Amandine Dubois, Martin Kroll, Adrien Saumard.

Source: Bernoulli, Volume 26, Number 3, 1727--1764.

Abstract:
We address the problem of non-parametric density estimation under the additional constraint that only privatised data are allowed to be published and available for inference. For this purpose, we adopt a recent generalisation of classical minimax theory to the framework of local $alpha$-differential privacy and provide a lower bound on the rate of convergence over Besov spaces $mathcal{B}^{s}_{pq}$ under mean integrated $mathbb{L}^{r}$-risk. This lower bound is deteriorated compared to the standard setup without privacy, and reveals a twofold elbow effect. In order to fulfill the privacy requirement, we suggest adding suitably scaled Laplace noise to empirical wavelet coefficients. Upper bounds within (at most) a logarithmic factor are derived under the assumption that $alpha$ stays bounded as $n$ increases: A linear but non-adaptive wavelet estimator is shown to attain the lower bound whenever $pgeq r$ but provides a slower rate of convergence otherwise. An adaptive non-linear wavelet estimator with appropriately chosen smoothing parameters and thresholding is shown to attain the lower bound within a logarithmic factor for all cases.




ad

From the coalfields of Somerset to the Adelaide Hills and beyond : the story of the Hewish Family : three centuries of one family's journey through time / Maureen Brown.

Hewish Henry -- Family.




ad

The Yangya Hicks : tales from the Hicks family of Yangya near Gladstone, South Australia, written from the 12th of May 1998 / by Joyce Coralie Hale (nee Hicks) (28.12.1923-17.12.2003).

Hicks (Family)




ad

Gordon of Huntly : heraldic heritage : cadets to South Australia / Robin Gregory Gordon.

South Australia -- Genealogy.




ad

Our Lady of Grace family page of history : a bookweek bicentennial project / edited by Janeen Brian.

Our Lady of Grace School (Glengowrie, S.A.)




ad

From alms house to first nation : a story of my ancestors in South Australia : a Sherwell family story / by Pamela Coad (nee Sherwell).

Sherwell (Family)




ad

Item 05: William Hilton Saunders WWI 1916-1919 address book with poetry




ad

Item 01: Ellis Ashmead-Bartlett diary, 1915-1917




ad

Item 01: Ellis Ashmead-Bartlett diary, 1915-1917




ad

Boeing says it&#39;s about to start building the 737 Max plane again in the middle of the coronavirus pandemic, even though it already has more planes than it can deliver

Boeing CEO Dave Calhoun said the company was aiming to resume production this month, despite the ongoing grounding and coronavirus pandemic.





ad

Chaffetz: I don't understand why Adam Schiff continues to have a security clearance

Fox News contributor Jason Chaffetz and Andy McCarthy react to House Intelligence transcripts on Russia probe.





ad

As Trump returns to the road, some Democrats want to bust Biden out of his basement

While President Donald Trump traveled to the battleground state of Arizona this week, his Democratic opponent for the White House, Joe Biden, campaigned from his basement as he has done throughout the coronavirus pandemic. The freeze on in-person campaigning during the outbreak has had an upside for Biden, giving the former vice president more time to court donors and shielding him from on-the-trail gaffes. "I personally would like to see him out more because he's in his element when he's meeting people," said Tom Sacks-Wilner, a fundraiser for Biden who is on the campaign's finance committee.





ad

‘Selfish, tribal and divided’: Barack Obama warns of changes to American way of life in leaked audio slamming Trump administration

Barack Obama said the “rule of law is at risk” following the justice department’s decision to drop charges against former Trump advisor Mike Flynn, as he issued a stark warning about the long-term impact on the American way of life by his successor.





ad

Coronavirus: Chinese official admits health system weaknesses

China says it will improve public health systems after criticism of its early response to the virus.





ad

Additive Multivariate Gaussian Processes for Joint Species Distribution Modeling with Heterogeneous Data

Jarno Vanhatalo, Marcelo Hartmann, Lari Veneranta.

Source: Bayesian Analysis, Volume 15, Number 2, 415--447.

Abstract:
Species distribution models (SDM) are a key tool in ecology, conservation and management of natural resources. Two key components of the state-of-the-art SDMs are the description for species distribution response along environmental covariates and the spatial random effect that captures deviations from the distribution patterns explained by environmental covariates. Joint species distribution models (JSDMs) additionally include interspecific correlations which have been shown to improve their descriptive and predictive performance compared to single species models. However, current JSDMs are restricted to hierarchical generalized linear modeling framework. Their limitation is that parametric models have trouble in explaining changes in abundance due, for example, highly non-linear physical tolerance limits which is particularly important when predicting species distribution in new areas or under scenarios of environmental change. On the other hand, semi-parametric response functions have been shown to improve the predictive performance of SDMs in these tasks in single species models. Here, we propose JSDMs where the responses to environmental covariates are modeled with additive multivariate Gaussian processes coded as linear models of coregionalization. These allow inference for wide range of functional forms and interspecific correlations between the responses. We propose also an efficient approach for inference with Laplace approximation and parameterization of the interspecific covariance matrices on the Euclidean space. We demonstrate the benefits of our model with two small scale examples and one real world case study. We use cross-validation to compare the proposed model to analogous semi-parametric single species models and parametric single and joint species models in interpolation and extrapolation tasks. The proposed model outperforms the alternative models in all cases. We also show that the proposed model can be seen as an extension of the current state-of-the-art JSDMs to semi-parametric models.




ad

Adaptive Bayesian Nonparametric Regression Using a Kernel Mixture of Polynomials with Application to Partial Linear Models

Fangzheng Xie, Yanxun Xu.

Source: Bayesian Analysis, Volume 15, Number 1, 159--186.

Abstract:
We propose a kernel mixture of polynomials prior for Bayesian nonparametric regression. The regression function is modeled by local averages of polynomials with kernel mixture weights. We obtain the minimax-optimal contraction rate of the full posterior distribution up to a logarithmic factor by estimating metric entropies of certain function classes. Under the assumption that the degree of the polynomials is larger than the unknown smoothness level of the true function, the posterior contraction behavior can adapt to this smoothness level provided an upper bound is known. We also provide a frequentist sieve maximum likelihood estimator with a near-optimal convergence rate. We further investigate the application of the kernel mixture of polynomials to partial linear models and obtain both the near-optimal rate of contraction for the nonparametric component and the Bernstein-von Mises limit (i.e., asymptotic normality) of the parametric component. The proposed method is illustrated with numerical examples and shows superior performance in terms of computational efficiency, accuracy, and uncertainty quantification compared to the local polynomial regression, DiceKriging, and the robust Gaussian stochastic process.