as

Top three Satou Sabally moments: Sharpshooter's 33-point game in Pullman was unforgettable

Since the day she stepped on campus, Satou Sabally's game has turned heads — and for good reason. She's had many memorable moments in a Duck uniform, including a standout performance against the USA Women in Nov. 2019, a monster game against Cal in Jan. 2020 and a career performance in Pullman in Jan. 2019.




as

Kobe, Duncan, Garnett headline Basketball Hall of Fame class

Kobe Bryant was already immortal. Bryant and fellow NBA greats Tim Duncan and Kevin Garnett headlined a nine-person group announced Saturday as this year’s class of enshrinees into the Naismith Memorial Basketball Hall of Fame. Two-time NBA champion coach Rudy Tomjanovich finally got his call, as did longtime Baylor women’s coach Kim Mulkey, 1,000-game winner Barbara Stevens of Bentley and three-time Final Four coach Eddie Sutton.




as

The Class of 2020: A look at basketball's new Hall of Famers

A look at the newest members of the Naismith Memorial Basketball Hall of Fame, announced on Saturday:




as

Texas hires Schaefer from Mississippi State

Texas moved quickly to hire a new women's basketball coach, luring Vic Schaefer away from powerhouse Mississippi State on Sunday. Texas athletic director Chris Del Conte announced the move by tweeting a picture of himself with Schaefer and his family holding up the “Hook'em Horns” hand signal. The move comes just two days after Texas dismissed eight-year coach Karen Aston, who had only one losing season in her tenure and had led the Longhorns to the Sweet 16 or farther four times.




as

New women's coach Schaefer answering a 'calling' to Texas

For Vic Schaefer, the decision to take over the Texas women's basketball program was profoundly personal. “It was a calling,” Schaefer said Monday, noting the old Austin hospital building where he was born is just across the street from where the Longhorns play at the Frank Erwin Center. Texas quickly snatched up Schaefer on Sunday, just two days after athletic director Chris Del Conte announced coach Karen Aston would not be retained after eight seasons.




as

Gamecocks’ Boston wins Leslie Award as nation’s best center

COLUMBIA, S.C. (AP) -- South Carolina freshman Aliyah Boston has won the Lisa Leslie Award given to the top center in women’s college basketball.




as

Sabrina Ionescu, Ruthy Hebard, Satou Sabally on staying connected, WNBA Draft, Oregon's historic season

Pac-12 Networks' Ashley Adamson catches up with Oregon's "Big 3" of Sabrina Ionescu, Ruthy Hebard and Satou Sabally to hear how they're adjusting to the new world without sports while still preparing for the WNBA Draft on April 17. They also share how they're staying hungry for basketball during the hiatus.




as

WNBA Draft Profile: Transcendent guard Sabrina Ionescu projects as top pick

After sweeping every national player of the year award, Sabrina Ionescu is off to the WNBA level where her skills will make an instant impact — not just to her new team but the league as a whole. She averaged 17.5 points, 8.6 rebounds and 9.1 assists for the Ducks in 2019-20, rewriting her own NCAA career triple-double record and becoming the first in college basketball history with at least 2,000 points, 1,000 rebounds and 1,000 assists.




as

WNBA Draft Profile: Productive forward Ruthy Hebard has uncanny handling, scoring, rebounding ability

Ruthy Hebard, who ranks 2nd in Oregon history in points (2,368) and 3rd in rebounds (1,299), prepares to play in the WNBA following four years in Eugene. Hebard is the Oregon and Pac-12 all-time leader in career field-goal percentage (65.1) and averaged 17.3 points per game and a career-high 9.6 rebounds per game as a senior.




as

WNBA Draft Profile: Do-it-all OSU talent Mikayla Pivec has her sights set on a pro breakout

Oregon State guard Mikayla Pivec is the epitome of a versatile player. Her 1,030 career rebounds were the most in school history, and she finished just one assist shy of becoming the first in OSU history to tally 1,500 points, 1,000 rebounds and 500 assists. She'll head to the WNBA looking to showcase her talents at the next level following the 2020 WNBA Draft.




as

Inside Sabrina Ionescu and Ruthy Hebard's lasting bond on quick look of 'Our Stories'

Learn how Oregon stars Sabrina Ionescu and Ruthy Hebard developed a lasting bond as college freshmen and carried that through storied four-year careers for the Ducks. Watch "Our Stories Unfinished Business: Sabrina Ionescu and Ruthy Hebard" debuting Wednesday, April 15 at 7 p.m. PT/ 8 p.m. MT on Pac-12 Network.




as

Mississippi State hires Nikki McCray-Penson as women's coach

Mississippi State hired former Old Dominion women’s basketball coach Nikki McCray-Penson to replace Vic Schaefer as the Bulldogs’ head coach. Athletic director John Cohen called McCray-Penson “a proven winner who will lead one of the best programs in the nation” on the department’s website. McCray-Penson, a former Tennessee star and Women’s Basketball Hall of Famer, said it’s been a dream to coach in the Southeastern Conference and she’s “grateful and blessed for this incredible honor and opportunity.”




as

Ruthy Hebard, Sabrina Ionescu 'represent everything that is great about basketball'

Ruthy Hebard and Sabrina Ionescu have had a remarkable four years together in Eugene, rewriting the history books and pushing the Ducks into the national spotlight. Catch the debut of "Our Stories Unfinished Business: Sabrina Ionescu and Ruthy Hebard" at Wednesday, April 15 at 7 p.m. PT/ 8 p.m. MT on Pac-12 Network.




as

Kentucky women add guards Massengill, Benton as transfers

LEXINGTON, Ky. (AP) -- Sophomore guards Jazmine Massengill and Robyn Benton transferred to Kentucky from Southeastern Conference rivals Wednesday.




as

Dr. Michelle Tom shares journey from ASU women's hoops to treating COVID-19 patients

Pac-12 Networks' Ashley Adamson speaks with former Arizona State women's basketball player Michelle Tom, who is now a doctor treating COVID-19 patients Winslow Indian Health Care Center and Little Colorado Medical Center in Eastern Arizona.




as

Chicago State women's basketball coach Misty Opat resigns

CHICAGO (AP) -- Chicago State women’s coach Misty Opat resigned Thursday after two seasons and a 3-55 record.




as

Ivey introduced as new Notre Dame coach, succeeding McGraw

Niele Ivey is coming home.




as

Detroit Mercy hires Gilbert as women's basketball coach

DETROIT (AP) -- Detroit Mercy hired AnnMarie Gilbert as women’s basketball coach.




as

Natalie Chou on why she took a stand against anti-Asian racism in wake of coronavirus

During Wednesday's "Pac-12 Perspective" podcast, Natalie Chou shared why she is using her platform to speak out against racism she sees in her community related to the novel coronavirus.




as

UCLA's Natalie Chou on her role models, inspiring Asian-American girls in basketball

Pac-12 Networks' Mike Yam has a conversation with UCLA's Natalie Chou during Wednesday's "Pac-12 Perspective" podcast. Chou reflects on her role models, passion for basketball and how her mom has made a big impact on her hoops career.




as

Oregon State's Aleah Goodman, Maddie Washington reflect on earning 2020 Pac-12 Sportsmanship Award

The Pac-12 Student-Athlete Advisory Committee voted to award the Oregon State women’s basketball team with the Pac-12 Sportsmanship Award for the 2019-20 season, honoring their character and sportsmanship before a rivalry game against Oregon in Jan. 2020 -- the day Kobe Bryant, his daughter, Gigi, and seven others passed away in a helicopter crash in Southern California. In the above video, Aleah Goodman and Madison Washington share how the teams came together as one in a circle of prayer before the game.




as

Oregon State women's basketball receives Pac-12 Sportsmanship Award for supporting rival Oregon in tragedy

On the day Kobe Bryant suddenly passed away, the Beavers embraced their rivals at midcourt in a moment of strength to support the Ducks, many of whom had personal connections to Bryant and his daughter, Gigi. For this, Oregon State is the 2020 recipient of the Pac-12 Sportsmanship Award.




as

Natalie Chou breaks through stereotypes, inspires young Asian American girls on 'Our Stories' quick look

Watch the debut of "Our Stories - Natalie Chou" on Sunday, May 10 at 12:30 p.m. PT/ 1:30 p.m. MT on Pac-12 Network.




as

Stanford's Tara VanDerveer on Haley Jones' versatile freshman year: 'It was really incredible'

During Friday's "Pac-12 Perspective," Stanford head coach Tara VanDerveer spoke about Haley Jones' positionless game and how the Cardinal used the dynamic freshman in 2019-20. Download and listen wherever you get your podcasts.




as

Pac-12 women's basketball student-athletes reflect on the influence of their moms ahead of Mother's Day

Pac-12 student-athletes give shout-outs to their moms ahead of Mother's Day on May 10th, 2020 including UCLA's Michaela Onyenwere, Oregon's Sabrina Ionescu and Satou Sabally, Arizona's Aari McDonald, Cate Reese, and Lacie Hull, Stanford's Kiana Williams, USC's Endyia Rogers, and Aliyah Jeune, and Utah's Brynna Maxwell.




as

On the Letac-Massam conjecture and existence of high dimensional Bayes estimators for graphical models

Emanuel Ben-David, Bala Rajaratnam.

Source: Electronic Journal of Statistics, Volume 14, Number 1, 580--604.

Abstract:
The Wishart distribution defined on the open cone of positive-definite matrices plays a central role in multivariate analysis and multivariate distribution theory. Its domain of parameters is often referred to as the Gindikin set. In recent years, varieties of useful extensions of the Wishart distribution have been proposed in the literature for the purposes of studying Markov random fields and graphical models. In particular, generalizations of the Wishart distribution, referred to as Type I and Type II (graphical) Wishart distributions introduced by Letac and Massam in Annals of Statistics (2007) play important roles in both frequentist and Bayesian inference for Gaussian graphical models. These distributions have been especially useful in high-dimensional settings due to the flexibility offered by their multiple-shape parameters. Concerning Type I and Type II Wishart distributions, a conjecture of Letac and Massam concerns the domain of multiple-shape parameters of these distributions. The conjecture also has implications for the existence of Bayes estimators corresponding to these high dimensional priors. The conjecture, which was first posed in the Annals of Statistics, has now been an open problem for about 10 years. In this paper, we give a necessary condition for the Letac and Massam conjecture to hold. More precisely, we prove that if the Letac and Massam conjecture holds on a decomposable graph, then no two separators of the graph can be nested within each other. For this, we analyze Type I and Type II Wishart distributions on appropriate Markov equivalent perfect DAG models and succeed in deriving the aforementioned necessary condition. This condition in particular identifies a class of counterexamples to the conjecture.




as

Drift estimation for stochastic reaction-diffusion systems

Gregor Pasemann, Wilhelm Stannat.

Source: Electronic Journal of Statistics, Volume 14, Number 1, 547--579.

Abstract:
A parameter estimation problem for a class of semilinear stochastic evolution equations is considered. Conditions for consistency and asymptotic normality are given in terms of growth and continuity properties of the nonlinear part. Emphasis is put on the case of stochastic reaction-diffusion systems. Robustness results for statistical inference under model uncertainty are provided.




as

Parseval inequalities and lower bounds for variance-based sensitivity indices

Olivier Roustant, Fabrice Gamboa, Bertrand Iooss.

Source: Electronic Journal of Statistics, Volume 14, Number 1, 386--412.

Abstract:
The so-called polynomial chaos expansion is widely used in computer experiments. For example, it is a powerful tool to estimate Sobol’ sensitivity indices. In this paper, we consider generalized chaos expansions built on general tensor Hilbert basis. In this frame, we revisit the computation of the Sobol’ indices with Parseval equalities and give general lower bounds for these indices obtained by truncation. The case of the eigenfunctions system associated with a Poincaré differential operator leads to lower bounds involving the derivatives of the analyzed function and provides an efficient tool for variable screening. These lower bounds are put in action both on toy and real life models demonstrating their accuracy.




as

Asymptotic properties of the maximum likelihood and cross validation estimators for transformed Gaussian processes

François Bachoc, José Betancourt, Reinhard Furrer, Thierry Klein.

Source: Electronic Journal of Statistics, Volume 14, Number 1, 1962--2008.

Abstract:
The asymptotic analysis of covariance parameter estimation of Gaussian processes has been subject to intensive investigation. However, this asymptotic analysis is very scarce for non-Gaussian processes. In this paper, we study a class of non-Gaussian processes obtained by regular non-linear transformations of Gaussian processes. We provide the increasing-domain asymptotic properties of the (Gaussian) maximum likelihood and cross validation estimators of the covariance parameters of a non-Gaussian process of this class. We show that these estimators are consistent and asymptotically normal, although they are defined as if the process was Gaussian. They do not need to model or estimate the non-linear transformation. Our results can thus be interpreted as a robustness of (Gaussian) maximum likelihood and cross validation towards non-Gaussianity. Our proofs rely on two technical results that are of independent interest for the increasing-domain asymptotic literature of spatial processes. First, we show that, under mild assumptions, coefficients of inverses of large covariance matrices decay at an inverse polynomial rate as a function of the corresponding observation location distances. Second, we provide a general central limit theorem for quadratic forms obtained from transformed Gaussian processes. Finally, our asymptotic results are illustrated by numerical simulations.




as

Asymptotics and optimal bandwidth for nonparametric estimation of density level sets

Wanli Qiao.

Source: Electronic Journal of Statistics, Volume 14, Number 1, 302--344.

Abstract:
Bandwidth selection is crucial in the kernel estimation of density level sets. A risk based on the symmetric difference between the estimated and true level sets is usually used to measure their proximity. In this paper we provide an asymptotic $L^{p}$ approximation to this risk, where $p$ is characterized by the weight function in the risk. In particular the excess risk corresponds to an $L^{2}$ type of risk, and is adopted to derive an optimal bandwidth for nonparametric level set estimation of $d$-dimensional density functions ($dgeq 1$). A direct plug-in bandwidth selector is developed for kernel density level set estimation and its efficacy is verified in numerical studies.




as

Assessing prediction error at interpolation and extrapolation points

Assaf Rabinowicz, Saharon Rosset.

Source: Electronic Journal of Statistics, Volume 14, Number 1, 272--301.

Abstract:
Common model selection criteria, such as $AIC$ and its variants, are based on in-sample prediction error estimators. However, in many applications involving predicting at interpolation and extrapolation points, in-sample error does not represent the relevant prediction error. In this paper new prediction error estimators, $tAI$ and $Loss(w_{t})$ are introduced. These estimators generalize previous error estimators, however are also applicable for assessing prediction error in cases involving interpolation and extrapolation. Based on these prediction error estimators, two model selection criteria with the same spirit as $AIC$ and Mallow’s $C_{p}$ are suggested. The advantages of our suggested methods are demonstrated in a simulation and a real data analysis of studies involving interpolation and extrapolation in linear mixed model and Gaussian process regression.




as

Model-based clustering with envelopes

Wenjing Wang, Xin Zhang, Qing Mai.

Source: Electronic Journal of Statistics, Volume 14, Number 1, 82--109.

Abstract:
Clustering analysis is an important unsupervised learning technique in multivariate statistics and machine learning. In this paper, we propose a set of new mixture models called CLEMM (in short for Clustering with Envelope Mixture Models) that is based on the widely used Gaussian mixture model assumptions and the nascent research area of envelope methodology. Formulated mostly for regression models, envelope methodology aims for simultaneous dimension reduction and efficient parameter estimation, and includes a very recent formulation of envelope discriminant subspace for classification and discriminant analysis. Motivated by the envelope discriminant subspace pursuit in classification, we consider parsimonious probabilistic mixture models where the cluster analysis can be improved by projecting the data onto a latent lower-dimensional subspace. The proposed CLEMM framework and the associated envelope-EM algorithms thus provide foundations for envelope methods in unsupervised and semi-supervised learning problems. Numerical studies on simulated data and two benchmark data sets show significant improvement of our propose methods over the classical methods such as Gaussian mixture models, K-means and hierarchical clustering algorithms. An R package is available at https://github.com/kusakehan/CLEMM.




as

Bias correction in conditional multivariate extremes

Mikael Escobar-Bach, Yuri Goegebeur, Armelle Guillou.

Source: Electronic Journal of Statistics, Volume 14, Number 1, 1773--1795.

Abstract:
We consider bias-corrected estimation of the stable tail dependence function in the regression context. To this aim, we first estimate the bias of a smoothed estimator of the stable tail dependence function, and then we subtract it from the estimator. The weak convergence, as a stochastic process, of the resulting asymptotically unbiased estimator of the conditional stable tail dependence function, correctly normalized, is established under mild assumptions, the covariate argument being fixed. The finite sample behaviour of our asymptotically unbiased estimator is then illustrated on a simulation study and compared to two alternatives, which are not bias corrected. Finally, our methodology is applied to a dataset of air pollution measurements.




as

Non-parametric adaptive estimation of order 1 Sobol indices in stochastic models, with an application to Epidemiology

Gwenaëlle Castellan, Anthony Cousien, Viet Chi Tran.

Source: Electronic Journal of Statistics, Volume 14, Number 1, 50--81.

Abstract:
Global sensitivity analysis is a set of methods aiming at quantifying the contribution of an uncertain input parameter of the model (or combination of parameters) on the variability of the response. We consider here the estimation of the Sobol indices of order 1 which are commonly-used indicators based on a decomposition of the output’s variance. In a deterministic framework, when the same inputs always give the same outputs, these indices are usually estimated by replicated simulations of the model. In a stochastic framework, when the response given a set of input parameters is not unique due to randomness in the model, metamodels are often used to approximate the mean and dispersion of the response by deterministic functions. We propose a new non-parametric estimator without the need of defining a metamodel to estimate the Sobol indices of order 1. The estimator is based on warped wavelets and is adaptive in the regularity of the model. The convergence of the mean square error to zero, when the number of simulations of the model tend to infinity, is computed and an elbow effect is shown, depending on the regularity of the model. Applications in Epidemiology are carried to illustrate the use of non-parametric estimators.




as

Monotone least squares and isotonic quantiles

Alexandre Mösching, Lutz Dümbgen.

Source: Electronic Journal of Statistics, Volume 14, Number 1, 24--49.

Abstract:
We consider bivariate observations $(X_{1},Y_{1}),ldots,(X_{n},Y_{n})$ such that, conditional on the $X_{i}$, the $Y_{i}$ are independent random variables. Precisely, the conditional distribution function of $Y_{i}$ equals $F_{X_{i}}$, where $(F_{x})_{x}$ is an unknown family of distribution functions. Under the sole assumption that $xmapsto F_{x}$ is isotonic with respect to stochastic order, one can estimate $(F_{x})_{x}$ in two ways: (i) For any fixed $y$ one estimates the antitonic function $xmapsto F_{x}(y)$ via nonparametric monotone least squares, replacing the responses $Y_{i}$ with the indicators $1_{[Y_{i}le y]}$. (ii) For any fixed $eta in (0,1)$ one estimates the isotonic quantile function $xmapsto F_{x}^{-1}(eta)$ via a nonparametric version of regression quantiles. We show that these two approaches are closely related, with (i) being more flexible than (ii). Then, under mild regularity conditions, we establish rates of convergence for the resulting estimators $hat{F}_{x}(y)$ and $hat{F}_{x}^{-1}(eta)$, uniformly over $(x,y)$ and $(x,eta)$ in certain rectangles as well as uniformly in $y$ or $eta$ for a fixed $x$.




as

A fast MCMC algorithm for the uniform sampling of binary matrices with fixed margins

Guanyang Wang.

Source: Electronic Journal of Statistics, Volume 14, Number 1, 1690--1706.

Abstract:
Uniform sampling of binary matrix with fixed margins is an important and difficult problem in statistics, computer science, ecology and so on. The well-known swap algorithm would be inefficient when the size of the matrix becomes large or when the matrix is too sparse/dense. Here we propose the Rectangle Loop algorithm, a Markov chain Monte Carlo algorithm to sample binary matrices with fixed margins uniformly. Theoretically the Rectangle Loop algorithm is better than the swap algorithm in Peskun’s order. Empirically studies also demonstrates the Rectangle Loop algorithm is remarkablely more efficient than the swap algorithm.




as

Asymptotic seed bias in respondent-driven sampling

Yuling Yan, Bret Hanlon, Sebastien Roch, Karl Rohe.

Source: Electronic Journal of Statistics, Volume 14, Number 1, 1577--1610.

Abstract:
Respondent-driven sampling (RDS) collects a sample of individuals in a networked population by incentivizing the sampled individuals to refer their contacts into the sample. This iterative process is initialized from some seed node(s). Sometimes, this selection creates a large amount of seed bias. Other times, the seed bias is small. This paper gains a deeper understanding of this bias by characterizing its effect on the limiting distribution of various RDS estimators. Using classical tools and results from multi-type branching processes [12], we show that the seed bias is negligible for the Generalized Least Squares (GLS) estimator and non-negligible for both the inverse probability weighted and Volz-Heckathorn (VH) estimators. In particular, we show that (i) above a critical threshold, VH converge to a non-trivial mixture distribution, where the mixture component depends on the seed node, and the mixture distribution is possibly multi-modal. Moreover, (ii) GLS converges to a Gaussian distribution independent of the seed node, under a certain condition on the Markov process. Numerical experiments with both simulated data and empirical social networks suggest that these results appear to hold beyond the Markov conditions of the theorems.




as

A Bayesian approach to disease clustering using restricted Chinese restaurant processes

Claudia Wehrhahn, Samuel Leonard, Abel Rodriguez, Tatiana Xifara.

Source: Electronic Journal of Statistics, Volume 14, Number 1, 1449--1478.

Abstract:
Identifying disease clusters (areas with an unusually high incidence of a particular disease) is a common problem in epidemiology and public health. We describe a Bayesian nonparametric mixture model for disease clustering that constrains clusters to be made of adjacent areal units. This is achieved by modifying the exchangeable partition probability function associated with the Ewen’s sampling distribution. We call the resulting prior the Restricted Chinese Restaurant Process, as the associated full conditional distributions resemble those associated with the standard Chinese Restaurant Process. The model is illustrated using synthetic data sets and in an application to oral cancer mortality in Germany.




as

A fast and consistent variable selection method for high-dimensional multivariate linear regression with a large number of explanatory variables

Ryoya Oda, Hirokazu Yanagihara.

Source: Electronic Journal of Statistics, Volume 14, Number 1, 1386--1412.

Abstract:
We put forward a variable selection method for selecting explanatory variables in a normality-assumed multivariate linear regression. It is cumbersome to calculate variable selection criteria for all subsets of explanatory variables when the number of explanatory variables is large. Therefore, we propose a fast and consistent variable selection method based on a generalized $C_{p}$ criterion. The consistency of the method is provided by a high-dimensional asymptotic framework such that the sample size and the sum of the dimensions of response vectors and explanatory vectors divided by the sample size tend to infinity and some positive constant which are less than one, respectively. Through numerical simulations, it is shown that the proposed method has a high probability of selecting the true subset of explanatory variables and is fast under a moderate sample size even when the number of dimensions is large.




as

Rate optimal Chernoff bound and application to community detection in the stochastic block models

Zhixin Zhou, Ping Li.

Source: Electronic Journal of Statistics, Volume 14, Number 1, 1302--1347.

Abstract:
The Chernoff coefficient is known to be an upper bound of Bayes error probability in classification problem. In this paper, we will develop a rate optimal Chernoff bound on the Bayes error probability. The new bound is not only an upper bound but also a lower bound of Bayes error probability up to a constant factor. Moreover, we will apply this result to community detection in the stochastic block models. As a clustering problem, the optimal misclassification rate of community detection problem can be characterized by our rate optimal Chernoff bound. This can be formalized by deriving a minimax error rate over certain parameter space of stochastic block models, then achieving such an error rate by a feasible algorithm employing multiple steps of EM type updates.




as

Consistency and asymptotic normality of Latent Block Model estimators

Vincent Brault, Christine Keribin, Mahendra Mariadassou.

Source: Electronic Journal of Statistics, Volume 14, Number 1, 1234--1268.

Abstract:
The Latent Block Model (LBM) is a model-based method to cluster simultaneously the $d$ columns and $n$ rows of a data matrix. Parameter estimation in LBM is a difficult and multifaceted problem. Although various estimation strategies have been proposed and are now well understood empirically, theoretical guarantees about their asymptotic behavior is rather sparse and most results are limited to the binary setting. We prove here theoretical guarantees in the valued settings. We show that under some mild conditions on the parameter space, and in an asymptotic regime where $log (d)/n$ and $log (n)/d$ tend to $0$ when $n$ and $d$ tend to infinity, (1) the maximum-likelihood estimate of the complete model (with known labels) is consistent and (2) the log-likelihood ratios are equivalent under the complete and observed (with unknown labels) models. This equivalence allows us to transfer the asymptotic consistency, and under mild conditions, asymptotic normality, to the maximum likelihood estimate under the observed model. Moreover, the variational estimator is also consistent and, under the same conditions, asymptotically normal.




as

A general drift estimation procedure for stochastic differential equations with additive fractional noise

Fabien Panloup, Samy Tindel, Maylis Varvenne.

Source: Electronic Journal of Statistics, Volume 14, Number 1, 1075--1136.

Abstract:
In this paper we consider the drift estimation problem for a general differential equation driven by an additive multidimensional fractional Brownian motion, under ergodic assumptions on the drift coefficient. Our estimation procedure is based on the identification of the invariant measure, and we provide consistency results as well as some information about the convergence rate. We also give some examples of coefficients for which the identifiability assumption for the invariant measure is satisfied.




as

Conditional density estimation with covariate measurement error

Xianzheng Huang, Haiming Zhou.

Source: Electronic Journal of Statistics, Volume 14, Number 1, 970--1023.

Abstract:
We consider estimating the density of a response conditioning on an error-prone covariate. Motivated by two existing kernel density estimators in the absence of covariate measurement error, we propose a method to correct the existing estimators for measurement error. Asymptotic properties of the resultant estimators under different types of measurement error distributions are derived. Moreover, we adjust bandwidths readily available from existing bandwidth selection methods developed for error-free data to obtain bandwidths for the new estimators. Extensive simulation studies are carried out to compare the proposed estimators with naive estimators that ignore measurement error, which also provide empirical evidence for the effectiveness of the proposed bandwidth selection methods. A real-life data example is used to illustrate implementation of these methods under practical scenarios. An R package, lpme, is developed for implementing all considered methods, which we demonstrate via an R code example in Appendix B.2.




as

On the distribution, model selection properties and uniqueness of the Lasso estimator in low and high dimensions

Karl Ewald, Ulrike Schneider.

Source: Electronic Journal of Statistics, Volume 14, Number 1, 944--969.

Abstract:
We derive expressions for the finite-sample distribution of the Lasso estimator in the context of a linear regression model in low as well as in high dimensions by exploiting the structure of the optimization problem defining the estimator. In low dimensions, we assume full rank of the regressor matrix and present expressions for the cumulative distribution function as well as the densities of the absolutely continuous parts of the estimator. Our results are presented for the case of normally distributed errors, but do not hinge on this assumption and can easily be generalized. Additionally, we establish an explicit formula for the correspondence between the Lasso and the least-squares estimator. We derive analogous results for the distribution in less explicit form in high dimensions where we make no assumptions on the regressor matrix at all. In this setting, we also investigate the model selection properties of the Lasso and show that possibly only a subset of models might be selected by the estimator, completely independently of the observed response vector. Finally, we present a condition for uniqueness of the estimator that is necessary as well as sufficient.




as

On a Metropolis–Hastings importance sampling estimator

Daniel Rudolf, Björn Sprungk.

Source: Electronic Journal of Statistics, Volume 14, Number 1, 857--889.

Abstract:
A classical approach for approximating expectations of functions w.r.t. partially known distributions is to compute the average of function values along a trajectory of a Metropolis–Hastings (MH) Markov chain. A key part in the MH algorithm is a suitable acceptance/rejection of a proposed state, which ensures the correct stationary distribution of the resulting Markov chain. However, the rejection of proposals causes highly correlated samples. In particular, when a state is rejected it is not taken any further into account. In contrast to that we consider a MH importance sampling estimator which explicitly incorporates all proposed states generated by the MH algorithm. The estimator satisfies a strong law of large numbers as well as a central limit theorem, and, in addition to that, we provide an explicit mean squared error bound. Remarkably, the asymptotic variance of the MH importance sampling estimator does not involve any correlation term in contrast to its classical counterpart. Moreover, although the analyzed estimator uses the same amount of information as the classical MH estimator, it can outperform the latter in scenarios of moderate dimensions as indicated by numerical experiments.




as

Modal clustering asymptotics with applications to bandwidth selection

Alessandro Casa, José E. Chacón, Giovanna Menardi.

Source: Electronic Journal of Statistics, Volume 14, Number 1, 835--856.

Abstract:
Density-based clustering relies on the idea of linking groups to some specific features of the probability distribution underlying the data. The reference to a true, yet unknown, population structure allows framing the clustering problem in a standard inferential setting, where the concept of ideal population clustering is defined as the partition induced by the true density function. The nonparametric formulation of this approach, known as modal clustering, draws a correspondence between the groups and the domains of attraction of the density modes. Operationally, a nonparametric density estimate is required and a proper selection of the amount of smoothing, governing the shape of the density and hence possibly the modal structure, is crucial to identify the final partition. In this work, we address the issue of density estimation for modal clustering from an asymptotic perspective. A natural and easy to interpret metric to measure the distance between density-based partitions is discussed, its asymptotic approximation explored, and employed to study the problem of bandwidth selection for nonparametric modal clustering.




as

The bias of isotonic regression

Ran Dai, Hyebin Song, Rina Foygel Barber, Garvesh Raskutti.

Source: Electronic Journal of Statistics, Volume 14, Number 1, 801--834.

Abstract:
We study the bias of the isotonic regression estimator. While there is extensive work characterizing the mean squared error of the isotonic regression estimator, relatively little is known about the bias. In this paper, we provide a sharp characterization, proving that the bias scales as $O(n^{-eta /3})$ up to log factors, where $1leq eta leq 2$ is the exponent corresponding to Hölder smoothness of the underlying mean. Importantly, this result only requires a strictly monotone mean and that the noise distribution has subexponential tails, without relying on symmetric noise or other restrictive assumptions.




as

Estimation of a semiparametric transformation model: A novel approach based on least squares minimization

Benjamin Colling, Ingrid Van Keilegom.

Source: Electronic Journal of Statistics, Volume 14, Number 1, 769--800.

Abstract:
Consider the following semiparametric transformation model $Lambda_{ heta }(Y)=m(X)+varepsilon $, where $X$ is a $d$-dimensional covariate, $Y$ is a univariate response variable and $varepsilon $ is an error term with zero mean and independent of $X$. We assume that $m$ is an unknown regression function and that ${Lambda _{ heta }: heta inTheta }$ is a parametric family of strictly increasing functions. Our goal is to develop two new estimators of the transformation parameter $ heta $. The main idea of these two estimators is to minimize, with respect to $ heta $, the $L_{2}$-distance between the transformation $Lambda _{ heta }$ and one of its fully nonparametric estimators. We consider in particular the nonparametric estimator based on the least-absolute deviation loss constructed in Colling and Van Keilegom (2019). We establish the consistency and the asymptotic normality of the two proposed estimators of $ heta $. We also carry out a simulation study to illustrate and compare the performance of our new parametric estimators to that of the profile likelihood estimator constructed in Linton et al. (2008).




as

The bias and skewness of M -estimators in regression

Christopher Withers, Saralees Nadarajah

Source: Electron. J. Statist., Volume 4, 1--14.

Abstract:
We consider M estimation of a regression model with a nuisance parameter and a vector of other parameters. The unknown distribution of the residuals is not assumed to be normal or symmetric. Simple and easily estimated formulas are given for the dominant terms of the bias and skewness of the parameter estimates. For the linear model these are proportional to the skewness of the ‘independent’ variables. For a nonlinear model, its linear component plays the role of these independent variables, and a second term must be added proportional to the covariance of its linear and quadratic components. For the least squares estimate with normal errors this term was derived by Box [1]. We also consider the effect of a large number of parameters, and the case of random independent variables.




as

Path-Based Spectral Clustering: Guarantees, Robustness to Outliers, and Fast Algorithms

We consider the problem of clustering with the longest-leg path distance (LLPD) metric, which is informative for elongated and irregularly shaped clusters. We prove finite-sample guarantees on the performance of clustering with respect to this metric when random samples are drawn from multiple intrinsically low-dimensional clusters in high-dimensional space, in the presence of a large number of high-dimensional outliers. By combining these results with spectral clustering with respect to LLPD, we provide conditions under which the Laplacian eigengap statistic correctly determines the number of clusters for a large class of data sets, and prove guarantees on the labeling accuracy of the proposed algorithm. Our methods are quite general and provide performance guarantees for spectral clustering with any ultrametric. We also introduce an efficient, easy to implement approximation algorithm for the LLPD based on a multiscale analysis of adjacency graphs, which allows for the runtime of LLPD spectral clustering to be quasilinear in the number of data points.