si Cellular internet of things : from massive deployments to critical 5G applications By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Author: Liberg, Olof, 1943- author.Callnumber: OnlineISBN: 9780081029039 (electronic bk.) Full Article
si Carotenoids : properties, processing and applications By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Callnumber: OnlineISBN: 9780128173145 (electronic bk.) Full Article
si Calcium signaling By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Callnumber: OnlineISBN: 9783030124571 (electronic bk.) Full Article
si Breakfast cereals and how they are made : raw materials, processing, and production By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Callnumber: OnlineISBN: 9780128120446 (electronic bk.) Full Article
si Brassica improvement : molecular, genetics and genomic perspectives By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Callnumber: OnlineISBN: 9783030346942 (electronic bk.) Full Article
si Biomedical product development : bench to bedside By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Callnumber: OnlineISBN: 9783030356262 (electronic bk.) Full Article
si Biology and physiology of freshwater neotropical fishes By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Callnumber: OnlineISBN: 9780128158739 (electronic bk.) Full Article
si Biological invasions in South Africa By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Callnumber: OnlineISBN: 9783030323943 (electronic bk.) Full Article
si Biodiversity of the Himalaya : Jammu and Kashmir State By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Callnumber: OnlineISBN: 9789813291744 (electronic bk.) Full Article
si Beyond our genes : pathophysiology of gene and environment interaction and epigenetic inheritance By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Callnumber: OnlineISBN: 9783030352134 (electronic bk.) Full Article
si Basic Electrocardiography By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Author: Petty, Brent G. author. aut http://id.loc.gov/vocabulary/relators/autCallnumber: OnlineISBN: 9783030328863 978-3-030-32886-3 Full Article
si Atlas of ulcers in systemic sclerosis : diagnosis and management By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Callnumber: OnlineISBN: 9783319984773 (electronic bk.) Full Article
si Atlas of sexually transmitted diseases : clinical aspects and differential diagnosis By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Callnumber: OnlineISBN: 9783319574707 (electronic bk.) Full Article
si Aquatic biopolymers : understanding their industrial significance and environmental implications By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Author: Olatunji, Ololade.Callnumber: OnlineISBN: 9783030347093 (electronic bk.) Full Article
si Anomalies of the Developing Dentition : a Clinical Guide to Diagnosis and Management By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Author: Soxman, Jane A., author.Callnumber: OnlineISBN: 9783030031640 (electronic bk.) Full Article
si Anatomical chart company atlas of pathophysiology By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Author: Atlas of pathophysiology.Callnumber: OnlineISBN: 9781496370921 Full Article
si Advances in parasitology. By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Callnumber: OnlineISBN: 9780123742292 (electronic bk.) Full Article
si Advanced age geriatric care : a comprehensive guide By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Callnumber: OnlineISBN: 9783319969985 (electronic bk.) Full Article
si Utah Signs SB 29 Drug Disposal Program Into Law, A Huge Step Forward... By www.prweb.com Published On :: Utah State Bill SB 29 requires environmentally friendly disposal of a lawfully possessed controlled substance. NarcX worked closely with Utah lawmakers to provide crucial guidance for the bill.(PRWeb April 08, 2020)Read the full story at https://www.prweb.com/releases/utah_signs_sb_29_drug_disposal_program_into_law_a_huge_step_forward_for_narcx/prweb17030392.htm Full Article
si Hays County Joins the Texas Purchasing Group by BidNet Direct By www.prweb.com Published On :: Hays County announced it has joined the Texas Purchasing Group and will be publishing and distributing upcoming bid opportunities on the system along with their current platform in these unprecedented...(PRWeb April 09, 2020)Read the full story at https://www.prweb.com/releases/hays_county_joins_the_texas_purchasing_group_by_bidnet_direct/prweb17021429.htm Full Article
si In Battle to Fight Coronavirus Pandemic, LeadingAge Nursing Home... By www.prweb.com Published On :: Aging Services Providers Dedicated to Fulfilling Their Critical Role in Public Health System(PRWeb April 18, 2020)Read the full story at https://www.prweb.com/releases/in_battle_to_fight_coronavirus_pandemic_leadingage_nursing_home_members_support_texas_action_to_gather_and_leverage_data/prweb17055806.htm Full Article
si Suntuity AirWorks Offering FREE Assistance in Drone Acquisition... By www.prweb.com Published On :: The drones and programs will be fully paid for by the DOJ as part of the $850 million funding that has been allocated to help public safety departments fight the spread of COVID-19. This includes...(PRWeb April 30, 2020)Read the full story at https://www.prweb.com/releases/suntuity_airworks_offering_free_assistance_in_drone_acquisition_through_850mm_federal_grant_assistance_program_for_public_safety_agencies/prweb17090555.htm Full Article
si Health Worker Data Alliance: Monitoring Emotional, Physical and... By www.prweb.com Published On :: Surveys provide secure, anonymous feedback from staff at all levels of healthcare organizations(PRWeb May 06, 2020)Read the full story at https://www.prweb.com/releases/health_worker_data_alliance_monitoring_emotional_physical_and_occupational_health_of_healthcare_workers_during_covid_19/prweb17101008.htm Full Article
si Consistent selection of the number of change-points via sample-splitting By projecteuclid.org Published On :: Mon, 17 Feb 2020 04:02 EST Changliang Zou, Guanghui Wang, Runze Li. Source: The Annals of Statistics, Volume 48, Number 1, 413--439.Abstract: In multiple change-point analysis, one of the major challenges is to estimate the number of change-points. Most existing approaches attempt to minimize a Schwarz information criterion which balances a term quantifying model fit with a penalization term accounting for model complexity that increases with the number of change-points and limits overfitting. However, different penalization terms are required to adapt to different contexts of multiple change-point problems and the optimal penalization magnitude usually varies from the model and error distribution. We propose a data-driven selection criterion that is applicable to most kinds of popular change-point detection methods, including binary segmentation and optimal partitioning algorithms. The key idea is to select the number of change-points that minimizes the squared prediction error, which measures the fit of a specified model for a new sample. We develop a cross-validation estimation scheme based on an order-preserved sample-splitting strategy, and establish its asymptotic selection consistency under some mild conditions. Effectiveness of the proposed selection criterion is demonstrated on a variety of numerical experiments and real-data examples. Full Article
si Concentration and consistency results for canonical and curved exponential-family models of random graphs By projecteuclid.org Published On :: Mon, 17 Feb 2020 04:02 EST Michael Schweinberger, Jonathan Stewart. Source: The Annals of Statistics, Volume 48, Number 1, 374--396.Abstract: Statistical inference for exponential-family models of random graphs with dependent edges is challenging. We stress the importance of additional structure and show that additional structure facilitates statistical inference. A simple example of a random graph with additional structure is a random graph with neighborhoods and local dependence within neighborhoods. We develop the first concentration and consistency results for maximum likelihood and $M$-estimators of a wide range of canonical and curved exponential-family models of random graphs with local dependence. All results are nonasymptotic and applicable to random graphs with finite populations of nodes, although asymptotic consistency results can be obtained as well. In addition, we show that additional structure can facilitate subgraph-to-graph estimation, and present concentration results for subgraph-to-graph estimators. As an application, we consider popular curved exponential-family models of random graphs, with local dependence induced by transitivity and parameter vectors whose dimensions depend on the number of nodes. Full Article
si Sparse high-dimensional regression: Exact scalable algorithms and phase transitions By projecteuclid.org Published On :: Mon, 17 Feb 2020 04:02 EST Dimitris Bertsimas, Bart Van Parys. Source: The Annals of Statistics, Volume 48, Number 1, 300--323.Abstract: We present a novel binary convex reformulation of the sparse regression problem that constitutes a new duality perspective. We devise a new cutting plane method and provide evidence that it can solve to provable optimality the sparse regression problem for sample sizes $n$ and number of regressors $p$ in the 100,000s, that is, two orders of magnitude better than the current state of the art, in seconds. The ability to solve the problem for very high dimensions allows us to observe new phase transition phenomena. Contrary to traditional complexity theory which suggests that the difficulty of a problem increases with problem size, the sparse regression problem has the property that as the number of samples $n$ increases the problem becomes easier in that the solution recovers 100% of the true signal, and our approach solves the problem extremely fast (in fact faster than Lasso), while for small number of samples $n$, our approach takes a larger amount of time to solve the problem, but importantly the optimal solution provides a statistically more relevant regressor. We argue that our exact sparse regression approach presents a superior alternative over heuristic methods available at present. Full Article
si Spectral and matrix factorization methods for consistent community detection in multi-layer networks By projecteuclid.org Published On :: Mon, 17 Feb 2020 04:02 EST Subhadeep Paul, Yuguo Chen. Source: The Annals of Statistics, Volume 48, Number 1, 230--250.Abstract: We consider the problem of estimating a consensus community structure by combining information from multiple layers of a multi-layer network using methods based on the spectral clustering or a low-rank matrix factorization. As a general theme, these “intermediate fusion” methods involve obtaining a low column rank matrix by optimizing an objective function and then using the columns of the matrix for clustering. However, the theoretical properties of these methods remain largely unexplored. In the absence of statistical guarantees on the objective functions, it is difficult to determine if the algorithms optimizing the objectives will return good community structures. We investigate the consistency properties of the global optimizer of some of these objective functions under the multi-layer stochastic blockmodel. For this purpose, we derive several new asymptotic results showing consistency of the intermediate fusion techniques along with the spectral clustering of mean adjacency matrix under a high dimensional setup, where the number of nodes, the number of layers and the number of communities of the multi-layer graph grow. Our numerical study shows that the intermediate fusion techniques outperform late fusion methods, namely spectral clustering on aggregate spectral kernel and module allegiance matrix in sparse networks, while they outperform the spectral clustering of mean adjacency matrix in multi-layer networks that contain layers with both homophilic and heterophilic communities. Full Article
si Adaptive risk bounds in univariate total variation denoising and trend filtering By projecteuclid.org Published On :: Mon, 17 Feb 2020 04:02 EST Adityanand Guntuboyina, Donovan Lieu, Sabyasachi Chatterjee, Bodhisattva Sen. Source: The Annals of Statistics, Volume 48, Number 1, 205--229.Abstract: We study trend filtering, a relatively recent method for univariate nonparametric regression. For a given integer $rgeq1$, the $r$th order trend filtering estimator is defined as the minimizer of the sum of squared errors when we constrain (or penalize) the sum of the absolute $r$th order discrete derivatives of the fitted function at the design points. For $r=1$, the estimator reduces to total variation regularization which has received much attention in the statistics and image processing literature. In this paper, we study the performance of the trend filtering estimator for every $rgeq1$, both in the constrained and penalized forms. Our main results show that in the strong sparsity setting when the underlying function is a (discrete) spline with few “knots,” the risk (under the global squared error loss) of the trend filtering estimator (with an appropriate choice of the tuning parameter) achieves the parametric $n^{-1}$-rate, up to a logarithmic (multiplicative) factor. Our results therefore provide support for the use of trend filtering, for every $rgeq1$, in the strong sparsity setting. Full Article
si Model assisted variable clustering: Minimax-optimal recovery and algorithms By projecteuclid.org Published On :: Mon, 17 Feb 2020 04:02 EST Florentina Bunea, Christophe Giraud, Xi Luo, Martin Royer, Nicolas Verzelen. Source: The Annals of Statistics, Volume 48, Number 1, 111--137.Abstract: The problem of variable clustering is that of estimating groups of similar components of a $p$-dimensional vector $X=(X_{1},ldots ,X_{p})$ from $n$ independent copies of $X$. There exists a large number of algorithms that return data-dependent groups of variables, but their interpretation is limited to the algorithm that produced them. An alternative is model-based clustering, in which one begins by defining population level clusters relative to a model that embeds notions of similarity. Algorithms tailored to such models yield estimated clusters with a clear statistical interpretation. We take this view here and introduce the class of $G$-block covariance models as a background model for variable clustering. In such models, two variables in a cluster are deemed similar if they have similar associations will all other variables. This can arise, for instance, when groups of variables are noise corrupted versions of the same latent factor. We quantify the difficulty of clustering data generated from a $G$-block covariance model in terms of cluster proximity, measured with respect to two related, but different, cluster separation metrics. We derive minimax cluster separation thresholds, which are the metric values below which no algorithm can recover the model-defined clusters exactly, and show that they are different for the two metrics. We therefore develop two algorithms, COD and PECOK, tailored to $G$-block covariance models, and study their minimax-optimality with respect to each metric. Of independent interest is the fact that the analysis of the PECOK algorithm, which is based on a corrected convex relaxation of the popular $K$-means algorithm, provides the first statistical analysis of such algorithms for variable clustering. Additionally, we compare our methods with another popular clustering method, spectral clustering. Extensive simulation studies, as well as our data analyses, confirm the applicability of our approach. Full Article
si Sparse SIR: Optimal rates and adaptive estimation By projecteuclid.org Published On :: Mon, 17 Feb 2020 04:02 EST Kai Tan, Lei Shi, Zhou Yu. Source: The Annals of Statistics, Volume 48, Number 1, 64--85.Abstract: Sliced inverse regression (SIR) is an innovative and effective method for sufficient dimension reduction and data visualization. Recently, an impressive range of penalized SIR methods has been proposed to estimate the central subspace in a sparse fashion. Nonetheless, few of them considered the sparse sufficient dimension reduction from a decision-theoretic point of view. To address this issue, we in this paper establish the minimax rates of convergence for estimating the sparse SIR directions under various commonly used loss functions in the literature of sufficient dimension reduction. We also discover the possible trade-off between statistical guarantee and computational performance for sparse SIR. We finally propose an adaptive estimation scheme for sparse SIR which is computationally tractable and rate optimal. Numerical studies are carried out to confirm the theoretical properties of our proposed methods. Full Article
si The phase transition for the existence of the maximum likelihood estimate in high-dimensional logistic regression By projecteuclid.org Published On :: Mon, 17 Feb 2020 04:02 EST Emmanuel J. Candès, Pragya Sur. Source: The Annals of Statistics, Volume 48, Number 1, 27--42.Abstract: This paper rigorously establishes that the existence of the maximum likelihood estimate (MLE) in high-dimensional logistic regression models with Gaussian covariates undergoes a sharp “phase transition.” We introduce an explicit boundary curve $h_{mathrm{MLE}}$, parameterized by two scalars measuring the overall magnitude of the unknown sequence of regression coefficients, with the following property: in the limit of large sample sizes $n$ and number of features $p$ proportioned in such a way that $p/n ightarrow kappa $, we show that if the problem is sufficiently high dimensional in the sense that $kappa >h_{mathrm{MLE}}$, then the MLE does not exist with probability one. Conversely, if $kappa <h_{mathrm{MLE}}$, the MLE asymptotically exists with probability one. Full Article
si Intrinsic Riemannian functional data analysis By projecteuclid.org Published On :: Wed, 30 Oct 2019 22:03 EDT Zhenhua Lin, Fang Yao. Source: The Annals of Statistics, Volume 47, Number 6, 3533--3577.Abstract: In this work we develop a novel and foundational framework for analyzing general Riemannian functional data, in particular a new development of tensor Hilbert spaces along curves on a manifold. Such spaces enable us to derive Karhunen–Loève expansion for Riemannian random processes. This framework also features an approach to compare objects from different tensor Hilbert spaces, which paves the way for asymptotic analysis in Riemannian functional data analysis. Built upon intrinsic geometric concepts such as vector field, Levi-Civita connection and parallel transport on Riemannian manifolds, the developed framework applies to not only Euclidean submanifolds but also manifolds without a natural ambient space. As applications of this framework, we develop intrinsic Riemannian functional principal component analysis (iRFPCA) and intrinsic Riemannian functional linear regression (iRFLR) that are distinct from their traditional and ambient counterparts. We also provide estimation procedures for iRFPCA and iRFLR, and investigate their asymptotic properties within the intrinsic geometry. Numerical performance is illustrated by simulated and real examples. Full Article
si Bootstrapping and sample splitting for high-dimensional, assumption-lean inference By projecteuclid.org Published On :: Wed, 30 Oct 2019 22:03 EDT Alessandro Rinaldo, Larry Wasserman, Max G’Sell. Source: The Annals of Statistics, Volume 47, Number 6, 3438--3469.Abstract: Several new methods have been recently proposed for performing valid inference after model selection. An older method is sample splitting: use part of the data for model selection and the rest for inference. In this paper, we revisit sample splitting combined with the bootstrap (or the Normal approximation). We show that this leads to a simple, assumption-lean approach to inference and we establish results on the accuracy of the method. In fact, we find new bounds on the accuracy of the bootstrap and the Normal approximation for general nonlinear parameters with increasing dimension which we then use to assess the accuracy of regression inference. We define new parameters that measure variable importance and that can be inferred with greater accuracy than the usual regression coefficients. Finally, we elucidate an inference-prediction trade-off: splitting increases the accuracy and robustness of inference but can decrease the accuracy of the predictions. Full Article
si Minimax posterior convergence rates and model selection consistency in high-dimensional DAG models based on sparse Cholesky factors By projecteuclid.org Published On :: Wed, 30 Oct 2019 22:03 EDT Kyoungjae Lee, Jaeyong Lee, Lizhen Lin. Source: The Annals of Statistics, Volume 47, Number 6, 3413--3437.Abstract: In this paper we study the high-dimensional sparse directed acyclic graph (DAG) models under the empirical sparse Cholesky prior. Among our results, strong model selection consistency or graph selection consistency is obtained under more general conditions than those in the existing literature. Compared to Cao, Khare and Ghosh [ Ann. Statist. (2019) 47 319–348], the required conditions are weakened in terms of the dimensionality, sparsity and lower bound of the nonzero elements in the Cholesky factor. Furthermore, our result does not require the irrepresentable condition, which is necessary for Lasso-type methods. We also derive the posterior convergence rates for precision matrices and Cholesky factors with respect to various matrix norms. The obtained posterior convergence rates are the fastest among those of the existing Bayesian approaches. In particular, we prove that our posterior convergence rates for Cholesky factors are the minimax or at least nearly minimax depending on the relative size of true sparseness for the entire dimension. The simulation study confirms that the proposed method outperforms the competing methods. Full Article
si On testing for high-dimensional white noise By projecteuclid.org Published On :: Wed, 30 Oct 2019 22:03 EDT Zeng Li, Clifford Lam, Jianfeng Yao, Qiwei Yao. Source: The Annals of Statistics, Volume 47, Number 6, 3382--3412.Abstract: Testing for white noise is a classical yet important problem in statistics, especially for diagnostic checks in time series modeling and linear regression. For high-dimensional time series in the sense that the dimension $p$ is large in relation to the sample size $T$, the popular omnibus tests including the multivariate Hosking and Li–McLeod tests are extremely conservative, leading to substantial power loss. To develop more relevant tests for high-dimensional cases, we propose a portmanteau-type test statistic which is the sum of squared singular values of the first $q$ lagged sample autocovariance matrices. It, therefore, encapsulates all the serial correlations (up to the time lag $q$) within and across all component series. Using the tools from random matrix theory and assuming both $p$ and $T$ diverge to infinity, we derive the asymptotic normality of the test statistic under both the null and a specific VMA(1) alternative hypothesis. As the actual implementation of the test requires the knowledge of three characteristic constants of the population cross-sectional covariance matrix and the value of the fourth moment of the standardized innovations, nontrivial estimations are proposed for these parameters and their integration leads to a practically usable test. Extensive simulation confirms the excellent finite-sample performance of the new test with accurate size and satisfactory power for a large range of finite $(p,T)$ combinations, therefore, ensuring wide applicability in practice. In particular, the new tests are consistently superior to the traditional Hosking and Li–McLeod tests. Full Article
si A smeary central limit theorem for manifolds with application to high-dimensional spheres By projecteuclid.org Published On :: Wed, 30 Oct 2019 22:03 EDT Benjamin Eltzner, Stephan F. Huckemann. Source: The Annals of Statistics, Volume 47, Number 6, 3360--3381.Abstract: The (CLT) central limit theorems for generalized Fréchet means (data descriptors assuming values in manifolds, such as intrinsic means, geodesics, etc.) on manifolds from the literature are only valid if a certain empirical process of Hessians of the Fréchet function converges suitably, as in the proof of the prototypical BP-CLT [ Ann. Statist. 33 (2005) 1225–1259]. This is not valid in many realistic scenarios and we provide for a new very general CLT. In particular, this includes scenarios where, in a suitable chart, the sample mean fluctuates asymptotically at a scale $n^{alpha }$ with exponents $alpha <1/2$ with a nonnormal distribution. As the BP-CLT yields only fluctuations that are, rescaled with $n^{1/2}$, asymptotically normal, just as the classical CLT for random vectors, these lower rates, somewhat loosely called smeariness, had to date been observed only on the circle. We make the concept of smeariness on manifolds precise, give an example for two-smeariness on spheres of arbitrary dimension, and show that smeariness, although “almost never” occurring, may have serious statistical implications on a continuum of sample scenarios nearby. In fact, this effect increases with dimension, striking in particular in high dimension low sample size scenarios. Full Article
si On optimal designs for nonregular models By projecteuclid.org Published On :: Wed, 30 Oct 2019 22:03 EDT Yi Lin, Ryan Martin, Min Yang. Source: The Annals of Statistics, Volume 47, Number 6, 3335--3359.Abstract: Classically, Fisher information is the relevant object in defining optimal experimental designs. However, for models that lack certain regularity, the Fisher information does not exist, and hence, there is no notion of design optimality available in the literature. This article seeks to fill the gap by proposing a so-called Hellinger information , which generalizes Fisher information in the sense that the two measures agree in regular problems, but the former also exists for certain types of nonregular problems. We derive a Hellinger information inequality, showing that Hellinger information defines a lower bound on the local minimax risk of estimators. This provides a connection between features of the underlying model—in particular, the design—and the performance of estimators, motivating the use of this new Hellinger information for nonregular optimal design problems. Hellinger optimal designs are derived for several nonregular regression problems, with numerical results empirically demonstrating the efficiency of these designs compared to alternatives. Full Article
si Hypothesis testing on linear structures of high-dimensional covariance matrix By projecteuclid.org Published On :: Wed, 30 Oct 2019 22:03 EDT Shurong Zheng, Zhao Chen, Hengjian Cui, Runze Li. Source: The Annals of Statistics, Volume 47, Number 6, 3300--3334.Abstract: This paper is concerned with test of significance on high-dimensional covariance structures, and aims to develop a unified framework for testing commonly used linear covariance structures. We first construct a consistent estimator for parameters involved in the linear covariance structure, and then develop two tests for the linear covariance structures based on entropy loss and quadratic loss used for covariance matrix estimation. To study the asymptotic properties of the proposed tests, we study related high-dimensional random matrix theory, and establish several highly useful asymptotic results. With the aid of these asymptotic results, we derive the limiting distributions of these two tests under the null and alternative hypotheses. We further show that the quadratic loss based test is asymptotically unbiased. We conduct Monte Carlo simulation study to examine the finite sample performance of the two tests. Our simulation results show that the limiting null distributions approximate their null distributions quite well, and the corresponding asymptotic critical values keep Type I error rate very well. Our numerical comparison implies that the proposed tests outperform existing ones in terms of controlling Type I error rate and power. Our simulation indicates that the test based on quadratic loss seems to have better power than the test based on entropy loss. Full Article
si Quantile regression under memory constraint By projecteuclid.org Published On :: Wed, 30 Oct 2019 22:03 EDT Xi Chen, Weidong Liu, Yichen Zhang. Source: The Annals of Statistics, Volume 47, Number 6, 3244--3273.Abstract: This paper studies the inference problem in quantile regression (QR) for a large sample size $n$ but under a limited memory constraint, where the memory can only store a small batch of data of size $m$. A natural method is the naive divide-and-conquer approach, which splits data into batches of size $m$, computes the local QR estimator for each batch and then aggregates the estimators via averaging. However, this method only works when $n=o(m^{2})$ and is computationally expensive. This paper proposes a computationally efficient method, which only requires an initial QR estimator on a small batch of data and then successively refines the estimator via multiple rounds of aggregations. Theoretically, as long as $n$ grows polynomially in $m$, we establish the asymptotic normality for the obtained estimator and show that our estimator with only a few rounds of aggregations achieves the same efficiency as the QR estimator computed on all the data. Moreover, our result allows the case that the dimensionality $p$ goes to infinity. The proposed method can also be applied to address the QR problem under distributed computing environment (e.g., in a large-scale sensor network) or for real-time streaming data. Full Article
si On partial-sum processes of ARMAX residuals By projecteuclid.org Published On :: Wed, 30 Oct 2019 22:03 EDT Steffen Grønneberg, Benjamin Holcblat. Source: The Annals of Statistics, Volume 47, Number 6, 3216--3243.Abstract: We establish general and versatile results regarding the limit behavior of the partial-sum process of ARMAX residuals. Illustrations include ARMA with seasonal dummies, misspecified ARMAX models with autocorrelated errors, nonlinear ARMAX models, ARMA with a structural break, a wide range of ARMAX models with infinite-variance errors, weak GARCH models and the consistency of kernel estimation of the density of ARMAX errors. Our results identify the limit distributions, and provide a general algorithm to obtain pivot statistics for CUSUM tests. Full Article
si Statistical inference for autoregressive models under heteroscedasticity of unknown form By projecteuclid.org Published On :: Wed, 30 Oct 2019 22:03 EDT Ke Zhu. Source: The Annals of Statistics, Volume 47, Number 6, 3185--3215.Abstract: This paper provides an entire inference procedure for the autoregressive model under (conditional) heteroscedasticity of unknown form with a finite variance. We first establish the asymptotic normality of the weighted least absolute deviations estimator (LADE) for the model. Second, we develop the random weighting (RW) method to estimate its asymptotic covariance matrix, leading to the implementation of the Wald test. Third, we construct a portmanteau test for model checking, and use the RW method to obtain its critical values. As a special weighted LADE, the feasible adaptive LADE (ALADE) is proposed and proved to have the same efficiency as its infeasible counterpart. The importance of our entire methodology based on the feasible ALADE is illustrated by simulation results and the real data analysis on three U.S. economic data sets. Full Article
si Adaptive estimation of the rank of the coefficient matrix in high-dimensional multivariate response regression models By projecteuclid.org Published On :: Wed, 30 Oct 2019 22:03 EDT Xin Bing, Marten H. Wegkamp. Source: The Annals of Statistics, Volume 47, Number 6, 3157--3184.Abstract: We consider the multivariate response regression problem with a regression coefficient matrix of low, unknown rank. In this setting, we analyze a new criterion for selecting the optimal reduced rank. This criterion differs notably from the one proposed in Bunea, She and Wegkamp ( Ann. Statist. 39 (2011) 1282–1309) in that it does not require estimation of the unknown variance of the noise, nor does it depend on a delicate choice of a tuning parameter. We develop an iterative, fully data-driven procedure, that adapts to the optimal signal-to-noise ratio. This procedure finds the true rank in a few steps with overwhelming probability. At each step, our estimate increases, while at the same time it does not exceed the true rank. Our finite sample results hold for any sample size and any dimension, even when the number of responses and of covariates grow much faster than the number of observations. We perform an extensive simulation study that confirms our theoretical findings. The new method performs better and is more stable than the procedure of Bunea, She and Wegkamp ( Ann. Statist. 39 (2011) 1282–1309) in both low- and high-dimensional settings. Full Article
si Randomized incomplete $U$-statistics in high dimensions By projecteuclid.org Published On :: Wed, 30 Oct 2019 22:03 EDT Xiaohui Chen, Kengo Kato. Source: The Annals of Statistics, Volume 47, Number 6, 3127--3156.Abstract: This paper studies inference for the mean vector of a high-dimensional $U$-statistic. In the era of big data, the dimension $d$ of the $U$-statistic and the sample size $n$ of the observations tend to be both large, and the computation of the $U$-statistic is prohibitively demanding. Data-dependent inferential procedures such as the empirical bootstrap for $U$-statistics is even more computationally expensive. To overcome such a computational bottleneck, incomplete $U$-statistics obtained by sampling fewer terms of the $U$-statistic are attractive alternatives. In this paper, we introduce randomized incomplete $U$-statistics with sparse weights whose computational cost can be made independent of the order of the $U$-statistic. We derive nonasymptotic Gaussian approximation error bounds for the randomized incomplete $U$-statistics in high dimensions, namely in cases where the dimension $d$ is possibly much larger than the sample size $n$, for both nondegenerate and degenerate kernels. In addition, we propose generic bootstrap methods for the incomplete $U$-statistics that are computationally much less demanding than existing bootstrap methods, and establish finite sample validity of the proposed bootstrap methods. Our methods are illustrated on the application to nonparametric testing for the pairwise independence of a high-dimensional random vector under weaker assumptions than those appearing in the literature. Full Article
si Sorted concave penalized regression By projecteuclid.org Published On :: Wed, 30 Oct 2019 22:03 EDT Long Feng, Cun-Hui Zhang. Source: The Annals of Statistics, Volume 47, Number 6, 3069--3098.Abstract: The Lasso is biased. Concave penalized least squares estimation (PLSE) takes advantage of signal strength to reduce this bias, leading to sharper error bounds in prediction, coefficient estimation and variable selection. For prediction and estimation, the bias of the Lasso can be also reduced by taking a smaller penalty level than what selection consistency requires, but such smaller penalty level depends on the sparsity of the true coefficient vector. The sorted $ell_{1}$ penalized estimation (Slope) was proposed for adaptation to such smaller penalty levels. However, the advantages of concave PLSE and Slope do not subsume each other. We propose sorted concave penalized estimation to combine the advantages of concave and sorted penalizations. We prove that sorted concave penalties adaptively choose the smaller penalty level and at the same time benefits from signal strength, especially when a significant proportion of signals are stronger than the corresponding adaptively selected penalty levels. A local convex approximation for sorted concave penalties, which extends the local linear and quadratic approximations for separable concave penalties, is developed to facilitate the computation of sorted concave PLSE and proven to possess desired prediction and estimation error bounds. Our analysis of prediction and estimation errors requires the restricted eigenvalue condition on the design, not beyond, and provides selection consistency under a required minimum signal strength condition in addition. Thus, our results also sharpens existing results on concave PLSE by removing the upper sparse eigenvalue component of the sparse Riesz condition. Full Article
si Testing for independence of large dimensional vectors By projecteuclid.org Published On :: Fri, 02 Aug 2019 22:04 EDT Taras Bodnar, Holger Dette, Nestor Parolya. Source: The Annals of Statistics, Volume 47, Number 5, 2977--3008.Abstract: In this paper, new tests for the independence of two high-dimensional vectors are investigated. We consider the case where the dimension of the vectors increases with the sample size and propose multivariate analysis of variance-type statistics for the hypothesis of a block diagonal covariance matrix. The asymptotic properties of the new test statistics are investigated under the null hypothesis and the alternative hypothesis using random matrix theory. For this purpose, we study the weak convergence of linear spectral statistics of central and (conditionally) noncentral Fisher matrices. In particular, a central limit theorem for linear spectral statistics of large dimensional (conditionally) noncentral Fisher matrices is derived which is then used to analyse the power of the tests under the alternative. The theoretical results are illustrated by means of a simulation study where we also compare the new tests with several alternative, in particular with the commonly used corrected likelihood ratio test. It is demonstrated that the latter test does not keep its nominal level, if the dimension of one sub-vector is relatively small compared to the dimension of the other sub-vector. On the other hand, the tests proposed in this paper provide a reasonable approximation of the nominal level in such situations. Moreover, we observe that one of the proposed tests is most powerful under a variety of correlation scenarios. Full Article
si Inference for the mode of a log-concave density By projecteuclid.org Published On :: Fri, 02 Aug 2019 22:04 EDT Charles R. Doss, Jon A. Wellner. Source: The Annals of Statistics, Volume 47, Number 5, 2950--2976.Abstract: We study a likelihood ratio test for the location of the mode of a log-concave density. Our test is based on comparison of the log-likelihoods corresponding to the unconstrained maximum likelihood estimator of a log-concave density and the constrained maximum likelihood estimator where the constraint is that the mode of the density is fixed, say at $m$. The constrained estimation problem is studied in detail in Doss and Wellner (2018). Here, the results of that paper are used to show that, under the null hypothesis (and strict curvature of $-log f$ at the mode), the likelihood ratio statistic is asymptotically pivotal: that is, it converges in distribution to a limiting distribution which is free of nuisance parameters, thus playing the role of the $chi_{1}^{2}$ distribution in classical parametric statistical problems. By inverting this family of tests, we obtain new (likelihood ratio based) confidence intervals for the mode of a log-concave density $f$. These new intervals do not depend on any smoothing parameters. We study the new confidence intervals via Monte Carlo methods and illustrate them with two real data sets. The new intervals seem to have several advantages over existing procedures. Software implementing the test and confidence intervals is available in the R package verb+logcondens.mode+. Full Article
si Projected spline estimation of the nonparametric function in high-dimensional partially linear models for massive data By projecteuclid.org Published On :: Fri, 02 Aug 2019 22:04 EDT Heng Lian, Kaifeng Zhao, Shaogao Lv. Source: The Annals of Statistics, Volume 47, Number 5, 2922--2949.Abstract: In this paper, we consider the local asymptotics of the nonparametric function in a partially linear model, within the framework of the divide-and-conquer estimation. Unlike the fixed-dimensional setting in which the parametric part does not affect the nonparametric part, the high-dimensional setting makes the issue more complicated. In particular, when a sparsity-inducing penalty such as lasso is used to make the estimation of the linear part feasible, the bias introduced will propagate to the nonparametric part. We propose a novel approach for estimation of the nonparametric function and establish the local asymptotics of the estimator. The result is useful for massive data with possibly different linear coefficients in each subpopulation but common nonparametric function. Some numerical illustrations are also presented. Full Article
si Test for high-dimensional correlation matrices By projecteuclid.org Published On :: Fri, 02 Aug 2019 22:04 EDT Shurong Zheng, Guanghui Cheng, Jianhua Guo, Hongtu Zhu. Source: The Annals of Statistics, Volume 47, Number 5, 2887--2921.Abstract: Testing correlation structures has attracted extensive attention in the literature due to both its importance in real applications and several major theoretical challenges. The aim of this paper is to develop a general framework of testing correlation structures for the one , two and multiple sample testing problems under a high-dimensional setting when both the sample size and data dimension go to infinity. Our test statistics are designed to deal with both the dense and sparse alternatives. We systematically investigate the asymptotic null distribution, power function and unbiasedness of each test statistic. Theoretically, we make great efforts to deal with the nonindependency of all random matrices of the sample correlation matrices. We use simulation studies and real data analysis to illustrate the versatility and practicability of our test statistics. Full Article
si Eigenvalue distributions of variance components estimators in high-dimensional random effects models By projecteuclid.org Published On :: Fri, 02 Aug 2019 22:04 EDT Zhou Fan, Iain M. Johnstone. Source: The Annals of Statistics, Volume 47, Number 5, 2855--2886.Abstract: We study the spectra of MANOVA estimators for variance component covariance matrices in multivariate random effects models. When the dimensionality of the observations is large and comparable to the number of realizations of each random effect, we show that the empirical spectra of such estimators are well approximated by deterministic laws. The Stieltjes transforms of these laws are characterized by systems of fixed-point equations, which are numerically solvable by a simple iterative procedure. Our proof uses operator-valued free probability theory, and we establish a general asymptotic freeness result for families of rectangular orthogonally invariant random matrices, which is of independent interest. Our work is motivated in part by the estimation of components of covariance between multiple phenotypic traits in quantitative genetics, and we specialize our results to common experimental designs that arise in this application. Full Article
si A unified treatment of multiple testing with prior knowledge using the p-filter By projecteuclid.org Published On :: Fri, 02 Aug 2019 22:04 EDT Aaditya K. Ramdas, Rina F. Barber, Martin J. Wainwright, Michael I. Jordan. Source: The Annals of Statistics, Volume 47, Number 5, 2790--2821.Abstract: There is a significant literature on methods for incorporating knowledge into multiple testing procedures so as to improve their power and precision. Some common forms of prior knowledge include (a) beliefs about which hypotheses are null, modeled by nonuniform prior weights; (b) differing importances of hypotheses, modeled by differing penalties for false discoveries; (c) multiple arbitrary partitions of the hypotheses into (possibly overlapping) groups and (d) knowledge of independence, positive or arbitrary dependence between hypotheses or groups, suggesting the use of more aggressive or conservative procedures. We present a unified algorithmic framework called p-filter for global null testing and false discovery rate (FDR) control that allows the scientist to incorporate all four types of prior knowledge (a)–(d) simultaneously, recovering a variety of known algorithms as special cases. Full Article