us Bayesian variance estimation in the Gaussian sequence model with partial information on the means By projecteuclid.org Published On :: Mon, 27 Apr 2020 22:02 EDT Gianluca Finocchio, Johannes Schmidt-Hieber. Source: Electronic Journal of Statistics, Volume 14, Number 1, 239--271.Abstract: Consider the Gaussian sequence model under the additional assumption that a fixed fraction of the means is known. We study the problem of variance estimation from a frequentist Bayesian perspective. The maximum likelihood estimator (MLE) for $sigma^{2}$ is biased and inconsistent. This raises the question whether the posterior is able to correct the MLE in this case. By developing a new proving strategy that uses refined properties of the posterior distribution, we find that the marginal posterior is inconsistent for any i.i.d. prior on the mean parameters. In particular, no assumption on the decay of the prior needs to be imposed. Surprisingly, we also find that consistency can be retained for a hierarchical prior based on Gaussian mixtures. In this case we also establish a limiting shape result and determine the limit distribution. In contrast to the classical Bernstein-von Mises theorem, the limit is non-Gaussian. We show that the Bayesian analysis leads to new statistical estimators outperforming the correctly calibrated MLE in a numerical simulation study. Full Article
us Efficient estimation in expectile regression using envelope models By projecteuclid.org Published On :: Thu, 23 Apr 2020 22:01 EDT Tuo Chen, Zhihua Su, Yi Yang, Shanshan Ding. Source: Electronic Journal of Statistics, Volume 14, Number 1, 143--173.Abstract: As a generalization of the classical linear regression, expectile regression (ER) explores the relationship between the conditional expectile of a response variable and a set of predictor variables. ER with respect to different expectile levels can provide a comprehensive picture of the conditional distribution of the response variable given the predictors. We adopt an efficient estimation method called the envelope model ([8]) in ER, and construct a novel envelope expectile regression (EER) model. Estimation of the EER parameters can be performed using the generalized method of moments (GMM). We establish the consistency and derive the asymptotic distribution of the EER estimators. In addition, we show that the EER estimators are asymptotically more efficient than the ER estimators. Numerical experiments and real data examples are provided to demonstrate the efficiency gains attained by EER compared to ER, and the efficiency gains can further lead to improvements in prediction. Full Article
us Nonparametric false discovery rate control for identifying simultaneous signals By projecteuclid.org Published On :: Thu, 23 Apr 2020 22:01 EDT Sihai Dave Zhao, Yet Tien Nguyen. Source: Electronic Journal of Statistics, Volume 14, Number 1, 110--142.Abstract: It is frequently of interest to identify simultaneous signals, defined as features that exhibit statistical significance across each of several independent experiments. For example, genes that are consistently differentially expressed across experiments in different animal species can reveal evolutionarily conserved biological mechanisms. However, in some problems the test statistics corresponding to these features can have complicated or unknown null distributions. This paper proposes a novel nonparametric false discovery rate control procedure that can identify simultaneous signals even without knowing these null distributions. The method is shown, theoretically and in simulations, to asymptotically control the false discovery rate. It was also used to identify genes that were both differentially expressed and proximal to differentially accessible chromatin in the brains of mice exposed to a conspecific intruder. The proposed method is available in the R package github.com/sdzhao/ssa. Full Article
us Model-based clustering with envelopes By projecteuclid.org Published On :: Thu, 23 Apr 2020 22:01 EDT Wenjing Wang, Xin Zhang, Qing Mai. Source: Electronic Journal of Statistics, Volume 14, Number 1, 82--109.Abstract: Clustering analysis is an important unsupervised learning technique in multivariate statistics and machine learning. In this paper, we propose a set of new mixture models called CLEMM (in short for Clustering with Envelope Mixture Models) that is based on the widely used Gaussian mixture model assumptions and the nascent research area of envelope methodology. Formulated mostly for regression models, envelope methodology aims for simultaneous dimension reduction and efficient parameter estimation, and includes a very recent formulation of envelope discriminant subspace for classification and discriminant analysis. Motivated by the envelope discriminant subspace pursuit in classification, we consider parsimonious probabilistic mixture models where the cluster analysis can be improved by projecting the data onto a latent lower-dimensional subspace. The proposed CLEMM framework and the associated envelope-EM algorithms thus provide foundations for envelope methods in unsupervised and semi-supervised learning problems. Numerical studies on simulated data and two benchmark data sets show significant improvement of our propose methods over the classical methods such as Gaussian mixture models, K-means and hierarchical clustering algorithms. An R package is available at https://github.com/kusakehan/CLEMM. Full Article
us Simultaneous transformation and rounding (STAR) models for integer-valued data By projecteuclid.org Published On :: Wed, 15 Apr 2020 04:02 EDT Daniel R. Kowal, Antonio Canale. Source: Electronic Journal of Statistics, Volume 14, Number 1, 1744--1772.Abstract: We propose a simple yet powerful framework for modeling integer-valued data, such as counts, scores, and rounded data. The data-generating process is defined by Simultaneously Transforming and Rounding (STAR) a continuous-valued process, which produces a flexible family of integer-valued distributions capable of modeling zero-inflation, bounded or censored data, and over- or underdispersion. The transformation is modeled as unknown for greater distributional flexibility, while the rounding operation ensures a coherent integer-valued data-generating process. An efficient MCMC algorithm is developed for posterior inference and provides a mechanism for adaptation of successful Bayesian models and algorithms for continuous data to the integer-valued data setting. Using the STAR framework, we design a new Bayesian Additive Regression Tree model for integer-valued data, which demonstrates impressive predictive distribution accuracy for both synthetic data and a large healthcare utilization dataset. For interpretable regression-based inference, we develop a STAR additive model, which offers greater flexibility and scalability than existing integer-valued models. The STAR additive model is applied to study the recent decline in Amazon river dolphins. Full Article
us A Bayesian approach to disease clustering using restricted Chinese restaurant processes By projecteuclid.org Published On :: Wed, 08 Apr 2020 22:01 EDT Claudia Wehrhahn, Samuel Leonard, Abel Rodriguez, Tatiana Xifara. Source: Electronic Journal of Statistics, Volume 14, Number 1, 1449--1478.Abstract: Identifying disease clusters (areas with an unusually high incidence of a particular disease) is a common problem in epidemiology and public health. We describe a Bayesian nonparametric mixture model for disease clustering that constrains clusters to be made of adjacent areal units. This is achieved by modifying the exchangeable partition probability function associated with the Ewen’s sampling distribution. We call the resulting prior the Restricted Chinese Restaurant Process, as the associated full conditional distributions resemble those associated with the standard Chinese Restaurant Process. The model is illustrated using synthetic data sets and in an application to oral cancer mortality in Germany. Full Article
us Computing the degrees of freedom of rank-regularized estimators and cousins By projecteuclid.org Published On :: Thu, 26 Mar 2020 22:03 EDT Rahul Mazumder, Haolei Weng. Source: Electronic Journal of Statistics, Volume 14, Number 1, 1348--1385.Abstract: Estimating a low rank matrix from its linear measurements is a problem of central importance in contemporary statistical analysis. The choice of tuning parameters for estimators remains an important challenge from a theoretical and practical perspective. To this end, Stein’s Unbiased Risk Estimate (SURE) framework provides a well-grounded statistical framework for degrees of freedom estimation. In this paper, we use the SURE framework to obtain degrees of freedom estimates for a general class of spectral regularized matrix estimators—our results generalize beyond the class of estimators that have been studied thus far. To this end, we use a result due to Shapiro (2002) pertaining to the differentiability of symmetric matrix valued functions, developed in the context of semidefinite optimization algorithms. We rigorously verify the applicability of Stein’s Lemma towards the derivation of degrees of freedom estimates; and also present new techniques based on Gaussian convolution to estimate the degrees of freedom of a class of spectral estimators, for which Stein’s Lemma does not directly apply. Full Article
us Differential network inference via the fused D-trace loss with cross variables By projecteuclid.org Published On :: Tue, 24 Mar 2020 22:01 EDT Yichong Wu, Tiejun Li, Xiaoping Liu, Luonan Chen. Source: Electronic Journal of Statistics, Volume 14, Number 1, 1269--1301.Abstract: Detecting the change of biological interaction networks is of great importance in biological and medical research. We proposed a simple loss function, named as CrossFDTL, to identify the network change or differential network by estimating the difference between two precision matrices under Gaussian assumption. The CrossFDTL is a natural fusion of the D-trace loss for the considered two networks by imposing the $ell _{1}$ penalty to the differential matrix to ensure sparsity. The key point of our method is to utilize the cross variables, which correspond to the sum and difference of two precision matrices instead of using their original forms. Moreover, we developed an efficient minimization algorithm for the proposed loss function and further rigorously proved its convergence. Numerical results showed that our method outperforms the existing methods in both accuracy and convergence speed for the simulated and real data. Full Article
us $k$-means clustering of extremes By projecteuclid.org Published On :: Mon, 02 Mar 2020 22:02 EST Anja Janßen, Phyllis Wan. Source: Electronic Journal of Statistics, Volume 14, Number 1, 1211--1233.Abstract: The $k$-means clustering algorithm and its variant, the spherical $k$-means clustering, are among the most important and popular methods in unsupervised learning and pattern detection. In this paper, we explore how the spherical $k$-means algorithm can be applied in the analysis of only the extremal observations from a data set. By making use of multivariate extreme value analysis we show how it can be adopted to find “prototypes” of extremal dependence and derive a consistency result for our suggested estimator. In the special case of max-linear models we show furthermore that our procedure provides an alternative way of statistical inference for this class of models. Finally, we provide data examples which show that our method is able to find relevant patterns in extremal observations and allows us to classify extremal events. Full Article
us Modal clustering asymptotics with applications to bandwidth selection By projecteuclid.org Published On :: Fri, 07 Feb 2020 22:03 EST Alessandro Casa, José E. Chacón, Giovanna Menardi. Source: Electronic Journal of Statistics, Volume 14, Number 1, 835--856.Abstract: Density-based clustering relies on the idea of linking groups to some specific features of the probability distribution underlying the data. The reference to a true, yet unknown, population structure allows framing the clustering problem in a standard inferential setting, where the concept of ideal population clustering is defined as the partition induced by the true density function. The nonparametric formulation of this approach, known as modal clustering, draws a correspondence between the groups and the domains of attraction of the density modes. Operationally, a nonparametric density estimate is required and a proper selection of the amount of smoothing, governing the shape of the density and hence possibly the modal structure, is crucial to identify the final partition. In this work, we address the issue of density estimation for modal clustering from an asymptotic perspective. A natural and easy to interpret metric to measure the distance between density-based partitions is discussed, its asymptotic approximation explored, and employed to study the problem of bandwidth selection for nonparametric modal clustering. Full Article
us Profile likelihood biclustering By projecteuclid.org Published On :: Fri, 31 Jan 2020 04:01 EST Cheryl Flynn, Patrick Perry. Source: Electronic Journal of Statistics, Volume 14, Number 1, 731--768.Abstract: Biclustering, the process of simultaneously clustering the rows and columns of a data matrix, is a popular and effective tool for finding structure in a high-dimensional dataset. Many biclustering procedures appear to work well in practice, but most do not have associated consistency guarantees. To address this shortcoming, we propose a new biclustering procedure based on profile likelihood. The procedure applies to a broad range of data modalities, including binary, count, and continuous observations. We prove that the procedure recovers the true row and column classes when the dimensions of the data matrix tend to infinity, even if the functional form of the data distribution is misspecified. The procedure requires computing a combinatorial search, which can be expensive in practice. Rather than performing this search directly, we propose a new heuristic optimization procedure based on the Kernighan-Lin heuristic, which has nice computational properties and performs well in simulations. We demonstrate our procedure with applications to congressional voting records, and microarray analysis. Full Article
us Path-Based Spectral Clustering: Guarantees, Robustness to Outliers, and Fast Algorithms By Published On :: 2020 We consider the problem of clustering with the longest-leg path distance (LLPD) metric, which is informative for elongated and irregularly shaped clusters. We prove finite-sample guarantees on the performance of clustering with respect to this metric when random samples are drawn from multiple intrinsically low-dimensional clusters in high-dimensional space, in the presence of a large number of high-dimensional outliers. By combining these results with spectral clustering with respect to LLPD, we provide conditions under which the Laplacian eigengap statistic correctly determines the number of clusters for a large class of data sets, and prove guarantees on the labeling accuracy of the proposed algorithm. Our methods are quite general and provide performance guarantees for spectral clustering with any ultrametric. We also introduce an efficient, easy to implement approximation algorithm for the LLPD based on a multiscale analysis of adjacency graphs, which allows for the runtime of LLPD spectral clustering to be quasilinear in the number of data points. Full Article
us Weighted Message Passing and Minimum Energy Flow for Heterogeneous Stochastic Block Models with Side Information By Published On :: 2020 We study the misclassification error for community detection in general heterogeneous stochastic block models (SBM) with noisy or partial label information. We establish a connection between the misclassification rate and the notion of minimum energy on the local neighborhood of the SBM. We develop an optimally weighted message passing algorithm to reconstruct labels for SBM based on the minimum energy flow and the eigenvectors of a certain Markov transition matrix. The general SBM considered in this paper allows for unequal-size communities, degree heterogeneity, and different connection probabilities among blocks. We focus on how to optimally weigh the message passing to improve misclassification. Full Article
us Perturbation Bounds for Procrustes, Classical Scaling, and Trilateration, with Applications to Manifold Learning By Published On :: 2020 One of the common tasks in unsupervised learning is dimensionality reduction, where the goal is to find meaningful low-dimensional structures hidden in high-dimensional data. Sometimes referred to as manifold learning, this problem is closely related to the problem of localization, which aims at embedding a weighted graph into a low-dimensional Euclidean space. Several methods have been proposed for localization, and also manifold learning. Nonetheless, the robustness property of most of them is little understood. In this paper, we obtain perturbation bounds for classical scaling and trilateration, which are then applied to derive performance bounds for Isomap, Landmark Isomap, and Maximum Variance Unfolding. A new perturbation bound for procrustes analysis plays a key role. Full Article
us Connecting Spectral Clustering to Maximum Margins and Level Sets By Published On :: 2020 We study the connections between spectral clustering and the problems of maximum margin clustering, and estimation of the components of level sets of a density function. Specifically, we obtain bounds on the eigenvectors of graph Laplacian matrices in terms of the between cluster separation, and within cluster connectivity. These bounds ensure that the spectral clustering solution converges to the maximum margin clustering solution as the scaling parameter is reduced towards zero. The sensitivity of maximum margin clustering solutions to outlying points is well known, but can be mitigated by first removing such outliers, and applying maximum margin clustering to the remaining points. If outliers are identified using an estimate of the underlying probability density, then the remaining points may be seen as an estimate of a level set of this density function. We show that such an approach can be used to consistently estimate the components of the level sets of a density function under very mild assumptions. Full Article
us Targeted Fused Ridge Estimation of Inverse Covariance Matrices from Multiple High-Dimensional Data Classes By Published On :: 2020 We consider the problem of jointly estimating multiple inverse covariance matrices from high-dimensional data consisting of distinct classes. An $ell_2$-penalized maximum likelihood approach is employed. The suggested approach is flexible and generic, incorporating several other $ell_2$-penalized estimators as special cases. In addition, the approach allows specification of target matrices through which prior knowledge may be incorporated and which can stabilize the estimation procedure in high-dimensional settings. The result is a targeted fused ridge estimator that is of use when the precision matrices of the constituent classes are believed to chiefly share the same structure while potentially differing in a number of locations of interest. It has many applications in (multi)factorial study designs. We focus on the graphical interpretation of precision matrices with the proposed estimator then serving as a basis for integrative or meta-analytic Gaussian graphical modeling. Situations are considered in which the classes are defined by data sets and subtypes of diseases. The performance of the proposed estimator in the graphical modeling setting is assessed through extensive simulation experiments. Its practical usability is illustrated by the differential network modeling of 12 large-scale gene expression data sets of diffuse large B-cell lymphoma subtypes. The estimator and its related procedures are incorporated into the R-package rags2ridges. Full Article
us Provably robust estimation of modulo 1 samples of a smooth function with applications to phase unwrapping By Published On :: 2020 Consider an unknown smooth function $f: [0,1]^d ightarrow mathbb{R}$, and assume we are given $n$ noisy mod 1 samples of $f$, i.e., $y_i = (f(x_i) + eta_i) mod 1$, for $x_i in [0,1]^d$, where $eta_i$ denotes the noise. Given the samples $(x_i,y_i)_{i=1}^{n}$, our goal is to recover smooth, robust estimates of the clean samples $f(x_i) mod 1$. We formulate a natural approach for solving this problem, which works with angular embeddings of the noisy mod 1 samples over the unit circle, inspired by the angular synchronization framework. This amounts to solving a smoothness regularized least-squares problem -- a quadratically constrained quadratic program (QCQP) -- where the variables are constrained to lie on the unit circle. Our proposed approach is based on solving its relaxation, which is a trust-region sub-problem and hence solvable efficiently. We provide theoretical guarantees demonstrating its robustness to noise for adversarial, as well as random Gaussian and Bernoulli noise models. To the best of our knowledge, these are the first such theoretical results for this problem. We demonstrate the robustness and efficiency of our proposed approach via extensive numerical simulations on synthetic data, along with a simple least-squares based solution for the unwrapping stage, that recovers the original samples of $f$ (up to a global shift). It is shown to perform well at high levels of noise, when taking as input the denoised modulo $1$ samples. Finally, we also consider two other approaches for denoising the modulo 1 samples that leverage tools from Riemannian optimization on manifolds, including a Burer-Monteiro approach for a semidefinite programming relaxation of our formulation. For the two-dimensional version of the problem, which has applications in synthetic aperture radar interferometry (InSAR), we are able to solve instances of real-world data with a million sample points in under 10 seconds, on a personal laptop. Full Article
us Causal Discovery Toolbox: Uncovering causal relationships in Python By Published On :: 2020 This paper presents a new open source Python framework for causal discovery from observational data and domain background knowledge, aimed at causal graph and causal mechanism modeling. The cdt package implements an end-to-end approach, recovering the direct dependencies (the skeleton of the causal graph) and the causal relationships between variables. It includes algorithms from the `Bnlearn' and `Pcalg' packages, together with algorithms for pairwise causal discovery such as ANM. Full Article
us Latent Simplex Position Model: High Dimensional Multi-view Clustering with Uncertainty Quantification By Published On :: 2020 High dimensional data often contain multiple facets, and several clustering patterns can co-exist under different variable subspaces, also known as the views. While multi-view clustering algorithms were proposed, the uncertainty quantification remains difficult --- a particular challenge is in the high complexity of estimating the cluster assignment probability under each view, and sharing information among views. In this article, we propose an approximate Bayes approach --- treating the similarity matrices generated over the views as rough first-stage estimates for the co-assignment probabilities; in its Kullback-Leibler neighborhood, we obtain a refined low-rank matrix, formed by the pairwise product of simplex coordinates. Interestingly, each simplex coordinate directly encodes the cluster assignment uncertainty. For multi-view clustering, we let each view draw a parameterization from a few candidates, leading to dimension reduction. With high model flexibility, the estimation can be efficiently carried out as a continuous optimization problem, hence enjoys gradient-based computation. The theory establishes the connection of this model to a random partition distribution under multiple views. Compared to single-view clustering approaches, substantially more interpretable results are obtained when clustering brains from a human traumatic brain injury study, using high-dimensional gene expression data. Full Article
us Learning Linear Non-Gaussian Causal Models in the Presence of Latent Variables By Published On :: 2020 We consider the problem of learning causal models from observational data generated by linear non-Gaussian acyclic causal models with latent variables. Without considering the effect of latent variables, the inferred causal relationships among the observed variables are often wrong. Under faithfulness assumption, we propose a method to check whether there exists a causal path between any two observed variables. From this information, we can obtain the causal order among the observed variables. The next question is whether the causal effects can be uniquely identified as well. We show that causal effects among observed variables cannot be identified uniquely under mere assumptions of faithfulness and non-Gaussianity of exogenous noises. However, we are able to propose an efficient method that identifies the set of all possible causal effects that are compatible with the observational data. We present additional structural conditions on the causal graph under which causal effects among observed variables can be determined uniquely. Furthermore, we provide necessary and sufficient graphical conditions for unique identification of the number of variables in the system. Experiments on synthetic data and real-world data show the effectiveness of our proposed algorithm for learning causal models. Full Article
us Optimal Bipartite Network Clustering By Published On :: 2020 We study bipartite community detection in networks, or more generally the network biclustering problem. We present a fast two-stage procedure based on spectral initialization followed by the application of a pseudo-likelihood classifier twice. Under mild regularity conditions, we establish the weak consistency of the procedure (i.e., the convergence of the misclassification rate to zero) under a general bipartite stochastic block model. We show that the procedure is optimal in the sense that it achieves the optimal convergence rate that is achievable by a biclustering oracle, adaptively over the whole class, up to constants. This is further formalized by deriving a minimax lower bound over a class of biclustering problems. The optimal rate we obtain sharpens some of the existing results and generalizes others to a wide regime of average degree growth, from sparse networks with average degrees growing arbitrarily slowly to fairly dense networks with average degrees of order $sqrt{n}$. As a special case, we recover the known exact recovery threshold in the $log n$ regime of sparsity. To obtain the consistency result, as part of the provable version of the algorithm, we introduce a sub-block partitioning scheme that is also computationally attractive, allowing for distributed implementation of the algorithm without sacrificing optimality. The provable algorithm is derived from a general class of pseudo-likelihood biclustering algorithms that employ simple EM type updates. We show the effectiveness of this general class by numerical simulations. Full Article
us Switching Regression Models and Causal Inference in the Presence of Discrete Latent Variables By Published On :: 2020 Given a response $Y$ and a vector $X = (X^1, dots, X^d)$ of $d$ predictors, we investigate the problem of inferring direct causes of $Y$ among the vector $X$. Models for $Y$ that use all of its causal covariates as predictors enjoy the property of being invariant across different environments or interventional settings. Given data from such environments, this property has been exploited for causal discovery. Here, we extend this inference principle to situations in which some (discrete-valued) direct causes of $ Y $ are unobserved. Such cases naturally give rise to switching regression models. We provide sufficient conditions for the existence, consistency and asymptotic normality of the MLE in linear switching regression models with Gaussian noise, and construct a test for the equality of such models. These results allow us to prove that the proposed causal discovery method obtains asymptotic false discovery control under mild conditions. We provide an algorithm, make available code, and test our method on simulated data. It is robust against model violations and outperforms state-of-the-art approaches. We further apply our method to a real data set, where we show that it does not only output causal predictors, but also a process-based clustering of data points, which could be of additional interest to practitioners. Full Article
us Learning Causal Networks via Additive Faithfulness By Published On :: 2020 In this paper we introduce a statistical model, called additively faithful directed acyclic graph (AFDAG), for causal learning from observational data. Our approach is based on additive conditional independence (ACI), a recently proposed three-way statistical relation that shares many similarities with conditional independence but without resorting to multi-dimensional kernels. This distinct feature strikes a balance between a parametric model and a fully nonparametric model, which makes the proposed model attractive for handling large networks. We develop an estimator for AFDAG based on a linear operator that characterizes ACI, and establish the consistency and convergence rates of this estimator, as well as the uniform consistency of the estimated DAG. Moreover, we introduce a modified PC-algorithm to implement the estimating procedure efficiently, so that its complexity is determined by the level of sparseness rather than the dimension of the network. Through simulation studies we show that our method outperforms existing methods when commonly assumed conditions such as Gaussian or Gaussian copula distributions do not hold. Finally, the usefulness of AFDAG formulation is demonstrated through an application to a proteomics data set. Full Article
us High-Dimensional Inference for Cluster-Based Graphical Models By Published On :: 2020 Motivated by modern applications in which one constructs graphical models based on a very large number of features, this paper introduces a new class of cluster-based graphical models, in which variable clustering is applied as an initial step for reducing the dimension of the feature space. We employ model assisted clustering, in which the clusters contain features that are similar to the same unobserved latent variable. Two different cluster-based Gaussian graphical models are considered: the latent variable graph, corresponding to the graphical model associated with the unobserved latent variables, and the cluster-average graph, corresponding to the vector of features averaged over clusters. Our study reveals that likelihood based inference for the latent graph, not analyzed previously, is analytically intractable. Our main contribution is the development and analysis of alternative estimation and inference strategies, for the precision matrix of an unobservable latent vector Z. We replace the likelihood of the data by an appropriate class of empirical risk functions, that can be specialized to the latent graphical model and to the simpler, but under-analyzed, cluster-average graphical model. The estimators thus derived can be used for inference on the graph structure, for instance on edge strength or pattern recovery. Inference is based on the asymptotic limits of the entry-wise estimates of the precision matrices associated with the conditional independence graphs under consideration. While taking the uncertainty induced by the clustering step into account, we establish Berry-Esseen central limit theorems for the proposed estimators. It is noteworthy that, although the clusters are estimated adaptively from the data, the central limit theorems regarding the entries of the estimated graphs are proved under the same conditions one would use if the clusters were known in advance. As an illustration of the usage of these newly developed inferential tools, we show that they can be reliably used for recovery of the sparsity pattern of the graphs we study, under FDR control, which is verified via simulation studies and an fMRI data analysis. These experimental results confirm the theoretically established difference between the two graph structures. Furthermore, the data analysis suggests that the latent variable graph, corresponding to the unobserved cluster centers, can help provide more insight into the understanding of the brain connectivity networks relative to the simpler, average-based, graph. Full Article
us Robust Asynchronous Stochastic Gradient-Push: Asymptotically Optimal and Network-Independent Performance for Strongly Convex Functions By Published On :: 2020 We consider the standard model of distributed optimization of a sum of functions $F(mathbf z) = sum_{i=1}^n f_i(mathbf z)$, where node $i$ in a network holds the function $f_i(mathbf z)$. We allow for a harsh network model characterized by asynchronous updates, message delays, unpredictable message losses, and directed communication among nodes. In this setting, we analyze a modification of the Gradient-Push method for distributed optimization, assuming that (i) node $i$ is capable of generating gradients of its function $f_i(mathbf z)$ corrupted by zero-mean bounded-support additive noise at each step, (ii) $F(mathbf z)$ is strongly convex, and (iii) each $f_i(mathbf z)$ has Lipschitz gradients. We show that our proposed method asymptotically performs as well as the best bounds on centralized gradient descent that takes steps in the direction of the sum of the noisy gradients of all the functions $f_1(mathbf z), ldots, f_n(mathbf z)$ at each step. Full Article
us Exact Guarantees on the Absence of Spurious Local Minima for Non-negative Rank-1 Robust Principal Component Analysis By Published On :: 2020 This work is concerned with the non-negative rank-1 robust principal component analysis (RPCA), where the goal is to recover the dominant non-negative principal components of a data matrix precisely, where a number of measurements could be grossly corrupted with sparse and arbitrary large noise. Most of the known techniques for solving the RPCA rely on convex relaxation methods by lifting the problem to a higher dimension, which significantly increase the number of variables. As an alternative, the well-known Burer-Monteiro approach can be used to cast the RPCA as a non-convex and non-smooth $ell_1$ optimization problem with a significantly smaller number of variables. In this work, we show that the low-dimensional formulation of the symmetric and asymmetric positive rank-1 RPCA based on the Burer-Monteiro approach has benign landscape, i.e., 1) it does not have any spurious local solution, 2) has a unique global solution, and 3) its unique global solution coincides with the true components. An implication of this result is that simple local search algorithms are guaranteed to achieve a zero global optimality gap when directly applied to the low-dimensional formulation. Furthermore, we provide strong deterministic and probabilistic guarantees for the exact recovery of the true principal components. In particular, it is shown that a constant fraction of the measurements could be grossly corrupted and yet they would not create any spurious local solution. Full Article
us Generalized Optimal Matching Methods for Causal Inference By Published On :: 2020 We develop an encompassing framework for matching, covariate balancing, and doubly-robust methods for causal inference from observational data called generalized optimal matching (GOM). The framework is given by generalizing a new functional-analytical formulation of optimal matching, giving rise to the class of GOM methods, for which we provide a single unified theory to analyze tractability and consistency. Many commonly used existing methods are included in GOM and, using their GOM interpretation, can be extended to optimally and automatically trade off balance for variance and outperform their standard counterparts. As a subclass, GOM gives rise to kernel optimal matching (KOM), which, as supported by new theoretical and empirical results, is notable for combining many of the positive properties of other methods in one. KOM, which is solved as a linearly-constrained convex-quadratic optimization problem, inherits both the interpretability and model-free consistency of matching but can also achieve the $sqrt{n}$-consistency of well-specified regression and the bias reduction and robustness of doubly robust methods. In settings of limited overlap, KOM enables a very transparent method for interval estimation for partial identification and robust coverage. We demonstrate this in examples with both synthetic and real data. Full Article
us Smoothed Nonparametric Derivative Estimation using Weighted Difference Quotients By Published On :: 2020 Derivatives play an important role in bandwidth selection methods (e.g., plug-ins), data analysis and bias-corrected confidence intervals. Therefore, obtaining accurate derivative information is crucial. Although many derivative estimation methods exist, the majority require a fixed design assumption. In this paper, we propose an effective and fully data-driven framework to estimate the first and second order derivative in random design. We establish the asymptotic properties of the proposed derivative estimator, and also propose a fast selection method for the tuning parameters. The performance and flexibility of the method is illustrated via an extensive simulation study. Full Article
us Union of Low-Rank Tensor Spaces: Clustering and Completion By Published On :: 2020 We consider the problem of clustering and completing a set of tensors with missing data that are drawn from a union of low-rank tensor spaces. In the clustering problem, given a partially sampled tensor data that is composed of a number of subtensors, each chosen from one of a certain number of unknown tensor spaces, we need to group the subtensors that belong to the same tensor space. We provide a geometrical analysis on the sampling pattern and subsequently derive the sampling rate that guarantees the correct clustering under some assumptions with high probability. Moreover, we investigate the fundamental conditions for finite/unique completability for the union of tensor spaces completion problem. Both deterministic and probabilistic conditions on the sampling pattern to ensure finite/unique completability are obtained. For both the clustering and completion problems, our tensor analysis provides significantly better bound than the bound given by the matrix analysis applied to any unfolding of the tensor data. Full Article
us High-dimensional Gaussian graphical models on network-linked data By Published On :: 2020 Graphical models are commonly used to represent conditional dependence relationships between variables. There are multiple methods available for exploring them from high-dimensional data, but almost all of them rely on the assumption that the observations are independent and identically distributed. At the same time, observations connected by a network are becoming increasingly common, and tend to violate these assumptions. Here we develop a Gaussian graphical model for observations connected by a network with potentially different mean vectors, varying smoothly over the network. We propose an efficient estimation algorithm and demonstrate its effectiveness on both simulated and real data, obtaining meaningful and interpretable results on a statistics coauthorship network. We also prove that our method estimates both the inverse covariance matrix and the corresponding graph structure correctly under the assumption of network “cohesion”, which refers to the empirically observed phenomenon of network neighbors sharing similar traits. Full Article
us Identifiability of Additive Noise Models Using Conditional Variances By Published On :: 2020 This paper considers a new identifiability condition for additive noise models (ANMs) in which each variable is determined by an arbitrary Borel measurable function of its parents plus an independent error. It has been shown that ANMs are fully recoverable under some identifiability conditions, such as when all error variances are equal. However, this identifiable condition could be restrictive, and hence, this paper focuses on a relaxed identifiability condition that involves not only error variances, but also the influence of parents. This new class of identifiable ANMs does not put any constraints on the form of dependencies, or distributions of errors, and allows different error variances. It further provides a statistically consistent and computationally feasible structure learning algorithm for the identifiable ANMs based on the new identifiability condition. The proposed algorithm assumes that all relevant variables are observed, while it does not assume faithfulness or a sparse graph. Demonstrated through extensive simulated and real multivariate data is that the proposed algorithm successfully recovers directed acyclic graphs. Full Article
us TIGER: using artificial intelligence to discover our collections By feedproxy.google.com Published On :: Tue, 10 Mar 2020 22:01:20 +0000 The State Library of NSW has almost 4 million digital files in its collection. Full Article
us Researching the Pacific: The Pacific Manuscripts Bureau By feedproxy.google.com Published On :: Mon, 27 Apr 2020 05:25:40 +0000 The State Library holds a superb collection of original documents, illustrations, photographs and books about the Pacifi Full Article
us Access thousands of newspapers and magazines with PressReader By feedproxy.google.com Published On :: Mon, 04 May 2020 03:40:42 +0000 Want to access thousands of newspapers and magazines wherever you are? Full Article
us Q&A with Adam Ferguson By feedproxy.google.com Published On :: Tue, 05 May 2020 05:43:41 +0000 Each year the Library hosts the popular World Press Photo exhibition, bringing together award-winning photographs from t Full Article
us Share your fall and winter photos with us! By www.eastgwillimbury.ca Published On :: Sun, 03 May 2020 16:08:06 GMT Full Article
us A Bayesian sparse finite mixture model for clustering data from a heterogeneous population By projecteuclid.org Published On :: Mon, 04 May 2020 04:00 EDT Erlandson F. Saraiva, Adriano K. Suzuki, Luís A. Milan. Source: Brazilian Journal of Probability and Statistics, Volume 34, Number 2, 323--344.Abstract: In this paper, we introduce a Bayesian approach for clustering data using a sparse finite mixture model (SFMM). The SFMM is a finite mixture model with a large number of components $k$ previously fixed where many components can be empty. In this model, the number of components $k$ can be interpreted as the maximum number of distinct mixture components. Then, we explore the use of a prior distribution for the weights of the mixture model that take into account the possibility that the number of clusters $k_{mathbf{c}}$ (e.g., nonempty components) can be random and smaller than the number of components $k$ of the finite mixture model. In order to determine clusters we develop a MCMC algorithm denominated Split-Merge allocation sampler. In this algorithm, the split-merge strategy is data-driven and was inserted within the algorithm in order to increase the mixing of the Markov chain in relation to the number of clusters. The performance of the method is verified using simulated datasets and three real datasets. The first real data set is the benchmark galaxy data, while second and third are the publicly available data set on Enzyme and Acidity, respectively. Full Article
us Agnostic tests can control the type I and type II errors simultaneously By projecteuclid.org Published On :: Mon, 04 May 2020 04:00 EDT Victor Coscrato, Rafael Izbicki, Rafael B. Stern. Source: Brazilian Journal of Probability and Statistics, Volume 34, Number 2, 230--250.Abstract: Despite its common practice, statistical hypothesis testing presents challenges in interpretation. For instance, in the standard frequentist framework there is no control of the type II error. As a result, the non-rejection of the null hypothesis $(H_{0})$ cannot reasonably be interpreted as its acceptance. We propose that this dilemma can be overcome by using agnostic hypothesis tests, since they can control the type I and II errors simultaneously. In order to make this idea operational, we show how to obtain agnostic hypothesis in typical models. For instance, we show how to build (unbiased) uniformly most powerful agnostic tests and how to obtain agnostic tests from standard p-values. Also, we present conditions such that the above tests can be made logically coherent. Finally, we present examples of consistent agnostic hypothesis tests. Full Article
us A note on the “L-logistic regression models: Prior sensitivity analysis, robustness to outliers and applications” By projecteuclid.org Published On :: Mon, 03 Feb 2020 04:00 EST Saralees Nadarajah, Yuancheng Si. Source: Brazilian Journal of Probability and Statistics, Volume 34, Number 1, 183--187.Abstract: Da Paz, Balakrishnan and Bazan [Braz. J. Probab. Stat. 33 (2019), 455–479] introduced the L-logistic distribution, studied its properties including estimation issues and illustrated a data application. This note derives a closed form expression for moment properties of the distribution. Some computational issues are discussed. Full Article
us Robust Bayesian model selection for heavy-tailed linear regression using finite mixtures By projecteuclid.org Published On :: Mon, 03 Feb 2020 04:00 EST Flávio B. Gonçalves, Marcos O. Prates, Victor Hugo Lachos. Source: Brazilian Journal of Probability and Statistics, Volume 34, Number 1, 51--70.Abstract: In this paper, we present a novel methodology to perform Bayesian model selection in linear models with heavy-tailed distributions. We consider a finite mixture of distributions to model a latent variable where each component of the mixture corresponds to one possible model within the symmetrical class of normal independent distributions. Naturally, the Gaussian model is one of the possibilities. This allows for a simultaneous analysis based on the posterior probability of each model. Inference is performed via Markov chain Monte Carlo—a Gibbs sampler with Metropolis–Hastings steps for a class of parameters. Simulated examples highlight the advantages of this approach compared to a segregated analysis based on arbitrarily chosen model selection criteria. Examples with real data are presented and an extension to censored linear regression is introduced and discussed. Full Article
us Subjective Bayesian testing using calibrated prior probabilities By projecteuclid.org Published On :: Mon, 26 Aug 2019 04:00 EDT Dan J. Spitzner. Source: Brazilian Journal of Probability and Statistics, Volume 33, Number 4, 861--893.Abstract: This article proposes a calibration scheme for Bayesian testing that coordinates analytically-derived statistical performance considerations with expert opinion. In other words, the scheme is effective and meaningful for incorporating objective elements into subjective Bayesian inference. It explores a novel role for default priors as anchors for calibration rather than substitutes for prior knowledge. Ideas are developed for use with multiplicity adjustments in multiple-model contexts, and to address the issue of prior sensitivity of Bayes factors. Along the way, the performance properties of an existing multiplicity adjustment related to the Poisson distribution are clarified theoretically. Connections of the overall calibration scheme to the Schwarz criterion are also explored. The proposed framework is examined and illustrated on a number of existing data sets related to problems in clinical trials, forensic pattern matching, and log-linear models methodology. Full Article
us Bayesian modelling of the abilities in dichotomous IRT models via regression with missing values in the covariates By projecteuclid.org Published On :: Mon, 26 Aug 2019 04:00 EDT Flávio B. Gonçalves, Bárbara C. C. Dias. Source: Brazilian Journal of Probability and Statistics, Volume 33, Number 4, 782--800.Abstract: Educational assessment usually considers a contextual questionnaire to extract relevant information from the applicants. This may include items related to socio-economical profile as well as items to extract other characteristics potentially related to applicant’s performance in the test. A careful analysis of the questionnaires jointly with the test’s results may evidence important relations between profiles and test performance. The most coherent way to perform this task in a statistical context is to use the information from the questionnaire to help explain the variability of the abilities in a joint model-based approach. Nevertheless, the responses to the questionnaire typically present missing values which, in some cases, may be missing not at random. This paper proposes a statistical methodology to model the abilities in dichotomous IRT models using the information of the contextual questionnaires via linear regression. The proposed methodology models the missing data jointly with the all the observed data, which allows for the estimation of the former. The missing data modelling is flexible enough to allow the specification of missing not at random structures. Furthermore, even if those structures are not assumed a priori, they can be estimated from the posterior results when assuming missing (completely) at random structures a priori. Statistical inference is performed under the Bayesian paradigm via an efficient MCMC algorithm. Simulated and real examples are presented to investigate the efficiency and applicability of the proposed methodology. Full Article
us L-Logistic regression models: Prior sensitivity analysis, robustness to outliers and applications By projecteuclid.org Published On :: Mon, 10 Jun 2019 04:04 EDT Rosineide F. da Paz, Narayanaswamy Balakrishnan, Jorge Luis Bazán. Source: Brazilian Journal of Probability and Statistics, Volume 33, Number 3, 455--479.Abstract: Tadikamalla and Johnson [ Biometrika 69 (1982) 461–465] developed the $L_{B}$ distribution to variables with bounded support by considering a transformation of the standard Logistic distribution. In this manuscript, a convenient parametrization of this distribution is proposed in order to develop regression models. This distribution, referred to here as L-Logistic distribution, provides great flexibility and includes the uniform distribution as a particular case. Several properties of this distribution are studied, and a Bayesian approach is adopted for the parameter estimation. Simulation studies, considering prior sensitivity analysis, recovery of parameters and comparison of algorithms, and robustness to outliers are all discussed showing that the results are insensitive to the choice of priors, efficiency of the algorithm MCMC adopted, and robustness of the model when compared with the beta distribution. Applications to estimate the vulnerability to poverty and to explain the anxiety are performed. The results to applications show that the L-Logistic regression models provide a better fit than the corresponding beta regression models. Full Article
us Failure rate of Birnbaum–Saunders distributions: Shape, change-point, estimation and robustness By projecteuclid.org Published On :: Mon, 04 Mar 2019 04:00 EST Emilia Athayde, Assis Azevedo, Michelli Barros, Víctor Leiva. Source: Brazilian Journal of Probability and Statistics, Volume 33, Number 2, 301--328.Abstract: The Birnbaum–Saunders (BS) distribution has been largely studied and applied. A random variable with BS distribution is a transformation of another random variable with standard normal distribution. Generalized BS distributions are obtained when the normally distributed random variable is replaced by another symmetrically distributed random variable. This allows us to obtain a wide class of positively skewed models with lighter and heavier tails than the BS model. Its failure rate admits several shapes, including the unimodal case, with its change-point being able to be used for different purposes. For example, to establish the reduction in a dose, and then in the cost of the medical treatment. We analyze the failure rates of generalized BS distributions obtained by the logistic, normal and Student-t distributions, considering their shape and change-point, estimating them, evaluating their robustness, assessing their performance by simulations, and applying the results to real data from different areas. Full Article
us Bayesian robustness to outliers in linear regression and ratio estimation By projecteuclid.org Published On :: Mon, 04 Mar 2019 04:00 EST Alain Desgagné, Philippe Gagnon. Source: Brazilian Journal of Probability and Statistics, Volume 33, Number 2, 205--221.Abstract: Whole robustness is a nice property to have for statistical models. It implies that the impact of outliers gradually vanishes as they approach plus or minus infinity. So far, the Bayesian literature provides results that ensure whole robustness for the location-scale model. In this paper, we make two contributions. First, we generalise the results to attain whole robustness in simple linear regression through the origin, which is a necessary step towards results for general linear regression models. We allow the variance of the error term to depend on the explanatory variable. This flexibility leads to the second contribution: we provide a simple Bayesian approach to robustly estimate finite population means and ratios. The strategy to attain whole robustness is simple since it lies in replacing the traditional normal assumption on the error term by a super heavy-tailed distribution assumption. As a result, users can estimate the parameters as usual, using the posterior distribution. Full Article
us Simple tail index estimation for dependent and heterogeneous data with missing values By projecteuclid.org Published On :: Mon, 14 Jan 2019 04:01 EST Ivana Ilić, Vladica M. Veličković. Source: Brazilian Journal of Probability and Statistics, Volume 33, Number 1, 192--203.Abstract: Financial returns are known to be nonnormal and tend to have fat-tailed distribution. Also, the dependence of large values in a stochastic process is an important topic in risk, insurance and finance. In the presence of missing values, we deal with the asymptotic properties of a simple “median” estimator of the tail index based on random variables with the heavy-tailed distribution function and certain dependence among the extremes. Weak consistency and asymptotic normality of the proposed estimator are established. The estimator is a special case of a well-known estimator defined in Bacro and Brito [ Statistics & Decisions 3 (1993) 133–143]. The advantage of the estimator is its robustness against deviations and compared to Hill’s, it is less affected by the fluctuations related to the maximum of the sample or by the presence of outliers. Several examples are analyzed in order to support the proofs. Full Article
us The equivalence of dynamic and static asset allocations under the uncertainty caused by Poisson processes By projecteuclid.org Published On :: Mon, 14 Jan 2019 04:01 EST Yong-Chao Zhang, Na Zhang. Source: Brazilian Journal of Probability and Statistics, Volume 33, Number 1, 184--191.Abstract: We investigate the equivalence of dynamic and static asset allocations in the case where the price process of a risky asset is driven by a Poisson process. Under some mild conditions, we obtain a necessary and sufficient condition for the equivalence of dynamic and static asset allocations. In addition, we provide a simple sufficient condition for the equivalence. Full Article
us Unlikeness is us : fourteen from the Exeter book By dal.novanet.ca Published On :: Fri, 1 May 2020 19:34:09 -0300 Author: Exeter book. Selections. EnglishCallnumber: PS 8631 A8489 E94 2018ISBN: 9781554471751 (softcover) Full Article
us Odysseus asleep : uncollected sequences, 1994-2019 By dal.novanet.ca Published On :: Fri, 1 May 2020 19:34:09 -0300 Author: Sanger, Peter, 1943- author.Callnumber: PS 8587 A372 O44 2019ISBN: 9781554472048 Full Article
us Nights below Foord Street : literature and popular culture in postindustrial Nova Scotia By dal.novanet.ca Published On :: Fri, 1 May 2020 19:34:09 -0300 Author: Thompson, Peter, 1981- author.Callnumber: PS 8131 N6 T56 2019ISBN: 0773559345 Full Article