ap On the consistency of graph-based Bayesian semi-supervised learning and the scalability of sampling algorithms By Published On :: 2020 This paper considers a Bayesian approach to graph-based semi-supervised learning. We show that if the graph parameters are suitably scaled, the graph-posteriors converge to a continuum limit as the size of the unlabeled data set grows. This consistency result has profound algorithmic implications: we prove that when consistency holds, carefully designed Markov chain Monte Carlo algorithms have a uniform spectral gap, independent of the number of unlabeled inputs. Numerical experiments illustrate and complement the theory. Full Article
ap Provably robust estimation of modulo 1 samples of a smooth function with applications to phase unwrapping By Published On :: 2020 Consider an unknown smooth function $f: [0,1]^d ightarrow mathbb{R}$, and assume we are given $n$ noisy mod 1 samples of $f$, i.e., $y_i = (f(x_i) + eta_i) mod 1$, for $x_i in [0,1]^d$, where $eta_i$ denotes the noise. Given the samples $(x_i,y_i)_{i=1}^{n}$, our goal is to recover smooth, robust estimates of the clean samples $f(x_i) mod 1$. We formulate a natural approach for solving this problem, which works with angular embeddings of the noisy mod 1 samples over the unit circle, inspired by the angular synchronization framework. This amounts to solving a smoothness regularized least-squares problem -- a quadratically constrained quadratic program (QCQP) -- where the variables are constrained to lie on the unit circle. Our proposed approach is based on solving its relaxation, which is a trust-region sub-problem and hence solvable efficiently. We provide theoretical guarantees demonstrating its robustness to noise for adversarial, as well as random Gaussian and Bernoulli noise models. To the best of our knowledge, these are the first such theoretical results for this problem. We demonstrate the robustness and efficiency of our proposed approach via extensive numerical simulations on synthetic data, along with a simple least-squares based solution for the unwrapping stage, that recovers the original samples of $f$ (up to a global shift). It is shown to perform well at high levels of noise, when taking as input the denoised modulo $1$ samples. Finally, we also consider two other approaches for denoising the modulo 1 samples that leverage tools from Riemannian optimization on manifolds, including a Burer-Monteiro approach for a semidefinite programming relaxation of our formulation. For the two-dimensional version of the problem, which has applications in synthetic aperture radar interferometry (InSAR), we are able to solve instances of real-world data with a million sample points in under 10 seconds, on a personal laptop. Full Article
ap Graph-Dependent Implicit Regularisation for Distributed Stochastic Subgradient Descent By Published On :: 2020 We propose graph-dependent implicit regularisation strategies for synchronised distributed stochastic subgradient descent (Distributed SGD) for convex problems in multi-agent learning. Under the standard assumptions of convexity, Lipschitz continuity, and smoothness, we establish statistical learning rates that retain, up to logarithmic terms, single-machine serial statistical guarantees through implicit regularisation (step size tuning and early stopping) with appropriate dependence on the graph topology. Our approach avoids the need for explicit regularisation in decentralised learning problems, such as adding constraints to the empirical risk minimisation rule. Particularly for distributed methods, the use of implicit regularisation allows the algorithm to remain simple, without projections or dual methods. To prove our results, we establish graph-independent generalisation bounds for Distributed SGD that match the single-machine serial SGD setting (using algorithmic stability), and we establish graph-dependent optimisation bounds that are of independent interest. We present numerical experiments to show that the qualitative nature of the upper bounds we derive can be representative of real behaviours. Full Article
ap Skill Rating for Multiplayer Games. Introducing Hypernode Graphs and their Spectral Theory By Published On :: 2020 We consider the skill rating problem for multiplayer games, that is how to infer player skills from game outcomes in multiplayer games. We formulate the problem as a minimization problem $arg min_{s} s^T Delta s$ where $Delta$ is a positive semidefinite matrix and $s$ a real-valued function, of which some entries are the skill values to be inferred and other entries are constrained by the game outcomes. We leverage graph-based semi-supervised learning (SSL) algorithms for this problem. We apply our algorithms on several data sets of multiplayer games and obtain very promising results compared to Elo Duelling (see Elo, 1978) and TrueSkill (see Herbrich et al., 2006).. As we leverage graph-based SSL algorithms and because games can be seen as relations between sets of players, we then generalize the approach. For this aim, we introduce a new finite model, called hypernode graph, defined to be a set of weighted binary relations between sets of nodes. We define Laplacians of hypernode graphs. Then, we show that the skill rating problem for multiplayer games can be formulated as $arg min_{s} s^T Delta s$ where $Delta$ is the Laplacian of a hypernode graph constructed from a set of games. From a fundamental perspective, we show that hypernode graph Laplacians are symmetric positive semidefinite matrices with constant functions in their null space. We show that problems on hypernode graphs can not be solved with graph constructions and graph kernels. We relate hypernode graphs to signed graphs showing that positive relations between groups can lead to negative relations between individuals. Full Article
ap High-Dimensional Inference for Cluster-Based Graphical Models By Published On :: 2020 Motivated by modern applications in which one constructs graphical models based on a very large number of features, this paper introduces a new class of cluster-based graphical models, in which variable clustering is applied as an initial step for reducing the dimension of the feature space. We employ model assisted clustering, in which the clusters contain features that are similar to the same unobserved latent variable. Two different cluster-based Gaussian graphical models are considered: the latent variable graph, corresponding to the graphical model associated with the unobserved latent variables, and the cluster-average graph, corresponding to the vector of features averaged over clusters. Our study reveals that likelihood based inference for the latent graph, not analyzed previously, is analytically intractable. Our main contribution is the development and analysis of alternative estimation and inference strategies, for the precision matrix of an unobservable latent vector Z. We replace the likelihood of the data by an appropriate class of empirical risk functions, that can be specialized to the latent graphical model and to the simpler, but under-analyzed, cluster-average graphical model. The estimators thus derived can be used for inference on the graph structure, for instance on edge strength or pattern recovery. Inference is based on the asymptotic limits of the entry-wise estimates of the precision matrices associated with the conditional independence graphs under consideration. While taking the uncertainty induced by the clustering step into account, we establish Berry-Esseen central limit theorems for the proposed estimators. It is noteworthy that, although the clusters are estimated adaptively from the data, the central limit theorems regarding the entries of the estimated graphs are proved under the same conditions one would use if the clusters were known in advance. As an illustration of the usage of these newly developed inferential tools, we show that they can be reliably used for recovery of the sparsity pattern of the graphs we study, under FDR control, which is verified via simulation studies and an fMRI data analysis. These experimental results confirm the theoretically established difference between the two graph structures. Furthermore, the data analysis suggests that the latent variable graph, corresponding to the unobserved cluster centers, can help provide more insight into the understanding of the brain connectivity networks relative to the simpler, average-based, graph. Full Article
ap GraKeL: A Graph Kernel Library in Python By Published On :: 2020 The problem of accurately measuring the similarity between graphs is at the core of many applications in a variety of disciplines. Graph kernels have recently emerged as a promising approach to this problem. There are now many kernels, each focusing on different structural aspects of graphs. Here, we present GraKeL, a library that unifies several graph kernels into a common framework. The library is written in Python and adheres to the scikit-learn interface. It is simple to use and can be naturally combined with scikit-learn's modules to build a complete machine learning pipeline for tasks such as graph classification and clustering. The code is BSD licensed and is available at: https://github.com/ysig/GraKeL. Full Article
ap Multiparameter Persistence Landscapes By Published On :: 2020 An important problem in the field of Topological Data Analysis is defining topological summaries which can be combined with traditional data analytic tools. In recent work Bubenik introduced the persistence landscape, a stable representation of persistence diagrams amenable to statistical analysis and machine learning tools. In this paper we generalise the persistence landscape to multiparameter persistence modules providing a stable representation of the rank invariant. We show that multiparameter landscapes are stable with respect to the interleaving distance and persistence weighted Wasserstein distance, and that the collection of multiparameter landscapes faithfully represents the rank invariant. Finally we provide example calculations and statistical tests to demonstrate a range of potential applications and how one can interpret the landscapes associated to a multiparameter module. Full Article
ap Community-Based Group Graphical Lasso By Published On :: 2020 A new strategy for probabilistic graphical modeling is developed that draws parallels to community detection analysis. The method jointly estimates an undirected graph and homogeneous communities of nodes. The structure of the communities is taken into account when estimating the graph and at the same time, the structure of the graph is accounted for when estimating communities of nodes. The procedure uses a joint group graphical lasso approach with community detection-based grouping, such that some groups of edges co-occur in the estimated graph. The grouping structure is unknown and is estimated based on community detection algorithms. Theoretical derivations regarding graph convergence and sparsistency, as well as accuracy of community recovery are included, while the method's empirical performance is illustrated in an fMRI context, as well as with simulated examples. Full Article
ap Representation Learning for Dynamic Graphs: A Survey By Published On :: 2020 Graphs arise naturally in many real-world applications including social networks, recommender systems, ontologies, biology, and computational finance. Traditionally, machine learning models for graphs have been mostly designed for static graphs. However, many applications involve evolving graphs. This introduces important challenges for learning and inference since nodes, attributes, and edges change over time. In this survey, we review the recent advances in representation learning for dynamic graphs, including dynamic knowledge graphs. We describe existing models from an encoder-decoder perspective, categorize these encoders and decoders based on the techniques they employ, and analyze the approaches in each category. We also review several prominent applications and widely used datasets and highlight directions for future research. Full Article
ap Scalable Approximate MCMC Algorithms for the Horseshoe Prior By Published On :: 2020 The horseshoe prior is frequently employed in Bayesian analysis of high-dimensional models, and has been shown to achieve minimax optimal risk properties when the truth is sparse. While optimization-based algorithms for the extremely popular Lasso and elastic net procedures can scale to dimension in the hundreds of thousands, algorithms for the horseshoe that use Markov chain Monte Carlo (MCMC) for computation are limited to problems an order of magnitude smaller. This is due to high computational cost per step and growth of the variance of time-averaging estimators as a function of dimension. We propose two new MCMC algorithms for computation in these models that have significantly improved performance compared to existing alternatives. One of the algorithms also approximates an expensive matrix product to give orders of magnitude speedup in high-dimensional applications. We prove guarantees for the accuracy of the approximate algorithm, and show that gradually decreasing the approximation error as the chain extends results in an exact algorithm. The scalability of the algorithm is illustrated in simulations with problem size as large as $N=5,000$ observations and $p=50,000$ predictors, and an application to a genome-wide association study with $N=2,267$ and $p=98,385$. The empirical results also show that the new algorithm yields estimates with lower mean squared error, intervals with better coverage, and elucidates features of the posterior that were often missed by previous algorithms in high dimensions, including bimodality of posterior marginals indicating uncertainty about which covariates belong in the model. Full Article
ap High-dimensional Gaussian graphical models on network-linked data By Published On :: 2020 Graphical models are commonly used to represent conditional dependence relationships between variables. There are multiple methods available for exploring them from high-dimensional data, but almost all of them rely on the assumption that the observations are independent and identically distributed. At the same time, observations connected by a network are becoming increasingly common, and tend to violate these assumptions. Here we develop a Gaussian graphical model for observations connected by a network with potentially different mean vectors, varying smoothly over the network. We propose an efficient estimation algorithm and demonstrate its effectiveness on both simulated and real data, obtaining meaningful and interpretable results on a statistics coauthorship network. We also prove that our method estimates both the inverse covariance matrix and the corresponding graph structure correctly under the assumption of network “cohesion”, which refers to the empirically observed phenomenon of network neighbors sharing similar traits. Full Article
ap Access thousands of newspapers and magazines with PressReader By feedproxy.google.com Published On :: Mon, 04 May 2020 03:40:42 +0000 Want to access thousands of newspapers and magazines wherever you are? Full Article
ap Bayesian modeling and prior sensitivity analysis for zero–one augmented beta regression models with an application to psychometric data By projecteuclid.org Published On :: Mon, 04 May 2020 04:00 EDT Danilo Covaes Nogarotto, Caio Lucidius Naberezny Azevedo, Jorge Luis Bazán. Source: Brazilian Journal of Probability and Statistics, Volume 34, Number 2, 304--322.Abstract: The interest on the analysis of the zero–one augmented beta regression (ZOABR) model has been increasing over the last few years. In this work, we developed a Bayesian inference for the ZOABR model, providing some contributions, namely: we explored the use of Jeffreys-rule and independence Jeffreys prior for some of the parameters, performing a sensitivity study of prior choice, comparing the Bayesian estimates with the maximum likelihood ones and measuring the accuracy of the estimates under several scenarios of interest. The results indicate, in a general way, that: the Bayesian approach, under the Jeffreys-rule prior, was as accurate as the ML one. Also, different from other approaches, we use the predictive distribution of the response to implement Bayesian residuals. To further illustrate the advantages of our approach, we conduct an analysis of a real psychometric data set including a Bayesian residual analysis, where it is shown that misleading inference can be obtained when the data is transformed. That is, when the zeros and ones are transformed to suitable values and the usual beta regression model is considered, instead of the ZOABR model. Finally, future developments are discussed. Full Article
ap Adaptive two-treatment three-period crossover design for normal responses By projecteuclid.org Published On :: Mon, 04 May 2020 04:00 EDT Uttam Bandyopadhyay, Shirsendu Mukherjee, Atanu Biswas. Source: Brazilian Journal of Probability and Statistics, Volume 34, Number 2, 291--303.Abstract: In adaptive crossover design, our goal is to allocate more patients to a promising treatment sequence. The present work contains a very simple three period crossover design for two competing treatments where the allocation in period 3 is done on the basis of the data obtained from the first two periods. Assuming normality of response variables we use a reliability functional for the choice between two treatments. We calculate the allocation proportions and their standard errors corresponding to the possible treatment combinations. We also derive some asymptotic results and provide solutions on related inferential problems. Moreover, the proposed procedure is compared with a possible competitor. Finally, we use a data set to illustrate the applicability of the proposed design. Full Article
ap A note on the “L-logistic regression models: Prior sensitivity analysis, robustness to outliers and applications” By projecteuclid.org Published On :: Mon, 03 Feb 2020 04:00 EST Saralees Nadarajah, Yuancheng Si. Source: Brazilian Journal of Probability and Statistics, Volume 34, Number 1, 183--187.Abstract: Da Paz, Balakrishnan and Bazan [Braz. J. Probab. Stat. 33 (2019), 455–479] introduced the L-logistic distribution, studied its properties including estimation issues and illustrated a data application. This note derives a closed form expression for moment properties of the distribution. Some computational issues are discussed. Full Article
ap Application of weighted and unordered majorization orders in comparisons of parallel systems with exponentiated generalized gamma components By projecteuclid.org Published On :: Mon, 03 Feb 2020 04:00 EST Abedin Haidari, Amir T. Payandeh Najafabadi, Narayanaswamy Balakrishnan. Source: Brazilian Journal of Probability and Statistics, Volume 34, Number 1, 150--166.Abstract: Consider two parallel systems, say $A$ and $B$, with respective lifetimes $T_{1}$ and $T_{2}$ wherein independent component lifetimes of each system follow exponentiated generalized gamma distribution with possibly different exponential shape and scale parameters. We show here that $T_{2}$ is smaller than $T_{1}$ with respect to the usual stochastic order (reversed hazard rate order) if the vector of logarithm (the main vector) of scale parameters of System $B$ is weakly weighted majorized by that of System $A$, and if the vector of exponential shape parameters of System $A$ is unordered mojorized by that of System $B$. By means of some examples, we show that the above results can not be extended to the hazard rate and likelihood ratio orders. However, when the scale parameters of each system divide into two homogeneous groups, we verify that the usual stochastic and reversed hazard rate orders can be extended, respectively, to the hazard rate and likelihood ratio orders. The established results complete and strengthen some of the known results in the literature. Full Article
ap Multivariate normal approximation of the maximum likelihood estimator via the delta method By projecteuclid.org Published On :: Mon, 03 Feb 2020 04:00 EST Andreas Anastasiou, Robert E. Gaunt. Source: Brazilian Journal of Probability and Statistics, Volume 34, Number 1, 136--149.Abstract: We use the delta method and Stein’s method to derive, under regularity conditions, explicit upper bounds for the distributional distance between the distribution of the maximum likelihood estimator (MLE) of a $d$-dimensional parameter and its asymptotic multivariate normal distribution. Our bounds apply in situations in which the MLE can be written as a function of a sum of i.i.d. $t$-dimensional random vectors. We apply our general bound to establish a bound for the multivariate normal approximation of the MLE of the normal distribution with unknown mean and variance. Full Article
ap Effects of gene–environment and gene–gene interactions in case-control studies: A novel Bayesian semiparametric approach By projecteuclid.org Published On :: Mon, 03 Feb 2020 04:00 EST Durba Bhattacharya, Sourabh Bhattacharya. Source: Brazilian Journal of Probability and Statistics, Volume 34, Number 1, 71--89.Abstract: Present day bio-medical research is pointing towards the fact that cognizance of gene–environment interactions along with genetic interactions may help prevent or detain the onset of many complex diseases like cardiovascular disease, cancer, type2 diabetes, autism or asthma by adjustments to lifestyle. In this regard, we propose a Bayesian semiparametric model to detect not only the roles of genes and their interactions, but also the possible influence of environmental variables on the genes in case-control studies. Our model also accounts for the unknown number of genetic sub-populations via finite mixtures composed of Dirichlet processes. An effective parallel computing methodology, developed by us harnesses the power of parallel processing technology to increase the efficiencies of our conditionally independent Gibbs sampling and Transformation based MCMC (TMCMC) methods. Applications of our model and methods to simulation studies with biologically realistic genotype datasets and a real, case-control based genotype dataset on early onset of myocardial infarction (MI) have yielded quite interesting results beside providing some insights into the differential effect of gender on MI. Full Article
ap A joint mean-correlation modeling approach for longitudinal zero-inflated count data By projecteuclid.org Published On :: Mon, 03 Feb 2020 04:00 EST Weiping Zhang, Jiangli Wang, Fang Qian, Yu Chen. Source: Brazilian Journal of Probability and Statistics, Volume 34, Number 1, 35--50.Abstract: Longitudinal zero-inflated count data are widely encountered in many fields, while modeling the correlation between measurements for the same subject is more challenge due to the lack of suitable multivariate joint distributions. This paper studies a novel mean-correlation modeling approach for longitudinal zero-inflated regression model, solving both problems of specifying joint distribution and parsimoniously modeling correlations with no constraint. The joint distribution of zero-inflated discrete longitudinal responses is modeled by a copula model whose correlation parameters are innovatively represented in hyper-spherical coordinates. To overcome the computational intractability in maximizing the full likelihood function of the model, we further propose a computationally efficient pairwise likelihood approach. We then propose separated mean and correlation regression models to model these key quantities, such modeling approach can also handle irregularly and possibly subject-specific times points. The resulting estimators are shown to be consistent and asymptotically normal. Data example and simulations support the effectiveness of the proposed approach. Full Article
ap Bootstrap-based testing inference in beta regressions By projecteuclid.org Published On :: Mon, 03 Feb 2020 04:00 EST Fábio P. Lima, Francisco Cribari-Neto. Source: Brazilian Journal of Probability and Statistics, Volume 34, Number 1, 18--34.Abstract: We address the issue of performing testing inference in small samples in the class of beta regression models. We consider the likelihood ratio test and its standard bootstrap version. We also consider two alternative resampling-based tests. One of them uses the bootstrap test statistic replicates to numerically estimate a Bartlett correction factor that can be applied to the likelihood ratio test statistic. By doing so, we avoid estimation of quantities located in the tail of the likelihood ratio test statistic null distribution. The second alternative resampling-based test uses a fast double bootstrap scheme in which a single second level bootstrapping resample is performed for each first level bootstrap replication. It delivers accurate testing inferences at a computational cost that is considerably smaller than that of a standard double bootstrapping scheme. The Monte Carlo results we provide show that the standard likelihood ratio test tends to be quite liberal in small samples. They also show that the bootstrap tests deliver accurate testing inferences even when the sample size is quite small. An empirical application is also presented and discussed. Full Article
ap Bayesian approach for the zero-modified Poisson–Lindley regression model By projecteuclid.org Published On :: Mon, 26 Aug 2019 04:00 EDT Wesley Bertoli, Katiane S. Conceição, Marinho G. Andrade, Francisco Louzada. Source: Brazilian Journal of Probability and Statistics, Volume 33, Number 4, 826--860.Abstract: The primary goal of this paper is to introduce the zero-modified Poisson–Lindley regression model as an alternative to model overdispersed count data exhibiting inflation or deflation of zeros in the presence of covariates. The zero-modification is incorporated by considering that a zero-truncated process produces positive observations and consequently, the proposed model can be fitted without any previous information about the zero-modification present in a given dataset. A fully Bayesian approach based on the g-prior method has been considered for inference concerns. An intensive Monte Carlo simulation study has been conducted to evaluate the performance of the developed methodology and the maximum likelihood estimators. The proposed model was considered for the analysis of a real dataset on the number of bids received by $126$ U.S. firms between 1978–1985, and the impact of choosing different prior distributions for the regression coefficients has been studied. A sensitivity analysis to detect influential points has been performed based on the Kullback–Leibler divergence. A general comparison with some well-known regression models for discrete data has been presented. Full Article
ap Option pricing with bivariate risk-neutral density via copula and heteroscedastic model: A Bayesian approach By projecteuclid.org Published On :: Mon, 26 Aug 2019 04:00 EDT Lucas Pereira Lopes, Vicente Garibay Cancho, Francisco Louzada. Source: Brazilian Journal of Probability and Statistics, Volume 33, Number 4, 801--825.Abstract: Multivariate options are adequate tools for multi-asset risk management. The pricing models derived from the pioneer Black and Scholes method under the multivariate case consider that the asset-object prices follow a Brownian geometric motion. However, the construction of such methods imposes some unrealistic constraints on the process of fair option calculation, such as constant volatility over the maturity time and linear correlation between the assets. Therefore, this paper aims to price and analyze the fair price behavior of the call-on-max (bivariate) option considering marginal heteroscedastic models with dependence structure modeled via copulas. Concerning inference, we adopt a Bayesian perspective and computationally intensive methods based on Monte Carlo simulations via Markov Chain (MCMC). A simulation study examines the bias, and the root mean squared errors of the posterior means for the parameters. Real stocks prices of Brazilian banks illustrate the approach. For the proposed method is verified the effects of strike and dependence structure on the fair price of the option. The results show that the prices obtained by our heteroscedastic model approach and copulas differ substantially from the prices obtained by the model derived from Black and Scholes. Empirical results are presented to argue the advantages of our strategy. Full Article
ap Unions of random walk and percolation on infinite graphs By projecteuclid.org Published On :: Mon, 10 Jun 2019 04:04 EDT Kazuki Okamura. Source: Brazilian Journal of Probability and Statistics, Volume 33, Number 3, 586--637.Abstract: We consider a random object that is associated with both random walks and random media, specifically, the superposition of a configuration of subcritical Bernoulli percolation on an infinite connected graph and the trace of the simple random walk on the same graph. We investigate asymptotics for the number of vertices of the enlargement of the trace of the walk until a fixed time, when the time tends to infinity. This process is more highly self-interacting than the range of random walk, which yields difficulties. We show a law of large numbers on vertex-transitive transient graphs. We compare the process on a vertex-transitive graph with the process on a finitely modified graph of the original vertex-transitive graph and show their behaviors are similar. We show that the process fluctuates almost surely on a certain non-vertex-transitive graph. On the two-dimensional integer lattice, by investigating the size of the boundary of the trace, we give an estimate for variances of the process implying a law of large numbers. We give an example of a graph with unbounded degrees on which the process behaves in a singular manner. As by-products, some results for the range and the boundary, which will be of independent interest, are obtained. Full Article
ap Fake uniformity in a shape inversion formula By projecteuclid.org Published On :: Mon, 10 Jun 2019 04:04 EDT Christian Rau. Source: Brazilian Journal of Probability and Statistics, Volume 33, Number 3, 549--557.Abstract: We revisit a shape inversion formula derived by Panaretos in the context of a particle density estimation problem with unknown rotation of the particle. A distribution is presented which imitates, or “fakes”, the uniformity or Haar distribution that is part of that formula. Full Article
ap Spatially adaptive Bayesian image reconstruction through locally-modulated Markov random field models By projecteuclid.org Published On :: Mon, 10 Jun 2019 04:04 EDT Salem M. Al-Gezeri, Robert G. Aykroyd. Source: Brazilian Journal of Probability and Statistics, Volume 33, Number 3, 498--519.Abstract: The use of Markov random field (MRF) models has proven to be a fruitful approach in a wide range of image processing applications. It allows local texture information to be incorporated in a systematic and unified way and allows statistical inference theory to be applied giving rise to novel output summaries and enhanced image interpretation. A great advantage of such low-level approaches is that they lead to flexible models, which can be applied to a wide range of imaging problems without the need for significant modification. This paper proposes and explores the use of conditional MRF models for situations where multiple images are to be processed simultaneously, or where only a single image is to be reconstructed and a sequential approach is taken. Although the coupling of image intensity values is a special case of our approach, the main extension over previous proposals is to allow the direct coupling of other properties, such as smoothness or texture. This is achieved using a local modulating function which adjusts the influence of global smoothing without the need for a fully inhomogeneous prior model. Several modulating functions are considered and a detailed simulation study, motivated by remote sensing applications in archaeological geophysics, of conditional reconstruction is presented. The results demonstrate that a substantial improvement in the quality of the image reconstruction, in terms of errors and residuals, can be achieved using this approach, especially at locations with rapid changes in the underlying intensity. Full Article
ap L-Logistic regression models: Prior sensitivity analysis, robustness to outliers and applications By projecteuclid.org Published On :: Mon, 10 Jun 2019 04:04 EDT Rosineide F. da Paz, Narayanaswamy Balakrishnan, Jorge Luis Bazán. Source: Brazilian Journal of Probability and Statistics, Volume 33, Number 3, 455--479.Abstract: Tadikamalla and Johnson [ Biometrika 69 (1982) 461–465] developed the $L_{B}$ distribution to variables with bounded support by considering a transformation of the standard Logistic distribution. In this manuscript, a convenient parametrization of this distribution is proposed in order to develop regression models. This distribution, referred to here as L-Logistic distribution, provides great flexibility and includes the uniform distribution as a particular case. Several properties of this distribution are studied, and a Bayesian approach is adopted for the parameter estimation. Simulation studies, considering prior sensitivity analysis, recovery of parameters and comparison of algorithms, and robustness to outliers are all discussed showing that the results are insensitive to the choice of priors, efficiency of the algorithm MCMC adopted, and robustness of the model when compared with the beta distribution. Applications to estimate the vulnerability to poverty and to explain the anxiety are performed. The results to applications show that the L-Logistic regression models provide a better fit than the corresponding beta regression models. Full Article
ap Hierarchical modelling of power law processes for the analysis of repairable systems with different truncation times: An empirical Bayes approach By projecteuclid.org Published On :: Mon, 04 Mar 2019 04:00 EST Rodrigo Citton P. dos Reis, Enrico A. Colosimo, Gustavo L. Gilardoni. Source: Brazilian Journal of Probability and Statistics, Volume 33, Number 2, 374--396.Abstract: In the data analysis from multiple repairable systems, it is usual to observe both different truncation times and heterogeneity among the systems. Among other reasons, the latter is caused by different manufacturing lines and maintenance teams of the systems. In this paper, a hierarchical model is proposed for the statistical analysis of multiple repairable systems under different truncation times. A reparameterization of the power law process is proposed in order to obtain a quasi-conjugate bayesian analysis. An empirical Bayes approach is used to estimate model hyperparameters. The uncertainty in the estimate of these quantities are corrected by using a parametric bootstrap approach. The results are illustrated in a real data set of failure times of power transformers from an electric company in Brazil. Full Article
ap A new log-linear bimodal Birnbaum–Saunders regression model with application to survival data By projecteuclid.org Published On :: Mon, 04 Mar 2019 04:00 EST Francisco Cribari-Neto, Rodney V. Fonseca. Source: Brazilian Journal of Probability and Statistics, Volume 33, Number 2, 329--355.Abstract: The log-linear Birnbaum–Saunders model has been widely used in empirical applications. We introduce an extension of this model based on a recently proposed version of the Birnbaum–Saunders distribution which is more flexible than the standard Birnbaum–Saunders law since its density may assume both unimodal and bimodal shapes. We show how to perform point estimation, interval estimation and hypothesis testing inferences on the parameters that index the regression model we propose. We also present a number of diagnostic tools, such as residual analysis, local influence, generalized leverage, generalized Cook’s distance and model misspecification tests. We investigate the usefulness of model selection criteria and the accuracy of prediction intervals for the proposed model. Results of Monte Carlo simulations are presented. Finally, we also present and discuss an empirical application. Full Article
ap Failure rate of Birnbaum–Saunders distributions: Shape, change-point, estimation and robustness By projecteuclid.org Published On :: Mon, 04 Mar 2019 04:00 EST Emilia Athayde, Assis Azevedo, Michelli Barros, Víctor Leiva. Source: Brazilian Journal of Probability and Statistics, Volume 33, Number 2, 301--328.Abstract: The Birnbaum–Saunders (BS) distribution has been largely studied and applied. A random variable with BS distribution is a transformation of another random variable with standard normal distribution. Generalized BS distributions are obtained when the normally distributed random variable is replaced by another symmetrically distributed random variable. This allows us to obtain a wide class of positively skewed models with lighter and heavier tails than the BS model. Its failure rate admits several shapes, including the unimodal case, with its change-point being able to be used for different purposes. For example, to establish the reduction in a dose, and then in the cost of the medical treatment. We analyze the failure rates of generalized BS distributions obtained by the logistic, normal and Student-t distributions, considering their shape and change-point, estimating them, evaluating their robustness, assessing their performance by simulations, and applying the results to real data from different areas. Full Article
ap A brief review of optimal scaling of the main MCMC approaches and optimal scaling of additive TMCMC under non-regular cases By projecteuclid.org Published On :: Mon, 04 Mar 2019 04:00 EST Kushal K. Dey, Sourabh Bhattacharya. Source: Brazilian Journal of Probability and Statistics, Volume 33, Number 2, 222--266.Abstract: Transformation based Markov Chain Monte Carlo (TMCMC) was proposed by Dutta and Bhattacharya ( Statistical Methodology 16 (2014) 100–116) as an efficient alternative to the Metropolis–Hastings algorithm, especially in high dimensions. The main advantage of this algorithm is that it simultaneously updates all components of a high dimensional parameter using appropriate move types defined by deterministic transformation of a single random variable. This results in reduction in time complexity at each step of the chain and enhances the acceptance rate. In this paper, we first provide a brief review of the optimal scaling theory for various existing MCMC approaches, comparing and contrasting them with the corresponding TMCMC approaches.The optimal scaling of the simplest form of TMCMC, namely additive TMCMC , has been studied extensively for the Gaussian proposal density in Dey and Bhattacharya (2017a). Here, we discuss diffusion-based optimal scaling behavior of additive TMCMC for non-Gaussian proposal densities—in particular, uniform, Student’s $t$ and Cauchy proposals. Although we could not formally prove our diffusion result for the Cauchy proposal, simulation based results lead us to conjecture that at least the recipe for obtaining general optimal scaling and optimal acceptance rate holds for the Cauchy case as well. We also consider diffusion based optimal scaling of TMCMC when the target density is discontinuous. Such non-regular situations have been studied in the case of Random Walk Metropolis Hastings (RWMH) algorithm by Neal and Roberts ( Methodology and Computing in Applied Probability 13 (2011) 583–601) using expected squared jumping distance (ESJD), but the diffusion theory based scaling has not been considered. We compare our diffusion based optimally scaled TMCMC approach with the ESJD based optimally scaled RWM with simulation studies involving several target distributions and proposal distributions including the challenging Cauchy proposal case, showing that additive TMCMC outperforms RWMH in almost all cases considered. Full Article
ap Globalizing capital : a history of the international monetary system By dal.novanet.ca Published On :: Fri, 1 May 2020 19:34:09 -0300 Author: Eichengreen, Barry J., author.Callnumber: HG 3881 E347 2019ISBN: 9780691193908 (paperback) Full Article
ap Flexible, boundary adapted, nonparametric methods for the estimation of univariate piecewise-smooth functions By projecteuclid.org Published On :: Tue, 04 Feb 2020 04:00 EST Umberto Amato, Anestis Antoniadis, Italia De Feis. Source: Statistics Surveys, Volume 14, 32--70.Abstract: We present and compare some nonparametric estimation methods (wavelet and/or spline-based) designed to recover a one-dimensional piecewise-smooth regression function in both a fixed equidistant or not equidistant design regression model and a random design model. Wavelet methods are known to be very competitive in terms of denoising and compression, due to the simultaneous localization property of a function in time and frequency. However, boundary assumptions, such as periodicity or symmetry, generate bias and artificial wiggles which degrade overall accuracy. Simple methods have been proposed in the literature for reducing the bias at the boundaries. We introduce new ones based on adaptive combinations of two estimators. The underlying idea is to combine a highly accurate method for non-regular functions, e.g., wavelets, with one well behaved at boundaries, e.g., Splines or Local Polynomial. We provide some asymptotic optimal results supporting our approach. All the methods can handle data with a random design. We also sketch some generalization to the multidimensional setting. To study the performance of the proposed approaches we have conducted an extensive set of simulations on synthetic data. An interesting regression analysis of two real data applications using these procedures unambiguously demonstrates their effectiveness. Full Article
ap Scalar-on-function regression for predicting distal outcomes from intensively gathered longitudinal data: Interpretability for applied scientists By projecteuclid.org Published On :: Tue, 05 Nov 2019 22:03 EST John J. Dziak, Donna L. Coffman, Matthew Reimherr, Justin Petrovich, Runze Li, Saul Shiffman, Mariya P. Shiyko. Source: Statistics Surveys, Volume 13, 150--180.Abstract: Researchers are sometimes interested in predicting a distal or external outcome (such as smoking cessation at follow-up) from the trajectory of an intensively recorded longitudinal variable (such as urge to smoke). This can be done in a semiparametric way via scalar-on-function regression. However, the resulting fitted coefficient regression function requires special care for correct interpretation, as it represents the joint relationship of time points to the outcome, rather than a marginal or cross-sectional relationship. We provide practical guidelines, based on experience with scientific applications, for helping practitioners interpret their results and illustrate these ideas using data from a smoking cessation study. Full Article
ap An approximate likelihood perspective on ABC methods By projecteuclid.org Published On :: Fri, 08 Jun 2018 22:03 EDT George Karabatsos, Fabrizio Leisen. Source: Statistics Surveys, Volume 12, 66--104.Abstract: We are living in the big data era, as current technologies and networks allow for the easy and routine collection of data sets in different disciplines. Bayesian Statistics offers a flexible modeling approach which is attractive for describing the complexity of these datasets. These models often exhibit a likelihood function which is intractable due to the large sample size, high number of parameters, or functional complexity. Approximate Bayesian Computational (ABC) methods provides likelihood-free methods for performing statistical inferences with Bayesian models defined by intractable likelihood functions. The vastity of the literature on ABC methods created a need to review and relate all ABC approaches so that scientists can more readily understand and apply them for their own work. This article provides a unifying review, general representation, and classification of all ABC methods from the view of approximate likelihood theory. This clarifies how ABC methods can be characterized, related, combined, improved, and applied for future research. Possible future research in ABC is then outlined. Full Article
ap A design-sensitive approach to fitting regression models with complex survey data By projecteuclid.org Published On :: Wed, 17 Jan 2018 04:00 EST Phillip S. Kott. Source: Statistics Surveys, Volume 12, 1--17.Abstract: Fitting complex survey data to regression equations is explored under a design-sensitive model-based framework. A robust version of the standard model assumes that the expected value of the difference between the dependent variable and its model-based prediction is zero no matter what the values of the explanatory variables. The extended model assumes only that the difference is uncorrelated with the covariates. Little is assumed about the error structure of this difference under either model other than independence across primary sampling units. The standard model often fails in practice, but the extended model very rarely does. Under this framework some of the methods developed in the conventional design-based, pseudo-maximum-likelihood framework, such as fitting weighted estimating equations and sandwich mean-squared-error estimation, are retained but their interpretations change. Few of the ideas here are new to the refereed literature. The goal instead is to collect those ideas and put them into a unified conceptual framework. Full Article
ap A survey of bootstrap methods in finite population sampling By projecteuclid.org Published On :: Tue, 15 Mar 2016 09:17 EDT Zeinab Mashreghi, David Haziza, Christian Léger. Source: Statistics Surveys, Volume 10, 1--52.Abstract: We review bootstrap methods in the context of survey data where the effect of the sampling design on the variability of estimators has to be taken into account. We present the methods in a unified way by classifying them in three classes: pseudo-population, direct, and survey weights methods. We cover variance estimation and the construction of confidence intervals for stratified simple random sampling as well as some unequal probability sampling designs. We also address the problem of variance estimation in presence of imputation to compensate for item non-response. Full Article
ap A unified treatment for non-asymptotic and asymptotic approaches to minimax signal detection By projecteuclid.org Published On :: Tue, 19 Jan 2016 09:04 EST Clément Marteau, Theofanis Sapatinas. Source: Statistics Surveys, Volume 9, 253--297.Abstract: We are concerned with minimax signal detection. In this setting, we discuss non-asymptotic and asymptotic approaches through a unified treatment. In particular, we consider a Gaussian sequence model that contains classical models as special cases, such as, direct, well-posed inverse and ill-posed inverse problems. Working with certain ellipsoids in the space of squared-summable sequences of real numbers, with a ball of positive radius removed, we compare the construction of lower and upper bounds for the minimax separation radius (non-asymptotic approach) and the minimax separation rate (asymptotic approach) that have been proposed in the literature. Some additional contributions, bringing to light links between non-asymptotic and asymptotic approaches to minimax signal, are also presented. An example of a mildly ill-posed inverse problem is used for illustrative purposes. In particular, it is shown that tools used to derive ‘asymptotic’ results can be exploited to draw ‘non-asymptotic’ conclusions, and vice-versa. In order to enhance our understanding of these two minimax signal detection paradigms, we bring into light hitherto unknown similarities and links between non-asymptotic and asymptotic approaches. Full Article
ap Adaptive clinical trial designs for phase I cancer studies By projecteuclid.org Published On :: Thu, 29 May 2014 09:11 EDT Oleksandr Sverdlov, Weng Kee Wong, Yevgen Ryeznik. Source: Statistics Surveys, Volume 8, 2--44.Abstract: Adaptive clinical trials are becoming increasingly popular research designs for clinical investigation. Adaptive designs are particularly useful in phase I cancer studies where clinical data are scant and the goals are to assess the drug dose-toxicity profile and to determine the maximum tolerated dose while minimizing the number of study patients treated at suboptimal dose levels. In the current work we give an overview of adaptive design methods for phase I cancer trials. We find that modern statistical literature is replete with novel adaptive designs that have clearly defined objectives and established statistical properties, and are shown to outperform conventional dose finding methods such as the 3+3 design, both in terms of statistical efficiency and in terms of minimizing the number of patients treated at highly toxic or nonefficacious doses. We discuss statistical, logistical, and regulatory aspects of these designs and present some links to non-commercial statistical software for implementing these methods in practice. Full Article
ap The theory and application of penalized methods or Reproducing Kernel Hilbert Spaces made easy By projecteuclid.org Published On :: Tue, 16 Oct 2012 09:36 EDT Nancy HeckmanSource: Statist. Surv., Volume 6, 113--141.Abstract: The popular cubic smoothing spline estimate of a regression function arises as the minimizer of the penalized sum of squares $sum_{j}(Y_{j}-mu(t_{j}))^{2}+lambda int_{a}^{b}[mu''(t)]^{2},dt$, where the data are $t_{j},Y_{j}$, $j=1,ldots,n$. The minimization is taken over an infinite-dimensional function space, the space of all functions with square integrable second derivatives. But the calculations can be carried out in a finite-dimensional space. The reduction from minimizing over an infinite dimensional space to minimizing over a finite dimensional space occurs for more general objective functions: the data may be related to the function $mu$ in another way, the sum of squares may be replaced by a more suitable expression, or the penalty, $int_{a}^{b}[mu''(t)]^{2},dt$, might take a different form. This paper reviews the Reproducing Kernel Hilbert Space structure that provides a finite-dimensional solution for a general minimization problem. Particular attention is paid to the construction and study of the Reproducing Kernel Hilbert Space corresponding to a penalty based on a linear differential operator. In this case, one can often calculate the minimizer explicitly, using Green’s functions. Full Article
ap Holtermann and the A&A Photographic Company By feedproxy.google.com Published On :: Thu, 10 Sep 2015 02:50:04 +0000 We recently received a comment about authorship of the Holtermann Collection. Although it may seem a purely historica Full Article
ap A bimodal gamma distribution: Properties, regression model and applications. (arXiv:2004.12491v2 [stat.ME] UPDATED) By arxiv.org Published On :: In this paper we propose a bimodal gamma distribution using a quadratic transformation based on the alpha-skew-normal model. We discuss several properties of this distribution such as mean, variance, moments, hazard rate and entropy measures. Further, we propose a new regression model with censored data based on the bimodal gamma distribution. This regression model can be very useful to the analysis of real data and could give more realistic fits than other special regression models. Monte Carlo simulations were performed to check the bias in the maximum likelihood estimation. The proposed models are applied to two real data sets found in literature. Full Article
ap Excess registered deaths in England and Wales during the COVID-19 pandemic, March 2020 and April 2020. (arXiv:2004.11355v4 [stat.AP] UPDATED) By arxiv.org Published On :: Official counts of COVID-19 deaths have been criticized for potentially including people who did not die of COVID-19 but merely died with COVID-19. I address that critique by fitting a generalized additive model to weekly counts of all registered deaths in England and Wales during the 2010s. The model produces baseline rates of death registrations expected in the absence of the COVID-19 pandemic, and comparing those baselines to recent counts of registered deaths exposes the emergence of excess deaths late in March 2020. Among adults aged 45+, about 38,700 excess deaths were registered in the 5 weeks comprising 21 March through 24 April (612 $pm$ 416 from 21$-$27 March, 5675 $pm$ 439 from 28 March through 3 April, then 9183 $pm$ 468, 12,712 $pm$ 589, and 10,511 $pm$ 567 in April's next 3 weeks). Both the Office for National Statistics's respective count of 26,891 death certificates which mention COVID-19, and the Department of Health and Social Care's hospital-focused count of 21,222 deaths, are appreciably less, implying that their counting methods have underestimated rather than overestimated the pandemic's true death toll. If underreporting rates have held steady, about 45,900 direct and indirect COVID-19 deaths might have been registered by April's end but not yet publicly reported in full. Full Article
ap A Critical Overview of Privacy-Preserving Approaches for Collaborative Forecasting. (arXiv:2004.09612v3 [cs.LG] UPDATED) By arxiv.org Published On :: Cooperation between different data owners may lead to an improvement in forecast quality - for instance by benefiting from spatial-temporal dependencies in geographically distributed time series. Due to business competitive factors and personal data protection questions, said data owners might be unwilling to share their data, which increases the interest in collaborative privacy-preserving forecasting. This paper analyses the state-of-the-art and unveils several shortcomings of existing methods in guaranteeing data privacy when employing Vector Autoregressive (VAR) models. The paper also provides mathematical proofs and numerical analysis to evaluate existing privacy-preserving methods, dividing them into three groups: data transformation, secure multi-party computations, and decomposition methods. The analysis shows that state-of-the-art techniques have limitations in preserving data privacy, such as a trade-off between privacy and forecasting accuracy, while the original data in iterative model fitting processes, in which intermediate results are shared, can be inferred after some iterations. Full Article
ap Capturing and Explaining Trajectory Singularities using Composite Signal Neural Networks. (arXiv:2003.10810v2 [cs.LG] UPDATED) By arxiv.org Published On :: Spatial trajectories are ubiquitous and complex signals. Their analysis is crucial in many research fields, from urban planning to neuroscience. Several approaches have been proposed to cluster trajectories. They rely on hand-crafted features, which struggle to capture the spatio-temporal complexity of the signal, or on Artificial Neural Networks (ANNs) which can be more efficient but less interpretable. In this paper we present a novel ANN architecture designed to capture the spatio-temporal patterns characteristic of a set of trajectories, while taking into account the demographics of the navigators. Hence, our model extracts markers linked to both behaviour and demographics. We propose a composite signal analyser (CompSNN) combining three simple ANN modules. Each of these modules uses different signal representations of the trajectory while remaining interpretable. Our CompSNN performs significantly better than its modules taken in isolation and allows to visualise which parts of the signal were most useful to discriminate the trajectories. Full Article
ap Risk-Aware Energy Scheduling for Edge Computing with Microgrid: A Multi-Agent Deep Reinforcement Learning Approach. (arXiv:2003.02157v2 [physics.soc-ph] UPDATED) By arxiv.org Published On :: In recent years, multi-access edge computing (MEC) is a key enabler for handling the massive expansion of Internet of Things (IoT) applications and services. However, energy consumption of a MEC network depends on volatile tasks that induces risk for energy demand estimations. As an energy supplier, a microgrid can facilitate seamless energy supply. However, the risk associated with energy supply is also increased due to unpredictable energy generation from renewable and non-renewable sources. Especially, the risk of energy shortfall is involved with uncertainties in both energy consumption and generation. In this paper, we study a risk-aware energy scheduling problem for a microgrid-powered MEC network. First, we formulate an optimization problem considering the conditional value-at-risk (CVaR) measurement for both energy consumption and generation, where the objective is to minimize the loss of energy shortfall of the MEC networks and we show this problem is an NP-hard problem. Second, we analyze our formulated problem using a multi-agent stochastic game that ensures the joint policy Nash equilibrium, and show the convergence of the proposed model. Third, we derive the solution by applying a multi-agent deep reinforcement learning (MADRL)-based asynchronous advantage actor-critic (A3C) algorithm with shared neural networks. This method mitigates the curse of dimensionality of the state space and chooses the best policy among the agents for the proposed problem. Finally, the experimental results establish a significant performance gain by considering CVaR for high accuracy energy scheduling of the proposed model than both the single and random agent models. Full Article
ap Covariance Matrix Adaptation for the Rapid Illumination of Behavior Space. (arXiv:1912.02400v2 [cs.LG] UPDATED) By arxiv.org Published On :: We focus on the challenge of finding a diverse collection of quality solutions on complex continuous domains. While quality diver-sity (QD) algorithms like Novelty Search with Local Competition (NSLC) and MAP-Elites are designed to generate a diverse range of solutions, these algorithms require a large number of evaluations for exploration of continuous spaces. Meanwhile, variants of the Covariance Matrix Adaptation Evolution Strategy (CMA-ES) are among the best-performing derivative-free optimizers in single-objective continuous domains. This paper proposes a new QD algorithm called Covariance Matrix Adaptation MAP-Elites (CMA-ME). Our new algorithm combines the self-adaptation techniques of CMA-ES with archiving and mapping techniques for maintaining diversity in QD. Results from experiments based on standard continuous optimization benchmarks show that CMA-ME finds better-quality solutions than MAP-Elites; similarly, results on the strategic game Hearthstone show that CMA-ME finds both a higher overall quality and broader diversity of strategies than both CMA-ES and MAP-Elites. Overall, CMA-ME more than doubles the performance of MAP-Elites using standard QD performance metrics. These results suggest that QD algorithms augmented by operators from state-of-the-art optimization algorithms can yield high-performing methods for simultaneously exploring and optimizing continuous search spaces, with significant applications to design, testing, and reinforcement learning among other domains. Full Article
ap Sampling random graph homomorphisms and applications to network data analysis. (arXiv:1910.09483v2 [math.PR] UPDATED) By arxiv.org Published On :: A graph homomorphism is a map between two graphs that preserves adjacency relations. We consider the problem of sampling a random graph homomorphism from a graph $F$ into a large network $mathcal{G}$. We propose two complementary MCMC algorithms for sampling a random graph homomorphisms and establish bounds on their mixing times and concentration of their time averages. Based on our sampling algorithms, we propose a novel framework for network data analysis that circumvents some of the drawbacks in methods based on independent and neigborhood sampling. Various time averages of the MCMC trajectory give us various computable observables, including well-known ones such as homomorphism density and average clustering coefficient and their generalizations. Furthermore, we show that these network observables are stable with respect to a suitably renormalized cut distance between networks. We provide various examples and simulations demonstrating our framework through synthetic networks. We also apply our framework for network clustering and classification problems using the Facebook100 dataset and Word Adjacency Networks of a set of classic novels. Full Article
ap Bayesian factor models for multivariate categorical data obtained from questionnaires. (arXiv:1910.04283v2 [stat.AP] UPDATED) By arxiv.org Published On :: Factor analysis is a flexible technique for assessment of multivariate dependence and codependence. Besides being an exploratory tool used to reduce the dimensionality of multivariate data, it allows estimation of common factors that often have an interesting theoretical interpretation in real problems. However, standard factor analysis is only applicable when the variables are scaled, which is often inappropriate, for example, in data obtained from questionnaires in the field of psychology,where the variables are often categorical. In this framework, we propose a factor model for the analysis of multivariate ordered and non-ordered polychotomous data. The inference procedure is done under the Bayesian approach via Markov chain Monte Carlo methods. Two Monte-Carlo simulation studies are presented to investigate the performance of this approach in terms of estimation bias, precision and assessment of the number of factors. We also illustrate the proposed method to analyze participants' responses to the Motivational State Questionnaire dataset, developed to study emotions in laboratory and field settings. Full Article
ap Convergence rates for optimised adaptive importance samplers. (arXiv:1903.12044v4 [stat.CO] UPDATED) By arxiv.org Published On :: Adaptive importance samplers are adaptive Monte Carlo algorithms to estimate expectations with respect to some target distribution which extit{adapt} themselves to obtain better estimators over a sequence of iterations. Although it is straightforward to show that they have the same $mathcal{O}(1/sqrt{N})$ convergence rate as standard importance samplers, where $N$ is the number of Monte Carlo samples, the behaviour of adaptive importance samplers over the number of iterations has been left relatively unexplored. In this work, we investigate an adaptation strategy based on convex optimisation which leads to a class of adaptive importance samplers termed extit{optimised adaptive importance samplers} (OAIS). These samplers rely on the iterative minimisation of the $chi^2$-divergence between an exponential-family proposal and the target. The analysed algorithms are closely related to the class of adaptive importance samplers which minimise the variance of the weight function. We first prove non-asymptotic error bounds for the mean squared errors (MSEs) of these algorithms, which explicitly depend on the number of iterations and the number of samples together. The non-asymptotic bounds derived in this paper imply that when the target belongs to the exponential family, the $L_2$ errors of the optimised samplers converge to the optimal rate of $mathcal{O}(1/sqrt{N})$ and the rate of convergence in the number of iterations are explicitly provided. When the target does not belong to the exponential family, the rate of convergence is the same but the asymptotic $L_2$ error increases by a factor $sqrt{ ho^star} > 1$, where $ ho^star - 1$ is the minimum $chi^2$-divergence between the target and an exponential-family proposal. Full Article
ap Nonparametric Estimation of the Fisher Information and Its Applications. (arXiv:2005.03622v1 [cs.IT]) By arxiv.org Published On :: This paper considers the problem of estimation of the Fisher information for location from a random sample of size $n$. First, an estimator proposed by Bhattacharya is revisited and improved convergence rates are derived. Second, a new estimator, termed a clipped estimator, is proposed. Superior upper bounds on the rates of convergence can be shown for the new estimator compared to the Bhattacharya estimator, albeit with different regularity conditions. Third, both of the estimators are evaluated for the practically relevant case of a random variable contaminated by Gaussian noise. Moreover, using Brown's identity, which relates the Fisher information and the minimum mean squared error (MMSE) in Gaussian noise, two corresponding consistent estimators for the MMSE are proposed. Simulation examples for the Bhattacharya estimator and the clipped estimator as well as the MMSE estimators are presented. The examples demonstrate that the clipped estimator can significantly reduce the required sample size to guarantee a specific confidence interval compared to the Bhattacharya estimator. Full Article