ia

Spatial modeling of trends in crime over time in Philadelphia

Cecilia Balocchi, Shane T. Jensen.

Source: The Annals of Applied Statistics, Volume 13, Number 4, 2235--2259.

Abstract:
Understanding the relationship between change in crime over time and the geography of urban areas is an important problem for urban planning. Accurate estimation of changing crime rates throughout a city would aid law enforcement as well as enable studies of the association between crime and the built environment. Bayesian modeling is a promising direction since areal data require principled sharing of information to address spatial autocorrelation between proximal neighborhoods. We develop several Bayesian approaches to spatial sharing of information between neighborhoods while modeling trends in crime counts over time. We apply our methodology to estimate changes in crime throughout Philadelphia over the 2006-15 period while also incorporating spatially-varying economic and demographic predictors. We find that the local shrinkage imposed by a conditional autoregressive model has substantial benefits in terms of out-of-sample predictive accuracy of crime. We also explore the possibility of spatial discontinuities between neighborhoods that could represent natural barriers or aspects of the built environment.




ia

Microsimulation model calibration using incremental mixture approximate Bayesian computation

Carolyn M. Rutter, Jonathan Ozik, Maria DeYoreo, Nicholson Collier.

Source: The Annals of Applied Statistics, Volume 13, Number 4, 2189--2212.

Abstract:
Microsimulation models (MSMs) are used to inform policy by predicting population-level outcomes under different scenarios. MSMs simulate individual-level event histories that mark the disease process (such as the development of cancer) and the effect of policy actions (such as screening) on these events. MSMs often have many unknown parameters; calibration is the process of searching the parameter space to select parameters that result in accurate MSM prediction of a wide range of targets. We develop Incremental Mixture Approximate Bayesian Computation (IMABC) for MSM calibration which results in a simulated sample from the posterior distribution of model parameters given calibration targets. IMABC begins with a rejection-based ABC step, drawing a sample of points from the prior distribution of model parameters and accepting points that result in simulated targets that are near observed targets. Next, the sample is iteratively updated by drawing additional points from a mixture of multivariate normal distributions and accepting points that result in accurate predictions. Posterior estimates are obtained by weighting the final set of accepted points to account for the adaptive sampling scheme. We demonstrate IMABC by calibrating CRC-SPIN 2.0, an updated version of a MSM for colorectal cancer (CRC) that has been used to inform national CRC screening guidelines.




ia

Joint model of accelerated failure time and mechanistic nonlinear model for censored covariates, with application in HIV/AIDS

Hongbin Zhang, Lang Wu.

Source: The Annals of Applied Statistics, Volume 13, Number 4, 2140--2157.

Abstract:
For a time-to-event outcome with censored time-varying covariates, a joint Cox model with a linear mixed effects model is the standard modeling approach. In some applications such as AIDS studies, mechanistic nonlinear models are available for some covariate process such as viral load during anti-HIV treatments, derived from the underlying data-generation mechanisms and disease progression. Such a mechanistic nonlinear covariate model may provide better-predicted values when the covariates are left censored or mismeasured. When the focus is on the impact of the time-varying covariate process on the survival outcome, an accelerated failure time (AFT) model provides an excellent alternative to the Cox proportional hazard model since an AFT model is formulated to allow the influence of the outcome by the entire covariate process. In this article, we consider a nonlinear mixed effects model for the censored covariates in an AFT model, implemented using a Monte Carlo EM algorithm, under the framework of a joint model for simultaneous inference. We apply the joint model to an HIV/AIDS data to gain insights for assessing the association between viral load and immunological restoration during antiretroviral therapy. Simulation is conducted to compare model performance when the covariate model and the survival model are misspecified.




ia

Statistical inference for partially observed branching processes with application to cell lineage tracking of in vivo hematopoiesis

Jason Xu, Samson Koelle, Peter Guttorp, Chuanfeng Wu, Cynthia Dunbar, Janis L. Abkowitz, Vladimir N. Minin.

Source: The Annals of Applied Statistics, Volume 13, Number 4, 2091--2119.

Abstract:
Single-cell lineage tracking strategies enabled by recent experimental technologies have produced significant insights into cell fate decisions, but lack the quantitative framework necessary for rigorous statistical analysis of mechanistic models describing cell division and differentiation. In this paper, we develop such a framework with corresponding moment-based parameter estimation techniques for continuous-time, multi-type branching processes. Such processes provide a probabilistic model of how cells divide and differentiate, and we apply our method to study hematopoiesis , the mechanism of blood cell production. We derive closed-form expressions for higher moments in a general class of such models. These analytical results allow us to efficiently estimate parameters of much richer statistical models of hematopoiesis than those used in previous statistical studies. To our knowledge, the method provides the first rate inference procedure for fitting such models to time series data generated from cellular barcoding experiments. After validating the methodology in simulation studies, we apply our estimator to hematopoietic lineage tracking data from rhesus macaques. Our analysis provides a more complete understanding of cell fate decisions during hematopoiesis in nonhuman primates, which may be more relevant to human biology and clinical strategies than previous findings from murine studies. For example, in addition to previously estimated hematopoietic stem cell self-renewal rate, we are able to estimate fate decision probabilities and to compare structurally distinct models of hematopoiesis using cross validation. These estimates of fate decision probabilities and our model selection results should help biologists compare competing hypotheses about how progenitor cells differentiate. The methodology is transferrable to a large class of stochastic compartmental and multi-type branching models, commonly used in studies of cancer progression, epidemiology and many other fields.




ia

Robust elastic net estimators for variable selection and identification of proteomic biomarkers

Gabriela V. Cohen Freue, David Kepplinger, Matías Salibián-Barrera, Ezequiel Smucler.

Source: The Annals of Applied Statistics, Volume 13, Number 4, 2065--2090.

Abstract:
In large-scale quantitative proteomic studies, scientists measure the abundance of thousands of proteins from the human proteome in search of novel biomarkers for a given disease. Penalized regression estimators can be used to identify potential biomarkers among a large set of molecular features measured. Yet, the performance and statistical properties of these estimators depend on the loss and penalty functions used to define them. Motivated by a real plasma proteomic biomarkers study, we propose a new class of penalized robust estimators based on the elastic net penalty, which can be tuned to keep groups of correlated variables together in the selected model and maintain robustness against possible outliers. We also propose an efficient algorithm to compute our robust penalized estimators and derive a data-driven method to select the penalty term. Our robust penalized estimators have very good robustness properties and are also consistent under certain regularity conditions. Numerical results show that our robust estimators compare favorably to other robust penalized estimators. Using our proposed methodology for the analysis of the proteomics data, we identify new potentially relevant biomarkers of cardiac allograft vasculopathy that are not found with nonrobust alternatives. The selected model is validated in a new set of 52 test samples and achieves an area under the receiver operating characteristic (AUC) of 0.85.




ia

Estimating abundance from multiple sampling capture-recapture data via a multi-state multi-period stopover model

Hannah Worthington, Rachel McCrea, Ruth King, Richard Griffiths.

Source: The Annals of Applied Statistics, Volume 13, Number 4, 2043--2064.

Abstract:
Capture-recapture studies often involve collecting data on numerous capture occasions over a relatively short period of time. For many study species this process is repeated, for example, annually, resulting in capture information spanning multiple sampling periods. To account for the different temporal scales, the robust design class of models have traditionally been applied providing a framework in which to analyse all of the available capture data in a single likelihood expression. However, these models typically require strong constraints, either the assumption of closure within a sampling period (the closed robust design) or conditioning on the number of individuals captured within a sampling period (the open robust design). For real datasets these assumptions may not be appropriate. We develop a general modelling structure that requires neither assumption by explicitly modelling the movement of individuals into the population both within and between the sampling periods, which in turn permits the estimation of abundance within a single consistent framework. The flexibility of the novel model structure is further demonstrated by including the computationally challenging case of multi-state data where there is individual time-varying discrete covariate information. We derive an efficient likelihood expression for the new multi-state multi-period stopover model using the hidden Markov model framework. We demonstrate the significant improvement in parameter estimation using our new modelling approach in terms of both the multi-period and multi-state components through both a simulation study and a real dataset relating to the protected species of great crested newts, Triturus cristatus .




ia

Estimating the rate constant from biosensor data via an adaptive variational Bayesian approach

Ye Zhang, Zhigang Yao, Patrik Forssén, Torgny Fornstedt.

Source: The Annals of Applied Statistics, Volume 13, Number 4, 2011--2042.

Abstract:
The means to obtain the rate constants of a chemical reaction is a fundamental open problem in both science and the industry. Traditional techniques for finding rate constants require either chemical modifications of the reactants or indirect measurements. The rate constant map method is a modern technique to study binding equilibrium and kinetics in chemical reactions. Finding a rate constant map from biosensor data is an ill-posed inverse problem that is usually solved by regularization. In this work, rather than finding a deterministic regularized rate constant map that does not provide uncertainty quantification of the solution, we develop an adaptive variational Bayesian approach to estimate the distribution of the rate constant map, from which some intrinsic properties of a chemical reaction can be explored, including information about rate constants. Our new approach is more realistic than the existing approaches used for biosensors and allows us to estimate the dynamics of the interactions, which are usually hidden in a deterministic approximate solution. We verify the performance of the new proposed method by numerical simulations, and compare it with the Markov chain Monte Carlo algorithm. The results illustrate that the variational method can reliably capture the posterior distribution in a computationally efficient way. Finally, the developed method is also tested on the real biosensor data (parathyroid hormone), where we provide two novel analysis tools—the thresholding contour map and the high order moment map—to estimate the number of interactions as well as their rate constants.




ia

A semiparametric modeling approach using Bayesian Additive Regression Trees with an application to evaluate heterogeneous treatment effects

Bret Zeldow, Vincent Lo Re III, Jason Roy.

Source: The Annals of Applied Statistics, Volume 13, Number 3, 1989--2010.

Abstract:
Bayesian Additive Regression Trees (BART) is a flexible machine learning algorithm capable of capturing nonlinearities between an outcome and covariates and interactions among covariates. We extend BART to a semiparametric regression framework in which the conditional expectation of an outcome is a function of treatment, its effect modifiers, and confounders. The confounders are allowed to have unspecified functional form, while treatment and effect modifiers that are directly related to the research question are given a linear form. The result is a Bayesian semiparametric linear regression model where the posterior distribution of the parameters of the linear part can be interpreted as in parametric Bayesian regression. This is useful in situations where a subset of the variables are of substantive interest and the others are nuisance variables that we would like to control for. An example of this occurs in causal modeling with the structural mean model (SMM). Under certain causal assumptions, our method can be used as a Bayesian SMM. Our methods are demonstrated with simulation studies and an application to dataset involving adults with HIV/Hepatitis C coinfection who newly initiate antiretroviral therapy. The methods are available in an R package called semibart.




ia

Radio-iBAG: Radiomics-based integrative Bayesian analysis of multiplatform genomic data

Youyi Zhang, Jeffrey S. Morris, Shivali Narang Aerry, Arvind U. K. Rao, Veerabhadran Baladandayuthapani.

Source: The Annals of Applied Statistics, Volume 13, Number 3, 1957--1988.

Abstract:
Technological innovations have produced large multi-modal datasets that include imaging and multi-platform genomics data. Integrative analyses of such data have the potential to reveal important biological and clinical insights into complex diseases like cancer. In this paper, we present Bayesian approaches for integrative analysis of radiological imaging and multi-platform genomic data, where-in our goals are to simultaneously identify genomic and radiomic, that is, radiology-based imaging markers, along with the latent associations between these two modalities, and to detect the overall prognostic relevance of the combined markers. For this task, we propose Radio-iBAG: Radiomics-based Integrative Bayesian Analysis of Multiplatform Genomic Data , a multi-scale Bayesian hierarchical model that involves several innovative strategies: it incorporates integrative analysis of multi-platform genomic data sets to capture fundamental biological relationships; explores the associations between radiomic markers accompanying genomic information with clinical outcomes; and detects genomic and radiomic markers associated with clinical prognosis. We also introduce the use of sparse Principal Component Analysis (sPCA) to extract a sparse set of approximately orthogonal meta-features each containing information from a set of related individual radiomic features, reducing dimensionality and combining like features. Our methods are motivated by and applied to The Cancer Genome Atlas glioblastoma multiforme data set, where-in we integrate magnetic resonance imaging-based biomarkers along with genomic, epigenomic and transcriptomic data. Our model identifies important magnetic resonance imaging features and the associated genomic platforms that are related with patient survival times.




ia

Bayesian methods for multiple mediators: Relating principal stratification and causal mediation in the analysis of power plant emission controls

Chanmin Kim, Michael J. Daniels, Joseph W. Hogan, Christine Choirat, Corwin M. Zigler.

Source: The Annals of Applied Statistics, Volume 13, Number 3, 1927--1956.

Abstract:
Emission control technologies installed on power plants are a key feature of many air pollution regulations in the US. While such regulations are predicated on the presumed relationships between emissions, ambient air pollution and human health, many of these relationships have never been empirically verified. The goal of this paper is to develop new statistical methods to quantify these relationships. We frame this problem as one of mediation analysis to evaluate the extent to which the effect of a particular control technology on ambient pollution is mediated through causal effects on power plant emissions. Since power plants emit various compounds that contribute to ambient pollution, we develop new methods for multiple intermediate variables that are measured contemporaneously, may interact with one another, and may exhibit joint mediating effects. Specifically, we propose new methods leveraging two related frameworks for causal inference in the presence of mediating variables: principal stratification and causal mediation analysis. We define principal effects based on multiple mediators, and also introduce a new decomposition of the total effect of an intervention on ambient pollution into the natural direct effect and natural indirect effects for all combinations of mediators. Both approaches are anchored to the same observed-data models, which we specify with Bayesian nonparametric techniques. We provide assumptions for estimating principal causal effects, then augment these with an additional assumption required for causal mediation analysis. The two analyses, interpreted in tandem, provide the first empirical investigation of the presumed causal pathways that motivate important air quality regulatory policies.




ia

Wavelet spectral testing: Application to nonstationary circadian rhythms

Jessica K. Hargreaves, Marina I. Knight, Jon W. Pitchford, Rachael J. Oakenfull, Sangeeta Chawla, Jack Munns, Seth J. Davis.

Source: The Annals of Applied Statistics, Volume 13, Number 3, 1817--1846.

Abstract:
Rhythmic data are ubiquitous in the life sciences. Biologists need reliable statistical tests to identify whether a particular experimental treatment has caused a significant change in a rhythmic signal. When these signals display nonstationary behaviour, as is common in many biological systems, the established methodologies may be misleading. Therefore, there is a real need for new methodology that enables the formal comparison of nonstationary processes. As circadian behaviour is best understood in the spectral domain, here we develop novel hypothesis testing procedures in the (wavelet) spectral domain, embedding replicate information when available. The data are modelled as realisations of locally stationary wavelet processes, allowing us to define and rigorously estimate their evolutionary wavelet spectra. Motivated by three complementary applications in circadian biology, our new methodology allows the identification of three specific types of spectral difference. We demonstrate the advantages of our methodology over alternative approaches, by means of a comprehensive simulation study and real data applications, using both published and newly generated circadian datasets. In contrast to the current standard methodologies, our method successfully identifies differences within the motivating circadian datasets, and facilitates wider ranging analyses of rhythmic biological data in general.




ia

Bayesian modeling of the structural connectome for studying Alzheimer’s disease

Arkaprava Roy, Subhashis Ghosal, Jeffrey Prescott, Kingshuk Roy Choudhury.

Source: The Annals of Applied Statistics, Volume 13, Number 3, 1791--1816.

Abstract:
We study possible relations between Alzheimer’s disease progression and the structure of the connectome which is white matter connecting different regions of the brain. Regression models in covariates including age, gender and disease status for the extent of white matter connecting each pair of regions of the brain are proposed. Subject inhomogeneity is also incorporated in the model through random effects with an unknown distribution. As there is a large number of pairs of regions, we also adopt a dimension reduction technique through graphon ( J. Combin. Theory Ser. B 96 (2006) 933–957) functions which reduces the functions of pairs of regions to functions of regions. The connecting graphon functions are considered unknown but the assumed smoothness allows putting priors of low complexity on these functions. We pursue a nonparametric Bayesian approach by assigning a Dirichlet process scale mixture of zero to mean normal prior on the distributions of the random effects and finite random series of tensor products of B-splines priors on the underlying graphon functions. We develop efficient Markov chain Monte Carlo techniques for drawing samples for the posterior distributions using Hamiltonian Monte Carlo (HMC). The proposed Bayesian method overwhelmingly outperforms a competing method based on ANCOVA models in the simulation setup. The proposed Bayesian approach is applied on a dataset of 100 subjects and 83 brain regions and key regions implicated in the changing connectome are identified.




ia

A hierarchical Bayesian model for single-cell clustering using RNA-sequencing data

Yiyi Liu, Joshua L. Warren, Hongyu Zhao.

Source: The Annals of Applied Statistics, Volume 13, Number 3, 1733--1752.

Abstract:
Understanding the heterogeneity of cells is an important biological question. The development of single-cell RNA-sequencing (scRNA-seq) technology provides high resolution data for such inquiry. A key challenge in scRNA-seq analysis is the high variability of measured RNA expression levels and frequent dropouts (missing values) due to limited input RNA compared to bulk RNA-seq measurement. Existing clustering methods do not perform well for these noisy and zero-inflated scRNA-seq data. In this manuscript we propose a Bayesian hierarchical model, called BasClu, to appropriately characterize important features of scRNA-seq data in order to more accurately cluster cells. We demonstrate the effectiveness of our method with extensive simulation studies and applications to three real scRNA-seq datasets.




ia

A Bayesian mark interaction model for analysis of tumor pathology images

Qiwei Li, Xinlei Wang, Faming Liang, Guanghua Xiao.

Source: The Annals of Applied Statistics, Volume 13, Number 3, 1708--1732.

Abstract:
With the advance of imaging technology, digital pathology imaging of tumor tissue slides is becoming a routine clinical procedure for cancer diagnosis. This process produces massive imaging data that capture histological details in high resolution. Recent developments in deep-learning methods have enabled us to identify and classify individual cells from digital pathology images at large scale. Reliable statistical approaches to model the spatial pattern of cells can provide new insight into tumor progression and shed light on the biological mechanisms of cancer. We consider the problem of modeling spatial correlations among three commonly seen cells observed in tumor pathology images. A novel geostatistical marking model with interpretable underlying parameters is proposed in a Bayesian framework. We use auxiliary variable MCMC algorithms to sample from the posterior distribution with an intractable normalizing constant. We demonstrate how this model-based analysis can lead to sharper inferences than ordinary exploratory analyses, by means of application to three benchmark datasets and a case study on the pathology images of $188$ lung cancer patients. The case study shows that the spatial correlation between tumor and stromal cells predicts patient prognosis. This statistical methodology not only presents a new model for characterizing spatial correlations in a multitype spatial point pattern conditioning on the locations of the points, but also provides a new perspective for understanding the role of cell–cell interactions in cancer progression.




ia

Sequential decision model for inference and prediction on nonuniform hypergraphs with application to knot matching from computational forestry

Seong-Hwan Jun, Samuel W. K. Wong, James V. Zidek, Alexandre Bouchard-Côté.

Source: The Annals of Applied Statistics, Volume 13, Number 3, 1678--1707.

Abstract:
In this paper, we consider the knot-matching problem arising in computational forestry. The knot-matching problem is an important problem that needs to be solved to advance the state of the art in automatic strength prediction of lumber. We show that this problem can be formulated as a quadripartite matching problem and develop a sequential decision model that admits efficient parameter estimation along with a sequential Monte Carlo sampler on graph matching that can be utilized for rapid sampling of graph matching. We demonstrate the effectiveness of our methods on 30 manually annotated boards and present findings from various simulation studies to provide further evidence supporting the efficacy of our methods.




ia

Modeling seasonality and serial dependence of electricity price curves with warping functional autoregressive dynamics

Ying Chen, J. S. Marron, Jiejie Zhang.

Source: The Annals of Applied Statistics, Volume 13, Number 3, 1590--1616.

Abstract:
Electricity prices are high dimensional, serially dependent and have seasonal variations. We propose a Warping Functional AutoRegressive (WFAR) model that simultaneously accounts for the cross time-dependence and seasonal variations of the large dimensional data. In particular, electricity price curves are obtained by smoothing over the $24$ discrete hourly prices on each day. In the functional domain, seasonal phase variations are separated from level amplitude changes in a warping process with the Fisher–Rao distance metric, and the aligned (season-adjusted) electricity price curves are modeled in the functional autoregression framework. In a real application, the WFAR model provides superior out-of-sample forecast accuracy in both a normal functioning market, Nord Pool, and an extreme situation, the California market. The forecast performance as well as the relative accuracy improvement are stable for different markets and different time periods.




ia

The classification permutation test: A flexible approach to testing for covariate imbalance in observational studies

Johann Gagnon-Bartsch, Yotam Shem-Tov.

Source: The Annals of Applied Statistics, Volume 13, Number 3, 1464--1483.

Abstract:
The gold standard for identifying causal relationships is a randomized controlled experiment. In many applications in the social sciences and medicine, the researcher does not control the assignment mechanism and instead may rely upon natural experiments or matching methods as a substitute to experimental randomization. The standard testable implication of random assignment is covariate balance between the treated and control units. Covariate balance is commonly used to validate the claim of as good as random assignment. We propose a new nonparametric test of covariate balance. Our Classification Permutation Test (CPT) is based on a combination of classification methods (e.g., random forests) with Fisherian permutation inference. We revisit four real data examples and present Monte Carlo power simulations to demonstrate the applicability of the CPT relative to other nonparametric tests of equality of multivariate distributions.




ia

Stratonovich type integration with respect to fractional Brownian motion with Hurst parameter less than $1/2$

Jorge A. León.

Source: Bernoulli, Volume 26, Number 3, 2436--2462.

Abstract:
Let $B^{H}$ be a fractional Brownian motion with Hurst parameter $Hin (0,1/2)$ and $p:mathbb{R} ightarrow mathbb{R}$ a polynomial function. The main purpose of this paper is to introduce a Stratonovich type stochastic integral with respect to $B^{H}$, whose domain includes the process $p(B^{H})$. That is, an integral that allows us to integrate $p(B^{H})$ with respect to $B^{H}$, which does not happen with the symmetric integral given by Russo and Vallois ( Probab. Theory Related Fields 97 (1993) 403–421) in general. Towards this end, we combine the approaches utilized by León and Nualart ( Stochastic Process. Appl. 115 (2005) 481–492), and Russo and Vallois ( Probab. Theory Related Fields 97 (1993) 403–421), whose aims are to extend the domain of the divergence operator for Gaussian processes and to define some stochastic integrals, respectively. Then, we study the relation between this Stratonovich integral and the extension of the divergence operator (see León and Nualart ( Stochastic Process. Appl. 115 (2005) 481–492)), an Itô formula and the existence of a unique solution of some Stratonovich stochastic differential equations. These last results have been analyzed by Alòs, León and Nualart ( Taiwanese J. Math. 5 (2001) 609–632), where the Hurst paramert $H$ belongs to the interval $(1/4,1/2)$.




ia

Frequency domain theory for functional time series: Variance decomposition and an invariance principle

Piotr Kokoszka, Neda Mohammadi Jouzdani.

Source: Bernoulli, Volume 26, Number 3, 2383--2399.

Abstract:
This paper is concerned with frequency domain theory for functional time series, which are temporally dependent sequences of functions in a Hilbert space. We consider a variance decomposition, which is more suitable for such a data structure than the variance decomposition based on the Karhunen–Loéve expansion. The decomposition we study uses eigenvalues of spectral density operators, which are functional analogs of the spectral density of a stationary scalar time series. We propose estimators of the variance components and derive convergence rates for their mean square error as well as their asymptotic normality. The latter is derived from a frequency domain invariance principle for the estimators of the spectral density operators. This principle is established for a broad class of linear time series models. It is a main contribution of the paper.




ia

Bayesian linear regression for multivariate responses under group sparsity

Bo Ning, Seonghyun Jeong, Subhashis Ghosal.

Source: Bernoulli, Volume 26, Number 3, 2353--2382.

Abstract:
We study frequentist properties of a Bayesian high-dimensional multivariate linear regression model with correlated responses. The predictors are separated into many groups and the group structure is pre-determined. Two features of the model are unique: (i) group sparsity is imposed on the predictors; (ii) the covariance matrix is unknown and its dimensions can also be high. We choose a product of independent spike-and-slab priors on the regression coefficients and a new prior on the covariance matrix based on its eigendecomposition. Each spike-and-slab prior is a mixture of a point mass at zero and a multivariate density involving the $ell_{2,1}$-norm. We first obtain the posterior contraction rate, the bounds on the effective dimension of the model with high posterior probabilities. We then show that the multivariate regression coefficients can be recovered under certain compatibility conditions. Finally, we quantify the uncertainty for the regression coefficients with frequentist validity through a Bernstein–von Mises type theorem. The result leads to selection consistency for the Bayesian method. We derive the posterior contraction rate using the general theory by constructing a suitable test from the first principle using moment bounds for certain likelihood ratios. This leads to posterior concentration around the truth with respect to the average Rényi divergence of order $1/2$. This technique of obtaining the required tests for posterior contraction rate could be useful in many other problems.




ia

A refined Cramér-type moderate deviation for sums of local statistics

Xiao Fang, Li Luo, Qi-Man Shao.

Source: Bernoulli, Volume 26, Number 3, 2319--2352.

Abstract:
We prove a refined Cramér-type moderate deviation result by taking into account of the skewness in normal approximation for sums of local statistics of independent random variables. We apply the main result to $k$-runs, U-statistics and subgraph counts in the Erdős–Rényi random graph. To prove our main result, we develop exponential concentration inequalities and higher-order tail probability expansions via Stein’s method.




ia

Convergence of persistence diagrams for topological crackle

Takashi Owada, Omer Bobrowski.

Source: Bernoulli, Volume 26, Number 3, 2275--2310.

Abstract:
In this paper, we study the persistent homology associated with topological crackle generated by distributions with an unbounded support. Persistent homology is a topological and algebraic structure that tracks the creation and destruction of topological cycles (generalizations of loops or holes) in different dimensions. Topological crackle is a term that refers to topological cycles generated by random points far away from the bulk of other points, when the support is unbounded. We establish weak convergence results for persistence diagrams – a point process representation for persistent homology, where each topological cycle is represented by its $({mathit{birth},mathit{death}})$ coordinates. In this work, we treat persistence diagrams as random closed sets, so that the resulting weak convergence is defined in terms of the Fell topology. Using this framework, we show that the limiting persistence diagrams can be divided into two parts. The first part is a deterministic limit containing a densely-growing number of persistence pairs with a shorter lifespan. The second part is a two-dimensional Poisson process, representing persistence pairs with a longer lifespan.




ia

Exponential integrability and exit times of diffusions on sub-Riemannian and metric measure spaces

Anton Thalmaier, James Thompson.

Source: Bernoulli, Volume 26, Number 3, 2202--2225.

Abstract:
In this article, we derive moment estimates, exponential integrability, concentration inequalities and exit times estimates for canonical diffusions firstly on sub-Riemannian limits of Riemannian foliations and secondly in the nonsmooth setting of $operatorname{RCD}^{*}(K,N)$ spaces. In each case, the necessary ingredients are Itô’s formula and a comparison theorem for the Laplacian, for which we refer to the recent literature. As an application, we derive pointwise Carmona-type estimates on eigenfunctions of Schrödinger operators.




ia

Directional differentiability for supremum-type functionals: Statistical applications

Javier Cárcamo, Antonio Cuevas, Luis-Alberto Rodríguez.

Source: Bernoulli, Volume 26, Number 3, 2143--2175.

Abstract:
We show that various functionals related to the supremum of a real function defined on an arbitrary set or a measure space are Hadamard directionally differentiable. We specifically consider the supremum norm, the supremum, the infimum, and the amplitude of a function. The (usually non-linear) derivatives of these maps adopt simple expressions under suitable assumptions on the underlying space. As an application, we improve and extend to the multidimensional case the results in Raghavachari ( Ann. Statist. 1 (1973) 67–73) regarding the limiting distributions of Kolmogorov–Smirnov type statistics under the alternative hypothesis. Similar results are obtained for analogous statistics associated with copulas. We additionally solve an open problem about the Berk–Jones statistic proposed by Jager and Wellner (In A Festschrift for Herman Rubin (2004) 319–331 IMS). Finally, the asymptotic distribution of maximum mean discrepancies over Donsker classes of functions is derived.




ia

Perfect sampling for Gibbs point processes using partial rejection sampling

Sarat B. Moka, Dirk P. Kroese.

Source: Bernoulli, Volume 26, Number 3, 2082--2104.

Abstract:
We present a perfect sampling algorithm for Gibbs point processes, based on the partial rejection sampling of Guo, Jerrum and Liu (In STOC’17 – Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing (2017) 342–355 ACM). Our particular focus is on pairwise interaction processes, penetrable spheres mixture models and area-interaction processes, with a finite interaction range. For an interaction range $2r$ of the target process, the proposed algorithm can generate a perfect sample with $O(log(1/r))$ expected running time complexity, provided that the intensity of the points is not too high and $Theta(1/r^{d})$ parallel processor units are available.




ia

First-order covariance inequalities via Stein’s method

Marie Ernst, Gesine Reinert, Yvik Swan.

Source: Bernoulli, Volume 26, Number 3, 2051--2081.

Abstract:
We propose probabilistic representations for inverse Stein operators (i.e., solutions to Stein equations) under general conditions; in particular, we deduce new simple expressions for the Stein kernel. These representations allow to deduce uniform and nonuniform Stein factors (i.e., bounds on solutions to Stein equations) and lead to new covariance identities expressing the covariance between arbitrary functionals of an arbitrary univariate target in terms of a weighted covariance of the derivatives of the functionals. Our weights are explicit, easily computable in most cases and expressed in terms of objects familiar within the context of Stein’s method. Applications of the Cauchy–Schwarz inequality to these weighted covariance identities lead to sharp upper and lower covariance bounds and, in particular, weighted Poincaré inequalities. Many examples are given and, in particular, classical variance bounds due to Klaassen, Brascamp and Lieb or Otto and Menz are corollaries. Connections with more recent literature are also detailed.




ia

Local differential privacy: Elbow effect in optimal density estimation and adaptation over Besov ellipsoids

Cristina Butucea, Amandine Dubois, Martin Kroll, Adrien Saumard.

Source: Bernoulli, Volume 26, Number 3, 1727--1764.

Abstract:
We address the problem of non-parametric density estimation under the additional constraint that only privatised data are allowed to be published and available for inference. For this purpose, we adopt a recent generalisation of classical minimax theory to the framework of local $alpha$-differential privacy and provide a lower bound on the rate of convergence over Besov spaces $mathcal{B}^{s}_{pq}$ under mean integrated $mathbb{L}^{r}$-risk. This lower bound is deteriorated compared to the standard setup without privacy, and reveals a twofold elbow effect. In order to fulfill the privacy requirement, we suggest adding suitably scaled Laplace noise to empirical wavelet coefficients. Upper bounds within (at most) a logarithmic factor are derived under the assumption that $alpha$ stays bounded as $n$ increases: A linear but non-adaptive wavelet estimator is shown to attain the lower bound whenever $pgeq r$ but provides a slower rate of convergence otherwise. An adaptive non-linear wavelet estimator with appropriately chosen smoothing parameters and thresholding is shown to attain the lower bound within a logarithmic factor for all cases.




ia

On the eigenproblem for Gaussian bridges

Pavel Chigansky, Marina Kleptsyna, Dmytro Marushkevych.

Source: Bernoulli, Volume 26, Number 3, 1706--1726.

Abstract:
Spectral decomposition of the covariance operator is one of the main building blocks in the theory and applications of Gaussian processes. Unfortunately, it is notoriously hard to derive in a closed form. In this paper, we consider the eigenproblem for Gaussian bridges. Given a base process, its bridge is obtained by conditioning the trajectories to start and terminate at the given points. What can be said about the spectrum of a bridge, given the spectrum of its base process? We show how this question can be answered asymptotically for a family of processes, including the fractional Brownian motion.




ia

Influence of the seed in affine preferential attachment trees

David Corlin Marchand, Ioan Manolescu.

Source: Bernoulli, Volume 26, Number 3, 1665--1705.

Abstract:
We study randomly growing trees governed by the affine preferential attachment rule. Starting with a seed tree $S$, vertices are attached one by one, each linked by an edge to a random vertex of the current tree, chosen with a probability proportional to an affine function of its degree. This yields a one-parameter family of preferential attachment trees $(T_{n}^{S})_{ngeq |S|}$, of which the linear model is a particular case. Depending on the choice of the parameter, the power-laws governing the degrees in $T_{n}^{S}$ have different exponents. We study the problem of the asymptotic influence of the seed $S$ on the law of $T_{n}^{S}$. We show that, for any two distinct seeds $S$ and $S'$, the laws of $T_{n}^{S}$ and $T_{n}^{S'}$ remain at uniformly positive total-variation distance as $n$ increases. This is a continuation of Curien et al. ( J. Éc. Polytech. Math. 2 (2015) 1–34), which in turn was inspired by a conjecture of Bubeck et al. ( IEEE Trans. Netw. Sci. Eng. 2 (2015) 30–39). The technique developed here is more robust than previous ones and is likely to help in the study of more general attachment mechanisms.




ia

Estimating the number of connected components in a graph via subgraph sampling

Jason M. Klusowski, Yihong Wu.

Source: Bernoulli, Volume 26, Number 3, 1635--1664.

Abstract:
Learning properties of large graphs from samples has been an important problem in statistical network analysis since the early work of Goodman ( Ann. Math. Stat. 20 (1949) 572–579) and Frank ( Scand. J. Stat. 5 (1978) 177–188). We revisit a problem formulated by Frank ( Scand. J. Stat. 5 (1978) 177–188) of estimating the number of connected components in a large graph based on the subgraph sampling model, in which we randomly sample a subset of the vertices and observe the induced subgraph. The key question is whether accurate estimation is achievable in the sublinear regime where only a vanishing fraction of the vertices are sampled. We show that it is impossible if the parent graph is allowed to contain high-degree vertices or long induced cycles. For the class of chordal graphs, where induced cycles of length four or above are forbidden, we characterize the optimal sample complexity within constant factors and construct linear-time estimators that provably achieve these bounds. This significantly expands the scope of previous results which have focused on unbiased estimators and special classes of graphs such as forests or cliques. Both the construction and the analysis of the proposed methodology rely on combinatorial properties of chordal graphs and identities of induced subgraph counts. They, in turn, also play a key role in proving minimax lower bounds based on construction of random instances of graphs with matching structures of small subgraphs.




ia

Sojourn time dimensions of fractional Brownian motion

Ivan Nourdin, Giovanni Peccati, Stéphane Seuret.

Source: Bernoulli, Volume 26, Number 3, 1619--1634.

Abstract:
We describe the size of the sets of sojourn times $E_{gamma }={tgeq 0:|B_{t}|leq t^{gamma }}$ associated with a fractional Brownian motion $B$ in terms of various large scale dimensions.




ia

Reliable clustering of Bernoulli mixture models

Amir Najafi, Seyed Abolfazl Motahari, Hamid R. Rabiee.

Source: Bernoulli, Volume 26, Number 2, 1535--1559.

Abstract:
A Bernoulli Mixture Model (BMM) is a finite mixture of random binary vectors with independent dimensions. The problem of clustering BMM data arises in a variety of real-world applications, ranging from population genetics to activity analysis in social networks. In this paper, we analyze the clusterability of BMMs from a theoretical perspective, when the number of clusters is unknown. In particular, we stipulate a set of conditions on the sample complexity and dimension of the model in order to guarantee the Probably Approximately Correct (PAC)-clusterability of a dataset. To the best of our knowledge, these findings are the first non-asymptotic bounds on the sample complexity of learning or clustering BMMs.




ia

On the probability distribution of the local times of diagonally operator-self-similar Gaussian fields with stationary increments

Kamran Kalbasi, Thomas Mountford.

Source: Bernoulli, Volume 26, Number 2, 1504--1534.

Abstract:
In this paper, we study the local times of vector-valued Gaussian fields that are ‘diagonally operator-self-similar’ and whose increments are stationary. Denoting the local time of such a Gaussian field around the spatial origin and over the temporal unit hypercube by $Z$, we show that there exists $lambdain(0,1)$ such that under some quite weak conditions, $lim_{n ightarrow+infty}frac{sqrt[n]{mathbb{E}(Z^{n})}}{n^{lambda}}$ and $lim_{x ightarrow+infty}frac{-logmathbb{P}(Z>x)}{x^{frac{1}{lambda}}}$ both exist and are strictly positive (possibly $+infty$). Moreover, we show that if the underlying Gaussian field is ‘strongly locally nondeterministic’, the above limits will be finite as well. These results are then applied to establish similar statements for the intersection local times of diagonally operator-self-similar Gaussian fields with stationary increments.




ia

The moduli of non-differentiability for Gaussian random fields with stationary increments

Wensheng Wang, Zhonggen Su, Yimin Xiao.

Source: Bernoulli, Volume 26, Number 2, 1410--1430.

Abstract:
We establish the exact moduli of non-differentiability of Gaussian random fields with stationary increments. As an application of the result, we prove that the uniform Hölder condition for the maximum local times of Gaussian random fields with stationary increments obtained in Xiao (1997) is optimal. These results are applicable to fractional Riesz–Bessel processes and stationary Gaussian random fields in the Matérn and Cauchy classes.




ia

Stratonovich stochastic differential equation with irregular coefficients: Girsanov’s example revisited

Ilya Pavlyukevich, Georgiy Shevchenko.

Source: Bernoulli, Volume 26, Number 2, 1381--1409.

Abstract:
In this paper, we study the Stratonovich stochastic differential equation $mathrm{d}X=|X|^{alpha }circ mathrm{d}B$, $alpha in (-1,1)$, which has been introduced by Cherstvy et al. ( New J. Phys. 15 (2013) 083039) in the context of analysis of anomalous diffusions in heterogeneous media. We determine its weak and strong solutions, which are homogeneous strong Markov processes spending zero time at $0$: for $alpha in (0,1)$, these solutions have the form egin{equation*}X_{t}^{ heta }=((1-alpha)B_{t}^{ heta })^{1/(1-alpha )},end{equation*} where $B^{ heta }$ is the $ heta $-skew Brownian motion driven by $B$ and starting at $frac{1}{1-alpha }(X_{0})^{1-alpha }$, $ heta in [-1,1]$, and $(x)^{gamma }=|x|^{gamma }operatorname{sign}x$; for $alpha in (-1,0]$, only the case $ heta =0$ is possible. The central part of the paper consists in the proof of the existence of a quadratic covariation $[f(B^{ heta }),B]$ for a locally square integrable function $f$ and is based on the time-reversion technique for Markovian diffusions.




ia

On stability of traveling wave solutions for integro-differential equations related to branching Markov processes

Pasha Tkachov.

Source: Bernoulli, Volume 26, Number 2, 1354--1380.

Abstract:
The aim of this paper is to prove stability of traveling waves for integro-differential equations connected with branching Markov processes. In other words, the limiting law of the left-most particle of a (time-continuous) branching Markov process with a Lévy non-branching part is demonstrated. The key idea is to approximate the branching Markov process by a branching random walk and apply the result of Aïdékon [ Ann. Probab. 41 (2013) 1362–1426] on the limiting law of the latter one.




ia

Consistent structure estimation of exponential-family random graph models with block structure

Michael Schweinberger.

Source: Bernoulli, Volume 26, Number 2, 1205--1233.

Abstract:
We consider the challenging problem of statistical inference for exponential-family random graph models based on a single observation of a random graph with complex dependence. To facilitate statistical inference, we consider random graphs with additional structure in the form of block structure. We have shown elsewhere that when the block structure is known, it facilitates consistency results for $M$-estimators of canonical and curved exponential-family random graph models with complex dependence, such as transitivity. In practice, the block structure is known in some applications (e.g., multilevel networks), but is unknown in others. When the block structure is unknown, the first and foremost question is whether it can be recovered with high probability based on a single observation of a random graph with complex dependence. The main consistency results of the paper show that it is possible to do so under weak dependence and smoothness conditions. These results confirm that exponential-family random graph models with block structure constitute a promising direction of statistical network analysis.




ia

Robust regression via mutivariate regression depth

Chao Gao.

Source: Bernoulli, Volume 26, Number 2, 1139--1170.

Abstract:
This paper studies robust regression in the settings of Huber’s $epsilon$-contamination models. We consider estimators that are maximizers of multivariate regression depth functions. These estimators are shown to achieve minimax rates in the settings of $epsilon$-contamination models for various regression problems including nonparametric regression, sparse linear regression, reduced rank regression, etc. We also discuss a general notion of depth function for linear operators that has potential applications in robust functional linear regression.




ia

A Bayesian nonparametric approach to log-concave density estimation

Ester Mariucci, Kolyan Ray, Botond Szabó.

Source: Bernoulli, Volume 26, Number 2, 1070--1097.

Abstract:
The estimation of a log-concave density on $mathbb{R}$ is a canonical problem in the area of shape-constrained nonparametric inference. We present a Bayesian nonparametric approach to this problem based on an exponentiated Dirichlet process mixture prior and show that the posterior distribution converges to the log-concave truth at the (near-) minimax rate in Hellinger distance. Our proof proceeds by establishing a general contraction result based on the log-concave maximum likelihood estimator that prevents the need for further metric entropy calculations. We further present computationally more feasible approximations and both an empirical and hierarchical Bayes approach. All priors are illustrated numerically via simulations.




ia

Distances and large deviations in the spatial preferential attachment model

Christian Hirsch, Christian Mönch.

Source: Bernoulli, Volume 26, Number 2, 927--947.

Abstract:
This paper considers two asymptotic properties of a spatial preferential-attachment model introduced by E. Jacob and P. Mörters (In Algorithms and Models for the Web Graph (2013) 14–25 Springer). First, in a regime of strong linear reinforcement, we show that typical distances are at most of doubly-logarithmic order. Second, we derive a large deviation principle for the empirical neighbourhood structure and express the rate function as solution to an entropy minimisation problem in the space of stationary marked point processes.




ia

Recurrence of multidimensional persistent random walks. Fourier and series criteria

Peggy Cénac, Basile de Loynes, Yoann Offret, Arnaud Rousselle.

Source: Bernoulli, Volume 26, Number 2, 858--892.

Abstract:
The recurrence and transience of persistent random walks built from variable length Markov chains are investigated. It turns out that these stochastic processes can be seen as Lévy walks for which the persistence times depend on some internal Markov chain: they admit Markov random walk skeletons. A recurrence versus transience dichotomy is highlighted. Assuming the positive recurrence of the driving chain, a sufficient Fourier criterion for the recurrence, close to the usual Chung–Fuchs one, is given and a series criterion is derived. The key tool is the Nagaev–Guivarc’h method. Finally, we focus on particular two-dimensional persistent random walks, including directionally reinforced random walks, for which necessary and sufficient Fourier and series criteria are obtained. Inspired by ( Adv. Math. 208 (2007) 680–698), we produce a genuine counterexample to the conjecture of ( Adv. Math. 117 (1996) 239–252). As for the one-dimensional case studied in ( J. Theoret. Probab. 31 (2018) 232–243), it is easier for a persistent random walk than its skeleton to be recurrent. However, such examples are much more difficult to exhibit in the higher dimensional context. These results are based on a surprisingly novel – to our knowledge – upper bound for the Lévy concentration function associated with symmetric distributions.




ia

Stochastic differential equations with a fractionally filtered delay: A semimartingale model for long-range dependent processes

Richard A. Davis, Mikkel Slot Nielsen, Victor Rohde.

Source: Bernoulli, Volume 26, Number 2, 799--827.

Abstract:
In this paper, we introduce a model, the stochastic fractional delay differential equation (SFDDE), which is based on the linear stochastic delay differential equation and produces stationary processes with hyperbolically decaying autocovariance functions. The model departs from the usual way of incorporating this type of long-range dependence into a short-memory model as it is obtained by applying a fractional filter to the drift term rather than to the noise term. The advantages of this approach are that the corresponding long-range dependent solutions are semimartingales and the local behavior of the sample paths is unaffected by the degree of long memory. We prove existence and uniqueness of solutions to the SFDDEs and study their spectral densities and autocovariance functions. Moreover, we define a subclass of SFDDEs which we study in detail and relate to the well-known fractionally integrated CARMA processes. Finally, we consider the task of simulating from the defining SFDDEs.




ia

A Feynman–Kac result via Markov BSDEs with generalised drivers

Elena Issoglio, Francesco Russo.

Source: Bernoulli, Volume 26, Number 1, 728--766.

Abstract:
In this paper, we investigate BSDEs where the driver contains a distributional term (in the sense of generalised functions) and derive general Feynman–Kac formulae related to these BSDEs. We introduce an integral operator to give sense to the equation and then we show the existence of a strong solution employing results on a related PDE. Due to the irregularity of the driver, the $Y$-component of a couple $(Y,Z)$ solving the BSDE is not necessarily a semimartingale but a weak Dirichlet process.




ia

Robust modifications of U-statistics and applications to covariance estimation problems

Stanislav Minsker, Xiaohan Wei.

Source: Bernoulli, Volume 26, Number 1, 694--727.

Abstract:
Let $Y$ be a $d$-dimensional random vector with unknown mean $mu $ and covariance matrix $Sigma $. This paper is motivated by the problem of designing an estimator of $Sigma $ that admits exponential deviation bounds in the operator norm under minimal assumptions on the underlying distribution, such as existence of only 4th moments of the coordinates of $Y$. To address this problem, we propose robust modifications of the operator-valued U-statistics, obtain non-asymptotic guarantees for their performance, and demonstrate the implications of these results to the covariance estimation problem under various structural assumptions.




ia

On frequentist coverage errors of Bayesian credible sets in moderately high dimensions

Keisuke Yano, Kengo Kato.

Source: Bernoulli, Volume 26, Number 1, 616--641.

Abstract:
In this paper, we study frequentist coverage errors of Bayesian credible sets for an approximately linear regression model with (moderately) high dimensional regressors, where the dimension of the regressors may increase with but is smaller than the sample size. Specifically, we consider quasi-Bayesian inference on the slope vector under the quasi-likelihood with Gaussian error distribution. Under this setup, we derive finite sample bounds on frequentist coverage errors of Bayesian credible rectangles. Derivation of those bounds builds on a novel Berry–Esseen type bound on quasi-posterior distributions and recent results on high-dimensional CLT on hyperrectangles. We use this general result to quantify coverage errors of Castillo–Nickl and $L^{infty}$-credible bands for Gaussian white noise models, linear inverse problems, and (possibly non-Gaussian) nonparametric regression models. In particular, we show that Bayesian credible bands for those nonparametric models have coverage errors decaying polynomially fast in the sample size, implying advantages of Bayesian credible bands over confidence bands based on extreme value theory.




ia

Operator-scaling Gaussian random fields via aggregation

Yi Shen, Yizao Wang.

Source: Bernoulli, Volume 26, Number 1, 500--530.

Abstract:
We propose an aggregated random-field model, and investigate the scaling limits of the aggregated partial-sum random fields. In this model, each copy in the aggregation is a $pm 1$-valued random field built from two correlated one-dimensional random walks, the law of each determined by a random persistence parameter. A flexible joint distribution of the two parameters is introduced, and given the parameters the two correlated random walks are conditionally independent. For the aggregated random field, when the persistence parameters are independent, the scaling limit is a fractional Brownian sheet. When the persistence parameters are tail-dependent, characterized in the framework of multivariate regular variation, the scaling limit is more delicate, and in particular depends on the growth rates of the underlying rectangular region along two directions: at different rates different operator-scaling Gaussian random fields appear as the region area tends to infinity. In particular, at the so-called critical speed, a large family of Gaussian random fields with long-range dependence arise in the limit. We also identify four different regimes at non-critical speed where fractional Brownian sheets arise in the limit.




ia

Multivariate count autoregression

Konstantinos Fokianos, Bård Støve, Dag Tjøstheim, Paul Doukhan.

Source: Bernoulli, Volume 26, Number 1, 471--499.

Abstract:
We are studying linear and log-linear models for multivariate count time series data with Poisson marginals. For studying the properties of such processes we develop a novel conceptual framework which is based on copulas. Earlier contributions impose the copula on the joint distribution of the vector of counts by employing a continuous extension methodology. Instead we introduce a copula function on a vector of associated continuous random variables. This construction avoids conceptual difficulties related to the joint distribution of counts yet it keeps the properties of the Poisson process marginally. Furthermore, this construction can be employed for modeling multivariate count time series with other marginal count distributions. We employ Markov chain theory and the notion of weak dependence to study ergodicity and stationarity of the models we consider. Suitable estimating equations are suggested for estimating unknown model parameters. The large sample properties of the resulting estimators are studied in detail. The work concludes with some simulations and a real data example.




ia

A new method for obtaining sharp compound Poisson approximation error estimates for sums of locally dependent random variables

Michael V. Boutsikas, Eutichia Vaggelatou

Source: Bernoulli, Volume 16, Number 2, 301--330.

Abstract:
Let X 1 , X 2 , …, X n be a sequence of independent or locally dependent random variables taking values in ℤ + . In this paper, we derive sharp bounds, via a new probabilistic method, for the total variation distance between the distribution of the sum ∑ i =1 n X i and an appropriate Poisson or compound Poisson distribution. These bounds include a factor which depends on the smoothness of the approximating Poisson or compound Poisson distribution. This “smoothness factor” is of order O( σ −2 ), according to a heuristic argument, where σ 2 denotes the variance of the approximating distribution. In this way, we offer sharp error estimates for a large range of values of the parameters. Finally, specific examples concerning appearances of rare runs in sequences of Bernoulli trials are presented by way of illustration.




ia

English given names : popularity, spelling variants, diminutives and abbreviations / by Carol Baxter.

Names, Personal -- England.




ia

Fuhlbohm family history : a collection of memorabilia of our ancestors and families in Germany, USA, and Australia / by Oscar Fuhlbohm.

Fuhlbohm (Family)