f Bayesian linear regression for multivariate responses under group sparsity By projecteuclid.org Published On :: Mon, 27 Apr 2020 04:02 EDT Bo Ning, Seonghyun Jeong, Subhashis Ghosal. Source: Bernoulli, Volume 26, Number 3, 2353--2382.Abstract: We study frequentist properties of a Bayesian high-dimensional multivariate linear regression model with correlated responses. The predictors are separated into many groups and the group structure is pre-determined. Two features of the model are unique: (i) group sparsity is imposed on the predictors; (ii) the covariance matrix is unknown and its dimensions can also be high. We choose a product of independent spike-and-slab priors on the regression coefficients and a new prior on the covariance matrix based on its eigendecomposition. Each spike-and-slab prior is a mixture of a point mass at zero and a multivariate density involving the $ell_{2,1}$-norm. We first obtain the posterior contraction rate, the bounds on the effective dimension of the model with high posterior probabilities. We then show that the multivariate regression coefficients can be recovered under certain compatibility conditions. Finally, we quantify the uncertainty for the regression coefficients with frequentist validity through a Bernstein–von Mises type theorem. The result leads to selection consistency for the Bayesian method. We derive the posterior contraction rate using the general theory by constructing a suitable test from the first principle using moment bounds for certain likelihood ratios. This leads to posterior concentration around the truth with respect to the average Rényi divergence of order $1/2$. This technique of obtaining the required tests for posterior contraction rate could be useful in many other problems. Full Article
f A refined Cramér-type moderate deviation for sums of local statistics By projecteuclid.org Published On :: Mon, 27 Apr 2020 04:02 EDT Xiao Fang, Li Luo, Qi-Man Shao. Source: Bernoulli, Volume 26, Number 3, 2319--2352.Abstract: We prove a refined Cramér-type moderate deviation result by taking into account of the skewness in normal approximation for sums of local statistics of independent random variables. We apply the main result to $k$-runs, U-statistics and subgraph counts in the Erdős–Rényi random graph. To prove our main result, we develop exponential concentration inequalities and higher-order tail probability expansions via Stein’s method. Full Article
f Convergence of persistence diagrams for topological crackle By projecteuclid.org Published On :: Mon, 27 Apr 2020 04:02 EDT Takashi Owada, Omer Bobrowski. Source: Bernoulli, Volume 26, Number 3, 2275--2310.Abstract: In this paper, we study the persistent homology associated with topological crackle generated by distributions with an unbounded support. Persistent homology is a topological and algebraic structure that tracks the creation and destruction of topological cycles (generalizations of loops or holes) in different dimensions. Topological crackle is a term that refers to topological cycles generated by random points far away from the bulk of other points, when the support is unbounded. We establish weak convergence results for persistence diagrams – a point process representation for persistent homology, where each topological cycle is represented by its $({mathit{birth},mathit{death}})$ coordinates. In this work, we treat persistence diagrams as random closed sets, so that the resulting weak convergence is defined in terms of the Fell topology. Using this framework, we show that the limiting persistence diagrams can be divided into two parts. The first part is a deterministic limit containing a densely-growing number of persistence pairs with a shorter lifespan. The second part is a two-dimensional Poisson process, representing persistence pairs with a longer lifespan. Full Article
f Concentration of the spectral norm of Erdős–Rényi random graphs By projecteuclid.org Published On :: Mon, 27 Apr 2020 04:02 EDT Gábor Lugosi, Shahar Mendelson, Nikita Zhivotovskiy. Source: Bernoulli, Volume 26, Number 3, 2253--2274.Abstract: We present results on the concentration properties of the spectral norm $|A_{p}|$ of the adjacency matrix $A_{p}$ of an Erdős–Rényi random graph $G(n,p)$. First, we consider the Erdős–Rényi random graph process and prove that $|A_{p}|$ is uniformly concentrated over the range $pin[Clog n/n,1]$. The analysis is based on delocalization arguments, uniform laws of large numbers, together with the entropy method to prove concentration inequalities. As an application of our techniques, we prove sharp sub-Gaussian moment inequalities for $|A_{p}|$ for all $pin[clog^{3}n/n,1]$ that improve the general bounds of Alon, Krivelevich, and Vu ( Israel J. Math. 131 (2002) 259–267) and some of the more recent results of Erdős et al. ( Ann. Probab. 41 (2013) 2279–2375). Both results are consistent with the asymptotic result of Füredi and Komlós ( Combinatorica 1 (1981) 233–241) that holds for fixed $p$ as $n oinfty$. Full Article
f On Sobolev tests of uniformity on the circle with an extension to the sphere By projecteuclid.org Published On :: Mon, 27 Apr 2020 04:02 EDT Sreenivasa Rao Jammalamadaka, Simos Meintanis, Thomas Verdebout. Source: Bernoulli, Volume 26, Number 3, 2226--2252.Abstract: Circular and spherical data arise in many applications, especially in biology, Earth sciences and astronomy. In dealing with such data, one of the preliminary steps before any further inference, is to test if such data is isotropic, that is, uniformly distributed around the circle or the sphere. In view of its importance, there is a considerable literature on the topic. In the present work, we provide new tests of uniformity on the circle based on original asymptotic results. Our tests are motivated by the shape of locally and asymptotically maximin tests of uniformity against generalized von Mises distributions. We show that they are uniformly consistent. Empirical power comparisons with several competing procedures are presented via simulations. The new tests detect particularly well multimodal alternatives such as mixtures of von Mises distributions. A practically-oriented combination of the new tests with already existing Sobolev tests is proposed. An extension to testing uniformity on the sphere, along with some simulations, is included. The procedures are illustrated on a real dataset. Full Article
f Exponential integrability and exit times of diffusions on sub-Riemannian and metric measure spaces By projecteuclid.org Published On :: Mon, 27 Apr 2020 04:02 EDT Anton Thalmaier, James Thompson. Source: Bernoulli, Volume 26, Number 3, 2202--2225.Abstract: In this article, we derive moment estimates, exponential integrability, concentration inequalities and exit times estimates for canonical diffusions firstly on sub-Riemannian limits of Riemannian foliations and secondly in the nonsmooth setting of $operatorname{RCD}^{*}(K,N)$ spaces. In each case, the necessary ingredients are Itô’s formula and a comparison theorem for the Laplacian, for which we refer to the recent literature. As an application, we derive pointwise Carmona-type estimates on eigenfunctions of Schrödinger operators. Full Article
f Scaling limits for super-replication with transient price impact By projecteuclid.org Published On :: Mon, 27 Apr 2020 04:02 EDT Peter Bank, Yan Dolinsky. Source: Bernoulli, Volume 26, Number 3, 2176--2201.Abstract: We prove a scaling limit theorem for the super-replication cost of options in a Cox–Ross–Rubinstein binomial model with transient price impact. The correct scaling turns out to keep the market depth parameter constant while resilience over fixed periods of time grows in inverse proportion with the duration between trading times. For vanilla options, the scaling limit is found to coincide with the one obtained by PDE-methods in ( Math. Finance 22 (2012) 250–276) for models with purely temporary price impact. These models are a special case of our framework and so our probabilistic scaling limit argument allows one to expand the scope of the scaling limit result to path-dependent options. Full Article
f Directional differentiability for supremum-type functionals: Statistical applications By projecteuclid.org Published On :: Mon, 27 Apr 2020 04:02 EDT Javier Cárcamo, Antonio Cuevas, Luis-Alberto Rodríguez. Source: Bernoulli, Volume 26, Number 3, 2143--2175.Abstract: We show that various functionals related to the supremum of a real function defined on an arbitrary set or a measure space are Hadamard directionally differentiable. We specifically consider the supremum norm, the supremum, the infimum, and the amplitude of a function. The (usually non-linear) derivatives of these maps adopt simple expressions under suitable assumptions on the underlying space. As an application, we improve and extend to the multidimensional case the results in Raghavachari ( Ann. Statist. 1 (1973) 67–73) regarding the limiting distributions of Kolmogorov–Smirnov type statistics under the alternative hypothesis. Similar results are obtained for analogous statistics associated with copulas. We additionally solve an open problem about the Berk–Jones statistic proposed by Jager and Wellner (In A Festschrift for Herman Rubin (2004) 319–331 IMS). Finally, the asymptotic distribution of maximum mean discrepancies over Donsker classes of functions is derived. Full Article
f Perfect sampling for Gibbs point processes using partial rejection sampling By projecteuclid.org Published On :: Mon, 27 Apr 2020 04:02 EDT Sarat B. Moka, Dirk P. Kroese. Source: Bernoulli, Volume 26, Number 3, 2082--2104.Abstract: We present a perfect sampling algorithm for Gibbs point processes, based on the partial rejection sampling of Guo, Jerrum and Liu (In STOC’17 – Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing (2017) 342–355 ACM). Our particular focus is on pairwise interaction processes, penetrable spheres mixture models and area-interaction processes, with a finite interaction range. For an interaction range $2r$ of the target process, the proposed algorithm can generate a perfect sample with $O(log(1/r))$ expected running time complexity, provided that the intensity of the points is not too high and $Theta(1/r^{d})$ parallel processor units are available. Full Article
f First-order covariance inequalities via Stein’s method By projecteuclid.org Published On :: Mon, 27 Apr 2020 04:02 EDT Marie Ernst, Gesine Reinert, Yvik Swan. Source: Bernoulli, Volume 26, Number 3, 2051--2081.Abstract: We propose probabilistic representations for inverse Stein operators (i.e., solutions to Stein equations) under general conditions; in particular, we deduce new simple expressions for the Stein kernel. These representations allow to deduce uniform and nonuniform Stein factors (i.e., bounds on solutions to Stein equations) and lead to new covariance identities expressing the covariance between arbitrary functionals of an arbitrary univariate target in terms of a weighted covariance of the derivatives of the functionals. Our weights are explicit, easily computable in most cases and expressed in terms of objects familiar within the context of Stein’s method. Applications of the Cauchy–Schwarz inequality to these weighted covariance identities lead to sharp upper and lower covariance bounds and, in particular, weighted Poincaré inequalities. Many examples are given and, in particular, classical variance bounds due to Klaassen, Brascamp and Lieb or Otto and Menz are corollaries. Connections with more recent literature are also detailed. Full Article
f On estimation of nonsmooth functionals of sparse normal means By projecteuclid.org Published On :: Mon, 27 Apr 2020 04:02 EDT O. Collier, L. Comminges, A.B. Tsybakov. Source: Bernoulli, Volume 26, Number 3, 1989--2020.Abstract: We study the problem of estimation of $N_{gamma }( heta )=sum_{i=1}^{d}| heta _{i}|^{gamma }$ for $gamma >0$ and of the $ell _{gamma }$-norm of $ heta $ for $gamma ge 1$ based on the observations $y_{i}= heta _{i}+varepsilon xi _{i}$, $i=1,ldots,d$, where $ heta =( heta _{1},dots , heta _{d})$ are unknown parameters, $varepsilon >0$ is known, and $xi _{i}$ are i.i.d. standard normal random variables. We find the non-asymptotic minimax rate for estimation of these functionals on the class of $s$-sparse vectors $ heta $ and we propose estimators achieving this rate. Full Article
f On sampling from a log-concave density using kinetic Langevin diffusions By projecteuclid.org Published On :: Mon, 27 Apr 2020 04:02 EDT Arnak S. Dalalyan, Lionel Riou-Durand. Source: Bernoulli, Volume 26, Number 3, 1956--1988.Abstract: Langevin diffusion processes and their discretizations are often used for sampling from a target density. The most convenient framework for assessing the quality of such a sampling scheme corresponds to smooth and strongly log-concave densities defined on $mathbb{R}^{p}$. The present work focuses on this framework and studies the behavior of the Monte Carlo algorithm based on discretizations of the kinetic Langevin diffusion. We first prove the geometric mixing property of the kinetic Langevin diffusion with a mixing rate that is optimal in terms of its dependence on the condition number. We then use this result for obtaining improved guarantees of sampling using the kinetic Langevin Monte Carlo method, when the quality of sampling is measured by the Wasserstein distance. We also consider the situation where the Hessian of the log-density of the target distribution is Lipschitz-continuous. In this case, we introduce a new discretization of the kinetic Langevin diffusion and prove that this leads to a substantial improvement of the upper bound on the sampling error measured in Wasserstein distance. Full Article
f Busemann functions and semi-infinite O’Connell–Yor polymers By projecteuclid.org Published On :: Mon, 27 Apr 2020 04:02 EDT Tom Alberts, Firas Rassoul-Agha, Mackenzie Simper. Source: Bernoulli, Volume 26, Number 3, 1927--1955.Abstract: We prove that given any fixed asymptotic velocity, the finite length O’Connell–Yor polymer has an infinite length limit satisfying the law of large numbers with this velocity. By a Markovian property of the quenched polymer this reduces to showing the existence of Busemann functions : almost sure limits of ratios of random point-to-point partition functions. The key ingredients are the Burke property of the O’Connell–Yor polymer and a comparison lemma for the ratios of partition functions. We also show the existence of infinite length limits in the Brownian last passage percolation model. Full Article
f On the best constant in the martingale version of Fefferman’s inequality By projecteuclid.org Published On :: Mon, 27 Apr 2020 04:02 EDT Adam Osękowski. Source: Bernoulli, Volume 26, Number 3, 1912--1926.Abstract: Let $X=(X_{t})_{tgeq 0}in H^{1}$ and $Y=(Y_{t})_{tgeq 0}in{mathrm{BMO}} $ be arbitrary continuous-path martingales. The paper contains the proof of the inequality egin{equation*}mathbb{E}int _{0}^{infty }iglvert dlangle X,Y angle_{t}igrvert leq sqrt{2}Vert XVert _{H^{1}}Vert YVert _{mathrm{BMO}_{2}},end{equation*} and the constant $sqrt{2}$ is shown to be the best possible. The proof rests on the construction of a certain special function, enjoying appropriate size and concavity conditions. Full Article
f Functional weak limit theorem for a local empirical process of non-stationary time series and its application By projecteuclid.org Published On :: Mon, 27 Apr 2020 04:02 EDT Ulrike Mayer, Henryk Zähle, Zhou Zhou. Source: Bernoulli, Volume 26, Number 3, 1891--1911.Abstract: We derive a functional weak limit theorem for a local empirical process of a wide class of piece-wise locally stationary (PLS) time series. The latter result is applied to derive the asymptotics of weighted empirical quantiles and weighted V-statistics of non-stationary time series. The class of admissible underlying time series is illustrated by means of PLS linear processes and PLS ARCH processes. Full Article
f Logarithmic Sobolev inequalities for finite spin systems and applications By projecteuclid.org Published On :: Mon, 27 Apr 2020 04:02 EDT Holger Sambale, Arthur Sinulis. Source: Bernoulli, Volume 26, Number 3, 1863--1890.Abstract: We derive sufficient conditions for a probability measure on a finite product space (a spin system ) to satisfy a (modified) logarithmic Sobolev inequality. We establish these conditions for various examples, such as the (vertex-weighted) exponential random graph model, the random coloring and the hard-core model with fugacity. This leads to two separate branches of applications. The first branch is given by mixing time estimates of the Glauber dynamics. The proofs do not rely on coupling arguments, but instead use functional inequalities. As a byproduct, this also yields exponential decay of the relative entropy along the Glauber semigroup. Secondly, we investigate the concentration of measure phenomenon (particularly of higher order) for these spin systems. We show the effect of better concentration properties by centering not around the mean, but around a stochastic term in the exponential random graph model. From there, one can deduce a central limit theorem for the number of triangles from the CLT of the edge count. In the Erdős–Rényi model the first-order approximation leads to a quantification and a proof of a central limit theorem for subgraph counts. Full Article
f Kernel and wavelet density estimators on manifolds and more general metric spaces By projecteuclid.org Published On :: Mon, 27 Apr 2020 04:02 EDT Galatia Cleanthous, Athanasios G. Georgiadis, Gerard Kerkyacharian, Pencho Petrushev, Dominique Picard. Source: Bernoulli, Volume 26, Number 3, 1832--1862.Abstract: We consider the problem of estimating the density of observations taking values in classical or nonclassical spaces such as manifolds and more general metric spaces. Our setting is quite general but also sufficiently rich in allowing the development of smooth functional calculus with well localized spectral kernels, Besov regularity spaces, and wavelet type systems. Kernel and both linear and nonlinear wavelet density estimators are introduced and studied. Convergence rates for these estimators are established and discussed. Full Article
f Optimal functional supervised classification with separation condition By projecteuclid.org Published On :: Mon, 27 Apr 2020 04:02 EDT Sébastien Gadat, Sébastien Gerchinovitz, Clément Marteau. Source: Bernoulli, Volume 26, Number 3, 1797--1831.Abstract: We consider the binary supervised classification problem with the Gaussian functional model introduced in ( Math. Methods Statist. 22 (2013) 213–225). Taking advantage of the Gaussian structure, we design a natural plug-in classifier and derive a family of upper bounds on its worst-case excess risk over Sobolev spaces. These bounds are parametrized by a separation distance quantifying the difficulty of the problem, and are proved to be optimal (up to logarithmic factors) through matching minimax lower bounds. Using the recent works of (In Advances in Neural Information Processing Systems (2014) 3437–3445 Curran Associates) and ( Ann. Statist. 44 (2016) 982–1009), we also derive a logarithmic lower bound showing that the popular $k$-nearest neighbors classifier is far from optimality in this specific functional setting. Full Article
f A fast algorithm with minimax optimal guarantees for topic models with an unknown number of topics By projecteuclid.org Published On :: Mon, 27 Apr 2020 04:02 EDT Xin Bing, Florentina Bunea, Marten Wegkamp. Source: Bernoulli, Volume 26, Number 3, 1765--1796.Abstract: Topic models have become popular for the analysis of data that consists in a collection of n independent multinomial observations, with parameters $N_{i}inmathbb{N}$ and $Pi_{i}in[0,1]^{p}$ for $i=1,ldots,n$. The model links all cell probabilities, collected in a $p imes n$ matrix $Pi$, via the assumption that $Pi$ can be factorized as the product of two nonnegative matrices $Ain[0,1]^{p imes K}$ and $Win[0,1]^{K imes n}$. Topic models have been originally developed in text mining, when one browses through $n$ documents, based on a dictionary of $p$ words, and covering $K$ topics. In this terminology, the matrix $A$ is called the word-topic matrix, and is the main target of estimation. It can be viewed as a matrix of conditional probabilities, and it is uniquely defined, under appropriate separability assumptions, discussed in detail in this work. Notably, the unique $A$ is required to satisfy what is commonly known as the anchor word assumption, under which $A$ has an unknown number of rows respectively proportional to the canonical basis vectors in $mathbb{R}^{K}$. The indices of such rows are referred to as anchor words. Recent computationally feasible algorithms, with theoretical guarantees, utilize constructively this assumption by linking the estimation of the set of anchor words with that of estimating the $K$ vertices of a simplex. This crucial step in the estimation of $A$ requires $K$ to be known, and cannot be easily extended to the more realistic set-up when $K$ is unknown. This work takes a different view on anchor word estimation, and on the estimation of $A$. We propose a new method of estimation in topic models, that is not a variation on the existing simplex finding algorithms, and that estimates $K$ from the observed data. We derive new finite sample minimax lower bounds for the estimation of $A$, as well as new upper bounds for our proposed estimator. We describe the scenarios where our estimator is minimax adaptive. Our finite sample analysis is valid for any $n,N_{i},p$ and $K$, and both $p$ and $K$ are allowed to increase with $n$, a situation not handled well by previous analyses. We complement our theoretical results with a detailed simulation study. We illustrate that the new algorithm is faster and more accurate than the current ones, although we start out with a computational and theoretical disadvantage of not knowing the correct number of topics $K$, while we provide the competing methods with the correct value in our simulations. Full Article
f Local differential privacy: Elbow effect in optimal density estimation and adaptation over Besov ellipsoids By projecteuclid.org Published On :: Mon, 27 Apr 2020 04:02 EDT Cristina Butucea, Amandine Dubois, Martin Kroll, Adrien Saumard. Source: Bernoulli, Volume 26, Number 3, 1727--1764.Abstract: We address the problem of non-parametric density estimation under the additional constraint that only privatised data are allowed to be published and available for inference. For this purpose, we adopt a recent generalisation of classical minimax theory to the framework of local $alpha$-differential privacy and provide a lower bound on the rate of convergence over Besov spaces $mathcal{B}^{s}_{pq}$ under mean integrated $mathbb{L}^{r}$-risk. This lower bound is deteriorated compared to the standard setup without privacy, and reveals a twofold elbow effect. In order to fulfill the privacy requirement, we suggest adding suitably scaled Laplace noise to empirical wavelet coefficients. Upper bounds within (at most) a logarithmic factor are derived under the assumption that $alpha$ stays bounded as $n$ increases: A linear but non-adaptive wavelet estimator is shown to attain the lower bound whenever $pgeq r$ but provides a slower rate of convergence otherwise. An adaptive non-linear wavelet estimator with appropriately chosen smoothing parameters and thresholding is shown to attain the lower bound within a logarithmic factor for all cases. Full Article
f On the eigenproblem for Gaussian bridges By projecteuclid.org Published On :: Mon, 27 Apr 2020 04:02 EDT Pavel Chigansky, Marina Kleptsyna, Dmytro Marushkevych. Source: Bernoulli, Volume 26, Number 3, 1706--1726.Abstract: Spectral decomposition of the covariance operator is one of the main building blocks in the theory and applications of Gaussian processes. Unfortunately, it is notoriously hard to derive in a closed form. In this paper, we consider the eigenproblem for Gaussian bridges. Given a base process, its bridge is obtained by conditioning the trajectories to start and terminate at the given points. What can be said about the spectrum of a bridge, given the spectrum of its base process? We show how this question can be answered asymptotically for a family of processes, including the fractional Brownian motion. Full Article
f Influence of the seed in affine preferential attachment trees By projecteuclid.org Published On :: Mon, 27 Apr 2020 04:02 EDT David Corlin Marchand, Ioan Manolescu. Source: Bernoulli, Volume 26, Number 3, 1665--1705.Abstract: We study randomly growing trees governed by the affine preferential attachment rule. Starting with a seed tree $S$, vertices are attached one by one, each linked by an edge to a random vertex of the current tree, chosen with a probability proportional to an affine function of its degree. This yields a one-parameter family of preferential attachment trees $(T_{n}^{S})_{ngeq |S|}$, of which the linear model is a particular case. Depending on the choice of the parameter, the power-laws governing the degrees in $T_{n}^{S}$ have different exponents. We study the problem of the asymptotic influence of the seed $S$ on the law of $T_{n}^{S}$. We show that, for any two distinct seeds $S$ and $S'$, the laws of $T_{n}^{S}$ and $T_{n}^{S'}$ remain at uniformly positive total-variation distance as $n$ increases. This is a continuation of Curien et al. ( J. Éc. Polytech. Math. 2 (2015) 1–34), which in turn was inspired by a conjecture of Bubeck et al. ( IEEE Trans. Netw. Sci. Eng. 2 (2015) 30–39). The technique developed here is more robust than previous ones and is likely to help in the study of more general attachment mechanisms. Full Article
f Estimating the number of connected components in a graph via subgraph sampling By projecteuclid.org Published On :: Mon, 27 Apr 2020 04:02 EDT Jason M. Klusowski, Yihong Wu. Source: Bernoulli, Volume 26, Number 3, 1635--1664.Abstract: Learning properties of large graphs from samples has been an important problem in statistical network analysis since the early work of Goodman ( Ann. Math. Stat. 20 (1949) 572–579) and Frank ( Scand. J. Stat. 5 (1978) 177–188). We revisit a problem formulated by Frank ( Scand. J. Stat. 5 (1978) 177–188) of estimating the number of connected components in a large graph based on the subgraph sampling model, in which we randomly sample a subset of the vertices and observe the induced subgraph. The key question is whether accurate estimation is achievable in the sublinear regime where only a vanishing fraction of the vertices are sampled. We show that it is impossible if the parent graph is allowed to contain high-degree vertices or long induced cycles. For the class of chordal graphs, where induced cycles of length four or above are forbidden, we characterize the optimal sample complexity within constant factors and construct linear-time estimators that provably achieve these bounds. This significantly expands the scope of previous results which have focused on unbiased estimators and special classes of graphs such as forests or cliques. Both the construction and the analysis of the proposed methodology rely on combinatorial properties of chordal graphs and identities of induced subgraph counts. They, in turn, also play a key role in proving minimax lower bounds based on construction of random instances of graphs with matching structures of small subgraphs. Full Article
f Sojourn time dimensions of fractional Brownian motion By projecteuclid.org Published On :: Mon, 27 Apr 2020 04:02 EDT Ivan Nourdin, Giovanni Peccati, Stéphane Seuret. Source: Bernoulli, Volume 26, Number 3, 1619--1634.Abstract: We describe the size of the sets of sojourn times $E_{gamma }={tgeq 0:|B_{t}|leq t^{gamma }}$ associated with a fractional Brownian motion $B$ in terms of various large scale dimensions. Full Article
f Efficient estimation in single index models through smoothing splines By projecteuclid.org Published On :: Fri, 31 Jan 2020 04:06 EST Arun K. Kuchibhotla, Rohit K. Patra. Source: Bernoulli, Volume 26, Number 2, 1587--1618.Abstract: We consider estimation and inference in a single index regression model with an unknown but smooth link function. In contrast to the standard approach of using kernels or regression splines, we use smoothing splines to estimate the smooth link function. We develop a method to compute the penalized least squares estimators (PLSEs) of the parametric and the nonparametric components given independent and identically distributed (i.i.d.) data. We prove the consistency and find the rates of convergence of the estimators. We establish asymptotic normality under mild assumption and prove asymptotic efficiency of the parametric component under homoscedastic errors. A finite sample simulation corroborates our asymptotic theory. We also analyze a car mileage data set and a Ozone concentration data set. The identifiability and existence of the PLSEs are also investigated. Full Article
f Random orthogonal matrices and the Cayley transform By projecteuclid.org Published On :: Fri, 31 Jan 2020 04:06 EST Michael Jauch, Peter D. Hoff, David B. Dunson. Source: Bernoulli, Volume 26, Number 2, 1560--1586.Abstract: Random orthogonal matrices play an important role in probability and statistics, arising in multivariate analysis, directional statistics, and models of physical systems, among other areas. Calculations involving random orthogonal matrices are complicated by their constrained support. Accordingly, we parametrize the Stiefel and Grassmann manifolds, represented as subsets of orthogonal matrices, in terms of Euclidean parameters using the Cayley transform. We derive the necessary Jacobian terms for change of variables formulas. Given a density defined on the Stiefel or Grassmann manifold, these allow us to specify the corresponding density for the Euclidean parameters, and vice versa. As an application, we present a Markov chain Monte Carlo approach to simulating from distributions on the Stiefel and Grassmann manifolds. Finally, we establish that the Euclidean parameters corresponding to a uniform orthogonal matrix can be approximated asymptotically by independent normals. This result contributes to the growing literature on normal approximations to the entries of random orthogonal matrices or transformations thereof. Full Article
f Reliable clustering of Bernoulli mixture models By projecteuclid.org Published On :: Fri, 31 Jan 2020 04:06 EST Amir Najafi, Seyed Abolfazl Motahari, Hamid R. Rabiee. Source: Bernoulli, Volume 26, Number 2, 1535--1559.Abstract: A Bernoulli Mixture Model (BMM) is a finite mixture of random binary vectors with independent dimensions. The problem of clustering BMM data arises in a variety of real-world applications, ranging from population genetics to activity analysis in social networks. In this paper, we analyze the clusterability of BMMs from a theoretical perspective, when the number of clusters is unknown. In particular, we stipulate a set of conditions on the sample complexity and dimension of the model in order to guarantee the Probably Approximately Correct (PAC)-clusterability of a dataset. To the best of our knowledge, these findings are the first non-asymptotic bounds on the sample complexity of learning or clustering BMMs. Full Article
f On the probability distribution of the local times of diagonally operator-self-similar Gaussian fields with stationary increments By projecteuclid.org Published On :: Fri, 31 Jan 2020 04:06 EST Kamran Kalbasi, Thomas Mountford. Source: Bernoulli, Volume 26, Number 2, 1504--1534.Abstract: In this paper, we study the local times of vector-valued Gaussian fields that are ‘diagonally operator-self-similar’ and whose increments are stationary. Denoting the local time of such a Gaussian field around the spatial origin and over the temporal unit hypercube by $Z$, we show that there exists $lambdain(0,1)$ such that under some quite weak conditions, $lim_{n ightarrow+infty}frac{sqrt[n]{mathbb{E}(Z^{n})}}{n^{lambda}}$ and $lim_{x ightarrow+infty}frac{-logmathbb{P}(Z>x)}{x^{frac{1}{lambda}}}$ both exist and are strictly positive (possibly $+infty$). Moreover, we show that if the underlying Gaussian field is ‘strongly locally nondeterministic’, the above limits will be finite as well. These results are then applied to establish similar statements for the intersection local times of diagonally operator-self-similar Gaussian fields with stationary increments. Full Article
f Limit theorems for long-memory flows on Wiener chaos By projecteuclid.org Published On :: Fri, 31 Jan 2020 04:06 EST Shuyang Bai, Murad S. Taqqu. Source: Bernoulli, Volume 26, Number 2, 1473--1503.Abstract: We consider a long-memory stationary process, defined not through a moving average type structure, but by a flow generated by a measure-preserving transform and by a multiple Wiener–Itô integral. The flow is described using a notion of mixing for infinite-measure spaces introduced by Krickeberg (In Proc. Fifth Berkeley Sympos. Math. Statist. and Probability (Berkeley, Calif., 1965/66), Vol. II: Contributions to Probability Theory, Part 2 (1967) 431–446 Univ. California Press). Depending on the interplay between the spreading rate of the flow and the order of the multiple integral, one can recover known central or non-central limit theorems, and also obtain joint convergence of multiple integrals of different orders. Full Article
f A characterization of the finiteness of perpetual integrals of Lévy processes By projecteuclid.org Published On :: Fri, 31 Jan 2020 04:06 EST Martin Kolb, Mladen Savov. Source: Bernoulli, Volume 26, Number 2, 1453--1472.Abstract: We derive a criterium for the almost sure finiteness of perpetual integrals of Lévy processes for a class of real functions including all continuous functions and for general one-dimensional Lévy processes that drifts to plus infinity. This generalizes previous work of Döring and Kyprianou, who considered Lévy processes having a local time, leaving the general case as an open problem. It turns out, that the criterium in the general situation simplifies significantly in the situation, where the process has a local time, but we also demonstrate that in general our criterium can not be reduced. This answers an open problem posed in ( J. Theoret. Probab. 29 (2016) 1192–1198). Full Article
f The moduli of non-differentiability for Gaussian random fields with stationary increments By projecteuclid.org Published On :: Fri, 31 Jan 2020 04:06 EST Wensheng Wang, Zhonggen Su, Yimin Xiao. Source: Bernoulli, Volume 26, Number 2, 1410--1430.Abstract: We establish the exact moduli of non-differentiability of Gaussian random fields with stationary increments. As an application of the result, we prove that the uniform Hölder condition for the maximum local times of Gaussian random fields with stationary increments obtained in Xiao (1997) is optimal. These results are applicable to fractional Riesz–Bessel processes and stationary Gaussian random fields in the Matérn and Cauchy classes. Full Article
f Stratonovich stochastic differential equation with irregular coefficients: Girsanov’s example revisited By projecteuclid.org Published On :: Fri, 31 Jan 2020 04:06 EST Ilya Pavlyukevich, Georgiy Shevchenko. Source: Bernoulli, Volume 26, Number 2, 1381--1409.Abstract: In this paper, we study the Stratonovich stochastic differential equation $mathrm{d}X=|X|^{alpha }circ mathrm{d}B$, $alpha in (-1,1)$, which has been introduced by Cherstvy et al. ( New J. Phys. 15 (2013) 083039) in the context of analysis of anomalous diffusions in heterogeneous media. We determine its weak and strong solutions, which are homogeneous strong Markov processes spending zero time at $0$: for $alpha in (0,1)$, these solutions have the form egin{equation*}X_{t}^{ heta }=((1-alpha)B_{t}^{ heta })^{1/(1-alpha )},end{equation*} where $B^{ heta }$ is the $ heta $-skew Brownian motion driven by $B$ and starting at $frac{1}{1-alpha }(X_{0})^{1-alpha }$, $ heta in [-1,1]$, and $(x)^{gamma }=|x|^{gamma }operatorname{sign}x$; for $alpha in (-1,0]$, only the case $ heta =0$ is possible. The central part of the paper consists in the proof of the existence of a quadratic covariation $[f(B^{ heta }),B]$ for a locally square integrable function $f$ and is based on the time-reversion technique for Markovian diffusions. Full Article
f On stability of traveling wave solutions for integro-differential equations related to branching Markov processes By projecteuclid.org Published On :: Fri, 31 Jan 2020 04:06 EST Pasha Tkachov. Source: Bernoulli, Volume 26, Number 2, 1354--1380.Abstract: The aim of this paper is to prove stability of traveling waves for integro-differential equations connected with branching Markov processes. In other words, the limiting law of the left-most particle of a (time-continuous) branching Markov process with a Lévy non-branching part is demonstrated. The key idea is to approximate the branching Markov process by a branching random walk and apply the result of Aïdékon [ Ann. Probab. 41 (2013) 1362–1426] on the limiting law of the latter one. Full Article
f A new McKean–Vlasov stochastic interpretation of the parabolic–parabolic Keller–Segel model: The one-dimensional case By projecteuclid.org Published On :: Fri, 31 Jan 2020 04:06 EST Denis Talay, Milica Tomašević. Source: Bernoulli, Volume 26, Number 2, 1323--1353.Abstract: In this paper, we analyze a stochastic interpretation of the one-dimensional parabolic–parabolic Keller–Segel system without cut-off. It involves an original type of McKean–Vlasov interaction kernel. At the particle level, each particle interacts with all the past of each other particle by means of a time integrated functional involving a singular kernel. At the mean-field level studied here, the McKean–Vlasov limit process interacts with all the past time marginals of its probability distribution in a similarly singular way. We prove that the parabolic–parabolic Keller–Segel system in the whole Euclidean space and the corresponding McKean–Vlasov stochastic differential equation are well-posed for any values of the parameters of the model. Full Article
f Rates of convergence in de Finetti’s representation theorem, and Hausdorff moment problem By projecteuclid.org Published On :: Fri, 31 Jan 2020 04:06 EST Emanuele Dolera, Stefano Favaro. Source: Bernoulli, Volume 26, Number 2, 1294--1322.Abstract: Given a sequence ${X_{n}}_{ngeq 1}$ of exchangeable Bernoulli random variables, the celebrated de Finetti representation theorem states that $frac{1}{n}sum_{i=1}^{n}X_{i}stackrel{a.s.}{longrightarrow }Y$ for a suitable random variable $Y:Omega ightarrow [0,1]$ satisfying $mathsf{P}[X_{1}=x_{1},dots ,X_{n}=x_{n}|Y]=Y^{sum_{i=1}^{n}x_{i}}(1-Y)^{n-sum_{i=1}^{n}x_{i}}$. In this paper, we study the rate of convergence in law of $frac{1}{n}sum_{i=1}^{n}X_{i}$ to $Y$ under the Kolmogorov distance. After showing that a rate of the type of $1/n^{alpha }$ can be obtained for any index $alpha in (0,1]$, we find a sufficient condition on the distribution of $Y$ for the achievement of the optimal rate of convergence, that is $1/n$. Besides extending and strengthening recent results under the weaker Wasserstein distance, our main result weakens the regularity hypotheses on $Y$ in the context of the Hausdorff moment problem. Full Article
f Strictly weak consensus in the uniform compass model on $mathbb{Z}$ By projecteuclid.org Published On :: Fri, 31 Jan 2020 04:06 EST Nina Gantert, Markus Heydenreich, Timo Hirscher. Source: Bernoulli, Volume 26, Number 2, 1269--1293.Abstract: We investigate a model for opinion dynamics, where individuals (modeled by vertices of a graph) hold certain abstract opinions. As time progresses, neighboring individuals interact with each other, and this interaction results in a realignment of opinions closer towards each other. This mechanism triggers formation of consensus among the individuals. Our main focus is on strong consensus (i.e., global agreement of all individuals) versus weak consensus (i.e., local agreement among neighbors). By extending a known model to a more general opinion space, which lacks a “central” opinion acting as a contraction point, we provide an example of an opinion formation process on the one-dimensional lattice $mathbb{Z}$ with weak consensus but no strong consensus. Full Article
f Consistent structure estimation of exponential-family random graph models with block structure By projecteuclid.org Published On :: Fri, 31 Jan 2020 04:06 EST Michael Schweinberger. Source: Bernoulli, Volume 26, Number 2, 1205--1233.Abstract: We consider the challenging problem of statistical inference for exponential-family random graph models based on a single observation of a random graph with complex dependence. To facilitate statistical inference, we consider random graphs with additional structure in the form of block structure. We have shown elsewhere that when the block structure is known, it facilitates consistency results for $M$-estimators of canonical and curved exponential-family random graph models with complex dependence, such as transitivity. In practice, the block structure is known in some applications (e.g., multilevel networks), but is unknown in others. When the block structure is unknown, the first and foremost question is whether it can be recovered with high probability based on a single observation of a random graph with complex dependence. The main consistency results of the paper show that it is possible to do so under weak dependence and smoothness conditions. These results confirm that exponential-family random graph models with block structure constitute a promising direction of statistical network analysis. Full Article
f Characterization of probability distribution convergence in Wasserstein distance by $L^{p}$-quantization error function By projecteuclid.org Published On :: Fri, 31 Jan 2020 04:06 EST Yating Liu, Gilles Pagès. Source: Bernoulli, Volume 26, Number 2, 1171--1204.Abstract: We establish conditions to characterize probability measures by their $L^{p}$-quantization error functions in both $mathbb{R}^{d}$ and Hilbert settings. This characterization is two-fold: static (identity of two distributions) and dynamic (convergence for the $L^{p}$-Wasserstein distance). We first propose a criterion on the quantization level $N$, valid for any norm on $mathbb{R}^{d}$ and any order $p$ based on a geometrical approach involving the Voronoï diagram. Then, we prove that in the $L^{2}$-case on a (separable) Hilbert space, the condition on the level $N$ can be reduced to $N=2$, which is optimal. More quantization based characterization cases in dimension 1 and a discussion of the completeness of a distance defined by the quantization error function can be found at the end of this paper. Full Article
f Interacting reinforced stochastic processes: Statistical inference based on the weighted empirical means By projecteuclid.org Published On :: Fri, 31 Jan 2020 04:06 EST Giacomo Aletti, Irene Crimaldi, Andrea Ghiglietti. Source: Bernoulli, Volume 26, Number 2, 1098--1138.Abstract: This work deals with a system of interacting reinforced stochastic processes , where each process $X^{j}=(X_{n,j})_{n}$ is located at a vertex $j$ of a finite weighted directed graph, and it can be interpreted as the sequence of “actions” adopted by an agent $j$ of the network. The interaction among the dynamics of these processes depends on the weighted adjacency matrix $W$ associated to the underlying graph: indeed, the probability that an agent $j$ chooses a certain action depends on its personal “inclination” $Z_{n,j}$ and on the inclinations $Z_{n,h}$, with $h eq j$, of the other agents according to the entries of $W$. The best known example of reinforced stochastic process is the Pólya urn. The present paper focuses on the weighted empirical means $N_{n,j}=sum_{k=1}^{n}q_{n,k}X_{k,j}$, since, for example, the current experience is more important than the past one in reinforced learning. Their almost sure synchronization and some central limit theorems in the sense of stable convergence are proven. The new approach with weighted means highlights the key points in proving some recent results for the personal inclinations $Z^{j}=(Z_{n,j})_{n}$ and for the empirical means $overline{X}^{j}=(sum_{k=1}^{n}X_{k,j}/n)_{n}$ given in recent papers (e.g. Aletti, Crimaldi and Ghiglietti (2019), Ann. Appl. Probab. 27 (2017) 3787–3844, Crimaldi et al. Stochastic Process. Appl. 129 (2019) 70–101). In fact, with a more sophisticated decomposition of the considered processes, we can understand how the different convergence rates of the involved stochastic processes combine. From an application point of view, we provide confidence intervals for the common limit inclination of the agents and a test statistics to make inference on the matrix $W$, based on the weighted empirical means. In particular, we answer a research question posed in Aletti, Crimaldi and Ghiglietti (2019). Full Article
f A unified principled framework for resampling based on pseudo-populations: Asymptotic theory By projecteuclid.org Published On :: Fri, 31 Jan 2020 04:06 EST Pier Luigi Conti, Daniela Marella, Fulvia Mecatti, Federico Andreis. Source: Bernoulli, Volume 26, Number 2, 1044--1069.Abstract: In this paper, a class of resampling techniques for finite populations under $pi $ps sampling design is introduced. The basic idea on which they rest is a two-step procedure consisting in: (i) constructing a “pseudo-population” on the basis of sample data; (ii) drawing a sample from the predicted population according to an appropriate resampling design. From a logical point of view, this approach is essentially based on the plug-in principle by Efron, at the “sampling design level”. Theoretical justifications based on large sample theory are provided. New approaches to construct pseudo populations based on various forms of calibrations are proposed. Finally, a simulation study is performed. Full Article
f Degeneracy in sparse ERGMs with functions of degrees as sufficient statistics By projecteuclid.org Published On :: Fri, 31 Jan 2020 04:06 EST Sumit Mukherjee. Source: Bernoulli, Volume 26, Number 2, 1016--1043.Abstract: A sufficient criterion for “non-degeneracy” is given for Exponential Random Graph Models on sparse graphs with sufficient statistics which are functions of the degree sequence. This criterion explains why statistics such as alternating $k$-star are non-degenerate, whereas subgraph counts are degenerate. It is further shown that this criterion is “almost” tight. Existence of consistent estimates is then proved for non-degenerate Exponential Random Graph Models. Full Article
f Stable processes conditioned to hit an interval continuously from the outside By projecteuclid.org Published On :: Fri, 31 Jan 2020 04:06 EST Leif Döring, Philip Weissmann. Source: Bernoulli, Volume 26, Number 2, 980--1015.Abstract: Conditioning stable Lévy processes on zero probability events recently became a tractable subject since several explicit formulas emerged from a deep analysis using the Lamperti transformations for self-similar Markov processes. In this article, we derive new harmonic functions and use them to explain how to condition stable processes to hit continuously a compact interval from the outside. Full Article
f Distances and large deviations in the spatial preferential attachment model By projecteuclid.org Published On :: Fri, 31 Jan 2020 04:06 EST Christian Hirsch, Christian Mönch. Source: Bernoulli, Volume 26, Number 2, 927--947.Abstract: This paper considers two asymptotic properties of a spatial preferential-attachment model introduced by E. Jacob and P. Mörters (In Algorithms and Models for the Web Graph (2013) 14–25 Springer). First, in a regime of strong linear reinforcement, we show that typical distances are at most of doubly-logarithmic order. Second, we derive a large deviation principle for the empirical neighbourhood structure and express the rate function as solution to an entropy minimisation problem in the space of stationary marked point processes. Full Article
f Convergence of the age structure of general schemes of population processes By projecteuclid.org Published On :: Fri, 31 Jan 2020 04:06 EST Jie Yen Fan, Kais Hamza, Peter Jagers, Fima Klebaner. Source: Bernoulli, Volume 26, Number 2, 893--926.Abstract: We consider a family of general branching processes with reproduction parameters depending on the age of the individual as well as the population age structure and a parameter $K$, which may represent the carrying capacity. These processes are Markovian in the age structure. In a previous paper ( Proc. Steklov Inst. Math. 282 (2013) 90–105), the Law of Large Numbers as $K o infty $ was derived. Here we prove the central limit theorem, namely the weak convergence of the fluctuation processes in an appropriate Skorokhod space. We also show that the limit is driven by a stochastic partial differential equation. Full Article
f Recurrence of multidimensional persistent random walks. Fourier and series criteria By projecteuclid.org Published On :: Fri, 31 Jan 2020 04:06 EST Peggy Cénac, Basile de Loynes, Yoann Offret, Arnaud Rousselle. Source: Bernoulli, Volume 26, Number 2, 858--892.Abstract: The recurrence and transience of persistent random walks built from variable length Markov chains are investigated. It turns out that these stochastic processes can be seen as Lévy walks for which the persistence times depend on some internal Markov chain: they admit Markov random walk skeletons. A recurrence versus transience dichotomy is highlighted. Assuming the positive recurrence of the driving chain, a sufficient Fourier criterion for the recurrence, close to the usual Chung–Fuchs one, is given and a series criterion is derived. The key tool is the Nagaev–Guivarc’h method. Finally, we focus on particular two-dimensional persistent random walks, including directionally reinforced random walks, for which necessary and sufficient Fourier and series criteria are obtained. Inspired by ( Adv. Math. 208 (2007) 680–698), we produce a genuine counterexample to the conjecture of ( Adv. Math. 117 (1996) 239–252). As for the one-dimensional case studied in ( J. Theoret. Probab. 31 (2018) 232–243), it is easier for a persistent random walk than its skeleton to be recurrent. However, such examples are much more difficult to exhibit in the higher dimensional context. These results are based on a surprisingly novel – to our knowledge – upper bound for the Lévy concentration function associated with symmetric distributions. Full Article
f Robust estimation of mixing measures in finite mixture models By projecteuclid.org Published On :: Fri, 31 Jan 2020 04:06 EST Nhat Ho, XuanLong Nguyen, Ya’acov Ritov. Source: Bernoulli, Volume 26, Number 2, 828--857.Abstract: In finite mixture models, apart from underlying mixing measure, true kernel density function of each subpopulation in the data is, in many scenarios, unknown. Perhaps the most popular approach is to choose some kernel functions that we empirically believe our data are generated from and use these kernels to fit our models. Nevertheless, as long as the chosen kernel and the true kernel are different, statistical inference of mixing measure under this setting will be highly unstable. To overcome this challenge, we propose flexible and efficient robust estimators of the mixing measure in these models, which are inspired by the idea of minimum Hellinger distance estimator, model selection criteria, and superefficiency phenomenon. We demonstrate that our estimators consistently recover the true number of components and achieve the optimal convergence rates of parameter estimation under both the well- and misspecified kernel settings for any fixed bandwidth. These desirable asymptotic properties are illustrated via careful simulation studies with both synthetic and real data. Full Article
f Stochastic differential equations with a fractionally filtered delay: A semimartingale model for long-range dependent processes By projecteuclid.org Published On :: Fri, 31 Jan 2020 04:06 EST Richard A. Davis, Mikkel Slot Nielsen, Victor Rohde. Source: Bernoulli, Volume 26, Number 2, 799--827.Abstract: In this paper, we introduce a model, the stochastic fractional delay differential equation (SFDDE), which is based on the linear stochastic delay differential equation and produces stationary processes with hyperbolically decaying autocovariance functions. The model departs from the usual way of incorporating this type of long-range dependence into a short-memory model as it is obtained by applying a fractional filter to the drift term rather than to the noise term. The advantages of this approach are that the corresponding long-range dependent solutions are semimartingales and the local behavior of the sample paths is unaffected by the degree of long memory. We prove existence and uniqueness of solutions to the SFDDEs and study their spectral densities and autocovariance functions. Moreover, we define a subclass of SFDDEs which we study in detail and relate to the well-known fractionally integrated CARMA processes. Finally, we consider the task of simulating from the defining SFDDEs. Full Article
f Convergence and concentration of empirical measures under Wasserstein distance in unbounded functional spaces By projecteuclid.org Published On :: Tue, 26 Nov 2019 04:00 EST Jing Lei. Source: Bernoulli, Volume 26, Number 1, 767--798.Abstract: We provide upper bounds of the expected Wasserstein distance between a probability measure and its empirical version, generalizing recent results for finite dimensional Euclidean spaces and bounded functional spaces. Such a generalization can cover Euclidean spaces with large dimensionality, with the optimal dependence on the dimensionality. Our method also covers the important case of Gaussian processes in separable Hilbert spaces, with rate-optimal upper bounds for functional data distributions whose coordinates decay geometrically or polynomially. Moreover, our bounds of the expected value can be combined with mean-concentration results to yield improved exponential tail probability bounds for the Wasserstein error of empirical measures under Bernstein-type or log Sobolev-type conditions. Full Article
f A Feynman–Kac result via Markov BSDEs with generalised drivers By projecteuclid.org Published On :: Tue, 26 Nov 2019 04:00 EST Elena Issoglio, Francesco Russo. Source: Bernoulli, Volume 26, Number 1, 728--766.Abstract: In this paper, we investigate BSDEs where the driver contains a distributional term (in the sense of generalised functions) and derive general Feynman–Kac formulae related to these BSDEs. We introduce an integral operator to give sense to the equation and then we show the existence of a strong solution employing results on a related PDE. Due to the irregularity of the driver, the $Y$-component of a couple $(Y,Z)$ solving the BSDE is not necessarily a semimartingale but a weak Dirichlet process. Full Article
f Robust modifications of U-statistics and applications to covariance estimation problems By projecteuclid.org Published On :: Tue, 26 Nov 2019 04:00 EST Stanislav Minsker, Xiaohan Wei. Source: Bernoulli, Volume 26, Number 1, 694--727.Abstract: Let $Y$ be a $d$-dimensional random vector with unknown mean $mu $ and covariance matrix $Sigma $. This paper is motivated by the problem of designing an estimator of $Sigma $ that admits exponential deviation bounds in the operator norm under minimal assumptions on the underlying distribution, such as existence of only 4th moments of the coordinates of $Y$. To address this problem, we propose robust modifications of the operator-valued U-statistics, obtain non-asymptotic guarantees for their performance, and demonstrate the implications of these results to the covariance estimation problem under various structural assumptions. Full Article