do

Tracy–Widom limit for Kendall’s tau

Zhigang Bao.

Source: The Annals of Statistics, Volume 47, Number 6, 3504--3532.

Abstract:
In this paper, we study a high-dimensional random matrix model from nonparametric statistics called the Kendall rank correlation matrix, which is a natural multivariate extension of the Kendall rank correlation coefficient. We establish the Tracy–Widom law for its largest eigenvalue. It is the first Tracy–Widom law for a nonparametric random matrix model, and also the first Tracy–Widom law for a high-dimensional U-statistic.




do

Randomized incomplete $U$-statistics in high dimensions

Xiaohui Chen, Kengo Kato.

Source: The Annals of Statistics, Volume 47, Number 6, 3127--3156.

Abstract:
This paper studies inference for the mean vector of a high-dimensional $U$-statistic. In the era of big data, the dimension $d$ of the $U$-statistic and the sample size $n$ of the observations tend to be both large, and the computation of the $U$-statistic is prohibitively demanding. Data-dependent inferential procedures such as the empirical bootstrap for $U$-statistics is even more computationally expensive. To overcome such a computational bottleneck, incomplete $U$-statistics obtained by sampling fewer terms of the $U$-statistic are attractive alternatives. In this paper, we introduce randomized incomplete $U$-statistics with sparse weights whose computational cost can be made independent of the order of the $U$-statistic. We derive nonasymptotic Gaussian approximation error bounds for the randomized incomplete $U$-statistics in high dimensions, namely in cases where the dimension $d$ is possibly much larger than the sample size $n$, for both nondegenerate and degenerate kernels. In addition, we propose generic bootstrap methods for the incomplete $U$-statistics that are computationally much less demanding than existing bootstrap methods, and establish finite sample validity of the proposed bootstrap methods. Our methods are illustrated on the application to nonparametric testing for the pairwise independence of a high-dimensional random vector under weaker assumptions than those appearing in the literature.




do

Active ranking from pairwise comparisons and when parametric assumptions do not help

Reinhard Heckel, Nihar B. Shah, Kannan Ramchandran, Martin J. Wainwright.

Source: The Annals of Statistics, Volume 47, Number 6, 3099--3126.

Abstract:
We consider sequential or active ranking of a set of $n$ items based on noisy pairwise comparisons. Items are ranked according to the probability that a given item beats a randomly chosen item, and ranking refers to partitioning the items into sets of prespecified sizes according to their scores. This notion of ranking includes as special cases the identification of the top-$k$ items and the total ordering of the items. We first analyze a sequential ranking algorithm that counts the number of comparisons won, and uses these counts to decide whether to stop, or to compare another pair of items, chosen based on confidence intervals specified by the data collected up to that point. We prove that this algorithm succeeds in recovering the ranking using a number of comparisons that is optimal up to logarithmic factors. This guarantee does depend on whether or not the underlying pairwise probability matrix, satisfies a particular structural property, unlike a significant body of past work on pairwise ranking based on parametric models such as the Thurstone or Bradley–Terry–Luce models. It has been a long-standing open question as to whether or not imposing these parametric assumptions allows for improved ranking algorithms. For stochastic comparison models, in which the pairwise probabilities are bounded away from zero, our second contribution is to resolve this issue by proving a lower bound for parametric models. This shows, perhaps surprisingly, that these popular parametric modeling choices offer at most logarithmic gains for stochastic comparisons.




do

Eigenvalue distributions of variance components estimators in high-dimensional random effects models

Zhou Fan, Iain M. Johnstone.

Source: The Annals of Statistics, Volume 47, Number 5, 2855--2886.

Abstract:
We study the spectra of MANOVA estimators for variance component covariance matrices in multivariate random effects models. When the dimensionality of the observations is large and comparable to the number of realizations of each random effect, we show that the empirical spectra of such estimators are well approximated by deterministic laws. The Stieltjes transforms of these laws are characterized by systems of fixed-point equations, which are numerically solvable by a simple iterative procedure. Our proof uses operator-valued free probability theory, and we establish a general asymptotic freeness result for families of rectangular orthogonally invariant random matrices, which is of independent interest. Our work is motivated in part by the estimation of components of covariance between multiple phenotypic traits in quantitative genetics, and we specialize our results to common experimental designs that arise in this application.




do

Distance multivariance: New dependence measures for random vectors

Björn Böttcher, Martin Keller-Ressel, René L. Schilling.

Source: The Annals of Statistics, Volume 47, Number 5, 2757--2789.

Abstract:
We introduce two new measures for the dependence of $nge2$ random variables: distance multivariance and total distance multivariance . Both measures are based on the weighted $L^{2}$-distance of quantities related to the characteristic functions of the underlying random variables. These extend distance covariance (introduced by Székely, Rizzo and Bakirov) from pairs of random variables to $n$-tuplets of random variables. We show that total distance multivariance can be used to detect the independence of $n$ random variables and has a simple finite-sample representation in terms of distance matrices of the sample points, where distance is measured by a continuous negative definite function. Under some mild moment conditions, this leads to a test for independence of multiple random vectors which is consistent against all alternatives.




do

Phase transition in the spiked random tensor with Rademacher prior

Wei-Kuo Chen.

Source: The Annals of Statistics, Volume 47, Number 5, 2734--2756.

Abstract:
We consider the problem of detecting a deformation from a symmetric Gaussian random $p$-tensor $(pgeq3)$ with a rank-one spike sampled from the Rademacher prior. Recently, in Lesieur et al. (Barbier, Krzakala, Macris, Miolane and Zdeborová (2017)), it was proved that there exists a critical threshold $eta_{p}$ so that when the signal-to-noise ratio exceeds $eta_{p}$, one can distinguish the spiked and unspiked tensors and weakly recover the prior via the minimal mean-square-error method. On the other side, Perry, Wein and Bandeira (Perry, Wein and Bandeira (2017)) proved that there exists a $eta_{p}'<eta_{p}$ such that any statistical hypothesis test cannot distinguish these two tensors, in the sense that their total variation distance asymptotically vanishes, when the signa-to-noise ratio is less than $eta_{p}'$. In this work, we show that $eta_{p}$ is indeed the critical threshold that strictly separates the distinguishability and indistinguishability between the two tensors under the total variation distance. Our approach is based on a subtle analysis of the high temperature behavior of the pure $p$-spin model with Ising spin, arising initially from the field of spin glasses. In particular, we identify the signal-to-noise criticality $eta_{p}$ as the critical temperature, distinguishing the high and low temperature behavior, of the Ising pure $p$-spin mean-field spin glass model.




do

Doubly penalized estimation in additive regression with high-dimensional data

Zhiqiang Tan, Cun-Hui Zhang.

Source: The Annals of Statistics, Volume 47, Number 5, 2567--2600.

Abstract:
Additive regression provides an extension of linear regression by modeling the signal of a response as a sum of functions of covariates of relatively low complexity. We study penalized estimation in high-dimensional nonparametric additive regression where functional semi-norms are used to induce smoothness of component functions and the empirical $L_{2}$ norm is used to induce sparsity. The functional semi-norms can be of Sobolev or bounded variation types and are allowed to be different amongst individual component functions. We establish oracle inequalities for the predictive performance of such methods under three simple technical conditions: a sub-Gaussian condition on the noise, a compatibility condition on the design and the functional classes under consideration and an entropy condition on the functional classes. For random designs, the sample compatibility condition can be replaced by its population version under an additional condition to ensure suitable convergence of empirical norms. In homogeneous settings where the complexities of the component functions are of the same order, our results provide a spectrum of minimax convergence rates, from the so-called slow rate without requiring the compatibility condition to the fast rate under the hard sparsity or certain $L_{q}$ sparsity to allow many small components in the true regression function. These results significantly broaden and sharpen existing ones in the literature.




do

A latent discrete Markov random field approach to identifying and classifying historical forest communities based on spatial multivariate tree species counts

Stephen Berg, Jun Zhu, Murray K. Clayton, Monika E. Shea, David J. Mladenoff.

Source: The Annals of Applied Statistics, Volume 13, Number 4, 2312--2340.

Abstract:
The Wisconsin Public Land Survey database describes historical forest composition at high spatial resolution and is of interest in ecological studies of forest composition in Wisconsin just prior to significant Euro-American settlement. For such studies it is useful to identify recurring subpopulations of tree species known as communities, but standard clustering approaches for subpopulation identification do not account for dependence between spatially nearby observations. Here, we develop and fit a latent discrete Markov random field model for the purpose of identifying and classifying historical forest communities based on spatially referenced multivariate tree species counts across Wisconsin. We show empirically for the actual dataset and through simulation that our latent Markov random field modeling approach improves prediction and parameter estimation performance. For model fitting we introduce a new stochastic approximation algorithm which enables computationally efficient estimation and classification of large amounts of spatial multivariate count data.




do

Oblique random survival forests

Byron C. Jaeger, D. Leann Long, Dustin M. Long, Mario Sims, Jeff M. Szychowski, Yuan-I Min, Leslie A. Mcclure, George Howard, Noah Simon.

Source: The Annals of Applied Statistics, Volume 13, Number 3, 1847--1883.

Abstract:
We introduce and evaluate the oblique random survival forest (ORSF). The ORSF is an ensemble method for right-censored survival data that uses linear combinations of input variables to recursively partition a set of training data. Regularized Cox proportional hazard models are used to identify linear combinations of input variables in each recursive partitioning step. Benchmark results using simulated and real data indicate that the ORSF’s predicted risk function has high prognostic value in comparison to random survival forests, conditional inference forests, regression and boosting. In an application to data from the Jackson Heart Study, we demonstrate variable and partial dependence using the ORSF and highlight characteristics of its ten-year predicted risk function for atherosclerotic cardiovascular disease events (ASCVD; stroke, coronary heart disease). We present visualizations comparing variable and partial effect estimation according to the ORSF, the conditional inference forest, and the Pooled Cohort Risk equations. The obliqueRSF R package, which provides functions to fit the ORSF and create variable and partial dependence plots, is available on the comprehensive R archive network (CRAN).




do

Incorporating conditional dependence in latent class models for probabilistic record linkage: Does it matter?

Huiping Xu, Xiaochun Li, Changyu Shen, Siu L. Hui, Shaun Grannis.

Source: The Annals of Applied Statistics, Volume 13, Number 3, 1753--1790.

Abstract:
The conditional independence assumption of the Felligi and Sunter (FS) model in probabilistic record linkage is often violated when matching real-world data. Ignoring conditional dependence has been shown to seriously bias parameter estimates. However, in record linkage, the ultimate goal is to inform the match status of record pairs and therefore, record linkage algorithms should be evaluated in terms of matching accuracy. In the literature, more flexible models have been proposed to relax the conditional independence assumption, but few studies have assessed whether such accommodations improve matching accuracy. In this paper, we show that incorporating the conditional dependence appropriately yields comparable or improved matching accuracy than the FS model using three real-world data linkage examples. Through a simulation study, we further investigate when conditional dependence models provide improved matching accuracy. Our study shows that the FS model is generally robust to the conditional independence assumption and provides comparable matching accuracy as the more complex conditional dependence models. However, when the match prevalence approaches 0% or 100% and conditional dependence exists in the dominating class, it is necessary to address conditional dependence as the FS model produces suboptimal matching accuracy. The need to address conditional dependence becomes less important when highly discriminating fields are used. Our simulation study also shows that conditional dependence models with misspecified dependence structure could produce less accurate record matching than the FS model and therefore we caution against the blind use of conditional dependence models.




do

RCRnorm: An integrated system of random-coefficient hierarchical regression models for normalizing NanoString nCounter data

Gaoxiang Jia, Xinlei Wang, Qiwei Li, Wei Lu, Ximing Tang, Ignacio Wistuba, Yang Xie.

Source: The Annals of Applied Statistics, Volume 13, Number 3, 1617--1647.

Abstract:
Formalin-fixed paraffin-embedded (FFPE) samples have great potential for biomarker discovery, retrospective studies and diagnosis or prognosis of diseases. Their application, however, is hindered by the unsatisfactory performance of traditional gene expression profiling techniques on damaged RNAs. NanoString nCounter platform is well suited for profiling of FFPE samples and measures gene expression with high sensitivity which may greatly facilitate realization of scientific and clinical values of FFPE samples. However, methodological development for normalization, a critical step when analyzing this type of data, is far behind. Existing methods designed for the platform use information from different types of internal controls separately and rely on an overly-simplified assumption that expression of housekeeping genes is constant across samples for global scaling. Thus, these methods are not optimized for the nCounter system, not mentioning that they were not developed for FFPE samples. We construct an integrated system of random-coefficient hierarchical regression models to capture main patterns and characteristics observed from NanoString data of FFPE samples and develop a Bayesian approach to estimate parameters and normalize gene expression across samples. Our method, labeled RCRnorm, incorporates information from all aspects of the experimental design and simultaneously removes biases from various sources. It eliminates the unrealistic assumption on housekeeping genes and offers great interpretability. Furthermore, it is applicable to freshly frozen or like samples that can be generally viewed as a reduced case of FFPE samples. Simulation and applications showed the superior performance of RCRnorm.




do

Network modelling of topological domains using Hi-C data

Y. X. Rachel Wang, Purnamrita Sarkar, Oana Ursu, Anshul Kundaje, Peter J. Bickel.

Source: The Annals of Applied Statistics, Volume 13, Number 3, 1511--1536.

Abstract:
Chromosome conformation capture experiments such as Hi-C are used to map the three-dimensional spatial organization of genomes. One specific feature of the 3D organization is known as topologically associating domains (TADs), which are densely interacting, contiguous chromatin regions playing important roles in regulating gene expression. A few algorithms have been proposed to detect TADs. In particular, the structure of Hi-C data naturally inspires application of community detection methods. However, one of the drawbacks of community detection is that most methods take exchangeability of the nodes in the network for granted; whereas the nodes in this case, that is, the positions on the chromosomes, are not exchangeable. We propose a network model for detecting TADs using Hi-C data that takes into account this nonexchangeability. In addition, our model explicitly makes use of cell-type specific CTCF binding sites as biological covariates and can be used to identify conserved TADs across multiple cell types. The model leads to a likelihood objective that can be efficiently optimized via relaxation. We also prove that when suitably initialized, this model finds the underlying TAD structure with high probability. Using simulated data, we show the advantages of our method and the caveats of popular community detection methods, such as spectral clustering, in this application. Applying our method to real Hi-C data, we demonstrate the domains identified have desirable epigenetic features and compare them across different cell types.




do

Local law and Tracy–Widom limit for sparse stochastic block models

Jong Yun Hwang, Ji Oon Lee, Wooseok Yang.

Source: Bernoulli, Volume 26, Number 3, 2400--2435.

Abstract:
We consider the spectral properties of sparse stochastic block models, where $N$ vertices are partitioned into $K$ balanced communities. Under an assumption that the intra-community probability and inter-community probability are of similar order, we prove a local semicircle law up to the spectral edges, with an explicit formula on the deterministic shift of the spectral edge. We also prove that the fluctuation of the extremal eigenvalues is given by the GOE Tracy–Widom law after rescaling and centering the entries of sparse stochastic block models. Applying the result to sparse stochastic block models, we rigorously prove that there is a large gap between the outliers and the spectral edge without centering.




do

Frequency domain theory for functional time series: Variance decomposition and an invariance principle

Piotr Kokoszka, Neda Mohammadi Jouzdani.

Source: Bernoulli, Volume 26, Number 3, 2383--2399.

Abstract:
This paper is concerned with frequency domain theory for functional time series, which are temporally dependent sequences of functions in a Hilbert space. We consider a variance decomposition, which is more suitable for such a data structure than the variance decomposition based on the Karhunen–Loéve expansion. The decomposition we study uses eigenvalues of spectral density operators, which are functional analogs of the spectral density of a stationary scalar time series. We propose estimators of the variance components and derive convergence rates for their mean square error as well as their asymptotic normality. The latter is derived from a frequency domain invariance principle for the estimators of the spectral density operators. This principle is established for a broad class of linear time series models. It is a main contribution of the paper.




do

Concentration of the spectral norm of Erdős–Rényi random graphs

Gábor Lugosi, Shahar Mendelson, Nikita Zhivotovskiy.

Source: Bernoulli, Volume 26, Number 3, 2253--2274.

Abstract:
We present results on the concentration properties of the spectral norm $|A_{p}|$ of the adjacency matrix $A_{p}$ of an Erdős–Rényi random graph $G(n,p)$. First, we consider the Erdős–Rényi random graph process and prove that $|A_{p}|$ is uniformly concentrated over the range $pin[Clog n/n,1]$. The analysis is based on delocalization arguments, uniform laws of large numbers, together with the entropy method to prove concentration inequalities. As an application of our techniques, we prove sharp sub-Gaussian moment inequalities for $|A_{p}|$ for all $pin[clog^{3}n/n,1]$ that improve the general bounds of Alon, Krivelevich, and Vu ( Israel J. Math. 131 (2002) 259–267) and some of the more recent results of Erdős et al. ( Ann. Probab. 41 (2013) 2279–2375). Both results are consistent with the asymptotic result of Füredi and Komlós ( Combinatorica 1 (1981) 233–241) that holds for fixed $p$ as $n oinfty$.




do

Random orthogonal matrices and the Cayley transform

Michael Jauch, Peter D. Hoff, David B. Dunson.

Source: Bernoulli, Volume 26, Number 2, 1560--1586.

Abstract:
Random orthogonal matrices play an important role in probability and statistics, arising in multivariate analysis, directional statistics, and models of physical systems, among other areas. Calculations involving random orthogonal matrices are complicated by their constrained support. Accordingly, we parametrize the Stiefel and Grassmann manifolds, represented as subsets of orthogonal matrices, in terms of Euclidean parameters using the Cayley transform. We derive the necessary Jacobian terms for change of variables formulas. Given a density defined on the Stiefel or Grassmann manifold, these allow us to specify the corresponding density for the Euclidean parameters, and vice versa. As an application, we present a Markov chain Monte Carlo approach to simulating from distributions on the Stiefel and Grassmann manifolds. Finally, we establish that the Euclidean parameters corresponding to a uniform orthogonal matrix can be approximated asymptotically by independent normals. This result contributes to the growing literature on normal approximations to the entries of random orthogonal matrices or transformations thereof.




do

The moduli of non-differentiability for Gaussian random fields with stationary increments

Wensheng Wang, Zhonggen Su, Yimin Xiao.

Source: Bernoulli, Volume 26, Number 2, 1410--1430.

Abstract:
We establish the exact moduli of non-differentiability of Gaussian random fields with stationary increments. As an application of the result, we prove that the uniform Hölder condition for the maximum local times of Gaussian random fields with stationary increments obtained in Xiao (1997) is optimal. These results are applicable to fractional Riesz–Bessel processes and stationary Gaussian random fields in the Matérn and Cauchy classes.




do

Rates of convergence in de Finetti’s representation theorem, and Hausdorff moment problem

Emanuele Dolera, Stefano Favaro.

Source: Bernoulli, Volume 26, Number 2, 1294--1322.

Abstract:
Given a sequence ${X_{n}}_{ngeq 1}$ of exchangeable Bernoulli random variables, the celebrated de Finetti representation theorem states that $frac{1}{n}sum_{i=1}^{n}X_{i}stackrel{a.s.}{longrightarrow }Y$ for a suitable random variable $Y:Omega ightarrow [0,1]$ satisfying $mathsf{P}[X_{1}=x_{1},dots ,X_{n}=x_{n}|Y]=Y^{sum_{i=1}^{n}x_{i}}(1-Y)^{n-sum_{i=1}^{n}x_{i}}$. In this paper, we study the rate of convergence in law of $frac{1}{n}sum_{i=1}^{n}X_{i}$ to $Y$ under the Kolmogorov distance. After showing that a rate of the type of $1/n^{alpha }$ can be obtained for any index $alpha in (0,1]$, we find a sufficient condition on the distribution of $Y$ for the achievement of the optimal rate of convergence, that is $1/n$. Besides extending and strengthening recent results under the weaker Wasserstein distance, our main result weakens the regularity hypotheses on $Y$ in the context of the Hausdorff moment problem.




do

Consistent structure estimation of exponential-family random graph models with block structure

Michael Schweinberger.

Source: Bernoulli, Volume 26, Number 2, 1205--1233.

Abstract:
We consider the challenging problem of statistical inference for exponential-family random graph models based on a single observation of a random graph with complex dependence. To facilitate statistical inference, we consider random graphs with additional structure in the form of block structure. We have shown elsewhere that when the block structure is known, it facilitates consistency results for $M$-estimators of canonical and curved exponential-family random graph models with complex dependence, such as transitivity. In practice, the block structure is known in some applications (e.g., multilevel networks), but is unknown in others. When the block structure is unknown, the first and foremost question is whether it can be recovered with high probability based on a single observation of a random graph with complex dependence. The main consistency results of the paper show that it is possible to do so under weak dependence and smoothness conditions. These results confirm that exponential-family random graph models with block structure constitute a promising direction of statistical network analysis.




do

A unified principled framework for resampling based on pseudo-populations: Asymptotic theory

Pier Luigi Conti, Daniela Marella, Fulvia Mecatti, Federico Andreis.

Source: Bernoulli, Volume 26, Number 2, 1044--1069.

Abstract:
In this paper, a class of resampling techniques for finite populations under $pi $ps sampling design is introduced. The basic idea on which they rest is a two-step procedure consisting in: (i) constructing a “pseudo-population” on the basis of sample data; (ii) drawing a sample from the predicted population according to an appropriate resampling design. From a logical point of view, this approach is essentially based on the plug-in principle by Efron, at the “sampling design level”. Theoretical justifications based on large sample theory are provided. New approaches to construct pseudo populations based on various forms of calibrations are proposed. Finally, a simulation study is performed.




do

Recurrence of multidimensional persistent random walks. Fourier and series criteria

Peggy Cénac, Basile de Loynes, Yoann Offret, Arnaud Rousselle.

Source: Bernoulli, Volume 26, Number 2, 858--892.

Abstract:
The recurrence and transience of persistent random walks built from variable length Markov chains are investigated. It turns out that these stochastic processes can be seen as Lévy walks for which the persistence times depend on some internal Markov chain: they admit Markov random walk skeletons. A recurrence versus transience dichotomy is highlighted. Assuming the positive recurrence of the driving chain, a sufficient Fourier criterion for the recurrence, close to the usual Chung–Fuchs one, is given and a series criterion is derived. The key tool is the Nagaev–Guivarc’h method. Finally, we focus on particular two-dimensional persistent random walks, including directionally reinforced random walks, for which necessary and sufficient Fourier and series criteria are obtained. Inspired by ( Adv. Math. 208 (2007) 680–698), we produce a genuine counterexample to the conjecture of ( Adv. Math. 117 (1996) 239–252). As for the one-dimensional case studied in ( J. Theoret. Probab. 31 (2018) 232–243), it is easier for a persistent random walk than its skeleton to be recurrent. However, such examples are much more difficult to exhibit in the higher dimensional context. These results are based on a surprisingly novel – to our knowledge – upper bound for the Lévy concentration function associated with symmetric distributions.




do

Normal approximation for sums of weighted &#36;U&#36;-statistics – application to Kolmogorov bounds in random subgraph counting

Nicolas Privault, Grzegorz Serafin.

Source: Bernoulli, Volume 26, Number 1, 587--615.

Abstract:
We derive normal approximation bounds in the Kolmogorov distance for sums of discrete multiple integrals and weighted $U$-statistics made of independent Bernoulli random variables. Such bounds are applied to normal approximation for the renormalized subgraph counts in the Erdős–Rényi random graph. This approach completely solves a long-standing conjecture in the general setting of arbitrary graph counting, while recovering recent results obtained for triangles and improving other bounds in the Wasserstein distance.




do

Operator-scaling Gaussian random fields via aggregation

Yi Shen, Yizao Wang.

Source: Bernoulli, Volume 26, Number 1, 500--530.

Abstract:
We propose an aggregated random-field model, and investigate the scaling limits of the aggregated partial-sum random fields. In this model, each copy in the aggregation is a $pm 1$-valued random field built from two correlated one-dimensional random walks, the law of each determined by a random persistence parameter. A flexible joint distribution of the two parameters is introduced, and given the parameters the two correlated random walks are conditionally independent. For the aggregated random field, when the persistence parameters are independent, the scaling limit is a fractional Brownian sheet. When the persistence parameters are tail-dependent, characterized in the framework of multivariate regular variation, the scaling limit is more delicate, and in particular depends on the growth rates of the underlying rectangular region along two directions: at different rates different operator-scaling Gaussian random fields appear as the region area tends to infinity. In particular, at the so-called critical speed, a large family of Gaussian random fields with long-range dependence arise in the limit. We also identify four different regimes at non-critical speed where fractional Brownian sheets arise in the limit.




do

Cliques in rank-1 random graphs: The role of inhomogeneity

Kay Bogerd, Rui M. Castro, Remco van der Hofstad.

Source: Bernoulli, Volume 26, Number 1, 253--285.

Abstract:
We study the asymptotic behavior of the clique number in rank-1 inhomogeneous random graphs, where edge probabilities between vertices are roughly proportional to the product of their vertex weights. We show that the clique number is concentrated on at most two consecutive integers, for which we provide an expression. Interestingly, the order of the clique number is primarily determined by the overall edge density, with the inhomogeneity only affecting multiplicative constants or adding at most a $log log (n)$ multiplicative factor. For sparse enough graphs the clique number is always bounded and the effect of inhomogeneity completely vanishes.




do

A new method for obtaining sharp compound Poisson approximation error estimates for sums of locally dependent random variables

Michael V. Boutsikas, Eutichia Vaggelatou

Source: Bernoulli, Volume 16, Number 2, 301--330.

Abstract:
Let X 1 , X 2 , …, X n be a sequence of independent or locally dependent random variables taking values in ℤ + . In this paper, we derive sharp bounds, via a new probabilistic method, for the total variation distance between the distribution of the sum ∑ i =1 n X i and an appropriate Poisson or compound Poisson distribution. These bounds include a factor which depends on the smoothness of the approximating Poisson or compound Poisson distribution. This “smoothness factor” is of order O( σ −2 ), according to a heuristic argument, where σ 2 denotes the variance of the approximating distribution. In this way, we offer sharp error estimates for a large range of values of the parameters. Finally, specific examples concerning appearances of rare runs in sequences of Bernoulli trials are presented by way of illustration.




do

Gordon of Huntly : heraldic heritage : cadets to South Australia / Robin Gregory Gordon.

South Australia -- Genealogy.




do

Traegers in Australia. 3, Ernst's story : the story of Ernst Wilhelm Traeger and Johanne Dorothea nee Lissmann, and their descendants, 1856-2018.

Traeger, Ernst Wilhelm, 1805-1874.




do

From Wends we came : the story of Johann and Maria Huppatz & their descendants / compiled by Frank Huppatz and Rone McDonnell.

Huppatz (Family).




do

McGraw-Hill and Cengage Abandon Merger Plans

The two major companies cited what they considered onerous divestiture requirements from the U.S. Department of Justice as the reason behind their joint decision.

The post McGraw-Hill and Cengage Abandon Merger Plans appeared first on Market Brief.



  • Marketplace K-12
  • Business Strategy
  • COVID-19
  • Curriculum / Digital Curriculum
  • Mergers and Acquisitions
  • Online / Virtual Learning

do

Box 3: Children's book illustrations by various artists, Peg Maltby and Dorothy Wall, , ca. 1932-1975




do

Box 4: Children's book illustrations by various artists, Dorothy Wall, ca. 1932




do

Box 6: Children's book illustrations by various artists, Dorothy Wall and Noela Young, ca. 1932-1964




do

Where do I start? Discover Your State Library Online

Whether you're looking for a new book to read, a binge-worthy podcast, inspiring stories, or a fun activity to do at home – you can get all of this and more online at your State Library




do

Where do I start? Discover Your State Library Online

Whether you’re looking for a new book to read, a binge-worthy podcast, inspiring stories, or a fun activity to do at home — you can get all of this and more online at your State Library.   




do

Federal watchdog finds &#39;reasonable grounds to believe&#39; vaccine doctor&#39;s ouster was retaliation, lawyers say

The Office of Special Counsel is recommending that ousted vaccine official Dr. Rick Bright be reinstated while it investigates his case, his lawyers announced Friday.Bright while leading coronavirus vaccine development was recently removed from his position as the director of the Department of Health and Human Services' Biomedical Advanced Research and Development Authority, and he alleges it was because he insisted congressional funding not go toward "drugs, vaccines, and other technologies that lack scientific merit" and limited the "broad use" of hydroxychloroquine after it was touted by President Trump. In a whistleblower complaint, he alleged "cronyism" at HHS. He has also alleged he was "pressured to ignore or dismiss expert scientific recommendations and instead to award lucrative contracts based on political connections."On Friday, Bright's lawyers said that the Office of Special Counsel has determined there are "reasonable grounds to believe" his firing was retaliation, The New York Times reports. The federal watchdog also recommended he be reinstated for 45 days to give the office "sufficient time to complete its investigation of Bright's allegations," CNN reports. The decision on whether to do so falls on Secretary of Health and Human Services Alex Azar, and Office of Special Counsel recommendations are "not binding," the Times notes. More stories from theweek.com Outed CIA agent Valerie Plame is running for Congress, and her launch video looks like a spy movie trailer 7 scathing cartoons about America's rush to reopen Trump says he couldn't have exposed WWII vets to COVID-19 because the wind was blowing the wrong way





do

Chaffetz: I don't understand why Adam Schiff continues to have a security clearance

Fox News contributor Jason Chaffetz and Andy McCarthy react to House Intelligence transcripts on Russia probe.





do

The McMichaels can&#39;t be charged with a hate crime by the state in the shooting death of Ahmaud Arbery because the law doesn&#39;t exist in Georgia

Georgia is one of four states that doesn't have a hate crime law. Arbery's killing has reignited calls for legislation.





do

Hierarchical Normalized Completely Random Measures for Robust Graphical Modeling

Andrea Cremaschi, Raffaele Argiento, Katherine Shoemaker, Christine Peterson, Marina Vannucci.

Source: Bayesian Analysis, Volume 14, Number 4, 1271--1301.

Abstract:
Gaussian graphical models are useful tools for exploring network structures in multivariate normal data. In this paper we are interested in situations where data show departures from Gaussianity, therefore requiring alternative modeling distributions. The multivariate $t$ -distribution, obtained by dividing each component of the data vector by a gamma random variable, is a straightforward generalization to accommodate deviations from normality such as heavy tails. Since different groups of variables may be contaminated to a different extent, Finegold and Drton (2014) introduced the Dirichlet $t$ -distribution, where the divisors are clustered using a Dirichlet process. In this work, we consider a more general class of nonparametric distributions as the prior on the divisor terms, namely the class of normalized completely random measures (NormCRMs). To improve the effectiveness of the clustering, we propose modeling the dependence among the divisors through a nonparametric hierarchical structure, which allows for the sharing of parameters across the samples in the data set. This desirable feature enables us to cluster together different components of multivariate data in a parsimonious way. We demonstrate through simulations that this approach provides accurate graphical model inference, and apply it to a case study examining the dependence structure in radiomics data derived from The Cancer Imaging Atlas.




do

Statistical Methodology in Single-Molecule Experiments

Chao Du, S. C. Kou.

Source: Statistical Science, Volume 35, Number 1, 75--91.

Abstract:
Toward the last quarter of the 20th century, the emergence of single-molecule experiments enabled scientists to track and study individual molecules’ dynamic properties in real time. Unlike macroscopic systems’ dynamics, those of single molecules can only be properly described by stochastic models even in the absence of external noise. Consequently, statistical methods have played a key role in extracting hidden information about molecular dynamics from data obtained through single-molecule experiments. In this article, we survey the major statistical methodologies used to analyze single-molecule experimental data. Our discussion is organized according to the types of stochastic models used to describe single-molecule systems as well as major experimental data collection techniques. We also highlight challenges and future directions in the application of statistical methodologies to single-molecule experiments.




do

Comment on “Automated Versus Do-It-Yourself Methods for Causal Inference: Lessons Learned from a Data Analysis Competition”

Susan Gruber, Mark J. van der Laan.

Source: Statistical Science, Volume 34, Number 1, 82--85.

Abstract:
Dorie and co-authors (DHSSC) are to be congratulated for initiating the ACIC Data Challenge. Their project engaged the community and accelerated research by providing a level playing field for comparing the performance of a priori specified algorithms. DHSSC identified themes concerning characteristics of the DGP, properties of the estimators, and inference. We discuss these themes in the context of targeted learning.




do

Smart women don't smoke / Biman Mullick.

London (33 Stillness Road, London SE23 1NG) : Cleanair, Campaign for a Smoke-free Environment, [1989?]




do

If you must smoke don't exhale / design : Biman Mullick.

London (33 Stillness Rd, London, SE23 1NG) : Cleanair, Campaign for a Smoke-free Environment, [198-?]




do

Tapadh leibh airson nach do smoc sibh / design : Biman Mullick.

London (33 Stillness Rd, SE23 1NG) : Cleanair, Campaign for a Smoke-free Environment, [198-?]




do

If you must smoke don't exhale / Biman Mullick.

London : Cleanair, [1988?]




do

Tapadh leibh airson nach do smoc sibh / design: Biman Mullick.

London (33 Stillness Road London SE23 1NG) : Cleanair Campaign for a Smoke-free Environment, [198-?]




do

The Joyful Reduction of Uncertainty: Music Perception as a Window to Predictive Neuronal Processing




do

Dopamine D1 and D2 Receptor Family Contributions to Modafinil-Induced Wakefulness

Jared W. Young
Mar 4, 2009; 29:2663-2665
Journal Club




do

Daily Marijuana Use Is Not Associated with Brain Morphometric Measures in Adolescents or Adults

Barbara J. Weiland
Jan 28, 2015; 35:1505-1512
Neurobiology of Disease




do

Significant Neuroanatomical Variation Among Domestic Dog Breeds

Erin E. Hecht
Sep 25, 2019; 39:7748-7758
BehavioralSystemsCognitive




do

Where Is the Anterior Temporal Lobe and What Does It Do?

Michael F. Bonner
Mar 6, 2013; 33:4213-4215
Journal Club