ndo

The disinfection of scarlet fever and other infectious diseases by antiseptic inunction / by J. Brendon Curgenven.

London : H.K. Lewis, 1891.




ndo

Dissertationes medicæ in Universitate Vindobonensi habitæ ad morbos chronicos pertinentes / Max. Stollii ; edidit et præfatus est Josephus Eyerel.

Viennæ : Typis Christiani Friderici Wappler, 1788-92.




ndo

The distribution and duration of visceral new growths : being the Bradshawe Lecture delivered before the Royal College of Physicians of London on August 19, 1889 / by Norman Moore.

Edinburgh : Young J. Pentland, 1889.




ndo

Domestic sanitary drainage and plumbing : lectures on practical sanitation delivered to plumbers, engineers, and others in the Central Technical Institution, South Kensington, London ... / by William R. Maguire.

London : Kegan Paul, Trench, Trubner, 1890.




ndo

Dr. Webster's remarks on the health of London : read before the Westminster Medical Society, April 13, 1850.

[London?] : [publisher not identified], [1850]




ndo

Electrical and anatomical demonstrations : delivered at the School of Massage and Electricity, in connection with the West-End Hospital for Diseases of the Nervous System, Paralysis and Epilepsy, Welbeck Street, London. A handbook for trained nurses and m

London : J. & A. Churchill, 1887.




ndo

Endoskopie und endoskopische Therapie der Krankheiten der Harnrohre und Blase / von Emil Burckhardt.

Tubingen : H. Laupp, 1889.




ndo

Paul brings herbs to refresh Virginie after she has performed a long walk barefoot. Stipple engraving by J.P. Simon after C.P. Landon.

A Paris (rue St Denis No. 214) : chez Bance aîné, [1810?]




ndo

Orlando rescuing Olympia from a sea-monster. Etching by F. Bartolozzi, 1763, after Lodovico Carracci.

London (in Cheapside) : Published according to Act of Parliament, by J Boydell engraver, 1763.




ndo

The Raden Temenggung and regent of Lebak, Java, Indonesia. Coloured lithograph by P. Lauters after C.W.M. van der Velde, ca. 1843.

Amsterdam : Uitgegeven by Frans Buffa en Zonen, [between 1843 and 1845]




ndo

The Radja Djajanagara and regent of Serang, Java, Indonesia. Coloured lithograph by P. Lauters after C.W.M. van der Velde, ca. 1843.

Amsterdam : Uitgegeven by Frans Buffa en Zonen, [between 1843 and 1845]




ndo

The dynastic marriage of William of Orange and Mary Stuart: above, they are brought together before a bust of Hercules; below, their wedding in London on 4 November 1677. Etching by R. de Hooghe, 1678.

[The Netherlands] : [Romeyn de Hooghe?], [1678?]




ndo

PHT remembers video games: Hockey on the Nintendo 64

Only so money hockey options on that odd beast of a machine.




ndo

Biman Mullick and Roy Castle at the Royal College of Physicians, London 5 January 1993.

[London?], [1993?]




ndo

Administrative scheme for the County of London made by the London County Council on 18th December, 1934, for discharging the functions transferred to the Council by Part I of the Local Government Act, 1929, and orders made bu the Minister of Health under

England : London County Council, Public Assistance Department, 1935.




ndo

Random distributions via Sequential Quantile Array

Annalisa Fabretti, Samantha Leorato.

Source: Electronic Journal of Statistics, Volume 14, Number 1, 1611--1647.

Abstract:
We propose a method to generate random distributions with known quantile distribution, or, more generally, with known distribution for some form of generalized quantile. The method takes inspiration from the random Sequential Barycenter Array distributions (SBA) proposed by Hill and Monticino (1998) which generates a Random Probability Measure (RPM) with known expected value. We define the Sequential Quantile Array (SQA) and show how to generate a random SQA from which we can derive RPMs. The distribution of the generated SQA-RPM can have full support and the RPMs can be both discrete, continuous and differentiable. We face also the problem of the efficient implementation of the procedure that ensures that the approximation of the SQA-RPM by a finite number of steps stays close to the SQA-RPM obtained theoretically by the procedure. Finally, we compare SQA-RPMs with similar approaches as Polya Tree.




ndo

Lower Bounds for Parallel and Randomized Convex Optimization

We study the question of whether parallelization in the exploration of the feasible set can be used to speed up convex optimization, in the local oracle model of computation and in the high-dimensional regime. We show that the answer is negative for both deterministic and randomized algorithms applied to essentially any of the interesting geometries and nonsmooth, weakly-smooth, or smooth objective functions. In particular, we show that it is not possible to obtain a polylogarithmic (in the sequential complexity of the problem) number of parallel rounds with a polynomial (in the dimension) number of queries per round. In the majority of these settings and when the dimension of the space is polynomial in the inverse target accuracy, our lower bounds match the oracle complexity of sequential convex optimization, up to at most a logarithmic factor in the dimension, which makes them (nearly) tight. Another conceptual contribution of our work is in providing a general and streamlined framework for proving lower bounds in the setting of parallel convex optimization. Prior to our work, lower bounds for parallel convex optimization algorithms were only known in a small fraction of the settings considered in this paper, mainly applying to Euclidean ($ell_2$) and $ell_infty$ spaces.




ndo

Convergences of Regularized Algorithms and Stochastic Gradient Methods with Random Projections

We study the least-squares regression problem over a Hilbert space, covering nonparametric regression over a reproducing kernel Hilbert space as a special case. We first investigate regularized algorithms adapted to a projection operator on a closed subspace of the Hilbert space. We prove convergence results with respect to variants of norms, under a capacity assumption on the hypothesis space and a regularity condition on the target function. As a result, we obtain optimal rates for regularized algorithms with randomized sketches, provided that the sketch dimension is proportional to the effective dimension up to a logarithmic factor. As a byproduct, we obtain similar results for Nystr"{o}m regularized algorithms. Our results provide optimal, distribution-dependent rates that do not have any saturation effect for sketched/Nystr"{o}m regularized algorithms, considering both the attainable and non-attainable cases, in the well-conditioned regimes. We then study stochastic gradient methods with projection over the subspace, allowing multi-pass over the data and minibatches, and we derive similar optimal statistical convergence results.




ndo

On the Complexity Analysis of the Primal Solutions for the Accelerated Randomized Dual Coordinate Ascent

Dual first-order methods are essential techniques for large-scale constrained convex optimization. However, when recovering the primal solutions, we need $T(epsilon^{-2})$ iterations to achieve an $epsilon$-optimal primal solution when we apply an algorithm to the non-strongly convex dual problem with $T(epsilon^{-1})$ iterations to achieve an $epsilon$-optimal dual solution, where $T(x)$ can be $x$ or $sqrt{x}$. In this paper, we prove that the iteration complexity of the primal solutions and dual solutions have the same $Oleft(frac{1}{sqrt{epsilon}} ight)$ order of magnitude for the accelerated randomized dual coordinate ascent. When the dual function further satisfies the quadratic functional growth condition, by restarting the algorithm at any period, we establish the linear iteration complexity for both the primal solutions and dual solutions even if the condition number is unknown. When applied to the regularized empirical risk minimization problem, we prove the iteration complexity of $Oleft(nlog n+sqrt{frac{n}{epsilon}} ight)$ in both primal space and dual space, where $n$ is the number of samples. Our result takes out the $left(log frac{1}{epsilon} ight)$ factor compared with the methods based on smoothing/regularization or Catalyst reduction. As far as we know, this is the first time that the optimal $Oleft(sqrt{frac{n}{epsilon}} ight)$ iteration complexity in the primal space is established for the dual coordinate ascent based stochastic algorithms. We also establish the accelerated linear complexity for some problems with nonsmooth loss, e.g., the least absolute deviation and SVM.




ndo

Branching random walks with uncountably many extinction probability vectors

Daniela Bertacchi, Fabio Zucca.

Source: Brazilian Journal of Probability and Statistics, Volume 34, Number 2, 426--438.

Abstract:
Given a branching random walk on a set $X$, we study its extinction probability vectors $mathbf{q}(cdot,A)$. Their components are the probability that the process goes extinct in a fixed $Asubseteq X$, when starting from a vertex $xin X$. The set of extinction probability vectors (obtained letting $A$ vary among all subsets of $X$) is a subset of the set of the fixed points of the generating function of the branching random walk. In particular here we are interested in the cardinality of the set of extinction probability vectors. We prove results which allow to understand whether the probability of extinction in a set $A$ is different from the one of extinction in another set $B$. In many cases there are only two possible extinction probability vectors and so far, in more complicated examples, only a finite number of distinct extinction probability vectors had been explicitly found. Whether a branching random walk could have an infinite number of distinct extinction probability vectors was not known. We apply our results to construct examples of branching random walks with uncountably many distinct extinction probability vectors.




ndo

Stein characterizations for linear combinations of gamma random variables

Benjamin Arras, Ehsan Azmoodeh, Guillaume Poly, Yvik Swan.

Source: Brazilian Journal of Probability and Statistics, Volume 34, Number 2, 394--413.

Abstract:
In this paper we propose a new, simple and explicit mechanism allowing to derive Stein operators for random variables whose characteristic function satisfies a simple ODE. We apply this to study random variables which can be represented as linear combinations of (not necessarily independent) gamma distributed random variables. The connection with Malliavin calculus for random variables in the second Wiener chaos is detailed. An application to McKay Type I random variables is also outlined.




ndo

Random environment binomial thinning integer-valued autoregressive process with Poisson or geometric marginal

Zhengwei Liu, Qi Li, Fukang Zhu.

Source: Brazilian Journal of Probability and Statistics, Volume 34, Number 2, 251--272.

Abstract:
To predict time series of counts with small values and remarkable fluctuations, an available model is the $r$ states random environment process based on the negative binomial thinning operator and the geometric marginal. However, we argue that the aforementioned model may suffer from the following two drawbacks. First, under the condition of no prior information, the overdispersed property of the geometric distribution may cause the predictions fluctuate greatly. Second, because of the constraints on the model parameters, some estimated parameters are close to zero in real-data examples, which may not objectively reveal the correlation relationship. For the first drawback, an $r$ states random environment process based on the binomial thinning operator and the Poisson marginal is introduced. For the second drawback, we propose a generalized $r$ states random environment integer-valued autoregressive model based on the binomial thinning operator to model fluctuations of data. Yule–Walker and conditional maximum likelihood estimates are considered and their performances are assessed via simulation studies. Two real-data sets are conducted to illustrate the better performances of the proposed models compared with some existing models.




ndo

Unions of random walk and percolation on infinite graphs

Kazuki Okamura.

Source: Brazilian Journal of Probability and Statistics, Volume 33, Number 3, 586--637.

Abstract:
We consider a random object that is associated with both random walks and random media, specifically, the superposition of a configuration of subcritical Bernoulli percolation on an infinite connected graph and the trace of the simple random walk on the same graph. We investigate asymptotics for the number of vertices of the enlargement of the trace of the walk until a fixed time, when the time tends to infinity. This process is more highly self-interacting than the range of random walk, which yields difficulties. We show a law of large numbers on vertex-transitive transient graphs. We compare the process on a vertex-transitive graph with the process on a finitely modified graph of the original vertex-transitive graph and show their behaviors are similar. We show that the process fluctuates almost surely on a certain non-vertex-transitive graph. On the two-dimensional integer lattice, by investigating the size of the boundary of the trace, we give an estimate for variances of the process implying a law of large numbers. We give an example of a graph with unbounded degrees on which the process behaves in a singular manner. As by-products, some results for the range and the boundary, which will be of independent interest, are obtained.




ndo

Spatially adaptive Bayesian image reconstruction through locally-modulated Markov random field models

Salem M. Al-Gezeri, Robert G. Aykroyd.

Source: Brazilian Journal of Probability and Statistics, Volume 33, Number 3, 498--519.

Abstract:
The use of Markov random field (MRF) models has proven to be a fruitful approach in a wide range of image processing applications. It allows local texture information to be incorporated in a systematic and unified way and allows statistical inference theory to be applied giving rise to novel output summaries and enhanced image interpretation. A great advantage of such low-level approaches is that they lead to flexible models, which can be applied to a wide range of imaging problems without the need for significant modification. This paper proposes and explores the use of conditional MRF models for situations where multiple images are to be processed simultaneously, or where only a single image is to be reconstructed and a sequential approach is taken. Although the coupling of image intensity values is a special case of our approach, the main extension over previous proposals is to allow the direct coupling of other properties, such as smoothness or texture. This is achieved using a local modulating function which adjusts the influence of global smoothing without the need for a fully inhomogeneous prior model. Several modulating functions are considered and a detailed simulation study, motivated by remote sensing applications in archaeological geophysics, of conditional reconstruction is presented. The results demonstrate that a substantial improvement in the quality of the image reconstruction, in terms of errors and residuals, can be achieved using this approach, especially at locations with rapid changes in the underlying intensity.




ndo

Necessary and sufficient conditions for the convergence of the consistent maximal displacement of the branching random walk

Bastien Mallein.

Source: Brazilian Journal of Probability and Statistics, Volume 33, Number 2, 356--373.

Abstract:
Consider a supercritical branching random walk on the real line. The consistent maximal displacement is the smallest of the distances between the trajectories followed by individuals at the $n$th generation and the boundary of the process. Fang and Zeitouni, and Faraud, Hu and Shi proved that under some integrability conditions, the consistent maximal displacement grows almost surely at rate $lambda^{*}n^{1/3}$ for some explicit constant $lambda^{*}$. We obtain here a necessary and sufficient condition for this asymptotic behaviour to hold.




ndo

Can $p$-values be meaningfully interpreted without random sampling?

Norbert Hirschauer, Sven Grüner, Oliver Mußhoff, Claudia Becker, Antje Jantsch.

Source: Statistics Surveys, Volume 14, 71--91.

Abstract:
Besides the inferential errors that abound in the interpretation of $p$-values, the probabilistic pre-conditions (i.e. random sampling or equivalent) for using them at all are not often met by observational studies in the social sciences. This paper systematizes different sampling designs and discusses the restrictive requirements of data collection that are the indispensable prerequisite for using $p$-values.




ndo

Statistical errors in Monte Carlo-based inference for random elements. (arXiv:2005.02532v2 [math.ST] UPDATED)

Monte Carlo simulation is useful to compute or estimate expected functionals of random elements if those random samples are possible to be generated from the true distribution. However, when the distribution has some unknown parameters, the samples must be generated from an estimated distribution with the parameters replaced by some estimators, which causes a statistical error in Monte Carlo estimation. This paper considers such a statistical error and investigates the asymptotic distributions of Monte Carlo-based estimators when the random elements are not only the real valued, but also functional valued random variables. We also investigate expected functionals for semimartingales in details. The consideration indicates that the Monte Carlo estimation can get worse when a semimartingale has a jump part with unremovable unknown parameters.




ndo

Sampling random graph homomorphisms and applications to network data analysis. (arXiv:1910.09483v2 [math.PR] UPDATED)

A graph homomorphism is a map between two graphs that preserves adjacency relations. We consider the problem of sampling a random graph homomorphism from a graph $F$ into a large network $mathcal{G}$. We propose two complementary MCMC algorithms for sampling a random graph homomorphisms and establish bounds on their mixing times and concentration of their time averages. Based on our sampling algorithms, we propose a novel framework for network data analysis that circumvents some of the drawbacks in methods based on independent and neigborhood sampling. Various time averages of the MCMC trajectory give us various computable observables, including well-known ones such as homomorphism density and average clustering coefficient and their generalizations. Furthermore, we show that these network observables are stable with respect to a suitably renormalized cut distance between networks. We provide various examples and simulations demonstrating our framework through synthetic networks. We also apply our framework for network clustering and classification problems using the Facebook100 dataset and Word Adjacency Networks of a set of classic novels.




ndo

Robust location estimators in regression models with covariates and responses missing at random. (arXiv:2005.03511v1 [stat.ME])

This paper deals with robust marginal estimation under a general regression model when missing data occur in the response and also in some of covariates. The target is a marginal location parameter which is given through an $M-$functional. To obtain robust Fisher--consistent estimators, properly defined marginal distribution function estimators are considered. These estimators avoid the bias due to missing values by assuming a missing at random condition. Three methods are considered to estimate the marginal distribution function which allows to obtain the $M-$location of interest: the well-known inverse probability weighting, a convolution--based method that makes use of the regression model and an augmented inverse probability weighting procedure that prevents against misspecification. The robust proposed estimators and the classical ones are compared through a numerical study under different missing models including clean and contaminated samples. We illustrate the estimators behaviour under a nonlinear model. A real data set is also analysed.




ndo

Convergence and inference for mixed Poisson random sums. (arXiv:2005.03187v1 [math.PR])

In this paper we obtain the limit distribution for partial sums with a random number of terms following a class of mixed Poisson distributions. The resulting weak limit is a mixing between a normal distribution and an exponential family, which we call by normal exponential family (NEF) laws. A new stability concept is introduced and a relationship between {alpha}-stable distributions and NEF laws is established. We propose estimation of the parameters of the NEF models through the method of moments and also by the maximum likelihood method, which is performed via an Expectation-Maximization algorithm. Monte Carlo simulation studies are addressed to check the performance of the proposed estimators and an empirical illustration on financial market is presented.




ndo

On the Optimality of Randomization in Experimental Design: How to Randomize for Minimax Variance and Design-Based Inference. (arXiv:2005.03151v1 [stat.ME])

I study the minimax-optimal design for a two-arm controlled experiment where conditional mean outcomes may vary in a given set. When this set is permutation symmetric, the optimal design is complete randomization, and using a single partition (i.e., the design that only randomizes the treatment labels for each side of the partition) has minimax risk larger by a factor of $n-1$. More generally, the optimal design is shown to be the mixed-strategy optimal design (MSOD) of Kallus (2018). Notably, even when the set of conditional mean outcomes has structure (i.e., is not permutation symmetric), being minimax-optimal for variance still requires randomization beyond a single partition. Nonetheless, since this targets precision, it may still not ensure sufficient uniformity in randomization to enable randomization (i.e., design-based) inference by Fisher's exact test to appropriately detect violations of null. I therefore propose the inference-constrained MSOD, which is minimax-optimal among all designs subject to such uniformity constraints. On the way, I discuss Johansson et al. (2020) who recently compared rerandomization of Morgan and Rubin (2012) and the pure-strategy optimal design (PSOD) of Kallus (2018). I point out some errors therein and set straight that randomization is minimax-optimal and that the "no free lunch" theorem and example in Kallus (2018) are correct.




ndo

Bayesian Random-Effects Meta-Analysis Using the bayesmeta R Package

The random-effects or normal-normal hierarchical model is commonly utilized in a wide range of meta-analysis applications. A Bayesian approach to inference is very attractive in this context, especially when a meta-analysis is based only on few studies. The bayesmeta R package provides readily accessible tools to perform Bayesian meta-analyses and generate plots and summaries, without having to worry about computational details. It allows for flexible prior specification and instant access to the resulting posterior distributions, including prediction and shrinkage estimation, and facilitating for example quick sensitivity checks. The present paper introduces the underlying theory and showcases its usage.




ndo

Microbial endophytes : functional biology and applications

9780128196540 (print)




ndo

Microbial endophytes : prospects for sustainable agriculture

0128187255




ndo

Management of fractured endodontic instruments : a clinical guide

9783319606514 (electronic bk.)




ndo

Endocrine surgery in children

9783662542569 (electronic book)




ndo

Clinical approaches in endodontic regeneration : current and emerging therapeutic perspectives

9783319968483 (electronic bk.)




ndo

Apical periodontitis in root-filled teeth : endodontic retreatment and alternative approaches

9783319572505 (electronic bk.)




ndo

Concentration and consistency results for canonical and curved exponential-family models of random graphs

Michael Schweinberger, Jonathan Stewart.

Source: The Annals of Statistics, Volume 48, Number 1, 374--396.

Abstract:
Statistical inference for exponential-family models of random graphs with dependent edges is challenging. We stress the importance of additional structure and show that additional structure facilitates statistical inference. A simple example of a random graph with additional structure is a random graph with neighborhoods and local dependence within neighborhoods. We develop the first concentration and consistency results for maximum likelihood and $M$-estimators of a wide range of canonical and curved exponential-family models of random graphs with local dependence. All results are nonasymptotic and applicable to random graphs with finite populations of nodes, although asymptotic consistency results can be obtained as well. In addition, we show that additional structure can facilitate subgraph-to-graph estimation, and present concentration results for subgraph-to-graph estimators. As an application, we consider popular curved exponential-family models of random graphs, with local dependence induced by transitivity and parameter vectors whose dimensions depend on the number of nodes.




ndo

Rerandomization in $2^{K}$ factorial experiments

Xinran Li, Peng Ding, Donald B. Rubin.

Source: The Annals of Statistics, Volume 48, Number 1, 43--63.

Abstract:
With many pretreatment covariates and treatment factors, the classical factorial experiment often fails to balance covariates across multiple factorial effects simultaneously. Therefore, it is intuitive to restrict the randomization of the treatment factors to satisfy certain covariate balance criteria, possibly conforming to the tiers of factorial effects and covariates based on their relative importances. This is rerandomization in factorial experiments. We study the asymptotic properties of this experimental design under the randomization inference framework without imposing any distributional or modeling assumptions of the covariates and outcomes. We derive the joint asymptotic sampling distribution of the usual estimators of the factorial effects, and show that it is symmetric, unimodal and more “concentrated” at the true factorial effects under rerandomization than under the classical factorial experiment. We quantify this advantage of rerandomization using the notions of “central convex unimodality” and “peakedness” of the joint asymptotic sampling distribution. We also construct conservative large-sample confidence sets for the factorial effects.




ndo

Randomized incomplete $U$-statistics in high dimensions

Xiaohui Chen, Kengo Kato.

Source: The Annals of Statistics, Volume 47, Number 6, 3127--3156.

Abstract:
This paper studies inference for the mean vector of a high-dimensional $U$-statistic. In the era of big data, the dimension $d$ of the $U$-statistic and the sample size $n$ of the observations tend to be both large, and the computation of the $U$-statistic is prohibitively demanding. Data-dependent inferential procedures such as the empirical bootstrap for $U$-statistics is even more computationally expensive. To overcome such a computational bottleneck, incomplete $U$-statistics obtained by sampling fewer terms of the $U$-statistic are attractive alternatives. In this paper, we introduce randomized incomplete $U$-statistics with sparse weights whose computational cost can be made independent of the order of the $U$-statistic. We derive nonasymptotic Gaussian approximation error bounds for the randomized incomplete $U$-statistics in high dimensions, namely in cases where the dimension $d$ is possibly much larger than the sample size $n$, for both nondegenerate and degenerate kernels. In addition, we propose generic bootstrap methods for the incomplete $U$-statistics that are computationally much less demanding than existing bootstrap methods, and establish finite sample validity of the proposed bootstrap methods. Our methods are illustrated on the application to nonparametric testing for the pairwise independence of a high-dimensional random vector under weaker assumptions than those appearing in the literature.




ndo

Eigenvalue distributions of variance components estimators in high-dimensional random effects models

Zhou Fan, Iain M. Johnstone.

Source: The Annals of Statistics, Volume 47, Number 5, 2855--2886.

Abstract:
We study the spectra of MANOVA estimators for variance component covariance matrices in multivariate random effects models. When the dimensionality of the observations is large and comparable to the number of realizations of each random effect, we show that the empirical spectra of such estimators are well approximated by deterministic laws. The Stieltjes transforms of these laws are characterized by systems of fixed-point equations, which are numerically solvable by a simple iterative procedure. Our proof uses operator-valued free probability theory, and we establish a general asymptotic freeness result for families of rectangular orthogonally invariant random matrices, which is of independent interest. Our work is motivated in part by the estimation of components of covariance between multiple phenotypic traits in quantitative genetics, and we specialize our results to common experimental designs that arise in this application.




ndo

Distance multivariance: New dependence measures for random vectors

Björn Böttcher, Martin Keller-Ressel, René L. Schilling.

Source: The Annals of Statistics, Volume 47, Number 5, 2757--2789.

Abstract:
We introduce two new measures for the dependence of $nge2$ random variables: distance multivariance and total distance multivariance . Both measures are based on the weighted $L^{2}$-distance of quantities related to the characteristic functions of the underlying random variables. These extend distance covariance (introduced by Székely, Rizzo and Bakirov) from pairs of random variables to $n$-tuplets of random variables. We show that total distance multivariance can be used to detect the independence of $n$ random variables and has a simple finite-sample representation in terms of distance matrices of the sample points, where distance is measured by a continuous negative definite function. Under some mild moment conditions, this leads to a test for independence of multiple random vectors which is consistent against all alternatives.




ndo

Phase transition in the spiked random tensor with Rademacher prior

Wei-Kuo Chen.

Source: The Annals of Statistics, Volume 47, Number 5, 2734--2756.

Abstract:
We consider the problem of detecting a deformation from a symmetric Gaussian random $p$-tensor $(pgeq3)$ with a rank-one spike sampled from the Rademacher prior. Recently, in Lesieur et al. (Barbier, Krzakala, Macris, Miolane and Zdeborová (2017)), it was proved that there exists a critical threshold $eta_{p}$ so that when the signal-to-noise ratio exceeds $eta_{p}$, one can distinguish the spiked and unspiked tensors and weakly recover the prior via the minimal mean-square-error method. On the other side, Perry, Wein and Bandeira (Perry, Wein and Bandeira (2017)) proved that there exists a $eta_{p}'<eta_{p}$ such that any statistical hypothesis test cannot distinguish these two tensors, in the sense that their total variation distance asymptotically vanishes, when the signa-to-noise ratio is less than $eta_{p}'$. In this work, we show that $eta_{p}$ is indeed the critical threshold that strictly separates the distinguishability and indistinguishability between the two tensors under the total variation distance. Our approach is based on a subtle analysis of the high temperature behavior of the pure $p$-spin model with Ising spin, arising initially from the field of spin glasses. In particular, we identify the signal-to-noise criticality $eta_{p}$ as the critical temperature, distinguishing the high and low temperature behavior, of the Ising pure $p$-spin mean-field spin glass model.




ndo

A latent discrete Markov random field approach to identifying and classifying historical forest communities based on spatial multivariate tree species counts

Stephen Berg, Jun Zhu, Murray K. Clayton, Monika E. Shea, David J. Mladenoff.

Source: The Annals of Applied Statistics, Volume 13, Number 4, 2312--2340.

Abstract:
The Wisconsin Public Land Survey database describes historical forest composition at high spatial resolution and is of interest in ecological studies of forest composition in Wisconsin just prior to significant Euro-American settlement. For such studies it is useful to identify recurring subpopulations of tree species known as communities, but standard clustering approaches for subpopulation identification do not account for dependence between spatially nearby observations. Here, we develop and fit a latent discrete Markov random field model for the purpose of identifying and classifying historical forest communities based on spatially referenced multivariate tree species counts across Wisconsin. We show empirically for the actual dataset and through simulation that our latent Markov random field modeling approach improves prediction and parameter estimation performance. For model fitting we introduce a new stochastic approximation algorithm which enables computationally efficient estimation and classification of large amounts of spatial multivariate count data.




ndo

Oblique random survival forests

Byron C. Jaeger, D. Leann Long, Dustin M. Long, Mario Sims, Jeff M. Szychowski, Yuan-I Min, Leslie A. Mcclure, George Howard, Noah Simon.

Source: The Annals of Applied Statistics, Volume 13, Number 3, 1847--1883.

Abstract:
We introduce and evaluate the oblique random survival forest (ORSF). The ORSF is an ensemble method for right-censored survival data that uses linear combinations of input variables to recursively partition a set of training data. Regularized Cox proportional hazard models are used to identify linear combinations of input variables in each recursive partitioning step. Benchmark results using simulated and real data indicate that the ORSF’s predicted risk function has high prognostic value in comparison to random survival forests, conditional inference forests, regression and boosting. In an application to data from the Jackson Heart Study, we demonstrate variable and partial dependence using the ORSF and highlight characteristics of its ten-year predicted risk function for atherosclerotic cardiovascular disease events (ASCVD; stroke, coronary heart disease). We present visualizations comparing variable and partial effect estimation according to the ORSF, the conditional inference forest, and the Pooled Cohort Risk equations. The obliqueRSF R package, which provides functions to fit the ORSF and create variable and partial dependence plots, is available on the comprehensive R archive network (CRAN).




ndo

RCRnorm: An integrated system of random-coefficient hierarchical regression models for normalizing NanoString nCounter data

Gaoxiang Jia, Xinlei Wang, Qiwei Li, Wei Lu, Ximing Tang, Ignacio Wistuba, Yang Xie.

Source: The Annals of Applied Statistics, Volume 13, Number 3, 1617--1647.

Abstract:
Formalin-fixed paraffin-embedded (FFPE) samples have great potential for biomarker discovery, retrospective studies and diagnosis or prognosis of diseases. Their application, however, is hindered by the unsatisfactory performance of traditional gene expression profiling techniques on damaged RNAs. NanoString nCounter platform is well suited for profiling of FFPE samples and measures gene expression with high sensitivity which may greatly facilitate realization of scientific and clinical values of FFPE samples. However, methodological development for normalization, a critical step when analyzing this type of data, is far behind. Existing methods designed for the platform use information from different types of internal controls separately and rely on an overly-simplified assumption that expression of housekeeping genes is constant across samples for global scaling. Thus, these methods are not optimized for the nCounter system, not mentioning that they were not developed for FFPE samples. We construct an integrated system of random-coefficient hierarchical regression models to capture main patterns and characteristics observed from NanoString data of FFPE samples and develop a Bayesian approach to estimate parameters and normalize gene expression across samples. Our method, labeled RCRnorm, incorporates information from all aspects of the experimental design and simultaneously removes biases from various sources. It eliminates the unrealistic assumption on housekeeping genes and offers great interpretability. Furthermore, it is applicable to freshly frozen or like samples that can be generally viewed as a reduced case of FFPE samples. Simulation and applications showed the superior performance of RCRnorm.




ndo

Concentration of the spectral norm of Erdős–Rényi random graphs

Gábor Lugosi, Shahar Mendelson, Nikita Zhivotovskiy.

Source: Bernoulli, Volume 26, Number 3, 2253--2274.

Abstract:
We present results on the concentration properties of the spectral norm $|A_{p}|$ of the adjacency matrix $A_{p}$ of an Erdős–Rényi random graph $G(n,p)$. First, we consider the Erdős–Rényi random graph process and prove that $|A_{p}|$ is uniformly concentrated over the range $pin[Clog n/n,1]$. The analysis is based on delocalization arguments, uniform laws of large numbers, together with the entropy method to prove concentration inequalities. As an application of our techniques, we prove sharp sub-Gaussian moment inequalities for $|A_{p}|$ for all $pin[clog^{3}n/n,1]$ that improve the general bounds of Alon, Krivelevich, and Vu ( Israel J. Math. 131 (2002) 259–267) and some of the more recent results of Erdős et al. ( Ann. Probab. 41 (2013) 2279–2375). Both results are consistent with the asymptotic result of Füredi and Komlós ( Combinatorica 1 (1981) 233–241) that holds for fixed $p$ as $n oinfty$.




ndo

Random orthogonal matrices and the Cayley transform

Michael Jauch, Peter D. Hoff, David B. Dunson.

Source: Bernoulli, Volume 26, Number 2, 1560--1586.

Abstract:
Random orthogonal matrices play an important role in probability and statistics, arising in multivariate analysis, directional statistics, and models of physical systems, among other areas. Calculations involving random orthogonal matrices are complicated by their constrained support. Accordingly, we parametrize the Stiefel and Grassmann manifolds, represented as subsets of orthogonal matrices, in terms of Euclidean parameters using the Cayley transform. We derive the necessary Jacobian terms for change of variables formulas. Given a density defined on the Stiefel or Grassmann manifold, these allow us to specify the corresponding density for the Euclidean parameters, and vice versa. As an application, we present a Markov chain Monte Carlo approach to simulating from distributions on the Stiefel and Grassmann manifolds. Finally, we establish that the Euclidean parameters corresponding to a uniform orthogonal matrix can be approximated asymptotically by independent normals. This result contributes to the growing literature on normal approximations to the entries of random orthogonal matrices or transformations thereof.




ndo

The moduli of non-differentiability for Gaussian random fields with stationary increments

Wensheng Wang, Zhonggen Su, Yimin Xiao.

Source: Bernoulli, Volume 26, Number 2, 1410--1430.

Abstract:
We establish the exact moduli of non-differentiability of Gaussian random fields with stationary increments. As an application of the result, we prove that the uniform Hölder condition for the maximum local times of Gaussian random fields with stationary increments obtained in Xiao (1997) is optimal. These results are applicable to fractional Riesz–Bessel processes and stationary Gaussian random fields in the Matérn and Cauchy classes.