ces

Random environment binomial thinning integer-valued autoregressive process with Poisson or geometric marginal

Zhengwei Liu, Qi Li, Fukang Zhu.

Source: Brazilian Journal of Probability and Statistics, Volume 34, Number 2, 251--272.

Abstract:
To predict time series of counts with small values and remarkable fluctuations, an available model is the $r$ states random environment process based on the negative binomial thinning operator and the geometric marginal. However, we argue that the aforementioned model may suffer from the following two drawbacks. First, under the condition of no prior information, the overdispersed property of the geometric distribution may cause the predictions fluctuate greatly. Second, because of the constraints on the model parameters, some estimated parameters are close to zero in real-data examples, which may not objectively reveal the correlation relationship. For the first drawback, an $r$ states random environment process based on the binomial thinning operator and the Poisson marginal is introduced. For the second drawback, we propose a generalized $r$ states random environment integer-valued autoregressive model based on the binomial thinning operator to model fluctuations of data. Yule–Walker and conditional maximum likelihood estimates are considered and their performances are assessed via simulation studies. Two real-data sets are conducted to illustrate the better performances of the proposed models compared with some existing models.




ces

A primer on the characterization of the exchangeable Marshall–Olkin copula via monotone sequences

Natalia Shenkman.

Source: Brazilian Journal of Probability and Statistics, Volume 34, Number 1, 127--135.

Abstract:
While derivations of the characterization of the $d$-variate exchangeable Marshall–Olkin copula via $d$-monotone sequences relying on basic knowledge in probability theory exist in the literature, they contain a myriad of unnecessary relatively complicated computations. We revisit this issue and provide proofs where all undesired artefacts are removed, thereby exposing the simplicity of the characterization. In particular, we give an insightful analytical derivation of the monotonicity conditions based on the monotonicity properties of the survival probabilities.




ces

Spatiotemporal point processes: regression, model specifications and future directions

Dani Gamerman.

Source: Brazilian Journal of Probability and Statistics, Volume 33, Number 4, 686--705.

Abstract:
Point processes are one of the most commonly encountered observation processes in Spatial Statistics. Model-based inference for them depends on the likelihood function. In the most standard setting of Poisson processes, the likelihood depends on the intensity function, and can not be computed analytically. A number of approximating techniques have been proposed to handle this difficulty. In this paper, we review recent work on exact solutions that solve this problem without resorting to approximations. The presentation concentrates more heavily on discrete time but also considers continuous time. The solutions are based on model specifications that impose smoothness constraints on the intensity function. We also review approaches to include a regression component and different ways to accommodate it while accounting for additional heterogeneity. Applications are provided to illustrate the results. Finally, we discuss possible extensions to account for discontinuities and/or jumps in the intensity function.




ces

Hierarchical modelling of power law processes for the analysis of repairable systems with different truncation times: An empirical Bayes approach

Rodrigo Citton P. dos Reis, Enrico A. Colosimo, Gustavo L. Gilardoni.

Source: Brazilian Journal of Probability and Statistics, Volume 33, Number 2, 374--396.

Abstract:
In the data analysis from multiple repairable systems, it is usual to observe both different truncation times and heterogeneity among the systems. Among other reasons, the latter is caused by different manufacturing lines and maintenance teams of the systems. In this paper, a hierarchical model is proposed for the statistical analysis of multiple repairable systems under different truncation times. A reparameterization of the power law process is proposed in order to obtain a quasi-conjugate bayesian analysis. An empirical Bayes approach is used to estimate model hyperparameters. The uncertainty in the estimate of these quantities are corrected by using a parametric bootstrap approach. The results are illustrated in a real data set of failure times of power transformers from an electric company in Brazil.




ces

Necessary and sufficient conditions for the convergence of the consistent maximal displacement of the branching random walk

Bastien Mallein.

Source: Brazilian Journal of Probability and Statistics, Volume 33, Number 2, 356--373.

Abstract:
Consider a supercritical branching random walk on the real line. The consistent maximal displacement is the smallest of the distances between the trajectories followed by individuals at the $n$th generation and the boundary of the process. Fang and Zeitouni, and Faraud, Hu and Shi proved that under some integrability conditions, the consistent maximal displacement grows almost surely at rate $lambda^{*}n^{1/3}$ for some explicit constant $lambda^{*}$. We obtain here a necessary and sufficient condition for this asymptotic behaviour to hold.




ces

The equivalence of dynamic and static asset allocations under the uncertainty caused by Poisson processes

Yong-Chao Zhang, Na Zhang.

Source: Brazilian Journal of Probability and Statistics, Volume 33, Number 1, 184--191.

Abstract:
We investigate the equivalence of dynamic and static asset allocations in the case where the price process of a risky asset is driven by a Poisson process. Under some mild conditions, we obtain a necessary and sufficient condition for the equivalence of dynamic and static asset allocations. In addition, we provide a simple sufficient condition for the equivalence.




ces

Odysseus asleep : uncollected sequences, 1994-2019

Sanger, Peter, 1943- author.
9781554472048




ces

Fully grown : why a stagnant economy is a sign of success

Vollrath, Dietrich, author.
9780226666006 hardcover




ces

The theory and application of penalized methods or Reproducing Kernel Hilbert Spaces made easy

Nancy Heckman

Source: Statist. Surv., Volume 6, 113--141.

Abstract:
The popular cubic smoothing spline estimate of a regression function arises as the minimizer of the penalized sum of squares $sum_{j}(Y_{j}-mu(t_{j}))^{2}+lambda int_{a}^{b}[mu''(t)]^{2},dt$, where the data are $t_{j},Y_{j}$, $j=1,ldots,n$. The minimization is taken over an infinite-dimensional function space, the space of all functions with square integrable second derivatives. But the calculations can be carried out in a finite-dimensional space. The reduction from minimizing over an infinite dimensional space to minimizing over a finite dimensional space occurs for more general objective functions: the data may be related to the function $mu$ in another way, the sum of squares may be replaced by a more suitable expression, or the penalty, $int_{a}^{b}[mu''(t)]^{2},dt$, might take a different form. This paper reviews the Reproducing Kernel Hilbert Space structure that provides a finite-dimensional solution for a general minimization problem. Particular attention is paid to the construction and study of the Reproducing Kernel Hilbert Space corresponding to a penalty based on a linear differential operator. In this case, one can often calculate the minimizer explicitly, using Green’s functions.




ces

Identifying the consequences of dynamic treatment strategies: A decision-theoretic overview

A. Philip Dawid, Vanessa Didelez

Source: Statist. Surv., Volume 4, 184--231.

Abstract:
We consider the problem of learning about and comparing the consequences of dynamic treatment strategies on the basis of observational data. We formulate this within a probabilistic decision-theoretic framework. Our approach is compared with related work by Robins and others: in particular, we show how Robins’s ‘ G -computation’ algorithm arises naturally from this decision-theoretic perspective. Careful attention is paid to the mathematical and substantive conditions required to justify the use of this formula. These conditions revolve around a property we term stability , which relates the probabilistic behaviours of observational and interventional regimes. We show how an assumption of ‘sequential randomization’ (or ‘no unmeasured confounders’), or an alternative assumption of ‘sequential irrelevance’, can be used to infer stability. Probabilistic influence diagrams are used to simplify manipulations, and their power and limitations are discussed. We compare our approach with alternative formulations based on causal DAGs or potential response models. We aim to show that formulating the problem of assessing dynamic treatment strategies as a problem of decision analysis brings clarity, simplicity and generality.

References:
Arjas, E. and Parner, J. (2004). Causal reasoning from longitudinal data. Scandinavian Journal of Statistics 31 171–187.

Arjas, E. and Saarela, O. (2010). Optimal dynamic regimes: Presenting a case for predictive inference. The International Journal of Biostatistics 6. http://tinyurl.com/33dfssf

Cowell, R. G., Dawid, A. P., Lauritzen, S. L. and Spiegelhalter, D. J. (1999). Probabilistic Networks and Expert Systems. Springer, New York.

Dawid, A. P. (1979). Conditional independence in statistical theory (with Discussion). Journal of the Royal Statistical Society, Series B 41 1–31.

Dawid, A. P. (1992). Applications of a general propagation algorithm for probabilistic expert systems. Statistics and Computing 2 25–36.

Dawid, A. P. (1998). Conditional independence. In Encyclopedia of Statistical Science ({U}pdate Volume 2) ( S. Kotz, C. B. Read and D. L. Banks, eds.) 146–155. Wiley-Interscience, New York.

Dawid, A. P. (2000). Causal inference without counterfactuals (with Discussion). Journal of the American Statistical Association 95 407–448.

Dawid, A. P. (2001). Separoids: A mathematical framework for conditional independence and irrelevance. Annals of Mathematics and Artificial Intelligence 32 335–372.

Dawid, A. P. (2002). Influence diagrams for causal modelling and inference. International Statistical Review 70 161–189. Corrigenda, ibid ., 437.

Dawid, A. P. (2003). Causal inference using influence diagrams: The problem of partial compliance (with Discussion). In Highly Structured Stochastic Systems ( P. J. Green, N. L. Hjort and S. Richardson, eds.) 45–81. Oxford University Press.

Dawid, A. P. (2010). Beware of the DAG! In Proceedings of the NIPS 2008 Workshop on Causality. Journal of Machine Learning Research Workshop and Conference Proceedings ( D. Janzing, I. Guyon and B. Schölkopf, eds.) 6 59–86. http://tinyurl.com/33va7tm

Dawid, A. P. and Didelez, V. (2008). Identifying optimal sequential decisions. In Proceedings of the Twenty-Fourth Annual Conference on Uncertainty in Artificial Intelligence (UAI-08) ( D. McAllester and A. Nicholson, eds.). 113-120. AUAI Press, Corvallis, Oregon. http://tinyurl.com/3899qpp

Dechter, R. (2003). Constraint Processing. Morgan Kaufmann Publishers.

Didelez, V., Dawid, A. P. and Geneletti, S. G. (2006). Direct and indirect effects of sequential treatments. In Proceedings of the Twenty-Second Annual Conference on Uncertainty in Artificial Intelligence (UAI-06) ( R. Dechter and T. Richardson, eds.). 138-146. AUAI Press, Arlington, Virginia. http://tinyurl.com/32w3f4e

Didelez, V., Kreiner, S. and Keiding, N. (2010). Graphical models for inference under outcome dependent sampling. Statistical Science (to appear).

Didelez, V. and Sheehan, N. S. (2007). Mendelian randomisation: Why epidemiology needs a formal language for causality. In Causality and Probability in the Sciences, ( F. Russo and J. Williamson, eds.). Texts in Philosophy Series 5 263–292. College Publications, London.

Eichler, M. and Didelez, V. (2010). Granger-causality and the effect of interventions in time series. Lifetime Data Analysis 16 3–32.

Ferguson, T. S. (1967). Mathematical Statistics: A Decision Theoretic Approach. Academic Press, New York, London.

Geneletti, S. G. (2007). Identifying direct and indirect effects in a non–counterfactual framework. Journal of the Royal Statistical Society: Series B 69 199–215.

Geneletti, S. G. and Dawid, A. P. (2010). Defining and identifying the effect of treatment on the treated. In Causality in the Sciences ( P. M. Illari, F. Russo and J. Williamson, eds.) Oxford University Press (to appear).

Gill, R. D. and Robins, J. M. (2001). Causal inference for complex longitudinal data: The continuous case. Annals of Statistics 29 1785–1811.

Guo, H. and Dawid, A. P. (2010). Sufficient covariates and linear propensity analysis. In Proceedings of the Thirteenth International Workshop on Artificial Intelligence and Statistics, (AISTATS) 2010, Chia Laguna, Sardinia, Italy, May 13-15, 2010. Journal of Machine Learning Research Workshop and Conference Proceedings ( Y. W. Teh and D. M. Titterington, eds.) 9 281–288. http://tinyurl.com/33lmuj7

Henderson, R., Ansel, P. and Alshibani, D. (2010). Regret-regression for optimal dynamic treatment regimes. Biometrics (to appear). doi:10.1111/j.1541-0420.2009.01368.x

Hernán, M. A. and Taubman, S. L. (2008). Does obesity shorten life? The importance of well defined interventions to answer causal questions. International Journal of Obesity 32 S8–S14.

Holland, P. W. (1986). Statistics and causal inference (with Discussion). Journal of the American Statistical Association 81 945–970.

Huang, Y. and Valtorta, M. (2006). Identifiability in causal Bayesian networks: A sound and complete algorithm. In AAAI’06: Proceedings of the 21st National Conference on Artificial Intelligence 1149–1154. AAAI Press.

Kang, J. D. Y. and Schafer, J. L. (2007). Demystifying double robustness: A comparison of alternative strategies for estimating a population mean from incomplete data. Statistical Science 22 523–539.

Lauritzen, S. L., Dawid, A. P., Larsen, B. N. and Leimer, H. G. (1990). Independence properties of directed Markov fields. Networks 20 491–505.

Lok, J., Gill, R., van der Vaart, A. and Robins, J. (2004). Estimating the causal effect of a time-varying treatment on time-to-event using structural nested failure time models. Statistica Neerlandica 58 271–295.

Moodie, E. M., Richardson, T. S. and Stephens, D. A. (2007). Demystifying optimal dynamic treatment regimes. Biometrics 63 447–455.

Murphy, S. A. (2003). Optimal dynamic treatment regimes (with Discussion). Journal of the Royal Statistical Society, Series B 65 331-366.

Oliver, R. M. and Smith, J. Q., eds. (1990). Influence Diagrams, Belief Nets and Decision Analysis. John Wiley and Sons, Chichester, United Kingdom.

Pearl, J. (1995). Causal diagrams for empirical research (with Discussion). Biometrika 82 669-710.

Pearl, J. (2009). Causality: Models, Reasoning and Inference, Second ed. Cambridge University Press, Cambridge.

Pearl, J. and Paz, A. (1987). Graphoids: A graph-based logic for reasoning about relevance relations. In Advances in Artificial Intelligence ( D. Hogg and L. Steels, eds.) II 357–363. North-Holland, Amsterdam.

Pearl, J. and Robins, J. (1995). Probabilistic evaluation of sequential plans from causal models with hidden variables. In Proceedings of the 11th Conference on Uncertainty in Artificial Intelligence ( P. Besnard and S. Hanks, eds.) 444–453. Morgan Kaufmann Publishers, San Francisco.

Raiffa, H. (1968). Decision Analysis. Addison-Wesley, Reading, Massachusetts.

Robins, J. M. (1986). A new approach to causal inference in mortality studies with sustained exposure periods—Application to control of the healthy worker survivor effect. Mathematical Modelling 7 1393–1512.

Robins, J. M. (1987). Addendum to “A new approach to causal inference in mortality studies with sustained exposure periods—Application to control of the healthy worker survivor effect”. Computers & Mathematics with Applications 14 923–945.

Robins, J. M. (1989). The analysis of randomized and nonrandomized AIDS treatment trials using a new approach to causal inference in longitudinal studies. In Health Service Research Methodology: A Focus on AIDS ( L. Sechrest, H. Freeman and A. Mulley, eds.) 113–159. NCSHR, U.S. Public Health Service.

Robins, J. M. (1992). Estimation of the time-dependent accelerated failure time model in the presence of confounding factors. Biometrika 79 321–324.

Robins, J. M. (1997). Causal inference from complex longitudinal data. In Latent Variable Modeling and Applications to Causality, ( M. Berkane, ed.). Lecture Notes in Statistics 120 69–117. Springer-Verlag, New York.

Robins, J. M. (1998). Structural nested failure time models. In Survival Analysis, ( P. K. Andersen and N. Keiding, eds.). Encyclopedia of Biostatistics 6 4372–4389. John Wiley and Sons, Chichester, UK.

Robins, J. M. (2000). Robust estimation in sequentially ignorable missing data and causal inference models. In Proceedings of the American Statistical Association Section on Bayesian Statistical Science 1999 6–10.

Robins, J. M. (2004). Optimal structural nested models for optimal sequential decisions. In Proceedings of the Second Seattle Symposium on Biostatistics ( D. Y. Lin and P. Heagerty, eds.) 189–326. Springer, New York.

Robins, J. M., Greenland, S. and Hu, F. C. (1999). Estimation of the causal effect of a time-varying exposure on the marginal mean of a repeated binary outcome. Journal of the American Statistical Association 94 687–700.

Robins, J. M., Hernán, M. A. and Brumback, B. (2000). Marginal structural models and causal inference in epidemiology. Epidemiology 11 550–560.

Robins, J. M. and Wasserman, L. A. (1997). Estimation of effects of sequential treatments by reparameterizing directed acyclic graphs. In Proceedings of the 13th Annual Conference on Uncertainty in Artificial Intelligence ( D. Geiger and P. Shenoy, eds.) 409-420. Morgan Kaufmann Publishers, San Francisco. http://tinyurl.com/33ghsas

Rosthøj, S., Fullwood, C., Henderson, R. and Stewart, S. (2006). Estimation of optimal dynamic anticoagulation regimes from observational data: A regret-based approach. Statistics in Medicine 25 4197–4215.

Shpitser, I. and Pearl, J. (2006a). Identification of conditional interventional distributions. In Proceedings of the 22nd Annual Conference on Uncertainty in Artificial Intelligence (UAI-06) ( R. Dechter and T. Richardson, eds.). 437–444. AUAI Press, Corvallis, Oregon. http://tinyurl.com/2um8w47

Shpitser, I. and Pearl, J. (2006b). Identification of joint interventional distributions in recursive semi-Markovian causal models. In Proceedings of the Twenty-First National Conference on Artificial Intelligence 1219–1226. AAAI Press, Menlo Park, California.

Spirtes, P., Glymour, C. and Scheines, R. (2000). Causation, Prediction and Search, Second ed. Springer-Verlag, New York.

Sterne, J. A. C., May, M., Costagliola, D., de Wolf, F., Phillips, A. N., Harris, R., Funk, M. J., Geskus, R. B., Gill, J., Dabis, F., Miro, J. M., Justice, A. C., Ledergerber, B., Fatkenheuer, G., Hogg, R. S., D’Arminio-Monforte, A., Saag, M., Smith, C., Staszewski, S., Egger, M., Cole, S. R. and When To Start Consortium (2009). Timing of initiation of antiretroviral therapy in AIDS-Free HIV-1-infected patients: A collaborative analysis of 18 HIV cohort studies. Lancet 373 1352–1363.

Taubman, S. L., Robins, J. M., Mittleman, M. A. and Hernán, M. A. (2009). Intervening on risk factors for coronary heart disease: An application of the parametric g-formula. International Journal of Epidemiology 38 1599–1611.

Tian, J. (2008). Identifying dynamic sequential plans. In Proceedings of the Twenty-Fourth Annual Conference on Uncertainty in Artificial Intelligence (UAI-08) ( D. McAllester and A. Nicholson, eds.). 554–561. AUAI Press, Corvallis, Oregon. http://tinyurl.com/36ufx2h

Verma, T. and Pearl, J. (1990). Causal networks: Semantics and expressiveness. In Uncertainty in Artificial Intelligence 4 ( R. D. Shachter, T. S. Levitt, L. N. Kanal and J. F. Lemmer, eds.) 69–76. North-Holland, Amsterdam.




ces

Was one of your ancestors a whaler?

Whaling – along with wool production – was one of the first primary industries after the establishment of New South Wa




ces

Was your ancestor a doctor?

A register of medical practitioners was first required to be kept in 1838 in New South Wales  and was published in the G




ces

Excess registered deaths in England and Wales during the COVID-19 pandemic, March 2020 and April 2020. (arXiv:2004.11355v4 [stat.AP] UPDATED)

Official counts of COVID-19 deaths have been criticized for potentially including people who did not die of COVID-19 but merely died with COVID-19. I address that critique by fitting a generalized additive model to weekly counts of all registered deaths in England and Wales during the 2010s. The model produces baseline rates of death registrations expected in the absence of the COVID-19 pandemic, and comparing those baselines to recent counts of registered deaths exposes the emergence of excess deaths late in March 2020. Among adults aged 45+, about 38,700 excess deaths were registered in the 5 weeks comprising 21 March through 24 April (612 $pm$ 416 from 21$-$27 March, 5675 $pm$ 439 from 28 March through 3 April, then 9183 $pm$ 468, 12,712 $pm$ 589, and 10,511 $pm$ 567 in April's next 3 weeks). Both the Office for National Statistics's respective count of 26,891 death certificates which mention COVID-19, and the Department of Health and Social Care's hospital-focused count of 21,222 deaths, are appreciably less, implying that their counting methods have underestimated rather than overestimated the pandemic's true death toll. If underreporting rates have held steady, about 45,900 direct and indirect COVID-19 deaths might have been registered by April's end but not yet publicly reported in full.




ces

SmartExchange: Trading Higher-cost Memory Storage/Access for Lower-cost Computation. (arXiv:2005.03403v1 [cs.LG])

We present SmartExchange, an algorithm-hardware co-design framework to trade higher-cost memory storage/access for lower-cost computation, for energy-efficient inference of deep neural networks (DNNs). We develop a novel algorithm to enforce a specially favorable DNN weight structure, where each layerwise weight matrix can be stored as the product of a small basis matrix and a large sparse coefficient matrix whose non-zero elements are all power-of-2. To our best knowledge, this algorithm is the first formulation that integrates three mainstream model compression ideas: sparsification or pruning, decomposition, and quantization, into one unified framework. The resulting sparse and readily-quantized DNN thus enjoys greatly reduced energy consumption in data movement as well as weight storage. On top of that, we further design a dedicated accelerator to fully utilize the SmartExchange-enforced weights to improve both energy efficiency and latency performance. Extensive experiments show that 1) on the algorithm level, SmartExchange outperforms state-of-the-art compression techniques, including merely sparsification or pruning, decomposition, and quantization, in various ablation studies based on nine DNN models and four datasets; and 2) on the hardware level, the proposed SmartExchange based accelerator can improve the energy efficiency by up to 6.7$ imes$ and the speedup by up to 19.2$ imes$ over four state-of-the-art DNN accelerators, when benchmarked on seven DNN models (including four standard DNNs, two compact DNN models, and one segmentation model) and three datasets.




ces

COVID-19 in-language resources




ces

The public policy primer : managing the policy process

Wu, Xun, author.
9781315624754 (electronic bk.)




ces

The Best and Worst Places to be a Woman in Canada 2019 : The Gender Gap in Canada’s 26 Biggest Cities

9781771254434 (print)




ces

Sustainable agriculture : advances in plant metabolome and microbiome

Parray, Javid Ahmad, author
9780128173749 (electronic bk.)




ces

Rediscovery of genetic and genomic resources for future food security

9811501564




ces

Mixed plantations of eucalyptus and leguminous trees : soil, microbiology and ecosystem services

9783030323653 (electronic bk.)




ces

Latin American dendroecology : combining tree-ring sciences and ecology in a megadiverse territory

9783030369309 (electronic bk.)




ces

Health consequences of microbial interactions with hydrocarbons, oils, and lipids

9783319724737 (electronic bk.)




ces

Green food processing techniques : preservation, transformation and extraction

9780128153536




ces

Governance of offshore freshwater resources

Martin-Nagle, Renee, author.
9004421041 (electronic book)




ces

Consequences of microbial interactions with hydrocarbons, oils, and lipids : biodegradation and bioremediation

9783319445359 (electronic bk.)




ces

Computational processing of the Portuguese language : 14th International Conference, PROPOR 2020, Evora, Portugal, March 2-4, 2020, Proceedings

PROPOR (Conference) (14th : 2020 : Evora, Portugal)
9783030415051 (electronic bk.)




ces

Carotenoids : properties, processing and applications

9780128173145 (electronic bk.)




ces

Breakfast cereals and how they are made : raw materials, processing, and production

9780128120446 (electronic bk.)




ces

Botulinum toxins, fillers and related substances

9783319168029 (electronic bk.)




ces

Biscuit, cookie and cracker process and recipes

Sykes, Glyn, author
9780128206133 (electronic bk.)




ces

Advances in virus research.

9780123850348 (electronic bk.)




ces

Advances in protein chemistry and structural biology.

9780123819635 (electronic bk.)




ces

Advances in protein chemistry and structural biology.

9780123864840 (electronic bk.)




ces

Advances in parasitology.

9780123742292 (electronic bk.)




ces

Advances in cyanobacterial biology

9780128193129 (electronic bk.)




ces

Advances in applied microbiology.

1282169459




ces

Advances in applied microbiology.

1282169416




ces

General Notices






ces

Detecting relevant changes in the mean of nonstationary processes—A mass excess approach

Holger Dette, Weichi Wu.

Source: The Annals of Statistics, Volume 47, Number 6, 3578--3608.

Abstract:
This paper considers the problem of testing if a sequence of means $(mu_{t})_{t=1,ldots ,n}$ of a nonstationary time series $(X_{t})_{t=1,ldots ,n}$ is stable in the sense that the difference of the means $mu_{1}$ and $mu_{t}$ between the initial time $t=1$ and any other time is smaller than a given threshold, that is $|mu_{1}-mu_{t}|leq c$ for all $t=1,ldots ,n$. A test for hypotheses of this type is developed using a bias corrected monotone rearranged local linear estimator and asymptotic normality of the corresponding test statistic is established. As the asymptotic variance depends on the location of the roots of the equation $|mu_{1}-mu_{t}|=c$ a new bootstrap procedure is proposed to obtain critical values and its consistency is established. As a consequence we are able to quantitatively describe relevant deviations of a nonstationary sequence from its initial value. The results are illustrated by means of a simulation study and by analyzing data examples.




ces

Joint convergence of sample autocovariance matrices when $p/n o 0$ with application

Monika Bhattacharjee, Arup Bose.

Source: The Annals of Statistics, Volume 47, Number 6, 3470--3503.

Abstract:
Consider a high-dimensional linear time series model where the dimension $p$ and the sample size $n$ grow in such a way that $p/n o 0$. Let $hat{Gamma }_{u}$ be the $u$th order sample autocovariance matrix. We first show that the LSD of any symmetric polynomial in ${hat{Gamma }_{u},hat{Gamma }_{u}^{*},ugeq 0}$ exists under independence and moment assumptions on the driving sequence together with weak assumptions on the coefficient matrices. This LSD result, with some additional effort, implies the asymptotic normality of the trace of any polynomial in ${hat{Gamma }_{u},hat{Gamma }_{u}^{*},ugeq 0}$. We also study similar results for several independent MA processes. We show applications of the above results to statistical inference problems such as in estimation of the unknown order of a high-dimensional MA process and in graphical and significance tests for hypotheses on coefficient matrices of one or several such independent processes.




ces

On partial-sum processes of ARMAX residuals

Steffen Grønneberg, Benjamin Holcblat.

Source: The Annals of Statistics, Volume 47, Number 6, 3216--3243.

Abstract:
We establish general and versatile results regarding the limit behavior of the partial-sum process of ARMAX residuals. Illustrations include ARMA with seasonal dummies, misspecified ARMAX models with autocorrelated errors, nonlinear ARMAX models, ARMA with a structural break, a wide range of ARMAX models with infinite-variance errors, weak GARCH models and the consistency of kernel estimation of the density of ARMAX errors. Our results identify the limit distributions, and provide a general algorithm to obtain pivot statistics for CUSUM tests.




ces

Distributed estimation of principal eigenspaces

Jianqing Fan, Dong Wang, Kaizheng Wang, Ziwei Zhu.

Source: The Annals of Statistics, Volume 47, Number 6, 3009--3031.

Abstract:
Principal component analysis (PCA) is fundamental to statistical machine learning. It extracts latent principal factors that contribute to the most variation of the data. When data are stored across multiple machines, however, communication cost can prohibit the computation of PCA in a central location and distributed algorithms for PCA are thus needed. This paper proposes and studies a distributed PCA algorithm: each node machine computes the top $K$ eigenvectors and transmits them to the central server; the central server then aggregates the information from all the node machines and conducts a PCA based on the aggregated information. We investigate the bias and variance for the resulting distributed estimator of the top $K$ eigenvectors. In particular, we show that for distributions with symmetric innovation, the empirical top eigenspaces are unbiased, and hence the distributed PCA is “unbiased.” We derive the rate of convergence for distributed PCA estimators, which depends explicitly on the effective rank of covariance, eigengap, and the number of machines. We show that when the number of machines is not unreasonably large, the distributed PCA performs as well as the whole sample PCA, even without full access of whole data. The theoretical results are verified by an extensive simulation study. We also extend our analysis to the heterogeneous case where the population covariance matrices are different across local machines but share similar top eigenstructures.




ces

Test for high-dimensional correlation matrices

Shurong Zheng, Guanghui Cheng, Jianhua Guo, Hongtu Zhu.

Source: The Annals of Statistics, Volume 47, Number 5, 2887--2921.

Abstract:
Testing correlation structures has attracted extensive attention in the literature due to both its importance in real applications and several major theoretical challenges. The aim of this paper is to develop a general framework of testing correlation structures for the one , two and multiple sample testing problems under a high-dimensional setting when both the sample size and data dimension go to infinity. Our test statistics are designed to deal with both the dense and sparse alternatives. We systematically investigate the asymptotic null distribution, power function and unbiasedness of each test statistic. Theoretically, we make great efforts to deal with the nonindependency of all random matrices of the sample correlation matrices. We use simulation studies and real data analysis to illustrate the versatility and practicability of our test statistics.




ces

The middle-scale asymptotics of Wishart matrices

Didier Chételat, Martin T. Wells.

Source: The Annals of Statistics, Volume 47, Number 5, 2639--2670.

Abstract:
We study the behavior of a real $p$-dimensional Wishart random matrix with $n$ degrees of freedom when $n,p ightarrowinfty$ but $p/n ightarrow0$. We establish the existence of phase transitions when $p$ grows at the order $n^{(K+1)/(K+3)}$ for every $Kinmathbb{N}$, and derive expressions for approximating densities between every two phase transitions. To do this, we make use of a novel tool we call the $mathcal{F}$-conjugate of an absolutely continuous distribution, which is obtained from the Fourier transform of the square root of its density. In the case of the normalized Wishart distribution, this represents an extension of the $t$-distribution to the space of real symmetric matrices.




ces

Cross validation for locally stationary processes

Stefan Richter, Rainer Dahlhaus.

Source: The Annals of Statistics, Volume 47, Number 4, 2145--2173.

Abstract:
We propose an adaptive bandwidth selector via cross validation for local M-estimators in locally stationary processes. We prove asymptotic optimality of the procedure under mild conditions on the underlying parameter curves. The results are applicable to a wide range of locally stationary processes such linear and nonlinear processes. A simulation study shows that the method works fairly well also in misspecified situations.




ces

A hierarchical dependent Dirichlet process prior for modelling bird migration patterns in the UK

Alex Diana, Eleni Matechou, Jim Griffin, Alison Johnston.

Source: The Annals of Applied Statistics, Volume 14, Number 1, 473--493.

Abstract:
Environmental changes in recent years have been linked to phenological shifts which in turn are linked to the survival of species. The work in this paper is motivated by capture-recapture data on blackcaps collected by the British Trust for Ornithology as part of the Constant Effort Sites monitoring scheme. Blackcaps overwinter abroad and migrate to the UK annually for breeding purposes. We propose a novel Bayesian nonparametric approach for expressing the bivariate density of individual arrival and departure times at different sites across a number of years as a mixture model. The new model combines the ideas of the hierarchical and the dependent Dirichlet process, allowing the estimation of site-specific weights and year-specific mixture locations, which are modelled as functions of environmental covariates using a multivariate extension of the Gaussian process. The proposed modelling framework is extremely general and can be used in any context where multivariate density estimation is performed jointly across different groups and in the presence of a continuous covariate.




ces

Measuring human activity spaces from GPS data with density ranking and summary curves

Yen-Chi Chen, Adrian Dobra.

Source: The Annals of Applied Statistics, Volume 14, Number 1, 409--432.

Abstract:
Activity spaces are fundamental to the assessment of individuals’ dynamic exposure to social and environmental risk factors associated with multiple spatial contexts that are visited during activities of daily living. In this paper we survey existing approaches for measuring the geometry, size and structure of activity spaces, based on GPS data, and explain their limitations. We propose addressing these shortcomings through a nonparametric approach called density ranking and also through three summary curves: the mass-volume curve, the Betti number curve and the persistence curve. We introduce a novel mixture model for human activity spaces and study its asymptotic properties. We prove that the kernel density estimator, which at the present time, is one of the most widespread methods for measuring activity spaces, is not a stable estimator of their structure. We illustrate the practical value of our methods with a simulation study and with a recently collected GPS dataset that comprises the locations visited by 10 individuals over a six months period.




ces

Modeling wildfire ignition origins in southern California using linear network point processes

Medha Uppala, Mark S. Handcock.

Source: The Annals of Applied Statistics, Volume 14, Number 1, 339--356.

Abstract:
This paper focuses on spatial and temporal modeling of point processes on linear networks. Point processes on linear networks can simply be defined as point events occurring on or near line segment network structures embedded in a certain space. A separable modeling framework is introduced that posits separate formation and dissolution models of point processes on linear networks over time. While the model was inspired by spider web building activity in brick mortar lines, the focus is on modeling wildfire ignition origins near road networks over a span of 14 years. As most wildfires in California have human-related origins, modeling the origin locations with respect to the road network provides insight into how human, vehicular and structural densities affect ignition occurrence. Model results show that roads that traverse different types of regions such as residential, interface and wildland regions have higher ignition intensities compared to roads that only exist in each of the mentioned region types.