pro

Bayesian approach for the zero-modified Poisson–Lindley regression model

Wesley Bertoli, Katiane S. Conceição, Marinho G. Andrade, Francisco Louzada.

Source: Brazilian Journal of Probability and Statistics, Volume 33, Number 4, 826--860.

Abstract:
The primary goal of this paper is to introduce the zero-modified Poisson–Lindley regression model as an alternative to model overdispersed count data exhibiting inflation or deflation of zeros in the presence of covariates. The zero-modification is incorporated by considering that a zero-truncated process produces positive observations and consequently, the proposed model can be fitted without any previous information about the zero-modification present in a given dataset. A fully Bayesian approach based on the g-prior method has been considered for inference concerns. An intensive Monte Carlo simulation study has been conducted to evaluate the performance of the developed methodology and the maximum likelihood estimators. The proposed model was considered for the analysis of a real dataset on the number of bids received by $126$ U.S. firms between 1978–1985, and the impact of choosing different prior distributions for the regression coefficients has been studied. A sensitivity analysis to detect influential points has been performed based on the Kullback–Leibler divergence. A general comparison with some well-known regression models for discrete data has been presented.




pro

Option pricing with bivariate risk-neutral density via copula and heteroscedastic model: A Bayesian approach

Lucas Pereira Lopes, Vicente Garibay Cancho, Francisco Louzada.

Source: Brazilian Journal of Probability and Statistics, Volume 33, Number 4, 801--825.

Abstract:
Multivariate options are adequate tools for multi-asset risk management. The pricing models derived from the pioneer Black and Scholes method under the multivariate case consider that the asset-object prices follow a Brownian geometric motion. However, the construction of such methods imposes some unrealistic constraints on the process of fair option calculation, such as constant volatility over the maturity time and linear correlation between the assets. Therefore, this paper aims to price and analyze the fair price behavior of the call-on-max (bivariate) option considering marginal heteroscedastic models with dependence structure modeled via copulas. Concerning inference, we adopt a Bayesian perspective and computationally intensive methods based on Monte Carlo simulations via Markov Chain (MCMC). A simulation study examines the bias, and the root mean squared errors of the posterior means for the parameters. Real stocks prices of Brazilian banks illustrate the approach. For the proposed method is verified the effects of strike and dependence structure on the fair price of the option. The results show that the prices obtained by our heteroscedastic model approach and copulas differ substantially from the prices obtained by the model derived from Black and Scholes. Empirical results are presented to argue the advantages of our strategy.




pro

Spatiotemporal point processes: regression, model specifications and future directions

Dani Gamerman.

Source: Brazilian Journal of Probability and Statistics, Volume 33, Number 4, 686--705.

Abstract:
Point processes are one of the most commonly encountered observation processes in Spatial Statistics. Model-based inference for them depends on the likelihood function. In the most standard setting of Poisson processes, the likelihood depends on the intensity function, and can not be computed analytically. A number of approximating techniques have been proposed to handle this difficulty. In this paper, we review recent work on exact solutions that solve this problem without resorting to approximations. The presentation concentrates more heavily on discrete time but also considers continuous time. The solutions are based on model specifications that impose smoothness constraints on the intensity function. We also review approaches to include a regression component and different ways to accommodate it while accounting for additional heterogeneity. Applications are provided to illustrate the results. Finally, we discuss possible extensions to account for discontinuities and/or jumps in the intensity function.




pro

Hierarchical modelling of power law processes for the analysis of repairable systems with different truncation times: An empirical Bayes approach

Rodrigo Citton P. dos Reis, Enrico A. Colosimo, Gustavo L. Gilardoni.

Source: Brazilian Journal of Probability and Statistics, Volume 33, Number 2, 374--396.

Abstract:
In the data analysis from multiple repairable systems, it is usual to observe both different truncation times and heterogeneity among the systems. Among other reasons, the latter is caused by different manufacturing lines and maintenance teams of the systems. In this paper, a hierarchical model is proposed for the statistical analysis of multiple repairable systems under different truncation times. A reparameterization of the power law process is proposed in order to obtain a quasi-conjugate bayesian analysis. An empirical Bayes approach is used to estimate model hyperparameters. The uncertainty in the estimate of these quantities are corrected by using a parametric bootstrap approach. The results are illustrated in a real data set of failure times of power transformers from an electric company in Brazil.




pro

A brief review of optimal scaling of the main MCMC approaches and optimal scaling of additive TMCMC under non-regular cases

Kushal K. Dey, Sourabh Bhattacharya.

Source: Brazilian Journal of Probability and Statistics, Volume 33, Number 2, 222--266.

Abstract:
Transformation based Markov Chain Monte Carlo (TMCMC) was proposed by Dutta and Bhattacharya ( Statistical Methodology 16 (2014) 100–116) as an efficient alternative to the Metropolis–Hastings algorithm, especially in high dimensions. The main advantage of this algorithm is that it simultaneously updates all components of a high dimensional parameter using appropriate move types defined by deterministic transformation of a single random variable. This results in reduction in time complexity at each step of the chain and enhances the acceptance rate. In this paper, we first provide a brief review of the optimal scaling theory for various existing MCMC approaches, comparing and contrasting them with the corresponding TMCMC approaches.The optimal scaling of the simplest form of TMCMC, namely additive TMCMC , has been studied extensively for the Gaussian proposal density in Dey and Bhattacharya (2017a). Here, we discuss diffusion-based optimal scaling behavior of additive TMCMC for non-Gaussian proposal densities—in particular, uniform, Student’s $t$ and Cauchy proposals. Although we could not formally prove our diffusion result for the Cauchy proposal, simulation based results lead us to conjecture that at least the recipe for obtaining general optimal scaling and optimal acceptance rate holds for the Cauchy case as well. We also consider diffusion based optimal scaling of TMCMC when the target density is discontinuous. Such non-regular situations have been studied in the case of Random Walk Metropolis Hastings (RWMH) algorithm by Neal and Roberts ( Methodology and Computing in Applied Probability 13 (2011) 583–601) using expected squared jumping distance (ESJD), but the diffusion theory based scaling has not been considered. We compare our diffusion based optimally scaled TMCMC approach with the ESJD based optimally scaled RWM with simulation studies involving several target distributions and proposal distributions including the challenging Cauchy proposal case, showing that additive TMCMC outperforms RWMH in almost all cases considered.




pro

The equivalence of dynamic and static asset allocations under the uncertainty caused by Poisson processes

Yong-Chao Zhang, Na Zhang.

Source: Brazilian Journal of Probability and Statistics, Volume 33, Number 1, 184--191.

Abstract:
We investigate the equivalence of dynamic and static asset allocations in the case where the price process of a risky asset is driven by a Poisson process. Under some mild conditions, we obtain a necessary and sufficient condition for the equivalence of dynamic and static asset allocations. In addition, we provide a simple sufficient condition for the equivalence.




pro

An approximate likelihood perspective on ABC methods

George Karabatsos, Fabrizio Leisen.

Source: Statistics Surveys, Volume 12, 66--104.

Abstract:
We are living in the big data era, as current technologies and networks allow for the easy and routine collection of data sets in different disciplines. Bayesian Statistics offers a flexible modeling approach which is attractive for describing the complexity of these datasets. These models often exhibit a likelihood function which is intractable due to the large sample size, high number of parameters, or functional complexity. Approximate Bayesian Computational (ABC) methods provides likelihood-free methods for performing statistical inferences with Bayesian models defined by intractable likelihood functions. The vastity of the literature on ABC methods created a need to review and relate all ABC approaches so that scientists can more readily understand and apply them for their own work. This article provides a unifying review, general representation, and classification of all ABC methods from the view of approximate likelihood theory. This clarifies how ABC methods can be characterized, related, combined, improved, and applied for future research. Possible future research in ABC is then outlined.




pro

A design-sensitive approach to fitting regression models with complex survey data

Phillip S. Kott.

Source: Statistics Surveys, Volume 12, 1--17.

Abstract:
Fitting complex survey data to regression equations is explored under a design-sensitive model-based framework. A robust version of the standard model assumes that the expected value of the difference between the dependent variable and its model-based prediction is zero no matter what the values of the explanatory variables. The extended model assumes only that the difference is uncorrelated with the covariates. Little is assumed about the error structure of this difference under either model other than independence across primary sampling units. The standard model often fails in practice, but the extended model very rarely does. Under this framework some of the methods developed in the conventional design-based, pseudo-maximum-likelihood framework, such as fitting weighted estimating equations and sandwich mean-squared-error estimation, are retained but their interpretations change. Few of the ideas here are new to the refereed literature. The goal instead is to collect those ideas and put them into a unified conceptual framework.




pro

A unified treatment for non-asymptotic and asymptotic approaches to minimax signal detection

Clément Marteau, Theofanis Sapatinas.

Source: Statistics Surveys, Volume 9, 253--297.

Abstract:
We are concerned with minimax signal detection. In this setting, we discuss non-asymptotic and asymptotic approaches through a unified treatment. In particular, we consider a Gaussian sequence model that contains classical models as special cases, such as, direct, well-posed inverse and ill-posed inverse problems. Working with certain ellipsoids in the space of squared-summable sequences of real numbers, with a ball of positive radius removed, we compare the construction of lower and upper bounds for the minimax separation radius (non-asymptotic approach) and the minimax separation rate (asymptotic approach) that have been proposed in the literature. Some additional contributions, bringing to light links between non-asymptotic and asymptotic approaches to minimax signal, are also presented. An example of a mildly ill-posed inverse problem is used for illustrative purposes. In particular, it is shown that tools used to derive ‘asymptotic’ results can be exploited to draw ‘non-asymptotic’ conclusions, and vice-versa. In order to enhance our understanding of these two minimax signal detection paradigms, we bring into light hitherto unknown similarities and links between non-asymptotic and asymptotic approaches.




pro

The theory and application of penalized methods or Reproducing Kernel Hilbert Spaces made easy

Nancy Heckman

Source: Statist. Surv., Volume 6, 113--141.

Abstract:
The popular cubic smoothing spline estimate of a regression function arises as the minimizer of the penalized sum of squares $sum_{j}(Y_{j}-mu(t_{j}))^{2}+lambda int_{a}^{b}[mu''(t)]^{2},dt$, where the data are $t_{j},Y_{j}$, $j=1,ldots,n$. The minimization is taken over an infinite-dimensional function space, the space of all functions with square integrable second derivatives. But the calculations can be carried out in a finite-dimensional space. The reduction from minimizing over an infinite dimensional space to minimizing over a finite dimensional space occurs for more general objective functions: the data may be related to the function $mu$ in another way, the sum of squares may be replaced by a more suitable expression, or the penalty, $int_{a}^{b}[mu''(t)]^{2},dt$, might take a different form. This paper reviews the Reproducing Kernel Hilbert Space structure that provides a finite-dimensional solution for a general minimization problem. Particular attention is paid to the construction and study of the Reproducing Kernel Hilbert Space corresponding to a penalty based on a linear differential operator. In this case, one can often calculate the minimizer explicitly, using Green’s functions.




pro

A survey of cross-validation procedures for model selection

Sylvain Arlot, Alain Celisse

Source: Statist. Surv., Volume 4, 40--79.

Abstract:
Used to estimate the risk of an estimator or to perform model selection, cross-validation is a widespread strategy because of its simplicity and its (apparent) universality. Many results exist on model selection performances of cross-validation procedures. This survey intends to relate these results to the most recent advances of model selection theory, with a particular emphasis on distinguishing empirical statements from rigorous theoretical results. As a conclusion, guidelines are provided for choosing the best cross-validation procedure according to the particular features of the problem in hand.




pro

Unsupervised Pre-trained Models from Healthy ADLs Improve Parkinson's Disease Classification of Gait Patterns. (arXiv:2005.02589v2 [cs.LG] UPDATED)

Application and use of deep learning algorithms for different healthcare applications is gaining interest at a steady pace. However, use of such algorithms can prove to be challenging as they require large amounts of training data that capture different possible variations. This makes it difficult to use them in a clinical setting since in most health applications researchers often have to work with limited data. Less data can cause the deep learning model to over-fit. In this paper, we ask how can we use data from a different environment, different use-case, with widely differing data distributions. We exemplify this use case by using single-sensor accelerometer data from healthy subjects performing activities of daily living - ADLs (source dataset), to extract features relevant to multi-sensor accelerometer gait data (target dataset) for Parkinson's disease classification. We train the pre-trained model using the source dataset and use it as a feature extractor. We show that the features extracted for the target dataset can be used to train an effective classification model. Our pre-trained source model consists of a convolutional autoencoder, and the target classification model is a simple multi-layer perceptron model. We explore two different pre-trained source models, trained using different activity groups, and analyze the influence the choice of pre-trained model has over the task of Parkinson's disease classification.




pro

A bimodal gamma distribution: Properties, regression model and applications. (arXiv:2004.12491v2 [stat.ME] UPDATED)

In this paper we propose a bimodal gamma distribution using a quadratic transformation based on the alpha-skew-normal model. We discuss several properties of this distribution such as mean, variance, moments, hazard rate and entropy measures. Further, we propose a new regression model with censored data based on the bimodal gamma distribution. This regression model can be very useful to the analysis of real data and could give more realistic fits than other special regression models. Monte Carlo simulations were performed to check the bias in the maximum likelihood estimation. The proposed models are applied to two real data sets found in literature.




pro

A Critical Overview of Privacy-Preserving Approaches for Collaborative Forecasting. (arXiv:2004.09612v3 [cs.LG] UPDATED)

Cooperation between different data owners may lead to an improvement in forecast quality - for instance by benefiting from spatial-temporal dependencies in geographically distributed time series. Due to business competitive factors and personal data protection questions, said data owners might be unwilling to share their data, which increases the interest in collaborative privacy-preserving forecasting. This paper analyses the state-of-the-art and unveils several shortcomings of existing methods in guaranteeing data privacy when employing Vector Autoregressive (VAR) models. The paper also provides mathematical proofs and numerical analysis to evaluate existing privacy-preserving methods, dividing them into three groups: data transformation, secure multi-party computations, and decomposition methods. The analysis shows that state-of-the-art techniques have limitations in preserving data privacy, such as a trade-off between privacy and forecasting accuracy, while the original data in iterative model fitting processes, in which intermediate results are shared, can be inferred after some iterations.




pro

Deep transfer learning for improving single-EEG arousal detection. (arXiv:2004.05111v2 [cs.CV] UPDATED)

Datasets in sleep science present challenges for machine learning algorithms due to differences in recording setups across clinics. We investigate two deep transfer learning strategies for overcoming the channel mismatch problem for cases where two datasets do not contain exactly the same setup leading to degraded performance in single-EEG models. Specifically, we train a baseline model on multivariate polysomnography data and subsequently replace the first two layers to prepare the architecture for single-channel electroencephalography data. Using a fine-tuning strategy, our model yields similar performance to the baseline model (F1=0.682 and F1=0.694, respectively), and was significantly better than a comparable single-channel model. Our results are promising for researchers working with small databases who wish to use deep learning models pre-trained on larger databases.




pro

Risk-Aware Energy Scheduling for Edge Computing with Microgrid: A Multi-Agent Deep Reinforcement Learning Approach. (arXiv:2003.02157v2 [physics.soc-ph] UPDATED)

In recent years, multi-access edge computing (MEC) is a key enabler for handling the massive expansion of Internet of Things (IoT) applications and services. However, energy consumption of a MEC network depends on volatile tasks that induces risk for energy demand estimations. As an energy supplier, a microgrid can facilitate seamless energy supply. However, the risk associated with energy supply is also increased due to unpredictable energy generation from renewable and non-renewable sources. Especially, the risk of energy shortfall is involved with uncertainties in both energy consumption and generation. In this paper, we study a risk-aware energy scheduling problem for a microgrid-powered MEC network. First, we formulate an optimization problem considering the conditional value-at-risk (CVaR) measurement for both energy consumption and generation, where the objective is to minimize the loss of energy shortfall of the MEC networks and we show this problem is an NP-hard problem. Second, we analyze our formulated problem using a multi-agent stochastic game that ensures the joint policy Nash equilibrium, and show the convergence of the proposed model. Third, we derive the solution by applying a multi-agent deep reinforcement learning (MADRL)-based asynchronous advantage actor-critic (A3C) algorithm with shared neural networks. This method mitigates the curse of dimensionality of the state space and chooses the best policy among the agents for the proposed problem. Finally, the experimental results establish a significant performance gain by considering CVaR for high accuracy energy scheduling of the proposed model than both the single and random agent models.




pro

Predictive Modeling of ICU Healthcare-Associated Infections from Imbalanced Data. Using Ensembles and a Clustering-Based Undersampling Approach. (arXiv:2005.03582v1 [cs.LG])

Early detection of patients vulnerable to infections acquired in the hospital environment is a challenge in current health systems given the impact that such infections have on patient mortality and healthcare costs. This work is focused on both the identification of risk factors and the prediction of healthcare-associated infections in intensive-care units by means of machine-learning methods. The aim is to support decision making addressed at reducing the incidence rate of infections. In this field, it is necessary to deal with the problem of building reliable classifiers from imbalanced datasets. We propose a clustering-based undersampling strategy to be used in combination with ensemble classifiers. A comparative study with data from 4616 patients was conducted in order to validate our proposal. We applied several single and ensemble classifiers both to the original dataset and to data preprocessed by means of different resampling methods. The results were analyzed by means of classic and recent metrics specifically designed for imbalanced data classification. They revealed that the proposal is more efficient in comparison with other approaches.




pro

Sequential Aggregation of Probabilistic Forecasts -- Applicaton to Wind Speed Ensemble Forecasts. (arXiv:2005.03540v1 [stat.AP])

In the field of numerical weather prediction (NWP), the probabilistic distribution of the future state of the atmosphere is sampled with Monte-Carlo-like simulations, called ensembles. These ensembles have deficiencies (such as conditional biases) that can be corrected thanks to statistical post-processing methods. Several ensembles exist and may be corrected with different statistiscal methods. A further step is to combine these raw or post-processed ensembles. The theory of prediction with expert advice allows us to build combination algorithms with theoretical guarantees on the forecast performance. This article adapts this theory to the case of probabilistic forecasts issued as step-wise cumulative distribution functions (CDF). The theory is applied to wind speed forecasting, by combining several raw or post-processed ensembles, considered as CDFs. The second goal of this study is to explore the use of two forecast performance criteria: the Continous ranked probability score (CRPS) and the Jolliffe-Primo test. Comparing the results obtained with both criteria leads to reconsidering the usual way to build skillful probabilistic forecasts, based on the minimization of the CRPS. Minimizing the CRPS does not necessarily produce reliable forecasts according to the Jolliffe-Primo test. The Jolliffe-Primo test generally selects reliable forecasts, but could lead to issuing suboptimal forecasts in terms of CRPS. It is proposed to use both criterion to achieve reliable and skillful probabilistic forecasts.




pro

Adaptive Invariance for Molecule Property Prediction. (arXiv:2005.03004v1 [q-bio.QM])

Effective property prediction methods can help accelerate the search for COVID-19 antivirals either through accurate in-silico screens or by effectively guiding on-going at-scale experimental efforts. However, existing prediction tools have limited ability to accommodate scarce or fragmented training data currently available. In this paper, we introduce a novel approach to learn predictors that can generalize or extrapolate beyond the heterogeneous data. Our method builds on and extends recently proposed invariant risk minimization, adaptively forcing the predictor to avoid nuisance variation. We achieve this by continually exercising and manipulating latent representations of molecules to highlight undesirable variation to the predictor. To test the method we use a combination of three data sources: SARS-CoV-2 antiviral screening data, molecular fragments that bind to SARS-CoV-2 main protease and large screening data for SARS-CoV-1. Our predictor outperforms state-of-the-art transfer learning methods by significant margin. We also report the top 20 predictions of our model on Broad drug repurposing hub.




pro

Vertebrate and invertebrate respiratory proteins, lipoproteins and other body fluid proteins

9783030417697 (electronic bk.)




pro

Tissue engineering : principles, protocols, and practical exercises

9783030396985




pro

Theranostics approaches to gastric and colon cancer

9789811520174 (electronic bk.)




pro

The public policy primer : managing the policy process

Wu, Xun, author.
9781315624754 (electronic bk.)




pro

The complexity of bird behaviour : a facet theory approach

Hackett, Paul, 1960- author
9783030121921 (electronic bk.)




pro

Temporomandibular disorders : a translational approach from basic science to clinical applicability

9783319572475 (electronic bk.)




pro

Systems approaches to making change : a practical guide

9781447174721 (electronic bk.)




pro

Sustainable digital communities : 15th International Conference, iConference 2020, Boras, Sweden, March 23–26, 2020, Proceedings

iConference (Conference) (15th : 2020 : Boras, Sweden)
9783030436872




pro

Salt, fat and sugar reduction : sensory approaches for nutritional reformulation of foods and beverages

O'Sullivan, Maurice G., author
9780128226124 (electronic bk.)




pro

Requirements engineering : 26th International Working Conference, REFSQ 2020, Pisa, Italy, March 24-27, 2020, Proceedings

REFSQ (Conference) (26th : 2020 : Pisa, Italy)
9783030444297




pro

Radiomics and radiogenomics in neuro-oncology : First International Workshop, RNO-AI 2019, held in conjunction with MICCAI 2019, Shenzhen, China, October 13, proceedings

Radiomics and Radiogenomics in Neuro-oncology using AI Workshop (1st : 2019 : Shenzhen Shi, China)
9783030401245




pro

Progress in botany.

9783030363277 (electronic bk.)




pro

Pediatric pelvic and proximal femoral osteotomies

9783319780337 978-3-319-78033-7




pro

Passive and active measurement : 21st International Conference, PAM 2020, Eugene, Oregon, USA, March 30-31, 2020, Proceedings

PAM (Conference) (21st : 2020 : Eugene, Oregon)
9783030440817




pro

Oral rehabilitation for compromised and elderly patients

3319761293 (electronic book)




pro

Natural materials and products from insects : chemistry and applications

9783030366100 (electronic bk.)




pro

Milk proteins : from expression to food

9780128152522 (electronic bk.)




pro

Microbial endophytes : prospects for sustainable agriculture

0128187255




pro

Microalgae biotechnology for food, health and high value products

9789811501692 (electronic bk.)




pro

LGBTQ cultures : what health care professionals need to know about sexual and gender diversity

Eliason, Michele J., author.
9781496394606 paperback




pro

Invertebrate embryology and reproduction

El-Bawab, Fatma, author.
9780128141151 (electronic bk.)




pro

Information retrieval technology : 15th Asia Information Retrieval Societies Conference, AIRS 2019, Hong Kong, China, November 7-9, 2019, proceedings

Asia Information Retrieval Societies Conference (15th : 2019 : Hong Kong, China)
9783030428358




pro

Handbook of geotechnical testing : basic theory, procedures and comparison of standards

Li, Yanrong (Writer on geology), author.
0429323743 electronic book




pro

Green food processing techniques : preservation, transformation and extraction

9780128153536




pro

Geriatric Medicine : a Problem-Based Approach

9789811032530




pro

Genetic and metabolic engineering for improved biofuel production from lignocellulosic biomass

9780128179543 (electronic bk.)




pro

Functional and preservative properties of phytochemicals

9780128196861 (electronic bk.)




pro

Drying atlas : drying kinetics and quality of agricultural products

Mühlbauer, Werner, author
9780128181638 (electronic bk.)




pro

Development of biopharmaceutical drug-device products

9783030314156 (electronic bk.)




pro

Cullin-RING ligases and protein neddylation : biology and therapeutics

9789811510250 (electronic bk.)




pro

Cotton production and uses : agronomy, crop protection, and postharvest technologies

9789811514722