regression A Matter of Progresssion—or Regression By www.ancientfaith.com Published On :: 2021-11-01T12:40:20+00:00 Preaching on the Parable of the Rich Man and Lazarus (Luke 16:19-31), Fr. Pat urges us to always maintain a proper perspective. Full Article
regression Dao Day 2024 – a regression in the making By clagnut.com Published On :: Sun, 07 Apr 2024 13:49:09 PST It’s twenty four years to the day since A List Apart published John Allsopp’s seminal treatise A Dao of Web Design. It must be one of the most vital and cited articles ever to be written about web design. In it John quoted the Tao Te Ching as a way of persuading us web designers to be like The Sage and “accept the ebb and flow of things”. John compared the nature of print with the web: The fact we can control a paper page is really a limitation of that medium. You can think – we can fix the size of text – or you can think – the size of text is unalterable. You can think – the dimensions of a page can be controlled – or – the dimensions of a page can’t be altered. These are simply facts of the medium. And they aren’t necessarily good facts, especially for the reader. We should embrace the fact that the web doesn’t have the same constraints, and design for this flexibility. Those demands for flexibility led – 10 years later – to responsive web design as a best practice, and on to the present concept of fluid design. However we’re currently battling against another regression. As John himself wrote recently, “having escaped the gravity well of web pages being ’print, only onscreen’, they became ’apps, only in the browser’”. The better way of doing things will win out. Why? Because more people benefit from the accessible outcomes of fluid design, and it is coupled with a lower design and technical debt, even if the initial effort is higher. Meanwhile plus í§a change, plus c’est la míªme chose, or as the Lao Tse wrote 2,500 years ago “Well established hierarchies are not easily uprooted. So ritual enthrals generation after generation.” Read or add comments Full Article Web standards CSS techniques
regression A New Typology Design of Performance Metrics to Measure Errors in Machine Learning Regression Algorithms By Published On :: 2019-01-24 Aim/Purpose: The aim of this study was to analyze various performance metrics and approaches to their classification. The main goal of the study was to develop a new typology that will help to advance knowledge of metrics and facilitate their use in machine learning regression algorithms Background: Performance metrics (error measures) are vital components of the evaluation frameworks in various fields. A performance metric can be defined as a logical and mathematical construct designed to measure how close are the actual results from what has been expected or predicted. A vast variety of performance metrics have been described in academic literature. The most commonly mentioned metrics in research studies are Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), etc. Knowledge about metrics properties needs to be systematized to simplify the design and use of the metrics. Methodology: A qualitative study was conducted to achieve the objectives of identifying related peer-reviewed research studies, literature reviews, critical thinking and inductive reasoning. Contribution: The main contribution of this paper is in ordering knowledge of performance metrics and enhancing understanding of their structure and properties by proposing a new typology, generic primary metrics mathematical formula and a visualization chart Findings: Based on the analysis of the structure of numerous performance metrics, we proposed a framework of metrics which includes four (4) categories: primary metrics, extended metrics, composite metrics, and hybrid sets of metrics. The paper identified three (3) key components (dimensions) that determine the structure and properties of primary metrics: method of determining point distance, method of normalization, method of aggregation of point distances over a data set. For each component, implementation options have been identified. The suggested new typology has been shown to cover a total of over 40 commonly used primary metrics Recommendations for Practitioners: Presented findings can be used to facilitate teaching performance metrics to university students and expedite metrics selection and implementation processes for practitioners Recommendation for Researchers: By using the proposed typology, researchers can streamline development of new metrics with predetermined properties Impact on Society: The outcomes of this study could be used for improving evaluation results in machine learning regression, forecasting and prognostics with direct or indirect positive impacts on innovation and productivity in a societal sense Future Research: Future research is needed to examine the properties of the extended metrics, composite metrics, and hybrid sets of metrics. Empirical study of the metrics is needed using R Studio or Azure Machine Learning Studio, to find associations between the properties of primary metrics and their “numerical” behavior in a wide spectrum of data characteristics and business or research requirements Full Article
regression Data Quality in Linear Regression Models: Effect of Errors in Test Data and Errors in Training Data on Predictive Accuracy By Published On :: Full Article
regression Which book should you read first, Active Statistics or Regression and Other Stories? By statmodeling.stat.columbia.edu Published On :: Wed, 23 Oct 2024 13:25:40 +0000 Kiran Gauthier writes: I was checking the web pages for Active Statistics and Regression and Other Stories and although I saw that Active Statistics is meant to accompany Regression and Other Stories, I was wondering how you would recommend reading … Continue reading → Full Article Bayesian Statistics Literature Miscellaneous Statistics Teaching
regression A Regression to Politics? Recent Court Decisions Could Give Partisanship Even More Influence at the NLRB By www.littler.com Published On :: Mon, 19 Aug 2024 21:14:27 +0000 Alex MacDonald discusses recent court decisions that criticized the way the NLRB operates and that could transform American labor law. Washington Legal Foundation View Full Article
regression Using Simple Linear Regression For Instrument Calibration? By www.qualitymag.com Published On :: Sat, 08 Jan 2022 00:00:00 -0500 Measurement devices must be calibrated regularly to ensure they perform their jobs properly. While calibration covers a wide range of applications and scenarios, the goal is simple: ensure your device is measuring to your standards. Full Article
regression Estimating Premorbid Intelligence: Regression Equations - AssessmentPsychology.com By www.assessmentpsychology.com Published On :: Tue, 30 Sep 2008 04:00:00 UTC Regression equations for estimating premorbid intelligence. Full Article
regression Teenage regression By thebirminghampress.com Published On :: Tue, 21 Nov 2023 12:06:57 +0000 Joe Costello is at Birmingham Town Hall watching Teenage Fanclub. Full Article Music Reviews Birmingham Town Hall Teenage Fanclub
regression Getting Started with Python Integration to SAS Viya for Predictive Modeling - Comparing Logistic Regression and Decision Tree By blogs.sas.com Published On :: Mon, 08 Apr 2024 15:41:47 +0000 Comparing Logistic Regression and Decision Tree - Which of our models is better at predicting our outcome? Learn how to compare models using misclassification, area under the curve (ROC) charts, and lift charts with validation data. In part 6 and part 7 of this series we fit a logistic regression [...] Getting Started with Python Integration to SAS Viya for Predictive Modeling - Comparing Logistic Regression and Decision Tree was published on SAS Users. Full Article Tech CAS CAS Actions data management Developers Models Programming Tips Python SAS Viya
regression Navigating predictions at nanoscale: a comprehensive study of regression models in magnetic nanoparticle synthesis By pubs.rsc.org Published On :: J. Mater. Chem. B, 2024, Advance ArticleDOI: 10.1039/D4TB02052A, Paper Open Access   This article is licensed under a Creative Commons Attribution 3.0 Unported Licence.Lukas Glänzer, Lennart Göpfert, Thomas Schmitz-Rode, Ioana SlabuThe transformative power of support vector regression in optimizing magnetic nanoparticle synthesis intricate relationships between process parameters and particle size, enabling the production of particles with tailored properties.To cite this article before page numbers are assigned, use the DOI form of citation above.The content of this RSS Feed (c) The Royal Society of Chemistry Full Article
regression Structural Interpretation of Vector Autoregressions with Incomplete Information: Revisiting the Role of Oil Supply and Demand Shocks: Comment [electronic journal]. By encore.st-andrews.ac.uk Published On :: Full Article
regression Structural Interpretation of Vector Autoregressions with Incomplete Identification: Revisiting the Role of Oil Supply and Demand Shocks [electronic journal]. By encore.st-andrews.ac.uk Published On :: National Bureau of Economic Research Full Article
regression The Response to Dynamic Incentives in Insurance Contracts with a Deductible: Evidence from a Differences-in-Regression-Discontinuities Design [electronic journal]. By encore.st-andrews.ac.uk Published On :: Full Article
regression Linear IV Regression Estimators for Structural Dynamic Discrete Choice Models [electronic journal]. By encore.st-andrews.ac.uk Published On :: National Bureau of Economic Research Full Article
regression Inference in Structural Vector Autoregressions When the Identifying Assumptions are Not Fully Believed: Re-evaluating the Role of Monetary Policy in Economic Fluctuations [electronic journal]. By encore.st-andrews.ac.uk Published On :: National Bureau of Economic Research Full Article
regression Historical Econometrics: Instrumental Variables and Regression Discontinuity Designs [electronic journal]. By encore.st-andrews.ac.uk Published On :: Full Article
regression Gaussian rank correlation and regression [electronic journal]. By encore.st-andrews.ac.uk Published On :: Full Article
regression Drawing Conclusions from Structural Vector Autoregressions Identified on the Basis of Sign Restrictions [electronic journal]. By encore.st-andrews.ac.uk Published On :: National Bureau of Economic Research Full Article
regression Direct Standard Errors for Regressions with Spatially Autocorrelated Residuals [electronic journal]. By encore.st-andrews.ac.uk Published On :: Full Article
regression Advances in Structural Vector Autoregressions with Imperfect Identifying Information [electronic journal]. By encore.st-andrews.ac.uk Published On :: National Bureau of Economic Research Full Article
regression Understanding regression analysis [electronic resource] / Michael Patrick Allen By darius.uleth.ca Published On :: New York : Plenum Press, 1997 Full Article
regression We need better default plots for regression. By statmodeling.stat.columbia.edu Published On :: Thu, 07 May 2020 13:41:51 +0000 Robin Lee writes: To check for linearity and homoscedasticity, we are taught to plot residuals against y fitted value in many statistics classes. However, plotting residuals against y fitted value has always been a confusing practice that I know that I should use but can’t quite explain why. It is not until this week I […] Full Article Statistical computing Statistical graphics
regression Lifted Regression/Reconstruction Networks. (arXiv:2005.03452v1 [cs.LG]) By arxiv.org Published On :: In this work we propose lifted regression/reconstruction networks (LRRNs), which combine lifted neural networks with a guaranteed Lipschitz continuity property for the output layer. Lifted neural networks explicitly optimize an energy model to infer the unit activations and therefore---in contrast to standard feed-forward neural networks---allow bidirectional feedback between layers. So far lifted neural networks have been modelled around standard feed-forward architectures. We propose to take further advantage of the feedback property by letting the layers simultaneously perform regression and reconstruction. The resulting lifted network architecture allows to control the desired amount of Lipschitz continuity, which is an important feature to obtain adversarially robust regression and classification methods. We analyse and numerically demonstrate applications for unsupervised and supervised learning. Full Article
regression Regression Forest-Based Atlas Localization and Direction Specific Atlas Generation for Pancreas Segmentation. (arXiv:2005.03345v1 [cs.CV]) By arxiv.org Published On :: This paper proposes a fully automated atlas-based pancreas segmentation method from CT volumes utilizing atlas localization by regression forest and atlas generation using blood vessel information. Previous probabilistic atlas-based pancreas segmentation methods cannot deal with spatial variations that are commonly found in the pancreas well. Also, shape variations are not represented by an averaged atlas. We propose a fully automated pancreas segmentation method that deals with two types of variations mentioned above. The position and size of the pancreas is estimated using a regression forest technique. After localization, a patient-specific probabilistic atlas is generated based on a new image similarity that reflects the blood vessel position and direction information around the pancreas. We segment it using the EM algorithm with the atlas as prior followed by the graph-cut. In evaluation results using 147 CT volumes, the Jaccard index and the Dice overlap of the proposed method were 62.1% and 75.1%, respectively. Although we automated all of the segmentation processes, segmentation results were superior to the other state-of-the-art methods in the Dice overlap. Full Article
regression Orthogonal regression method for observations from a mixture By www.ams.org Published On :: Mon, 02 Mar 2020 06:58 EST R. E. Maĭboroda, G. V. Navara and O. V. Sugakova Theor. Probability and Math. Statist. 99 (2020), 169-188. Abstract, references and article information Full Article
regression Minimax estimators of parameters of a regression model By www.ams.org Published On :: Mon, 02 Mar 2020 06:58 EST A. V. Ivanov and I. K. Matsak Theor. Probability and Math. Statist. 99 (2020), 91-99. Abstract, references and article information Full Article
regression The limiting behavior of isotonic and convex regression estimators when the model is misspecified By projecteuclid.org Published On :: Tue, 05 May 2020 22:00 EDT Eunji Lim. Source: Electronic Journal of Statistics, Volume 14, Number 1, 2053--2097.Abstract: We study the asymptotic behavior of the least squares estimators when the model is possibly misspecified. We consider the setting where we wish to estimate an unknown function $f_{*}:(0,1)^{d} ightarrow mathbb{R}$ from observations $(X,Y),(X_{1},Y_{1}),cdots ,(X_{n},Y_{n})$; our estimator $hat{g}_{n}$ is the minimizer of $sum _{i=1}^{n}(Y_{i}-g(X_{i}))^{2}/n$ over $gin mathcal{G}$ for some set of functions $mathcal{G}$. We provide sufficient conditions on the metric entropy of $mathcal{G}$, under which $hat{g}_{n}$ converges to $g_{*}$ as $n ightarrow infty $, where $g_{*}$ is the minimizer of $|g-f_{*}| riangleq mathbb{E}(g(X)-f_{*}(X))^{2}$ over $gin mathcal{G}$. As corollaries of our theorem, we establish $|hat{g}_{n}-g_{*}| ightarrow 0$ as $n ightarrow infty $ when $mathcal{G}$ is the set of monotone functions or the set of convex functions. We also make a connection between the convergence rate of $|hat{g}_{n}-g_{*}|$ and the metric entropy of $mathcal{G}$. As special cases of our finding, we compute the convergence rate of $|hat{g}_{n}-g_{*}|^{2}$ when $mathcal{G}$ is the set of bounded monotone functions or the set of bounded convex functions. Full Article
regression Estimation of linear projections of non-sparse coefficients in high-dimensional regression By projecteuclid.org Published On :: Mon, 27 Apr 2020 22:02 EDT David Azriel, Armin Schwartzman. Source: Electronic Journal of Statistics, Volume 14, Number 1, 174--206.Abstract: In this work we study estimation of signals when the number of parameters is much larger than the number of observations. A large body of literature assumes for these kind of problems a sparse structure where most of the parameters are zero or close to zero. When this assumption does not hold, one can focus on low-dimensional functions of the parameter vector. In this work we study one-dimensional linear projections. Specifically, in the context of high-dimensional linear regression, the parameter of interest is ${oldsymbol{eta}}$ and we study estimation of $mathbf{a}^{T}{oldsymbol{eta}}$. We show that $mathbf{a}^{T}hat{oldsymbol{eta}}$, where $hat{oldsymbol{eta}}$ is the least squares estimator, using pseudo-inverse when $p>n$, is minimax and admissible. Thus, for linear projections no regularization or shrinkage is needed. This estimator is easy to analyze and confidence intervals can be constructed. We study a high-dimensional dataset from brain imaging where it is shown that the signal is weak, non-sparse and significantly different from zero. Full Article
regression Adaptive estimation in the supremum norm for semiparametric mixtures of regressions By projecteuclid.org Published On :: Thu, 23 Apr 2020 22:01 EDT Heiko Werner, Hajo Holzmann, Pierre Vandekerkhove. Source: Electronic Journal of Statistics, Volume 14, Number 1, 1816--1871.Abstract: We investigate a flexible two-component semiparametric mixture of regressions model, in which one of the conditional component distributions of the response given the covariate is unknown but assumed symmetric about a location parameter, while the other is specified up to a scale parameter. The location and scale parameters together with the proportion are allowed to depend nonparametrically on covariates. After settling identifiability, we provide local M-estimators for these parameters which converge in the sup-norm at the optimal rates over Hölder-smoothness classes. We also introduce an adaptive version of the estimators based on the Lepski-method. Sup-norm bounds show that the local M-estimator properly estimates the functions globally, and are the first step in the construction of useful inferential tools such as confidence bands. In our analysis we develop general results about rates of convergence in the sup-norm as well as adaptive estimation of local M-estimators which might be of some independent interest, and which can also be applied in various other settings. We investigate the finite-sample behaviour of our method in a simulation study, and give an illustration to a real data set from bioinformatics. Full Article
regression Efficient estimation in expectile regression using envelope models By projecteuclid.org Published On :: Thu, 23 Apr 2020 22:01 EDT Tuo Chen, Zhihua Su, Yi Yang, Shanshan Ding. Source: Electronic Journal of Statistics, Volume 14, Number 1, 143--173.Abstract: As a generalization of the classical linear regression, expectile regression (ER) explores the relationship between the conditional expectile of a response variable and a set of predictor variables. ER with respect to different expectile levels can provide a comprehensive picture of the conditional distribution of the response variable given the predictors. We adopt an efficient estimation method called the envelope model ([8]) in ER, and construct a novel envelope expectile regression (EER) model. Estimation of the EER parameters can be performed using the generalized method of moments (GMM). We establish the consistency and derive the asymptotic distribution of the EER estimators. In addition, we show that the EER estimators are asymptotically more efficient than the ER estimators. Numerical experiments and real data examples are provided to demonstrate the efficiency gains attained by EER compared to ER, and the efficiency gains can further lead to improvements in prediction. Full Article
regression Posterior contraction and credible sets for filaments of regression functions By projecteuclid.org Published On :: Tue, 14 Apr 2020 22:01 EDT Wei Li, Subhashis Ghosal. Source: Electronic Journal of Statistics, Volume 14, Number 1, 1707--1743.Abstract: A filament consists of local maximizers of a smooth function $f$ when moving in a certain direction. A filamentary structure is an important feature of the shape of an object and is also considered as an important lower dimensional characterization of multivariate data. There have been some recent theoretical studies of filaments in the nonparametric kernel density estimation context. This paper supplements the current literature in two ways. First, we provide a Bayesian approach to the filament estimation in regression context and study the posterior contraction rates using a finite random series of B-splines basis. Compared with the kernel-estimation method, this has a theoretical advantage as the bias can be better controlled when the function is smoother, which allows obtaining better rates. Assuming that $f:mathbb{R}^{2}mapsto mathbb{R}$ belongs to an isotropic Hölder class of order $alpha geq 4$, with the optimal choice of smoothing parameters, the posterior contraction rates for the filament points on some appropriately defined integral curves and for the Hausdorff distance of the filament are both $(n/log n)^{(2-alpha )/(2(1+alpha ))}$. Secondly, we provide a way to construct a credible set with sufficient frequentist coverage for the filaments. We demonstrate the success of our proposed method in simulations and one application to earthquake data. Full Article
regression Nonconcave penalized estimation in sparse vector autoregression model By projecteuclid.org Published On :: Wed, 01 Apr 2020 04:00 EDT Xuening Zhu. Source: Electronic Journal of Statistics, Volume 14, Number 1, 1413--1448.Abstract: High dimensional time series receive considerable attention recently, whose temporal and cross-sectional dependency could be captured by the vector autoregression (VAR) model. To tackle with the high dimensionality, penalization methods are widely employed. However, theoretically, the existing studies of the penalization methods mainly focus on $i.i.d$ data, therefore cannot quantify the effect of the dependence level on the convergence rate. In this work, we use the spectral properties of the time series to quantify the dependence and derive a nonasymptotic upper bound for the estimation errors. By focusing on the nonconcave penalization methods, we manage to establish the oracle properties of the penalized VAR model estimation by considering the effects of temporal and cross-sectional dependence. Extensive numerical studies are conducted to compare the finite sample performance using different penalization functions. Lastly, an air pollution data of mainland China is analyzed for illustration purpose. Full Article
regression A fast and consistent variable selection method for high-dimensional multivariate linear regression with a large number of explanatory variables By projecteuclid.org Published On :: Fri, 27 Mar 2020 22:00 EDT Ryoya Oda, Hirokazu Yanagihara. Source: Electronic Journal of Statistics, Volume 14, Number 1, 1386--1412.Abstract: We put forward a variable selection method for selecting explanatory variables in a normality-assumed multivariate linear regression. It is cumbersome to calculate variable selection criteria for all subsets of explanatory variables when the number of explanatory variables is large. Therefore, we propose a fast and consistent variable selection method based on a generalized $C_{p}$ criterion. The consistency of the method is provided by a high-dimensional asymptotic framework such that the sample size and the sum of the dimensions of response vectors and explanatory vectors divided by the sample size tend to infinity and some positive constant which are less than one, respectively. Through numerical simulations, it is shown that the proposed method has a high probability of selecting the true subset of explanatory variables and is fast under a moderate sample size even when the number of dimensions is large. Full Article
regression The bias of isotonic regression By projecteuclid.org Published On :: Tue, 04 Feb 2020 22:03 EST Ran Dai, Hyebin Song, Rina Foygel Barber, Garvesh Raskutti. Source: Electronic Journal of Statistics, Volume 14, Number 1, 801--834.Abstract: We study the bias of the isotonic regression estimator. While there is extensive work characterizing the mean squared error of the isotonic regression estimator, relatively little is known about the bias. In this paper, we provide a sharp characterization, proving that the bias scales as $O(n^{-eta /3})$ up to log factors, where $1leq eta leq 2$ is the exponent corresponding to Hölder smoothness of the underlying mean. Importantly, this result only requires a strictly monotone mean and that the noise distribution has subexponential tails, without relying on symmetric noise or other restrictive assumptions. Full Article
regression The bias and skewness of M -estimators in regression By projecteuclid.org Published On :: Thu, 05 Aug 2010 15:41 EDT Christopher Withers, Saralees NadarajahSource: Electron. J. Statist., Volume 4, 1--14.Abstract: We consider M estimation of a regression model with a nuisance parameter and a vector of other parameters. The unknown distribution of the residuals is not assumed to be normal or symmetric. Simple and easily estimated formulas are given for the dominant terms of the bias and skewness of the parameter estimates. For the linear model these are proportional to the skewness of the ‘independent’ variables. For a nonlinear model, its linear component plays the role of these independent variables, and a second term must be added proportional to the covariance of its linear and quadratic components. For the least squares estimate with normal errors this term was derived by Box [1]. We also consider the effect of a large number of parameters, and the case of random independent variables. Full Article
regression A Statistical Learning Approach to Modal Regression By Published On :: 2020 This paper studies the nonparametric modal regression problem systematically from a statistical learning viewpoint. Originally motivated by pursuing a theoretical understanding of the maximum correntropy criterion based regression (MCCR), our study reveals that MCCR with a tending-to-zero scale parameter is essentially modal regression. We show that the nonparametric modal regression problem can be approached via the classical empirical risk minimization. Some efforts are then made to develop a framework for analyzing and implementing modal regression. For instance, the modal regression function is described, the modal regression risk is defined explicitly and its Bayes rule is characterized; for the sake of computational tractability, the surrogate modal regression risk, which is termed as the generalization risk in our study, is introduced. On the theoretical side, the excess modal regression risk, the excess generalization risk, the function estimation error, and the relations among the above three quantities are studied rigorously. It turns out that under mild conditions, function estimation consistency and convergence may be pursued in modal regression as in vanilla regression protocols such as mean regression, median regression, and quantile regression. On the practical side, the implementation issues of modal regression including the computational algorithm and the selection of the tuning parameters are discussed. Numerical validations on modal regression are also conducted to verify our findings. Full Article
regression Online Sufficient Dimension Reduction Through Sliced Inverse Regression By Published On :: 2020 Sliced inverse regression is an effective paradigm that achieves the goal of dimension reduction through replacing high dimensional covariates with a small number of linear combinations. It does not impose parametric assumptions on the dependence structure. More importantly, such a reduction of dimension is sufficient in that it does not cause loss of information. In this paper, we adapt the stationary sliced inverse regression to cope with the rapidly changing environments. We propose to implement sliced inverse regression in an online fashion. This online learner consists of two steps. In the first step we construct an online estimate for the kernel matrix; in the second step we propose two online algorithms, one is motivated by the perturbation method and the other is originated from the gradient descent optimization, to perform online singular value decomposition. The theoretical properties of this online learner are established. We demonstrate the numerical performance of this online learner through simulations and real world applications. All numerical studies confirm that this online learner performs as well as the batch learner. Full Article
regression Switching Regression Models and Causal Inference in the Presence of Discrete Latent Variables By Published On :: 2020 Given a response $Y$ and a vector $X = (X^1, dots, X^d)$ of $d$ predictors, we investigate the problem of inferring direct causes of $Y$ among the vector $X$. Models for $Y$ that use all of its causal covariates as predictors enjoy the property of being invariant across different environments or interventional settings. Given data from such environments, this property has been exploited for causal discovery. Here, we extend this inference principle to situations in which some (discrete-valued) direct causes of $ Y $ are unobserved. Such cases naturally give rise to switching regression models. We provide sufficient conditions for the existence, consistency and asymptotic normality of the MLE in linear switching regression models with Gaussian noise, and construct a test for the equality of such models. These results allow us to prove that the proposed causal discovery method obtains asymptotic false discovery control under mild conditions. We provide an algorithm, make available code, and test our method on simulated data. It is robust against model violations and outperforms state-of-the-art approaches. We further apply our method to a real data set, where we show that it does not only output causal predictors, but also a process-based clustering of data points, which could be of additional interest to practitioners. Full Article
regression WONDER: Weighted One-shot Distributed Ridge Regression in High Dimensions By Published On :: 2020 In many areas, practitioners need to analyze large data sets that challenge conventional single-machine computing. To scale up data analysis, distributed and parallel computing approaches are increasingly needed. Here we study a fundamental and highly important problem in this area: How to do ridge regression in a distributed computing environment? Ridge regression is an extremely popular method for supervised learning, and has several optimality properties, thus it is important to study. We study one-shot methods that construct weighted combinations of ridge regression estimators computed on each machine. By analyzing the mean squared error in a high-dimensional random-effects model where each predictor has a small effect, we discover several new phenomena. Infinite-worker limit: The distributed estimator works well for very large numbers of machines, a phenomenon we call 'infinite-worker limit'. Optimal weights: The optimal weights for combining local estimators sum to more than unity, due to the downward bias of ridge. Thus, all averaging methods are suboptimal. We also propose a new Weighted ONe-shot DistributEd Ridge regression algorithm (WONDER). We test WONDER in simulation studies and using the Million Song Dataset as an example. There it can save at least 100x in computation time, while nearly preserving test accuracy. Full Article
regression Bayesian modeling and prior sensitivity analysis for zero–one augmented beta regression models with an application to psychometric data By projecteuclid.org Published On :: Mon, 04 May 2020 04:00 EDT Danilo Covaes Nogarotto, Caio Lucidius Naberezny Azevedo, Jorge Luis Bazán. Source: Brazilian Journal of Probability and Statistics, Volume 34, Number 2, 304--322.Abstract: The interest on the analysis of the zero–one augmented beta regression (ZOABR) model has been increasing over the last few years. In this work, we developed a Bayesian inference for the ZOABR model, providing some contributions, namely: we explored the use of Jeffreys-rule and independence Jeffreys prior for some of the parameters, performing a sensitivity study of prior choice, comparing the Bayesian estimates with the maximum likelihood ones and measuring the accuracy of the estimates under several scenarios of interest. The results indicate, in a general way, that: the Bayesian approach, under the Jeffreys-rule prior, was as accurate as the ML one. Also, different from other approaches, we use the predictive distribution of the response to implement Bayesian residuals. To further illustrate the advantages of our approach, we conduct an analysis of a real psychometric data set including a Bayesian residual analysis, where it is shown that misleading inference can be obtained when the data is transformed. That is, when the zeros and ones are transformed to suitable values and the usual beta regression model is considered, instead of the ZOABR model. Finally, future developments are discussed. Full Article
regression A note on the “L-logistic regression models: Prior sensitivity analysis, robustness to outliers and applications” By projecteuclid.org Published On :: Mon, 03 Feb 2020 04:00 EST Saralees Nadarajah, Yuancheng Si. Source: Brazilian Journal of Probability and Statistics, Volume 34, Number 1, 183--187.Abstract: Da Paz, Balakrishnan and Bazan [Braz. J. Probab. Stat. 33 (2019), 455–479] introduced the L-logistic distribution, studied its properties including estimation issues and illustrated a data application. This note derives a closed form expression for moment properties of the distribution. Some computational issues are discussed. Full Article
regression Robust Bayesian model selection for heavy-tailed linear regression using finite mixtures By projecteuclid.org Published On :: Mon, 03 Feb 2020 04:00 EST Flávio B. Gonçalves, Marcos O. Prates, Victor Hugo Lachos. Source: Brazilian Journal of Probability and Statistics, Volume 34, Number 1, 51--70.Abstract: In this paper, we present a novel methodology to perform Bayesian model selection in linear models with heavy-tailed distributions. We consider a finite mixture of distributions to model a latent variable where each component of the mixture corresponds to one possible model within the symmetrical class of normal independent distributions. Naturally, the Gaussian model is one of the possibilities. This allows for a simultaneous analysis based on the posterior probability of each model. Inference is performed via Markov chain Monte Carlo—a Gibbs sampler with Metropolis–Hastings steps for a class of parameters. Simulated examples highlight the advantages of this approach compared to a segregated analysis based on arbitrarily chosen model selection criteria. Examples with real data are presented and an extension to censored linear regression is introduced and discussed. Full Article
regression Bootstrap-based testing inference in beta regressions By projecteuclid.org Published On :: Mon, 03 Feb 2020 04:00 EST Fábio P. Lima, Francisco Cribari-Neto. Source: Brazilian Journal of Probability and Statistics, Volume 34, Number 1, 18--34.Abstract: We address the issue of performing testing inference in small samples in the class of beta regression models. We consider the likelihood ratio test and its standard bootstrap version. We also consider two alternative resampling-based tests. One of them uses the bootstrap test statistic replicates to numerically estimate a Bartlett correction factor that can be applied to the likelihood ratio test statistic. By doing so, we avoid estimation of quantities located in the tail of the likelihood ratio test statistic null distribution. The second alternative resampling-based test uses a fast double bootstrap scheme in which a single second level bootstrapping resample is performed for each first level bootstrap replication. It delivers accurate testing inferences at a computational cost that is considerably smaller than that of a standard double bootstrapping scheme. The Monte Carlo results we provide show that the standard likelihood ratio test tends to be quite liberal in small samples. They also show that the bootstrap tests deliver accurate testing inferences even when the sample size is quite small. An empirical application is also presented and discussed. Full Article
regression Bayesian approach for the zero-modified Poisson–Lindley regression model By projecteuclid.org Published On :: Mon, 26 Aug 2019 04:00 EDT Wesley Bertoli, Katiane S. Conceição, Marinho G. Andrade, Francisco Louzada. Source: Brazilian Journal of Probability and Statistics, Volume 33, Number 4, 826--860.Abstract: The primary goal of this paper is to introduce the zero-modified Poisson–Lindley regression model as an alternative to model overdispersed count data exhibiting inflation or deflation of zeros in the presence of covariates. The zero-modification is incorporated by considering that a zero-truncated process produces positive observations and consequently, the proposed model can be fitted without any previous information about the zero-modification present in a given dataset. A fully Bayesian approach based on the g-prior method has been considered for inference concerns. An intensive Monte Carlo simulation study has been conducted to evaluate the performance of the developed methodology and the maximum likelihood estimators. The proposed model was considered for the analysis of a real dataset on the number of bids received by $126$ U.S. firms between 1978–1985, and the impact of choosing different prior distributions for the regression coefficients has been studied. A sensitivity analysis to detect influential points has been performed based on the Kullback–Leibler divergence. A general comparison with some well-known regression models for discrete data has been presented. Full Article
regression Bayesian modelling of the abilities in dichotomous IRT models via regression with missing values in the covariates By projecteuclid.org Published On :: Mon, 26 Aug 2019 04:00 EDT Flávio B. Gonçalves, Bárbara C. C. Dias. Source: Brazilian Journal of Probability and Statistics, Volume 33, Number 4, 782--800.Abstract: Educational assessment usually considers a contextual questionnaire to extract relevant information from the applicants. This may include items related to socio-economical profile as well as items to extract other characteristics potentially related to applicant’s performance in the test. A careful analysis of the questionnaires jointly with the test’s results may evidence important relations between profiles and test performance. The most coherent way to perform this task in a statistical context is to use the information from the questionnaire to help explain the variability of the abilities in a joint model-based approach. Nevertheless, the responses to the questionnaire typically present missing values which, in some cases, may be missing not at random. This paper proposes a statistical methodology to model the abilities in dichotomous IRT models using the information of the contextual questionnaires via linear regression. The proposed methodology models the missing data jointly with the all the observed data, which allows for the estimation of the former. The missing data modelling is flexible enough to allow the specification of missing not at random structures. Furthermore, even if those structures are not assumed a priori, they can be estimated from the posterior results when assuming missing (completely) at random structures a priori. Statistical inference is performed under the Bayesian paradigm via an efficient MCMC algorithm. Simulated and real examples are presented to investigate the efficiency and applicability of the proposed methodology. Full Article
regression Spatiotemporal point processes: regression, model specifications and future directions By projecteuclid.org Published On :: Mon, 26 Aug 2019 04:00 EDT Dani Gamerman. Source: Brazilian Journal of Probability and Statistics, Volume 33, Number 4, 686--705.Abstract: Point processes are one of the most commonly encountered observation processes in Spatial Statistics. Model-based inference for them depends on the likelihood function. In the most standard setting of Poisson processes, the likelihood depends on the intensity function, and can not be computed analytically. A number of approximating techniques have been proposed to handle this difficulty. In this paper, we review recent work on exact solutions that solve this problem without resorting to approximations. The presentation concentrates more heavily on discrete time but also considers continuous time. The solutions are based on model specifications that impose smoothness constraints on the intensity function. We also review approaches to include a regression component and different ways to accommodate it while accounting for additional heterogeneity. Applications are provided to illustrate the results. Finally, we discuss possible extensions to account for discontinuities and/or jumps in the intensity function. Full Article
regression L-Logistic regression models: Prior sensitivity analysis, robustness to outliers and applications By projecteuclid.org Published On :: Mon, 10 Jun 2019 04:04 EDT Rosineide F. da Paz, Narayanaswamy Balakrishnan, Jorge Luis Bazán. Source: Brazilian Journal of Probability and Statistics, Volume 33, Number 3, 455--479.Abstract: Tadikamalla and Johnson [ Biometrika 69 (1982) 461–465] developed the $L_{B}$ distribution to variables with bounded support by considering a transformation of the standard Logistic distribution. In this manuscript, a convenient parametrization of this distribution is proposed in order to develop regression models. This distribution, referred to here as L-Logistic distribution, provides great flexibility and includes the uniform distribution as a particular case. Several properties of this distribution are studied, and a Bayesian approach is adopted for the parameter estimation. Simulation studies, considering prior sensitivity analysis, recovery of parameters and comparison of algorithms, and robustness to outliers are all discussed showing that the results are insensitive to the choice of priors, efficiency of the algorithm MCMC adopted, and robustness of the model when compared with the beta distribution. Applications to estimate the vulnerability to poverty and to explain the anxiety are performed. The results to applications show that the L-Logistic regression models provide a better fit than the corresponding beta regression models. Full Article
regression Influence measures for the Waring regression model By projecteuclid.org Published On :: Mon, 04 Mar 2019 04:00 EST Luisa Rivas, Manuel Galea. Source: Brazilian Journal of Probability and Statistics, Volume 33, Number 2, 402--424.Abstract: In this paper, we present a regression model where the response variable is a count data that follows a Waring distribution. The Waring regression model allows for analysis of phenomena where the Geometric regression model is inadequate, because the probability of success on each trial, $p$, is different for each individual and $p$ has an associated distribution. Estimation is performed by maximum likelihood, through the maximization of the $Q$-function using EM algorithm. Diagnostic measures are calculated for this model. To illustrate the results, an application to real data is presented. Some specific details are given in the Appendix of the paper. Full Article
regression A new log-linear bimodal Birnbaum–Saunders regression model with application to survival data By projecteuclid.org Published On :: Mon, 04 Mar 2019 04:00 EST Francisco Cribari-Neto, Rodney V. Fonseca. Source: Brazilian Journal of Probability and Statistics, Volume 33, Number 2, 329--355.Abstract: The log-linear Birnbaum–Saunders model has been widely used in empirical applications. We introduce an extension of this model based on a recently proposed version of the Birnbaum–Saunders distribution which is more flexible than the standard Birnbaum–Saunders law since its density may assume both unimodal and bimodal shapes. We show how to perform point estimation, interval estimation and hypothesis testing inferences on the parameters that index the regression model we propose. We also present a number of diagnostic tools, such as residual analysis, local influence, generalized leverage, generalized Cook’s distance and model misspecification tests. We investigate the usefulness of model selection criteria and the accuracy of prediction intervals for the proposed model. Results of Monte Carlo simulations are presented. Finally, we also present and discuss an empirical application. Full Article