base MPI Estimates No More than 167,000 Non-Citizens Could Be Ineligible for Green Cards Based on Current Public Benefits Use By www.migrationpolicy.org Published On :: Tue, 03 Mar 2020 18:12:47 -0500 WASHINGTON – While the new Trump administration public charge rule is likely to vastly reshape future legal immigration based on its test to assess if a person might ever use public benefits in the future, the universe of non-citizens who could be denied a green card based on current benefits use is quite small. Full Article
base Federer charges to fifth Basel title By www.abc.net.au Published On :: Mon, 07 Nov 2011 09:09:00 +1100 Roger Federer has returned to winning mode 10 months after his last title, as the home tennis hero schooled Japanese wild card Kei Nishikori 6-1, 6-3 to win a fifth Swiss Indoors title in Basel. Full Article
base Association Between the Use of Antidepressants and the Risk of Type 2 Diabetes: A Large, Population-Based Cohort Study in Japan By care.diabetesjournals.org Published On :: 2020-03-20T11:50:34-07:00 OBJECTIVE This study aimed to reveal the associations between the risk of new-onset type 2 diabetes and the duration of antidepressant use and the antidepressant dose, and between antidepressant use after diabetes onset and clinical outcomes. RESEARCH DESIGN AND METHODS In this large-scale retrospective cohort study in Japan, new users of antidepressants (exposure group) and nonusers (nonexposure group), aged 20–79 years, were included between 1 April 2006 and 31 May 2015. Patients with a history of diabetes or receipt of antidiabetes treatment were excluded. Covariates were adjusted by using propensity score matching; the associations were analyzed between risk of new-onset type 2 diabetes and the duration of antidepressant use/dose of antidepressant in the exposure and nonexposure groups by using Cox proportional hazards models. Changes in glycated hemoglobin (HbA1c) level were examined in groups with continuous use, discontinuation, or a reduction in the dose of antidepressants. RESULTS Of 90,530 subjects, 45,265 were in both the exposure and the nonexposure group after propensity score matching; 5,225 patients (5.8%) developed diabetes. Antidepressant use was associated with the risk of diabetes onset in a time- and dose-dependent manner. The adjusted hazard ratio was 1.27 (95% CI 1.16–1.39) for short-term low-dose and 3.95 (95% CI 3.31–4.72) for long-term high-dose antidepressant use. HbA1c levels were lower in patients who discontinued or reduced the dose of antidepressants (F[2,49] = 8.17; P < 0.001). CONCLUSIONS Long-term antidepressant use increased the risk of type 2 diabetes onset in a time- and dose-dependent manner. Glucose tolerance improved when antidepressants were discontinued or the dose was reduced after diabetes onset. Full Article
base Green Cards and Public Charge: Who Could Be Denied Based on Benefits Use? By www.migrationpolicy.org Published On :: Thu, 27 Feb 2020 11:06:26 -0500 On this webinar MPI experts discuss their estimates of the populations that could be deemed ineligible for a green card based on existing benefits use. They also discuss the broader consequences of the public-charge rule implemented in February 2020, through its "chilling effects" and imposition of a wealth test aimed at assessing whether green-card applicants ever would be likely to use a public benefit in the future. Full Article
base The Public-Charge Rule: Broad Impacts, But Few Will Be Denied Green Cards Based on Actual Benefits Use By www.migrationpolicy.org Published On :: Tue, 03 Mar 2020 17:57:09 -0500 While the Trump administration public-charge rule is likely to vastly reshape legal immigration based on its test to assess if a person might ever use public benefits in the future, the universe of noncitizens who could be denied a green card based on current benefits use is quite small. That's because very few benefit programs are open to noncitizens who do not hold a green card. This commentary offers estimates of who might be affected. Full Article
base Green Cards and Public Charge: Who Could Be Denied Based on Benefits Use? By www.migrationpolicy.org Published On :: Thu, 12 Mar 2020 18:21:12 -0400 On this webinar, MPI experts discussed the public-charge rule and released estimates of the populations that could be deemed ineligible for a green card based on existing benefits use. They examined the far larger consequences of the rule, through its "chilling effects" and imposition of a test aimed at assessing whether green-card applicants are likely to ever use a public benefit in the future. And they discussed how the latter holds the potential to reshape legal immigration to the United States. Full Article
base Baseball and Linguistic Uncertainty By decisions-and-info-gaps.blogspot.com Published On :: Sun, 21 Aug 2011 14:00:00 +0000 In my youth I played an inordinate amount of baseball, collected baseball cards, and idolized baseball players. I've outgrown all that but when I'm in the States during baseball season I do enjoy watching a few innings on the TV.So I was watching a baseball game recently and the commentator was talking about the art of pitching. Throwing a baseball, he said, is like shooting a shotgun. You get a spray. As a pitcher, you have to know your spray. You learn to control it, but you know that it is there. The ball won't always go where you want it. And furthermore, where you want the ball depends on the batter's style and strategy, which vary from pitch to pitch for every batter.That's baseball talk, but it stuck in my mind. Baseball pitchers must manage uncertainty! And it is not enough to reduce it and hope for the best. Suppose you want to throw a strike. It's not a good strategy to aim directly at, say, the lower outside corner of the strike zone, because of the spray of the ball's path and because the batter's stance can shift. Especially if the spray is skewed down and out, you'll want to move up and in a bit.This is all very similar to the ambiguity of human speech when we pitch words at each other. Words don't have precise meanings; meanings spread out like the pitcher's spray. If we want to communicate precisely we need to be aware of this uncertainty, and manage it, taking account of the listener's propensities.Take the word "liberal" as it is used in political discussion.For many decades, "liberals" have tended to support high taxes to provide generous welfare, public medical insurance, and low-cost housing. They advocate liberal (meaning magnanimous or abundant) government involvement for the citizens' benefit.A "liberal" might also be someone who is open-minded and tolerant, who is not strict in applying rules to other people, or even to him or herself. Such a person might be called "liberal" (meaning advocating individual rights) for opposing extensive government involvement in private decisions. For instance, liberals (in this second sense) might oppose high taxes since they reduce individuals' ability to make independent choices. As another example, John Stuart Mill opposed laws which restricted the rights of women to work (at night, for instance), even though these laws were intended to promote the welfare of women. Women, insisted Mill, are intelligent adults and can judge for themselves what is good for them.Returning to the first meaning of "liberal" mentioned above, people of that strain may support restrictions of trade to countries which ignore the health and safety of workers. The other type of "liberal" might tend to support unrestricted trade.Sending out words and pitching baseballs are both like shooting a shotgun: meanings (and baseballs) spray out. You must know what meaning you wish to convey, and what other meanings the word can have. The choice of the word, and the crafting of its context, must manage the uncertainty of where the word will land in the listener's mind.Let's go back to baseball again.If there were no uncertainty in the pitcher's pitch and the batter's swing, then baseball would be a dreadfully boring game. If the batter knows exactly where and when the ball will arrive, and can completely control the bat, then every swing will be a homer. Or conversely, if the pitcher always knows exactly how the batter will swing, and if each throw is perfectly controlled, then every batter will strike out. But which is it? Whose certainty dominates? The batter's or the pitcher's? It can't be both. There is some deep philosophical problem here. Clearly there cannot be complete certainty in a world which has some element of free will, or surprise, or discovery. This is not just a tautology, a necessary result of what we mean by "uncertainty" and "surprise". It is an implication of limited human knowledge. Uncertainty - which makes baseball and life interesting - is inevitable in the human world.How does this carry over to human speech?It is said of the Wright brothers that they thought so synergistically that one brother could finish an idea or sentence begun by the other. If there is no uncertainty in what I am going to say, then you will be bored with my conversation, or at least, you won't learn anything from me. It is because you don't know what I mean by, for instance, "robustness", that my speech on this topic is enlightening (and maybe interesting). And it is because you disagree with me about what robustness means (and you tell me so), that I can perhaps extend my own understanding.So, uncertainty is inevitable in a world that is rich enough to have surprise or free will. Furthermore, this uncertainty leads to a process - through speech - of discovery and new understanding. Uncertainty, and the use of language, leads to discovery.Isn't baseball an interesting game? Full Article
base Rhode Island PARCC Scores Lower on Computer-Based Exams By feedproxy.google.com Published On :: Tue, 09 Feb 2016 00:00:00 +0000 A state-by-state breakdown shows that Colorado, Rhode Island and Illinois found some evidence that students' familiarity with technology impacted scores on 2014-15 PARCC exams. An analysis in Maryland is pending. Full Article Rhode_Island
base Adult Safeguarding - A rights based approach in responding to Elder Abuse - Elicia White_SLIDES. By www.catalog.slsa.sa.gov.au Published On :: Full Article
base Die Cladoceren der Umgebung von Basel / vorgelegt von Theodor Stingelin. By feedproxy.google.com Published On :: Genf : Rey & Malavallon, 1895. Full Article
base An encyclopedia of the practice of medicine, based on bacteriology / by J. Buchanan. By feedproxy.google.com Published On :: New York : R.R. Russell, 1891. Full Article
base Essai sur la méningite en plaque ou scléreuse limitée a la base de l’encéphale / par Emile Labarriere. By feedproxy.google.com Published On :: Paris : V.A. Delahaye, 1878. Full Article
base Evaluation of drug abuse treatments : based on first year followup : national followup study of admissions to drug abuse treatments in the DARP during 1969-1972. By search.wellcomelibrary.org Published On :: Rockville, Maryland : National Institute on Drug Abuse, 1978. Full Article
base Parseval inequalities and lower bounds for variance-based sensitivity indices By projecteuclid.org Published On :: Tue, 05 May 2020 22:00 EDT Olivier Roustant, Fabrice Gamboa, Bertrand Iooss. Source: Electronic Journal of Statistics, Volume 14, Number 1, 386--412.Abstract: The so-called polynomial chaos expansion is widely used in computer experiments. For example, it is a powerful tool to estimate Sobol’ sensitivity indices. In this paper, we consider generalized chaos expansions built on general tensor Hilbert basis. In this frame, we revisit the computation of the Sobol’ indices with Parseval equalities and give general lower bounds for these indices obtained by truncation. The case of the eigenfunctions system associated with a Poincaré differential operator leads to lower bounds involving the derivatives of the analyzed function and provides an efficient tool for variable screening. These lower bounds are put in action both on toy and real life models demonstrating their accuracy. Full Article
base Model-based clustering with envelopes By projecteuclid.org Published On :: Thu, 23 Apr 2020 22:01 EDT Wenjing Wang, Xin Zhang, Qing Mai. Source: Electronic Journal of Statistics, Volume 14, Number 1, 82--109.Abstract: Clustering analysis is an important unsupervised learning technique in multivariate statistics and machine learning. In this paper, we propose a set of new mixture models called CLEMM (in short for Clustering with Envelope Mixture Models) that is based on the widely used Gaussian mixture model assumptions and the nascent research area of envelope methodology. Formulated mostly for regression models, envelope methodology aims for simultaneous dimension reduction and efficient parameter estimation, and includes a very recent formulation of envelope discriminant subspace for classification and discriminant analysis. Motivated by the envelope discriminant subspace pursuit in classification, we consider parsimonious probabilistic mixture models where the cluster analysis can be improved by projecting the data onto a latent lower-dimensional subspace. The proposed CLEMM framework and the associated envelope-EM algorithms thus provide foundations for envelope methods in unsupervised and semi-supervised learning problems. Numerical studies on simulated data and two benchmark data sets show significant improvement of our propose methods over the classical methods such as Gaussian mixture models, K-means and hierarchical clustering algorithms. An R package is available at https://github.com/kusakehan/CLEMM. Full Article
base Estimation of a semiparametric transformation model: A novel approach based on least squares minimization By projecteuclid.org Published On :: Tue, 04 Feb 2020 22:03 EST Benjamin Colling, Ingrid Van Keilegom. Source: Electronic Journal of Statistics, Volume 14, Number 1, 769--800.Abstract: Consider the following semiparametric transformation model $Lambda_{ heta }(Y)=m(X)+varepsilon $, where $X$ is a $d$-dimensional covariate, $Y$ is a univariate response variable and $varepsilon $ is an error term with zero mean and independent of $X$. We assume that $m$ is an unknown regression function and that ${Lambda _{ heta }: heta inTheta }$ is a parametric family of strictly increasing functions. Our goal is to develop two new estimators of the transformation parameter $ heta $. The main idea of these two estimators is to minimize, with respect to $ heta $, the $L_{2}$-distance between the transformation $Lambda _{ heta }$ and one of its fully nonparametric estimators. We consider in particular the nonparametric estimator based on the least-absolute deviation loss constructed in Colling and Van Keilegom (2019). We establish the consistency and the asymptotic normality of the two proposed estimators of $ heta $. We also carry out a simulation study to illustrate and compare the performance of our new parametric estimators to that of the profile likelihood estimator constructed in Linton et al. (2008). Full Article
base Path-Based Spectral Clustering: Guarantees, Robustness to Outliers, and Fast Algorithms By Published On :: 2020 We consider the problem of clustering with the longest-leg path distance (LLPD) metric, which is informative for elongated and irregularly shaped clusters. We prove finite-sample guarantees on the performance of clustering with respect to this metric when random samples are drawn from multiple intrinsically low-dimensional clusters in high-dimensional space, in the presence of a large number of high-dimensional outliers. By combining these results with spectral clustering with respect to LLPD, we provide conditions under which the Laplacian eigengap statistic correctly determines the number of clusters for a large class of data sets, and prove guarantees on the labeling accuracy of the proposed algorithm. Our methods are quite general and provide performance guarantees for spectral clustering with any ultrametric. We also introduce an efficient, easy to implement approximation algorithm for the LLPD based on a multiscale analysis of adjacency graphs, which allows for the runtime of LLPD spectral clustering to be quasilinear in the number of data points. Full Article
base On the consistency of graph-based Bayesian semi-supervised learning and the scalability of sampling algorithms By Published On :: 2020 This paper considers a Bayesian approach to graph-based semi-supervised learning. We show that if the graph parameters are suitably scaled, the graph-posteriors converge to a continuum limit as the size of the unlabeled data set grows. This consistency result has profound algorithmic implications: we prove that when consistency holds, carefully designed Markov chain Monte Carlo algorithms have a uniform spectral gap, independent of the number of unlabeled inputs. Numerical experiments illustrate and complement the theory. Full Article
base High-Dimensional Inference for Cluster-Based Graphical Models By Published On :: 2020 Motivated by modern applications in which one constructs graphical models based on a very large number of features, this paper introduces a new class of cluster-based graphical models, in which variable clustering is applied as an initial step for reducing the dimension of the feature space. We employ model assisted clustering, in which the clusters contain features that are similar to the same unobserved latent variable. Two different cluster-based Gaussian graphical models are considered: the latent variable graph, corresponding to the graphical model associated with the unobserved latent variables, and the cluster-average graph, corresponding to the vector of features averaged over clusters. Our study reveals that likelihood based inference for the latent graph, not analyzed previously, is analytically intractable. Our main contribution is the development and analysis of alternative estimation and inference strategies, for the precision matrix of an unobservable latent vector Z. We replace the likelihood of the data by an appropriate class of empirical risk functions, that can be specialized to the latent graphical model and to the simpler, but under-analyzed, cluster-average graphical model. The estimators thus derived can be used for inference on the graph structure, for instance on edge strength or pattern recovery. Inference is based on the asymptotic limits of the entry-wise estimates of the precision matrices associated with the conditional independence graphs under consideration. While taking the uncertainty induced by the clustering step into account, we establish Berry-Esseen central limit theorems for the proposed estimators. It is noteworthy that, although the clusters are estimated adaptively from the data, the central limit theorems regarding the entries of the estimated graphs are proved under the same conditions one would use if the clusters were known in advance. As an illustration of the usage of these newly developed inferential tools, we show that they can be reliably used for recovery of the sparsity pattern of the graphs we study, under FDR control, which is verified via simulation studies and an fMRI data analysis. These experimental results confirm the theoretically established difference between the two graph structures. Furthermore, the data analysis suggests that the latent variable graph, corresponding to the unobserved cluster centers, can help provide more insight into the understanding of the brain connectivity networks relative to the simpler, average-based, graph. Full Article
base Community-Based Group Graphical Lasso By Published On :: 2020 A new strategy for probabilistic graphical modeling is developed that draws parallels to community detection analysis. The method jointly estimates an undirected graph and homogeneous communities of nodes. The structure of the communities is taken into account when estimating the graph and at the same time, the structure of the graph is accounted for when estimating communities of nodes. The procedure uses a joint group graphical lasso approach with community detection-based grouping, such that some groups of edges co-occur in the estimated graph. The grouping structure is unknown and is estimated based on community detection algorithms. Theoretical derivations regarding graph convergence and sparsistency, as well as accuracy of community recovery are included, while the method's empirical performance is illustrated in an fMRI context, as well as with simulated examples. Full Article
base Estimation of a Low-rank Topic-Based Model for Information Cascades By Published On :: 2020 We consider the problem of estimating the latent structure of a social network based on the observed information diffusion events, or cascades, where the observations for a given cascade consist of only the timestamps of infection for infected nodes but not the source of the infection. Most of the existing work on this problem has focused on estimating a diffusion matrix without any structural assumptions on it. In this paper, we propose a novel model based on the intuition that an information is more likely to propagate among two nodes if they are interested in similar topics which are also prominent in the information content. In particular, our model endows each node with an influence vector (which measures how authoritative the node is on each topic) and a receptivity vector (which measures how susceptible the node is for each topic). We show how this node-topic structure can be estimated from the observed cascades, and prove the consistency of the estimator. Experiments on synthetic and real data demonstrate the improved performance and better interpretability of our model compared to existing state-of-the-art methods. Full Article
base Bootstrap-based testing inference in beta regressions By projecteuclid.org Published On :: Mon, 03 Feb 2020 04:00 EST Fábio P. Lima, Francisco Cribari-Neto. Source: Brazilian Journal of Probability and Statistics, Volume 34, Number 1, 18--34.Abstract: We address the issue of performing testing inference in small samples in the class of beta regression models. We consider the likelihood ratio test and its standard bootstrap version. We also consider two alternative resampling-based tests. One of them uses the bootstrap test statistic replicates to numerically estimate a Bartlett correction factor that can be applied to the likelihood ratio test statistic. By doing so, we avoid estimation of quantities located in the tail of the likelihood ratio test statistic null distribution. The second alternative resampling-based test uses a fast double bootstrap scheme in which a single second level bootstrapping resample is performed for each first level bootstrap replication. It delivers accurate testing inferences at a computational cost that is considerably smaller than that of a standard double bootstrapping scheme. The Monte Carlo results we provide show that the standard likelihood ratio test tends to be quite liberal in small samples. They also show that the bootstrap tests deliver accurate testing inferences even when the sample size is quite small. An empirical application is also presented and discussed. Full Article
base Bayesian inference on power Lindley distribution based on different loss functions By projecteuclid.org Published On :: Mon, 26 Aug 2019 04:00 EDT Abbas Pak, M. E. Ghitany, Mohammad Reza Mahmoudi. Source: Brazilian Journal of Probability and Statistics, Volume 33, Number 4, 894--914.Abstract: This paper focuses on Bayesian estimation of the parameters and reliability function of the power Lindley distribution by using various symmetric and asymmetric loss functions. Assuming suitable priors on the parameters, Bayes estimates are derived by using squared error, linear exponential (linex) and general entropy loss functions. Since, under these loss functions, Bayes estimates of the parameters do not have closed forms we use lindley’s approximation technique to calculate the Bayes estimates. Moreover, we obtain the Bayes estimates of the parameters using a Markov Chain Monte Carlo (MCMC) method. Simulation studies are conducted in order to evaluate the performances of the proposed estimators under the considered loss functions. Finally, analysis of a real data set is presented for illustrative purposes. Full Article
base A rank-based Cramér–von-Mises-type test for two samples By projecteuclid.org Published On :: Mon, 10 Jun 2019 04:04 EDT Jamye Curry, Xin Dang, Hailin Sang. Source: Brazilian Journal of Probability and Statistics, Volume 33, Number 3, 425--454.Abstract: We study a rank based univariate two-sample distribution-free test. The test statistic is the difference between the average of between-group rank distances and the average of within-group rank distances. This test statistic is closely related to the two-sample Cramér–von Mises criterion. They are different empirical versions of a same quantity for testing the equality of two population distributions. Although they may be different for finite samples, they share the same expected value, variance and asymptotic properties. The advantage of the new rank based test over the classical one is its ease to generalize to the multivariate case. Rather than using the empirical process approach, we provide a different easier proof, bringing in a different perspective and insight. In particular, we apply the Hájek projection and orthogonal decomposition technique in deriving the asymptotics of the proposed rank based statistic. A numerical study compares power performance of the rank formulation test with other commonly-used nonparametric tests and recommendations on those tests are provided. Lastly, we propose a multivariate extension of the test based on the spatial rank. Full Article
base Variable selection methods for model-based clustering By projecteuclid.org Published On :: Thu, 26 Apr 2018 04:00 EDT Michael Fop, Thomas Brendan Murphy. Source: Statistics Surveys, Volume 12, 18--65.Abstract: Model-based clustering is a popular approach for clustering multivariate data which has seen applications in numerous fields. Nowadays, high-dimensional data are more and more common and the model-based clustering approach has adapted to deal with the increasing dimensionality. In particular, the development of variable selection techniques has received a lot of attention and research effort in recent years. Even for small size problems, variable selection has been advocated to facilitate the interpretation of the clustering results. This review provides a summary of the methods developed for variable selection in model-based clustering. Existing R packages implementing the different methods are indicated and illustrated in application to two data analysis examples. Full Article
base Primal and dual model representations in kernel-based learning By projecteuclid.org Published On :: Wed, 25 Aug 2010 10:28 EDT Johan A.K. Suykens, Carlos Alzate, Kristiaan PelckmansSource: Statist. Surv., Volume 4, 148--183.Abstract: This paper discusses the role of primal and (Lagrange) dual model representations in problems of supervised and unsupervised learning. The specification of the estimation problem is conceived at the primal level as a constrained optimization problem. The constraints relate to the model which is expressed in terms of the feature map. From the conditions for optimality one jointly finds the optimal model representation and the model estimate. At the dual level the model is expressed in terms of a positive definite kernel function, which is characteristic for a support vector machine methodology. It is discussed how least squares support vector machines are playing a central role as core models across problems of regression, classification, principal component analysis, spectral clustering, canonical correlation analysis, dimensionality reduction and data visualization. Full Article
base Finite mixture models and model-based clustering By projecteuclid.org Published On :: Thu, 05 Aug 2010 15:41 EDT Volodymyr Melnykov, Ranjan MaitraSource: Statist. Surv., Volume 4, 80--116.Abstract: Finite mixture models have a long history in statistics, having been used to model population heterogeneity, generalize distributional assumptions, and lately, for providing a convenient yet formal framework for clustering and classification. This paper provides a detailed review into mixture models and model-based clustering. Recent trends as well as open problems in the area are also discussed. Full Article
base Statistical errors in Monte Carlo-based inference for random elements. (arXiv:2005.02532v2 [math.ST] UPDATED) By arxiv.org Published On :: Monte Carlo simulation is useful to compute or estimate expected functionals of random elements if those random samples are possible to be generated from the true distribution. However, when the distribution has some unknown parameters, the samples must be generated from an estimated distribution with the parameters replaced by some estimators, which causes a statistical error in Monte Carlo estimation. This paper considers such a statistical error and investigates the asymptotic distributions of Monte Carlo-based estimators when the random elements are not only the real valued, but also functional valued random variables. We also investigate expected functionals for semimartingales in details. The consideration indicates that the Monte Carlo estimation can get worse when a semimartingale has a jump part with unremovable unknown parameters. Full Article
base Margin-Based Generalization Lower Bounds for Boosted Classifiers. (arXiv:1909.12518v4 [cs.LG] UPDATED) By arxiv.org Published On :: Boosting is one of the most successful ideas in machine learning. The most well-accepted explanations for the low generalization error of boosting algorithms such as AdaBoost stem from margin theory. The study of margins in the context of boosting algorithms was initiated by Schapire, Freund, Bartlett and Lee (1998) and has inspired numerous boosting algorithms and generalization bounds. To date, the strongest known generalization (upper bound) is the $k$th margin bound of Gao and Zhou (2013). Despite the numerous generalization upper bounds that have been proved over the last two decades, nothing is known about the tightness of these bounds. In this paper, we give the first margin-based lower bounds on the generalization error of boosted classifiers. Our lower bounds nearly match the $k$th margin bound and thus almost settle the generalization performance of boosted classifiers in terms of margins. Full Article
base Predictive Modeling of ICU Healthcare-Associated Infections from Imbalanced Data. Using Ensembles and a Clustering-Based Undersampling Approach. (arXiv:2005.03582v1 [cs.LG]) By arxiv.org Published On :: Early detection of patients vulnerable to infections acquired in the hospital environment is a challenge in current health systems given the impact that such infections have on patient mortality and healthcare costs. This work is focused on both the identification of risk factors and the prediction of healthcare-associated infections in intensive-care units by means of machine-learning methods. The aim is to support decision making addressed at reducing the incidence rate of infections. In this field, it is necessary to deal with the problem of building reliable classifiers from imbalanced datasets. We propose a clustering-based undersampling strategy to be used in combination with ensemble classifiers. A comparative study with data from 4616 patients was conducted in order to validate our proposal. We applied several single and ensemble classifiers both to the original dataset and to data preprocessed by means of different resampling methods. The results were analyzed by means of classic and recent metrics specifically designed for imbalanced data classification. They revealed that the proposal is more efficient in comparison with other approaches. Full Article
base Transfer Learning for sEMG-based Hand Gesture Classification using Deep Learning in a Master-Slave Architecture. (arXiv:2005.03460v1 [eess.SP]) By arxiv.org Published On :: Recent advancements in diagnostic learning and development of gesture-based human machine interfaces have driven surface electromyography (sEMG) towards significant importance. Analysis of hand gestures requires an accurate assessment of sEMG signals. The proposed work presents a novel sequential master-slave architecture consisting of deep neural networks (DNNs) for classification of signs from the Indian sign language using signals recorded from multiple sEMG channels. The performance of the master-slave network is augmented by leveraging additional synthetic feature data generated by long short term memory networks. Performance of the proposed network is compared to that of a conventional DNN prior to and after the addition of synthetic data. Up to 14% improvement is observed in the conventional DNN and up to 9% improvement in master-slave network on addition of synthetic data with an average accuracy value of 93.5% asserting the suitability of the proposed approach. Full Article
base Multi-Label Sampling based on Local Label Imbalance. (arXiv:2005.03240v1 [cs.LG]) By arxiv.org Published On :: Class imbalance is an inherent characteristic of multi-label data that hinders most multi-label learning methods. One efficient and flexible strategy to deal with this problem is to employ sampling techniques before training a multi-label learning model. Although existing multi-label sampling approaches alleviate the global imbalance of multi-label datasets, it is actually the imbalance level within the local neighbourhood of minority class examples that plays a key role in performance degradation. To address this issue, we propose a novel measure to assess the local label imbalance of multi-label datasets, as well as two multi-label sampling approaches based on the local label imbalance, namely MLSOL and MLUL. By considering all informative labels, MLSOL creates more diverse and better labeled synthetic instances for difficult examples, while MLUL eliminates instances that are harmful to their local region. Experimental results on 13 multi-label datasets demonstrate the effectiveness of the proposed measure and sampling approaches for a variety of evaluation metrics, particularly in the case of an ensemble of classifiers trained on repeated samples of the original data. Full Article
base On the Optimality of Randomization in Experimental Design: How to Randomize for Minimax Variance and Design-Based Inference. (arXiv:2005.03151v1 [stat.ME]) By arxiv.org Published On :: I study the minimax-optimal design for a two-arm controlled experiment where conditional mean outcomes may vary in a given set. When this set is permutation symmetric, the optimal design is complete randomization, and using a single partition (i.e., the design that only randomizes the treatment labels for each side of the partition) has minimax risk larger by a factor of $n-1$. More generally, the optimal design is shown to be the mixed-strategy optimal design (MSOD) of Kallus (2018). Notably, even when the set of conditional mean outcomes has structure (i.e., is not permutation symmetric), being minimax-optimal for variance still requires randomization beyond a single partition. Nonetheless, since this targets precision, it may still not ensure sufficient uniformity in randomization to enable randomization (i.e., design-based) inference by Fisher's exact test to appropriately detect violations of null. I therefore propose the inference-constrained MSOD, which is minimax-optimal among all designs subject to such uniformity constraints. On the way, I discuss Johansson et al. (2020) who recently compared rerandomization of Morgan and Rubin (2012) and the pure-strategy optimal design (PSOD) of Kallus (2018). I point out some errors therein and set straight that randomization is minimax-optimal and that the "no free lunch" theorem and example in Kallus (2018) are correct. Full Article
base Towards Frequency-Based Explanation for Robust CNN. (arXiv:2005.03141v1 [cs.LG]) By arxiv.org Published On :: Current explanation techniques towards a transparent Convolutional Neural Network (CNN) mainly focuses on building connections between the human-understandable input features with models' prediction, overlooking an alternative representation of the input, the frequency components decomposition. In this work, we present an analysis of the connection between the distribution of frequency components in the input dataset and the reasoning process the model learns from the data. We further provide quantification analysis about the contribution of different frequency components toward the model's prediction. We show that the vulnerability of the model against tiny distortions is a result of the model is relying on the high-frequency features, the target features of the adversarial (black and white-box) attackers, to make the prediction. We further show that if the model develops stronger association between the low-frequency component with true labels, the model is more robust, which is the explanation of why adversarially trained models are more robust against tiny distortions. Full Article
base Pediatric allergy : a case-based collection with MCQs. By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Callnumber: OnlineISBN: 9783030182823 (electronic bk.) Full Article
base Intelligent wavelet based techniques for advanced multimedia applications By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Author: Singh, Rajiv, authorCallnumber: OnlineISBN: 9783030318734 (electronic bk.) Full Article
base Geriatric Medicine : a Problem-Based Approach By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Callnumber: OnlineISBN: 9789811032530 Full Article
base Database design and implementation By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Author: Sciore, Edward, authorCallnumber: OnlineISBN: 9783030338367 (electronic bk.) Full Article
base Common problems in the newborn nursery : an evidence and case-based guide By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Callnumber: OnlineISBN: 9783319956725 (electronic bk.) Full Article
base Children’s Palliative Care: An International Case-Based Manual By dal.novanet.ca Published On :: Fri, 1 May 2020 19:44:43 -0300 Callnumber: OnlineISBN: 9783030273750 978-3-030-27375-0 Full Article
base Bootstrap confidence regions based on M-estimators under nonstandard conditions By projecteuclid.org Published On :: Mon, 17 Feb 2020 04:02 EST Stephen M. S. Lee, Puyudi Yang. Source: The Annals of Statistics, Volume 48, Number 1, 274--299.Abstract: Suppose that a confidence region is desired for a subvector $ heta $ of a multidimensional parameter $xi =( heta ,psi )$, based on an M-estimator $hat{xi }_{n}=(hat{ heta }_{n},hat{psi }_{n})$ calculated from a random sample of size $n$. Under nonstandard conditions $hat{xi }_{n}$ often converges at a nonregular rate $r_{n}$, in which case consistent estimation of the distribution of $r_{n}(hat{ heta }_{n}- heta )$, a pivot commonly chosen for confidence region construction, is most conveniently effected by the $m$ out of $n$ bootstrap. The above choice of pivot has three drawbacks: (i) the shape of the region is either subjectively prescribed or controlled by a computationally intensive depth function; (ii) the region is not transformation equivariant; (iii) $hat{xi }_{n}$ may not be uniquely defined. To resolve the above difficulties, we propose a one-dimensional pivot derived from the criterion function, and prove that its distribution can be consistently estimated by the $m$ out of $n$ bootstrap, or by a modified version of the perturbation bootstrap. This leads to a new method for constructing confidence regions which are transformation equivariant and have shapes driven solely by the criterion function. A subsampling procedure is proposed for selecting $m$ in practice. Empirical performance of the new method is illustrated with examples drawn from different nonstandard M-estimation settings. Extension of our theory to row-wise independent triangular arrays is also explored. Full Article
base Envelope-based sparse partial least squares By projecteuclid.org Published On :: Mon, 17 Feb 2020 04:02 EST Guangyu Zhu, Zhihua Su. Source: The Annals of Statistics, Volume 48, Number 1, 161--182.Abstract: Sparse partial least squares (SPLS) is widely used in applied sciences as a method that performs dimension reduction and variable selection simultaneously in linear regression. Several implementations of SPLS have been derived, among which the SPLS proposed in Chun and Keleş ( J. R. Stat. Soc. Ser. B. Stat. Methodol. 72 (2010) 3–25) is very popular and highly cited. However, for all of these implementations, the theoretical properties of SPLS are largely unknown. In this paper, we propose a new version of SPLS, called the envelope-based SPLS, using a connection between envelope models and partial least squares (PLS). We establish the consistency, oracle property and asymptotic normality of the envelope-based SPLS estimator. The large-sample scenario and high-dimensional scenario are both considered. We also develop the envelope-based SPLS estimators under the context of generalized linear models, and discuss its theoretical properties including consistency, oracle property and asymptotic distribution. Numerical experiments and examples show that the envelope-based SPLS estimator has better variable selection and prediction performance over the SPLS estimator ( J. R. Stat. Soc. Ser. B. Stat. Methodol. 72 (2010) 3–25). Full Article
base Minimax posterior convergence rates and model selection consistency in high-dimensional DAG models based on sparse Cholesky factors By projecteuclid.org Published On :: Wed, 30 Oct 2019 22:03 EDT Kyoungjae Lee, Jaeyong Lee, Lizhen Lin. Source: The Annals of Statistics, Volume 47, Number 6, 3413--3437.Abstract: In this paper we study the high-dimensional sparse directed acyclic graph (DAG) models under the empirical sparse Cholesky prior. Among our results, strong model selection consistency or graph selection consistency is obtained under more general conditions than those in the existing literature. Compared to Cao, Khare and Ghosh [ Ann. Statist. (2019) 47 319–348], the required conditions are weakened in terms of the dimensionality, sparsity and lower bound of the nonzero elements in the Cholesky factor. Furthermore, our result does not require the irrepresentable condition, which is necessary for Lasso-type methods. We also derive the posterior convergence rates for precision matrices and Cholesky factors with respect to various matrix norms. The obtained posterior convergence rates are the fastest among those of the existing Bayesian approaches. In particular, we prove that our posterior convergence rates for Cholesky factors are the minimax or at least nearly minimax depending on the relative size of true sparseness for the entire dimension. The simulation study confirms that the proposed method outperforms the competing methods. Full Article
base A hierarchical curve-based approach to the analysis of manifold data By projecteuclid.org Published On :: Wed, 27 Nov 2019 22:01 EST Liberty Vittert, Adrian W. Bowman, Stanislav Katina. Source: The Annals of Applied Statistics, Volume 13, Number 4, 2539--2563.Abstract: One of the data structures generated by medical imaging technology is high resolution point clouds representing anatomical surfaces. Stereophotogrammetry and laser scanning are two widely available sources of this kind of data. A standardised surface representation is required to provide a meaningful correspondence across different images as a basis for statistical analysis. Point locations with anatomical definitions, referred to as landmarks, have been the traditional approach. Landmarks can also be taken as the starting point for more general surface representations, often using templates which are warped on to an observed surface by matching landmark positions and subsequent local adjustment of the surface. The aim of the present paper is to provide a new approach which places anatomical curves at the heart of the surface representation and its analysis. Curves provide intermediate structures which capture the principal features of the manifold (surface) of interest through its ridges and valleys. As landmarks are often available these are used as anchoring points, but surface curvature information is the principal guide in estimating the curve locations. The surface patches between these curves are relatively flat and can be represented in a standardised manner by appropriate surface transects to give a complete surface model. This new approach does not require the use of a template, reference sample or any external information to guide the method and, when compared with a surface based approach, the estimation of curves is shown to have improved performance. In addition, examples involving applications to mussel shells and human faces show that the analysis of curve information can deliver more targeted and effective insight than the use of full surface information. Full Article
base Outline analyses of the called strike zone in Major League Baseball By projecteuclid.org Published On :: Wed, 27 Nov 2019 22:01 EST Dale L. Zimmerman, Jun Tang, Rui Huang. Source: The Annals of Applied Statistics, Volume 13, Number 4, 2416--2451.Abstract: We extend statistical shape analytic methods known as outline analysis for application to the strike zone, a central feature of the game of baseball. Although the strike zone is rigorously defined by Major League Baseball’s official rules, umpires make mistakes in calling pitches as strikes (and balls) and may even adhere to a strike zone somewhat different than that prescribed by the rule book. Our methods yield inference on geometric attributes (centroid, dimensions, orientation and shape) of this “called strike zone” (CSZ) and on the effects that years, umpires, player attributes, game situation factors and their interactions have on those attributes. The methodology consists of first using kernel discriminant analysis to determine a noisy outline representing the CSZ corresponding to each factor combination, then fitting existing elliptic Fourier and new generalized superelliptic models for closed curves to that outline and finally analyzing the fitted model coefficients using standard methods of regression analysis, factorial analysis of variance and variance component estimation. We apply these methods to PITCHf/x data comprising more than three million called pitches from the 2008–2016 Major League Baseball seasons to address numerous questions about the CSZ. We find that all geometric attributes of the CSZ, except its size, became significantly more like those of the rule-book strike zone from 2008–2016 and that several player attribute/game situation factors had statistically and practically significant effects on many of them. We also establish that the variation in the horizontal center, width and area of an individual umpire’s CSZ from pitch to pitch is smaller than their variation among CSZs from different umpires. Full Article
base A latent discrete Markov random field approach to identifying and classifying historical forest communities based on spatial multivariate tree species counts By projecteuclid.org Published On :: Wed, 27 Nov 2019 22:01 EST Stephen Berg, Jun Zhu, Murray K. Clayton, Monika E. Shea, David J. Mladenoff. Source: The Annals of Applied Statistics, Volume 13, Number 4, 2312--2340.Abstract: The Wisconsin Public Land Survey database describes historical forest composition at high spatial resolution and is of interest in ecological studies of forest composition in Wisconsin just prior to significant Euro-American settlement. For such studies it is useful to identify recurring subpopulations of tree species known as communities, but standard clustering approaches for subpopulation identification do not account for dependence between spatially nearby observations. Here, we develop and fit a latent discrete Markov random field model for the purpose of identifying and classifying historical forest communities based on spatially referenced multivariate tree species counts across Wisconsin. We show empirically for the actual dataset and through simulation that our latent Markov random field modeling approach improves prediction and parameter estimation performance. For model fitting we introduce a new stochastic approximation algorithm which enables computationally efficient estimation and classification of large amounts of spatial multivariate count data. Full Article
base Fitting a deeply nested hierarchical model to a large book review dataset using a moment-based estimator By projecteuclid.org Published On :: Wed, 27 Nov 2019 22:01 EST Ningshan Zhang, Kyle Schmaus, Patrick O. Perry. Source: The Annals of Applied Statistics, Volume 13, Number 4, 2260--2288.Abstract: We consider a particular instance of a common problem in recommender systems, using a database of book reviews to inform user-targeted recommendations. In our dataset, books are categorized into genres and subgenres. To exploit this nested taxonomy, we use a hierarchical model that enables information pooling across across similar items at many levels within the genre hierarchy. The main challenge in deploying this model is computational. The data sizes are large and fitting the model at scale using off-the-shelf maximum likelihood procedures is prohibitive. To get around this computational bottleneck, we extend a moment-based fitting procedure proposed for fitting single-level hierarchical models to the general case of arbitrarily deep hierarchies. This extension is an order of magnitude faster than standard maximum likelihood procedures. The fitting method can be deployed beyond recommender systems to general contexts with deeply nested hierarchical generalized linear mixed models. Full Article
base Radio-iBAG: Radiomics-based integrative Bayesian analysis of multiplatform genomic data By projecteuclid.org Published On :: Wed, 16 Oct 2019 22:03 EDT Youyi Zhang, Jeffrey S. Morris, Shivali Narang Aerry, Arvind U. K. Rao, Veerabhadran Baladandayuthapani. Source: The Annals of Applied Statistics, Volume 13, Number 3, 1957--1988.Abstract: Technological innovations have produced large multi-modal datasets that include imaging and multi-platform genomics data. Integrative analyses of such data have the potential to reveal important biological and clinical insights into complex diseases like cancer. In this paper, we present Bayesian approaches for integrative analysis of radiological imaging and multi-platform genomic data, where-in our goals are to simultaneously identify genomic and radiomic, that is, radiology-based imaging markers, along with the latent associations between these two modalities, and to detect the overall prognostic relevance of the combined markers. For this task, we propose Radio-iBAG: Radiomics-based Integrative Bayesian Analysis of Multiplatform Genomic Data , a multi-scale Bayesian hierarchical model that involves several innovative strategies: it incorporates integrative analysis of multi-platform genomic data sets to capture fundamental biological relationships; explores the associations between radiomic markers accompanying genomic information with clinical outcomes; and detects genomic and radiomic markers associated with clinical prognosis. We also introduce the use of sparse Principal Component Analysis (sPCA) to extract a sparse set of approximately orthogonal meta-features each containing information from a set of related individual radiomic features, reducing dimensionality and combining like features. Our methods are motivated by and applied to The Cancer Genome Atlas glioblastoma multiforme data set, where-in we integrate magnetic resonance imaging-based biomarkers along with genomic, epigenomic and transcriptomic data. Our model identifies important magnetic resonance imaging features and the associated genomic platforms that are related with patient survival times. Full Article
base Interacting reinforced stochastic processes: Statistical inference based on the weighted empirical means By projecteuclid.org Published On :: Fri, 31 Jan 2020 04:06 EST Giacomo Aletti, Irene Crimaldi, Andrea Ghiglietti. Source: Bernoulli, Volume 26, Number 2, 1098--1138.Abstract: This work deals with a system of interacting reinforced stochastic processes , where each process $X^{j}=(X_{n,j})_{n}$ is located at a vertex $j$ of a finite weighted directed graph, and it can be interpreted as the sequence of “actions” adopted by an agent $j$ of the network. The interaction among the dynamics of these processes depends on the weighted adjacency matrix $W$ associated to the underlying graph: indeed, the probability that an agent $j$ chooses a certain action depends on its personal “inclination” $Z_{n,j}$ and on the inclinations $Z_{n,h}$, with $h eq j$, of the other agents according to the entries of $W$. The best known example of reinforced stochastic process is the Pólya urn. The present paper focuses on the weighted empirical means $N_{n,j}=sum_{k=1}^{n}q_{n,k}X_{k,j}$, since, for example, the current experience is more important than the past one in reinforced learning. Their almost sure synchronization and some central limit theorems in the sense of stable convergence are proven. The new approach with weighted means highlights the key points in proving some recent results for the personal inclinations $Z^{j}=(Z_{n,j})_{n}$ and for the empirical means $overline{X}^{j}=(sum_{k=1}^{n}X_{k,j}/n)_{n}$ given in recent papers (e.g. Aletti, Crimaldi and Ghiglietti (2019), Ann. Appl. Probab. 27 (2017) 3787–3844, Crimaldi et al. Stochastic Process. Appl. 129 (2019) 70–101). In fact, with a more sophisticated decomposition of the considered processes, we can understand how the different convergence rates of the involved stochastic processes combine. From an application point of view, we provide confidence intervals for the common limit inclination of the agents and a test statistics to make inference on the matrix $W$, based on the weighted empirical means. In particular, we answer a research question posed in Aletti, Crimaldi and Ghiglietti (2019). Full Article
base A unified principled framework for resampling based on pseudo-populations: Asymptotic theory By projecteuclid.org Published On :: Fri, 31 Jan 2020 04:06 EST Pier Luigi Conti, Daniela Marella, Fulvia Mecatti, Federico Andreis. Source: Bernoulli, Volume 26, Number 2, 1044--1069.Abstract: In this paper, a class of resampling techniques for finite populations under $pi $ps sampling design is introduced. The basic idea on which they rest is a two-step procedure consisting in: (i) constructing a “pseudo-population” on the basis of sample data; (ii) drawing a sample from the predicted population according to an appropriate resampling design. From a logical point of view, this approach is essentially based on the plug-in principle by Efron, at the “sampling design level”. Theoretical justifications based on large sample theory are provided. New approaches to construct pseudo populations based on various forms of calibrations are proposed. Finally, a simulation study is performed. Full Article