ap

A simulation study of disaggregation regression for spatial disease mapping. (arXiv:2005.03604v1 [stat.AP])

Disaggregation regression has become an important tool in spatial disease mapping for making fine-scale predictions of disease risk from aggregated response data. By including high resolution covariate information and modelling the data generating process on a fine scale, it is hoped that these models can accurately learn the relationships between covariates and response at a fine spatial scale. However, validating these high resolution predictions can be a challenge, as often there is no data observed at this spatial scale. In this study, disaggregation regression was performed on simulated data in various settings and the resulting fine-scale predictions are compared to the simulated ground truth. Performance was investigated with varying numbers of data points, sizes of aggregated areas and levels of model misspecification. The effectiveness of cross validation on the aggregate level as a measure of fine-scale predictive performance was also investigated. Predictive performance improved as the number of observations increased and as the size of the aggregated areas decreased. When the model was well-specified, fine-scale predictions were accurate even with small numbers of observations and large aggregated areas. Under model misspecification predictive performance was significantly worse for large aggregated areas but remained high when response data was aggregated over smaller regions. Cross-validation correlation on the aggregate level was a moderately good predictor of fine-scale predictive performance. While the simulations are unlikely to capture the nuances of real-life response data, this study gives insight into the effectiveness of disaggregation regression in different contexts.




ap

Domain Adaptation in Highly Imbalanced and Overlapping Datasets. (arXiv:2005.03585v1 [cs.LG])

In many Machine Learning domains, datasets are characterized by highly imbalanced and overlapping classes. Particularly in the medical domain, a specific list of symptoms can be labeled as one of various different conditions. Some of these conditions may be more prevalent than others by several orders of magnitude. Here we present a novel unsupervised Domain Adaptation scheme for such datasets. The scheme, based on a specific type of Quantification, is designed to work under both label and conditional shifts. It is demonstrated on datasets generated from Electronic Health Records and provides high quality results for both Quantification and Domain Adaptation in very challenging scenarios. Potential benefits of using this scheme in the current COVID-19 outbreak, for estimation of prevalence and probability of infection, are discussed.




ap

Predictive Modeling of ICU Healthcare-Associated Infections from Imbalanced Data. Using Ensembles and a Clustering-Based Undersampling Approach. (arXiv:2005.03582v1 [cs.LG])

Early detection of patients vulnerable to infections acquired in the hospital environment is a challenge in current health systems given the impact that such infections have on patient mortality and healthcare costs. This work is focused on both the identification of risk factors and the prediction of healthcare-associated infections in intensive-care units by means of machine-learning methods. The aim is to support decision making addressed at reducing the incidence rate of infections. In this field, it is necessary to deal with the problem of building reliable classifiers from imbalanced datasets. We propose a clustering-based undersampling strategy to be used in combination with ensemble classifiers. A comparative study with data from 4616 patients was conducted in order to validate our proposal. We applied several single and ensemble classifiers both to the original dataset and to data preprocessed by means of different resampling methods. The results were analyzed by means of classic and recent metrics specifically designed for imbalanced data classification. They revealed that the proposal is more efficient in comparison with other approaches.




ap

Sequential Aggregation of Probabilistic Forecasts -- Applicaton to Wind Speed Ensemble Forecasts. (arXiv:2005.03540v1 [stat.AP])

In the field of numerical weather prediction (NWP), the probabilistic distribution of the future state of the atmosphere is sampled with Monte-Carlo-like simulations, called ensembles. These ensembles have deficiencies (such as conditional biases) that can be corrected thanks to statistical post-processing methods. Several ensembles exist and may be corrected with different statistiscal methods. A further step is to combine these raw or post-processed ensembles. The theory of prediction with expert advice allows us to build combination algorithms with theoretical guarantees on the forecast performance. This article adapts this theory to the case of probabilistic forecasts issued as step-wise cumulative distribution functions (CDF). The theory is applied to wind speed forecasting, by combining several raw or post-processed ensembles, considered as CDFs. The second goal of this study is to explore the use of two forecast performance criteria: the Continous ranked probability score (CRPS) and the Jolliffe-Primo test. Comparing the results obtained with both criteria leads to reconsidering the usual way to build skillful probabilistic forecasts, based on the minimization of the CRPS. Minimizing the CRPS does not necessarily produce reliable forecasts according to the Jolliffe-Primo test. The Jolliffe-Primo test generally selects reliable forecasts, but could lead to issuing suboptimal forecasts in terms of CRPS. It is proposed to use both criterion to achieve reliable and skillful probabilistic forecasts.




ap

A Locally Adaptive Interpretable Regression. (arXiv:2005.03350v1 [stat.ML])

Machine learning models with both good predictability and high interpretability are crucial for decision support systems. Linear regression is one of the most interpretable prediction models. However, the linearity in a simple linear regression worsens its predictability. In this work, we introduce a locally adaptive interpretable regression (LoAIR). In LoAIR, a metamodel parameterized by neural networks predicts percentile of a Gaussian distribution for the regression coefficients for a rapid adaptation. Our experimental results on public benchmark datasets show that our model not only achieves comparable or better predictive performance than the other state-of-the-art baselines but also discovers some interesting relationships between input and target variables such as a parabolic relationship between CO2 emissions and Gross National Product (GNP). Therefore, LoAIR is a step towards bridging the gap between econometrics, statistics, and machine learning by improving the predictive ability of linear regression without depreciating its interpretability.




ap

Reducing Communication in Graph Neural Network Training. (arXiv:2005.03300v1 [cs.LG])

Graph Neural Networks (GNNs) are powerful and flexible neural networks that use the naturally sparse connectivity information of the data. GNNs represent this connectivity as sparse matrices, which have lower arithmetic intensity and thus higher communication costs compared to dense matrices, making GNNs harder to scale to high concurrencies than convolutional or fully-connected neural networks.

We present a family of parallel algorithms for training GNNs. These algorithms are based on their counterparts in dense and sparse linear algebra, but they had not been previously applied to GNN training. We show that they can asymptotically reduce communication compared to existing parallel GNN training methods. We implement a promising and practical version that is based on 2D sparse-dense matrix multiplication using torch.distributed. Our implementation parallelizes over GPU-equipped clusters. We train GNNs on up to a hundred GPUs on datasets that include a protein network with over a billion edges.




ap

Classification of pediatric pneumonia using chest X-rays by functional regression. (arXiv:2005.03243v1 [stat.AP])

An accurate and prompt diagnosis of pediatric pneumonia is imperative for successful treatment intervention. One approach to diagnose pneumonia cases is using radiographic data. In this article, we propose a novel parsimonious scalar-on-image classification model adopting the ideas of functional data analysis. Our main idea is to treat images as functional measurements and exploit underlying covariance structures to select basis functions; these bases are then used in approximating both image profiles and corresponding regression coefficient. We re-express the regression model into a standard generalized linear model where the functional principal component scores are treated as covariates. We apply the method to (1) classify pneumonia against healthy and viral against bacterial pneumonia patients, and (2) test the null effect about the association between images and responses. Extensive simulation studies show excellent numerical performance in terms of classification, hypothesis testing, and efficient computation.




ap

Subdomain Adaptation with Manifolds Discrepancy Alignment. (arXiv:2005.03229v1 [cs.LG])

Reducing domain divergence is a key step in transfer learning problems. Existing works focus on the minimization of global domain divergence. However, two domains may consist of several shared subdomains, and differ from each other in each subdomain. In this paper, we take the local divergence of subdomains into account in transfer. Specifically, we propose to use low-dimensional manifold to represent subdomain, and align the local data distribution discrepancy in each manifold across domains. A Manifold Maximum Mean Discrepancy (M3D) is developed to measure the local distribution discrepancy in each manifold. We then propose a general framework, called Transfer with Manifolds Discrepancy Alignment (TMDA), to couple the discovery of data manifolds with the minimization of M3D. We instantiate TMDA in the subspace learning case considering both the linear and nonlinear mappings. We also instantiate TMDA in the deep learning framework. Extensive experimental studies demonstrate that TMDA is a promising method for various transfer learning tasks.




ap

Adaptive Invariance for Molecule Property Prediction. (arXiv:2005.03004v1 [q-bio.QM])

Effective property prediction methods can help accelerate the search for COVID-19 antivirals either through accurate in-silico screens or by effectively guiding on-going at-scale experimental efforts. However, existing prediction tools have limited ability to accommodate scarce or fragmented training data currently available. In this paper, we introduce a novel approach to learn predictors that can generalize or extrapolate beyond the heterogeneous data. Our method builds on and extends recently proposed invariant risk minimization, adaptively forcing the predictor to avoid nuisance variation. We achieve this by continually exercising and manipulating latent representations of molecules to highlight undesirable variation to the predictor. To test the method we use a combination of three data sources: SARS-CoV-2 antiviral screening data, molecular fragments that bind to SARS-CoV-2 main protease and large screening data for SARS-CoV-1. Our predictor outperforms state-of-the-art transfer learning methods by significant margin. We also report the top 20 predictions of our model on Broad drug repurposing hub.




ap

Entries now open for the 2020 National Biography Award

Tuesday 10 December 2019

Entries are now open for the 2020 National Biography Award – Australia's richest prize for biography and memoir writing.




ap

mgm: Estimating Time-Varying Mixed Graphical Models in High-Dimensional Data

We present the R package mgm for the estimation of k-order mixed graphical models (MGMs) and mixed vector autoregressive (mVAR) models in high-dimensional data. These are a useful extensions of graphical models for only one variable type, since data sets consisting of mixed types of variables (continuous, count, categorical) are ubiquitous. In addition, we allow to relax the stationarity assumption of both models by introducing time-varying versions of MGMs and mVAR models based on a kernel weighting approach. Time-varying models offer a rich description of temporally evolving systems and allow to identify external influences on the model structure such as the impact of interventions. We provide the background of all implemented methods and provide fully reproducible examples that illustrate how to use the package.




ap

Wine science : principles and applications

Jackson, Ron S., author.
9780128161180




ap

Urban landscape entomology

Held, David W. (David Wayne), 1972- author
9780128130728 (electronic bk.)




ap

Tumor microenvironment : the main driver of metabolic adaptation

9783030340254 (electronic bk.)




ap

Theranostics approaches to gastric and colon cancer

9789811520174 (electronic bk.)




ap

The science of grapevines

Keller, Markus, (horticulturist) author
9780128167021 (electronic bk.)




ap

The complexity of bird behaviour : a facet theory approach

Hackett, Paul, 1960- author
9783030121921 (electronic bk.)




ap

The Best and Worst Places to be a Woman in Canada 2019 : The Gender Gap in Canada’s 26 Biggest Cities

9781771254434 (print)




ap

Temporomandibular disorders : a translational approach from basic science to clinical applicability

9783319572475 (electronic bk.)




ap

Systems approaches to making change : a practical guide

9781447174721 (electronic bk.)




ap

Structured object-oriented formal language and method : 9th International Workshop, SOFL+MSVL 2019, Shenzhen, China, November 5, 2019, Revised selected papers

SOFL+MSVL (Workshop) (9th : 2019 : Shenzhen, China)
9783030414184 (electronic bk.)




ap

Space information networks : 4th International Conference, SINC 2019, Wuzhen, China, September 19-20, 2019, Revised Selected Papers

SINC (Conference) (4th : 2019 : Wuzhen, China)
9789811534423 (electronic bk.)




ap

Sowing legume seeds, reaping cash : a renaissance within communities in Sub-Saharan Africa

Akpo, Essegbemon, author.
9789811508455 (electronic bk.)




ap

Semantic technology : 9th Joint International Conference, JIST 2019, Hangzhou, China, November 25-27, 2019, Revised selected papers

Joint International Semantic Technology Conference (9th : 2019 : Hangzhou, China)
9789811534126 (electronic bk.)




ap

Salt, fat and sugar reduction : sensory approaches for nutritional reformulation of foods and beverages

O'Sullivan, Maurice G., author
9780128226124 (electronic bk.)




ap

Regulation of cancer immune checkpoints : molecular and cellular mechanisms and therapy

9789811532665




ap

Rapid Recovery in Total Joint Arthroplasty

9783030412234 978-3-030-41223-4




ap

Plant-fire interactions : applying ecophysiology to wildfire management

Resco de Dios, Víctor, author
9783030411923 (electronic book)




ap

Plant small RNA : biogenesis, regulation and application

9780128173367 (electronic bk.)




ap

Plant microRNAs : shaping development and environmental responses

9783030357726 (electronic bk.)




ap

Phytoremediation : in-situ applications

9783030000998 (electronic bk.)




ap

Personalized food intervention and therapy for autism spectrum disorder management

9783030304027 (electronic bk.)




ap

Ocular therapeutics handbook : a clinical manual

Onofrey, Bruce E., author.
197510904X




ap

Neonatal lung ultrasonography

9789402415490 (electronic bk.)




ap

Natural materials and products from insects : chemistry and applications

9783030366100 (electronic bk.)




ap

Nanoencapsulation of food ingredients by specialized equipment

9780128156728 (electronic bk.)




ap

Nanobiomaterial engineering : concepts and their applications in biomedicine and diagnostics

9789813298408 (electronic bk.)




ap

Models of tree and stand dynamics : theory, formulation and application

Mäkelä, Annikki, author
9783030357610




ap

Microbial endophytes : functional biology and applications

9780128196540 (print)




ap

Maxillofacial cone beam computed tomography : principles, techniques and clinical applications

9783319620619 (electronic bk.)




ap

Landscape modelling and decision support

9783030374211 (electronic bk.)




ap

Ketamine : from abused drug to rapid-acting antidepressant

9789811529023




ap

Intelligent wavelet based techniques for advanced multimedia applications

Singh, Rajiv, author
9783030318734 (electronic bk.)




ap

Geriatric Medicine : a Problem-Based Approach

9789811032530




ap

Gapenski's understanding healthcare financial management

Pink, George H., author.
9781640551145 (electronic bk.)




ap

Functional foods in cancer prevention and therapy

9780128165386 (electronic bk.)




ap

Extra-coronal restorations : concepts and clinical application

9783319790930 (electronic bk.)




ap

Enterprise information systems : 21st International Conference, ICEIS 2019, Heraklion, Crete, Greece, May 3-5, 2019, Revised Selected Papers

International Conference on Enterprise Information Systems (21st : 2019 : Ērakleion, Greece)
9783030407834 (electronic bk.)




ap

Deep learning in medical image analysis : challenges and applications

9783030331283 (electronic bk.)




ap

Current microbiological research in Africa : selected applications for sustainable environmental management

9783030352967 (electronic bk.)