prediction

Predictions 2020: Facebook Caves, Google Zags, Netflix Sells Out, and Data Policy Gets Sexy

A new year brings another run at my annual predictions: For 17 years now, I’ve taken a few hours to imagine what might happen over the course of the coming twelve months. And my goodness did I swing for the fences last year — and I pretty much whiffed. Batting .300 is great in the majors, but it … Continue reading "Predictions 2020: Facebook Caves, Google Zags, Netflix Sells Out, and Data Policy Gets Sexy"




prediction

Predictions and Policymaking: Complex Modelling Beyond COVID-19

1 April 2020

Yasmin Afina

Research Assistant, International Security Programme

Calum Inverarity

Research Analyst and Coordinator, International Security Programme
The COVID-19 pandemic has highlighted the potential of complex systems modelling for policymaking but it is crucial to also understand its limitations.

GettyImages-1208425931.jpg

A member of the media wearing a protective face mask works in Downing Street where Britain's Prime Minister Boris Johnson is self-isolating in central London, 27 March 2020. Photo by TOLGA AKMEN/AFP via Getty Images.

Complex systems models have played a significant role in informing and shaping the public health measures adopted by governments in the context of the COVID-19 pandemic. For instance, modelling carried out by a team at Imperial College London is widely reported to have driven the approach in the UK from a strategy of mitigation to one of suppression.

Complex systems modelling will increasingly feed into policymaking by predicting a range of potential correlations, results and outcomes based on a set of parameters, assumptions, data and pre-defined interactions. It is already instrumental in developing risk mitigation and resilience measures to address and prepare for existential crises such as pandemics, prospects of a nuclear war, as well as climate change.

The human factor

In the end, model-driven approaches must stand up to the test of real-life data. Modelling for policymaking must take into account a number of caveats and limitations. Models are developed to help answer specific questions, and their predictions will depend on the hypotheses and definitions set by the modellers, which are subject to their individual and collective biases and assumptions. For instance, the models developed by Imperial College came with the caveated assumption that a policy of social distancing for people over 70 will have a 75 per cent compliance rate. This assumption is based on the modellers’ own perceptions of demographics and society, and may not reflect all societal factors that could impact this compliance rate in real life, such as gender, age, ethnicity, genetic diversity, economic stability, as well as access to food, supplies and healthcare. This is why modelling benefits from a cognitively diverse team who bring a wide range of knowledge and understanding to the early creation of a model.

The potential of artificial intelligence

Machine learning, or artificial intelligence (AI), has the potential to advance the capacity and accuracy of modelling techniques by identifying new patterns and interactions, and overcoming some of the limitations resulting from human assumptions and bias. Yet, increasing reliance on these techniques raises the issue of explainability. Policymakers need to be fully aware and understand the model, assumptions and input data behind any predictions and must be able to communicate this aspect of modelling in order to uphold democratic accountability and transparency in public decision-making.

In addition, models using machine learning techniques require extensive amounts of data, which must also be of high quality and as free from bias as possible to ensure accuracy and address the issues at stake. Although technology may be used in the process (i.e. automated extraction and processing of information with big data), data is ultimately created, collected, aggregated and analysed by and for human users. Datasets will reflect the individual and collective biases and assumptions of those creating, collecting, processing and analysing this data. Algorithmic bias is inevitable, and it is essential that policy- and decision-makers are fully aware of how reliable the systems are, as well as their potential social implications.

The age of distrust

Increasing use of emerging technologies for data- and evidence-based policymaking is taking place, paradoxically, in an era of growing mistrust towards expertise and experts, as infamously surmised by Michael Gove. Policymakers and subject-matter experts have faced increased public scrutiny of their findings and the resultant policies that they have been used to justify.

This distrust and scepticism within public discourse has only been fuelled by an ever-increasing availability of diffuse sources of information, not all of which are verifiable and robust. This has caused tension between experts, policymakers and public, which has led to conflicts and uncertainty over what data and predictions can be trusted, and to what degree. This dynamic is exacerbated when considering that certain individuals may purposefully misappropriate, or simply misinterpret, data to support their argument or policies. Politicians are presently considered the least trusted professionals by the UK public, highlighting the importance of better and more effective communication between the scientific community, policymakers and the populations affected by policy decisions.

Acknowledging limitations

While measures can and should be built in to improve the transparency and robustness of scientific models in order to counteract these common criticisms, it is important to acknowledge that there are limitations to the steps that can be taken. This is particularly the case when dealing with predictions of future events, which inherently involve degrees of uncertainty that cannot be fully accounted for by human or machine. As a result, if not carefully considered and communicated, the increased use of complex modelling in policymaking holds the potential to undermine and obfuscate the policymaking process, which may contribute towards significant mistakes being made, increased uncertainty, lack of trust in the models and in the political process and further disaffection of citizens.

The potential contribution of complexity modelling to the work of policymakers is undeniable. However, it is imperative to appreciate the inner workings and limitations of these models, such as the biases that underpin their functioning and the uncertainties that they will not be fully capable of accounting for, in spite of their immense power. They must be tested against the data, again and again, as new information becomes available or there is a risk of scientific models becoming embroiled in partisan politicization and potentially weaponized for political purposes. It is therefore important not to consider these models as oracles, but instead as one of many contributions to the process of policymaking.




prediction

Phosphotyrosine-based Phosphoproteomics for Target Identification and Drug Response Prediction in AML Cell Lines [Research]

Acute myeloid leukemia (AML) is a clonal disorder arising from hematopoietic myeloid progenitors. Aberrantly activated tyrosine kinases (TK) are involved in leukemogenesis and are associated with poor treatment outcome. Kinase inhibitor (KI) treatment has shown promise in improving patient outcome in AML. However, inhibitor selection for patients is suboptimal.

In a preclinical effort to address KI selection, we analyzed a panel of 16 AML cell lines using phosphotyrosine (pY) enrichment-based, label-free phosphoproteomics. The Integrative Inferred Kinase Activity (INKA) algorithm was used to identify hyperphosphorylated, active kinases as candidates for KI treatment, and efficacy of selected KIs was tested.

Heterogeneous signaling was observed with between 241 and 2764 phosphopeptides detected per cell line. Of 4853 identified phosphopeptides with 4229 phosphosites, 4459 phosphopeptides (4430 pY) were linked to 3605 class I sites (3525 pY). INKA analysis in single cell lines successfully pinpointed driver kinases (PDGFRA, JAK2, KIT and FLT3) corresponding with activating mutations present in these cell lines. Furthermore, potential receptor tyrosine kinase (RTK) drivers, undetected by standard molecular analyses, were identified in four cell lines (FGFR1 in KG-1 and KG-1a, PDGFRA in Kasumi-3, and FLT3 in MM6). These cell lines proved highly sensitive to specific KIs. Six AML cell lines without a clear RTK driver showed evidence of MAPK1/3 activation, indicative of the presence of activating upstream RAS mutations. Importantly, FLT3 phosphorylation was demonstrated in two clinical AML samples with a FLT3 internal tandem duplication (ITD) mutation.

Our data show the potential of pY-phosphoproteomics and INKA analysis to provide insight in AML TK signaling and identify hyperactive kinases as potential targets for treatment in AML cell lines. These results warrant future investigation of clinical samples to further our understanding of TK phosphorylation in relation to clinical response in the individual patient.




prediction

Ocean acidification prediction now possible years in advance

(University of Colorado at Boulder) CU Boulder researchers have developed a method that could enable scientists to accurately forecast ocean acidity up to five years in advance. This would enable fisheries and communities that depend on seafood negatively affected by ocean acidification to adapt to changing conditions in real time, improving economic and food security in the next few decades.




prediction

Artificial Intelligence Prediction and Counterterrorism

9 August 2019

The use of AI in counterterrorism is not inherently wrong, and this paper suggests some necessary conditions for legitimate use of AI as part of a predictive approach to counterterrorism on the part of liberal democratic states.

Kathleen McKendrick

British Army Officer, Former Visiting Research Fellow at Chatham House

2019-08-06-AICounterterrorism.jpg

Surveillance cameras manufactured by Hangzhou Hikvision Digital Technology Co. at a testing station near the company’s headquarters in Hangzhou, China. Photo: Getty Images

Summary

  • The use of predictive artificial intelligence (AI) in countering terrorism is often assumed to have a deleterious effect on human rights, generating spectres of ‘pre-crime’ punishment and surveillance states. However, the well-regulated use of new capabilities may enhance states’ abilities to protect citizens’ right to life, while at the same time improving adherence to principles intended to protect other human rights, such as transparency, proportionality and freedom from unfair discrimination. The same regulatory framework could also contribute to safeguarding against broader misuse of related technologies.
  • Most states focus on preventing terrorist attacks, rather than reacting to them. As such, prediction is already central to effective counterterrorism. AI allows higher volumes of data to be analysed, and may perceive patterns in those data that would, for reasons of both volume and dimensionality, otherwise be beyond the capacity of human interpretation. The impact of this is that traditional methods of investigation that work outwards from known suspects may be supplemented by methods that analyse the activity of a broad section of an entire population to identify previously unknown threats.
  • Developments in AI have amplified the ability to conduct surveillance without being constrained by resources. Facial recognition technology, for instance, may enable the complete automation of surveillance using CCTV in public places in the near future.
  • The current way predictive AI capabilities are used presents a number of interrelated problems from both a human rights and a practical perspective. Where limitations and regulations do exist, they may have the effect of curtailing the utility of approaches that apply AI, while not necessarily safeguarding human rights to an adequate extent.
  • The infringement of privacy associated with the automated analysis of certain types of public data is not wrong in principle, but the analysis must be conducted within a robust legal and policy framework that places sensible limitations on interventions based on its results.
  • In future, broader access to less intrusive aspects of public data, direct regulation of how those data are used – including oversight of activities by private-sector actors – and the imposition of technical as well as regulatory safeguards may improve both operational performance and compliance with human rights legislation. It is important that any such measures proceed in a manner that is sensitive to the impact on other rights such as freedom of expression, and freedom of association and assembly.




prediction

Predictions and Policymaking: Complex Modelling Beyond COVID-19

1 April 2020

Yasmin Afina

Research Assistant, International Security Programme

Calum Inverarity

Research Analyst and Coordinator, International Security Programme
The COVID-19 pandemic has highlighted the potential of complex systems modelling for policymaking but it is crucial to also understand its limitations.

GettyImages-1208425931.jpg

A member of the media wearing a protective face mask works in Downing Street where Britain's Prime Minister Boris Johnson is self-isolating in central London, 27 March 2020. Photo by TOLGA AKMEN/AFP via Getty Images.

Complex systems models have played a significant role in informing and shaping the public health measures adopted by governments in the context of the COVID-19 pandemic. For instance, modelling carried out by a team at Imperial College London is widely reported to have driven the approach in the UK from a strategy of mitigation to one of suppression.

Complex systems modelling will increasingly feed into policymaking by predicting a range of potential correlations, results and outcomes based on a set of parameters, assumptions, data and pre-defined interactions. It is already instrumental in developing risk mitigation and resilience measures to address and prepare for existential crises such as pandemics, prospects of a nuclear war, as well as climate change.

The human factor

In the end, model-driven approaches must stand up to the test of real-life data. Modelling for policymaking must take into account a number of caveats and limitations. Models are developed to help answer specific questions, and their predictions will depend on the hypotheses and definitions set by the modellers, which are subject to their individual and collective biases and assumptions. For instance, the models developed by Imperial College came with the caveated assumption that a policy of social distancing for people over 70 will have a 75 per cent compliance rate. This assumption is based on the modellers’ own perceptions of demographics and society, and may not reflect all societal factors that could impact this compliance rate in real life, such as gender, age, ethnicity, genetic diversity, economic stability, as well as access to food, supplies and healthcare. This is why modelling benefits from a cognitively diverse team who bring a wide range of knowledge and understanding to the early creation of a model.

The potential of artificial intelligence

Machine learning, or artificial intelligence (AI), has the potential to advance the capacity and accuracy of modelling techniques by identifying new patterns and interactions, and overcoming some of the limitations resulting from human assumptions and bias. Yet, increasing reliance on these techniques raises the issue of explainability. Policymakers need to be fully aware and understand the model, assumptions and input data behind any predictions and must be able to communicate this aspect of modelling in order to uphold democratic accountability and transparency in public decision-making.

In addition, models using machine learning techniques require extensive amounts of data, which must also be of high quality and as free from bias as possible to ensure accuracy and address the issues at stake. Although technology may be used in the process (i.e. automated extraction and processing of information with big data), data is ultimately created, collected, aggregated and analysed by and for human users. Datasets will reflect the individual and collective biases and assumptions of those creating, collecting, processing and analysing this data. Algorithmic bias is inevitable, and it is essential that policy- and decision-makers are fully aware of how reliable the systems are, as well as their potential social implications.

The age of distrust

Increasing use of emerging technologies for data- and evidence-based policymaking is taking place, paradoxically, in an era of growing mistrust towards expertise and experts, as infamously surmised by Michael Gove. Policymakers and subject-matter experts have faced increased public scrutiny of their findings and the resultant policies that they have been used to justify.

This distrust and scepticism within public discourse has only been fuelled by an ever-increasing availability of diffuse sources of information, not all of which are verifiable and robust. This has caused tension between experts, policymakers and public, which has led to conflicts and uncertainty over what data and predictions can be trusted, and to what degree. This dynamic is exacerbated when considering that certain individuals may purposefully misappropriate, or simply misinterpret, data to support their argument or policies. Politicians are presently considered the least trusted professionals by the UK public, highlighting the importance of better and more effective communication between the scientific community, policymakers and the populations affected by policy decisions.

Acknowledging limitations

While measures can and should be built in to improve the transparency and robustness of scientific models in order to counteract these common criticisms, it is important to acknowledge that there are limitations to the steps that can be taken. This is particularly the case when dealing with predictions of future events, which inherently involve degrees of uncertainty that cannot be fully accounted for by human or machine. As a result, if not carefully considered and communicated, the increased use of complex modelling in policymaking holds the potential to undermine and obfuscate the policymaking process, which may contribute towards significant mistakes being made, increased uncertainty, lack of trust in the models and in the political process and further disaffection of citizens.

The potential contribution of complexity modelling to the work of policymakers is undeniable. However, it is imperative to appreciate the inner workings and limitations of these models, such as the biases that underpin their functioning and the uncertainties that they will not be fully capable of accounting for, in spite of their immense power. They must be tested against the data, again and again, as new information becomes available or there is a risk of scientific models becoming embroiled in partisan politicization and potentially weaponized for political purposes. It is therefore important not to consider these models as oracles, but instead as one of many contributions to the process of policymaking.




prediction

Combined Visual and Semi-quantitative Evaluation Improves Outcome Prediction by Early Mid-treatment 18F-fluoro-deoxi-glucose Positron Emission Tomography in Diffuse Large B-cell Lymphoma.

The purpose of this study was to assess the predictive and prognostic value of interim FDG PET (iPET) in evaluating early response to immuno-chemotherapy after two cycles (PET-2) in diffuse large B-cell lymphoma (DLBCL) by applying two different methods of interpretation: the Deauville visual five-point scale (5-PS) and a change in standardised uptake value by semi-quantitative evaluation. Methods: 145 patients with newly diagnosed DLBCL underwent pre-treatment PET (PET-0) and PET-2 assessment. PET-2 was classified according to both the visual 5-PS and percentage SUV changes (SUV). Receiver operating characteristic (ROC) analysis was performed to compare the accuracy of the two methods for predicting progression-free survival (PFS). Survival estimates, based on each method separately and combined, were calculated for iPET-positive (iPET+) and iPET-negative (iPET–) groups and compared. Results: Both with visual and SUV-based evaluations significant differences were found between the PFS of iPET– and iPET+ patient groups (p<0.001). Visually the best negative (NPV) and positive predictive value (PPV) occurred when iPET was defined as positive if Deauville score 4-5 (89% and 59%, respectively). Using the 66% SUV cut-off value, reported previously, NPV and PPV were 80 and 76%, respectively. SUV at 48.9% cut-off point, reported for the first time here, produced 100% specificity along with the highest sensitivity (24%). Visual and semi-quantitative SUV<48.9% assessment of each PET-2 gave the same PET-2 classification (positive or negative) in 70% (102/145) of all patients. This combined classification delivered NPV and PPV of 89% and 100% respectively, and all iPET+ patients failed to achieve or remain in remission. Conclusion: In this large consistently treated and assessed series of DLBCL, iPET had good prognostic value interpreted either visually or semi-quantitatively. We determined that the most effective SUV cut-off was at 48.9%, and that when combined with visual 5-PS assessment, a positive PET-2 was highly predictive of treatment failure.




prediction

64Cu-DOTATATE PET/CT and prediction of overall and progression-free survival in patients with neuroendocrine neoplasms

Overexpression of somatostatin receptors in patients with neuroendocrine neoplasms (NEN) is utilized for both diagnosis and treatment. Receptor density may reflect tumor differentiation and thus be associated with prognosis. Non-invasive visualization and quantification of somatostatin receptor density is possible by somatostatin receptor imaging (SRI) using positron emission tomography (PET). Recently, we introduced 64Cu-DOTATATE for SRI and we hypothesized that uptake of this tracer could be associated with overall (OS) and progression-free survival (PFS). Methods: We evaluated patients with NEN that had a 64Cu-DOTATATE PET/CT SRI performed in two prospective studies. Tracer uptake was determined as the maximal standardized uptake value (SUVmax) for each patient. Kaplan-Meier analysis with log-rank was used to determine the predictive value of 64Cu-DOTATATE SUVmax for OS and PFS. Specificity, sensitivity and accuracy was calculated for prediction of outcome at 24 months after 64Cu-DOTATATE PET/CT. Results: A total of 128 patients with NEN were included and followed for a median of 73 (1-112) months. During follow-up, 112 experienced disease progression and 69 patients died. The optimal cutoff for 64Cu-DOTATATE SUVmax was 43.3 for prediction of PFS with a hazard ratio of 0.56 (95% CI: 0.38-0.84) for patients with SUVmax > 43.3. However, no significant cutoff was found for prediction of OS. In multiple Cox regression adjusted for age, sex, primary tumor site and tumor grade, the SUVmax cutoff hazard ratio was 0.50 (0.32-0.77) for PFS. The accuracy was moderate for predicting PFS (57%) at 24 months after 64Cu-DOTATATE PET/CT. Conclusion: In this first study to report the association of 64Cu-DOTATATE PET/CT and outcome in patients with NEN, tumor somatostatin receptor density visualized with 64Cu-DOTATATE PET/CT was prognostic for PFS but not OS. However, the accuracy of prediction of PFS at 24 months after 64Cu-DOTATATE PET/CT SRI was moderate limiting the value on an individual patient basis.




prediction

Mass Spectrometry Based Immunopeptidomics Leads to Robust Predictions of Phosphorylated HLA Class I Ligands [Technological Innovation and Resources]

The presentation of peptides on class I human leukocyte antigen (HLA-I) molecules plays a central role in immune recognition of infected or malignant cells. In cancer, non-self HLA-I ligands can arise from many different alterations, including non-synonymous mutations, gene fusion, cancer-specific alternative mRNA splicing or aberrant post-translational modifications. Identifying HLA-I ligands remains a challenging task that requires either heavy experimental work for in vivo identification or optimized bioinformatics tools for accurate predictions. To date, no HLA-I ligand predictor includes post-translational modifications. To fill this gap, we curated phosphorylated HLA-I ligands from several immunopeptidomics studies (including six newly measured samples) covering 72 HLA-I alleles and retrieved a total of 2,066 unique phosphorylated peptides. We then expanded our motif deconvolution tool to identify precise binding motifs of phosphorylated HLA-I ligands. Our results reveal a clear enrichment of phosphorylated peptides among HLA-C ligands and demonstrate a prevalent role of both HLA-I motifs and kinase motifs on the presentation of phosphorylated peptides. These data further enabled us to develop and validate the first predictor of interactions between HLA-I molecules and phosphorylated peptides.




prediction

Phosphotyrosine-based Phosphoproteomics for Target Identification and Drug Response Prediction in AML Cell Lines [Research]

Acute myeloid leukemia (AML) is a clonal disorder arising from hematopoietic myeloid progenitors. Aberrantly activated tyrosine kinases (TK) are involved in leukemogenesis and are associated with poor treatment outcome. Kinase inhibitor (KI) treatment has shown promise in improving patient outcome in AML. However, inhibitor selection for patients is suboptimal.

In a preclinical effort to address KI selection, we analyzed a panel of 16 AML cell lines using phosphotyrosine (pY) enrichment-based, label-free phosphoproteomics. The Integrative Inferred Kinase Activity (INKA) algorithm was used to identify hyperphosphorylated, active kinases as candidates for KI treatment, and efficacy of selected KIs was tested.

Heterogeneous signaling was observed with between 241 and 2764 phosphopeptides detected per cell line. Of 4853 identified phosphopeptides with 4229 phosphosites, 4459 phosphopeptides (4430 pY) were linked to 3605 class I sites (3525 pY). INKA analysis in single cell lines successfully pinpointed driver kinases (PDGFRA, JAK2, KIT and FLT3) corresponding with activating mutations present in these cell lines. Furthermore, potential receptor tyrosine kinase (RTK) drivers, undetected by standard molecular analyses, were identified in four cell lines (FGFR1 in KG-1 and KG-1a, PDGFRA in Kasumi-3, and FLT3 in MM6). These cell lines proved highly sensitive to specific KIs. Six AML cell lines without a clear RTK driver showed evidence of MAPK1/3 activation, indicative of the presence of activating upstream RAS mutations. Importantly, FLT3 phosphorylation was demonstrated in two clinical AML samples with a FLT3 internal tandem duplication (ITD) mutation.

Our data show the potential of pY-phosphoproteomics and INKA analysis to provide insight in AML TK signaling and identify hyperactive kinases as potential targets for treatment in AML cell lines. These results warrant future investigation of clinical samples to further our understanding of TK phosphorylation in relation to clinical response in the individual patient.




prediction

Predictions and Policymaking: Complex Modelling Beyond COVID-19

1 April 2020

Yasmin Afina

Research Assistant, International Security Programme

Calum Inverarity

Research Analyst and Coordinator, International Security Programme
The COVID-19 pandemic has highlighted the potential of complex systems modelling for policymaking but it is crucial to also understand its limitations.

GettyImages-1208425931.jpg

A member of the media wearing a protective face mask works in Downing Street where Britain's Prime Minister Boris Johnson is self-isolating in central London, 27 March 2020. Photo by TOLGA AKMEN/AFP via Getty Images.

Complex systems models have played a significant role in informing and shaping the public health measures adopted by governments in the context of the COVID-19 pandemic. For instance, modelling carried out by a team at Imperial College London is widely reported to have driven the approach in the UK from a strategy of mitigation to one of suppression.

Complex systems modelling will increasingly feed into policymaking by predicting a range of potential correlations, results and outcomes based on a set of parameters, assumptions, data and pre-defined interactions. It is already instrumental in developing risk mitigation and resilience measures to address and prepare for existential crises such as pandemics, prospects of a nuclear war, as well as climate change.

The human factor

In the end, model-driven approaches must stand up to the test of real-life data. Modelling for policymaking must take into account a number of caveats and limitations. Models are developed to help answer specific questions, and their predictions will depend on the hypotheses and definitions set by the modellers, which are subject to their individual and collective biases and assumptions. For instance, the models developed by Imperial College came with the caveated assumption that a policy of social distancing for people over 70 will have a 75 per cent compliance rate. This assumption is based on the modellers’ own perceptions of demographics and society, and may not reflect all societal factors that could impact this compliance rate in real life, such as gender, age, ethnicity, genetic diversity, economic stability, as well as access to food, supplies and healthcare. This is why modelling benefits from a cognitively diverse team who bring a wide range of knowledge and understanding to the early creation of a model.

The potential of artificial intelligence

Machine learning, or artificial intelligence (AI), has the potential to advance the capacity and accuracy of modelling techniques by identifying new patterns and interactions, and overcoming some of the limitations resulting from human assumptions and bias. Yet, increasing reliance on these techniques raises the issue of explainability. Policymakers need to be fully aware and understand the model, assumptions and input data behind any predictions and must be able to communicate this aspect of modelling in order to uphold democratic accountability and transparency in public decision-making.

In addition, models using machine learning techniques require extensive amounts of data, which must also be of high quality and as free from bias as possible to ensure accuracy and address the issues at stake. Although technology may be used in the process (i.e. automated extraction and processing of information with big data), data is ultimately created, collected, aggregated and analysed by and for human users. Datasets will reflect the individual and collective biases and assumptions of those creating, collecting, processing and analysing this data. Algorithmic bias is inevitable, and it is essential that policy- and decision-makers are fully aware of how reliable the systems are, as well as their potential social implications.

The age of distrust

Increasing use of emerging technologies for data- and evidence-based policymaking is taking place, paradoxically, in an era of growing mistrust towards expertise and experts, as infamously surmised by Michael Gove. Policymakers and subject-matter experts have faced increased public scrutiny of their findings and the resultant policies that they have been used to justify.

This distrust and scepticism within public discourse has only been fuelled by an ever-increasing availability of diffuse sources of information, not all of which are verifiable and robust. This has caused tension between experts, policymakers and public, which has led to conflicts and uncertainty over what data and predictions can be trusted, and to what degree. This dynamic is exacerbated when considering that certain individuals may purposefully misappropriate, or simply misinterpret, data to support their argument or policies. Politicians are presently considered the least trusted professionals by the UK public, highlighting the importance of better and more effective communication between the scientific community, policymakers and the populations affected by policy decisions.

Acknowledging limitations

While measures can and should be built in to improve the transparency and robustness of scientific models in order to counteract these common criticisms, it is important to acknowledge that there are limitations to the steps that can be taken. This is particularly the case when dealing with predictions of future events, which inherently involve degrees of uncertainty that cannot be fully accounted for by human or machine. As a result, if not carefully considered and communicated, the increased use of complex modelling in policymaking holds the potential to undermine and obfuscate the policymaking process, which may contribute towards significant mistakes being made, increased uncertainty, lack of trust in the models and in the political process and further disaffection of citizens.

The potential contribution of complexity modelling to the work of policymakers is undeniable. However, it is imperative to appreciate the inner workings and limitations of these models, such as the biases that underpin their functioning and the uncertainties that they will not be fully capable of accounting for, in spite of their immense power. They must be tested against the data, again and again, as new information becomes available or there is a risk of scientific models becoming embroiled in partisan politicization and potentially weaponized for political purposes. It is therefore important not to consider these models as oracles, but instead as one of many contributions to the process of policymaking.




prediction

Prediction models for diagnosis and prognosis of covid-19 infection: systematic review and critical appraisal




prediction

Use of electronic medical records in development and validation of risk prediction models of hospital readmission: systematic review




prediction

Plasma Lipidome and Prediction of Type 2 Diabetes in the Population-Based Malmo&#x0308; Diet and Cancer Cohort

OBJECTIVE

Type 2 diabetes mellitus (T2DM) is associated with dyslipidemia, but the detailed alterations in lipid species preceding the disease are largely unknown. We aimed to identify plasma lipids associated with development of T2DM and investigate their associations with lifestyle.

RESEARCH DESIGN AND METHODS

At baseline, 178 lipids were measured by mass spectrometry in 3,668 participants without diabetes from the Malmö Diet and Cancer Study. The population was randomly split into discovery (n = 1,868, including 257 incident cases) and replication (n = 1,800, including 249 incident cases) sets. We used orthogonal projections to latent structures discriminant analyses, extracted a predictive component for T2DM incidence (lipid-PCDM), and assessed its association with T2DM incidence using Cox regression and lifestyle factors using general linear models.

RESULTS

A T2DM-predictive lipid-PCDM derived from the discovery set was independently associated with T2DM incidence in the replication set, with hazard ratio (HR) among subjects in the fifth versus first quintile of lipid-PCDM of 3.7 (95% CI 2.2–6.5). In comparison, the HR of T2DM among obese versus normal weight subjects was 1.8 (95% CI 1.2–2.6). Clinical lipids did not improve T2DM risk prediction, but adding the lipid-PCDM to all conventional T2DM risk factors increased the area under the receiver operating characteristics curve by 3%. The lipid-PCDM was also associated with a dietary risk score for T2DM incidence and lower level of physical activity.

CONCLUSIONS

A lifestyle-related lipidomic profile strongly predicts T2DM development beyond current risk factors. Further studies are warranted to test if lifestyle interventions modifying this lipidomic profile can prevent T2DM.




prediction

[Accounts of medical and magical character, fortune tellings and predictions]

19th century.




prediction

Gaussian field on the symmetric group: Prediction and learning

François Bachoc, Baptiste Broto, Fabrice Gamboa, Jean-Michel Loubes.

Source: Electronic Journal of Statistics, Volume 14, Number 1, 503--546.

Abstract:
In the framework of the supervised learning of a real function defined on an abstract space $mathcal{X}$, Gaussian processes are widely used. The Euclidean case for $mathcal{X}$ is well known and has been widely studied. In this paper, we explore the less classical case where $mathcal{X}$ is the non commutative finite group of permutations (namely the so-called symmetric group $S_{N}$). We provide an application to Gaussian process based optimization of Latin Hypercube Designs. We also extend our results to the case of partial rankings.




prediction

Assessing prediction error at interpolation and extrapolation points

Assaf Rabinowicz, Saharon Rosset.

Source: Electronic Journal of Statistics, Volume 14, Number 1, 272--301.

Abstract:
Common model selection criteria, such as $AIC$ and its variants, are based on in-sample prediction error estimators. However, in many applications involving predicting at interpolation and extrapolation points, in-sample error does not represent the relevant prediction error. In this paper new prediction error estimators, $tAI$ and $Loss(w_{t})$ are introduced. These estimators generalize previous error estimators, however are also applicable for assessing prediction error in cases involving interpolation and extrapolation. Based on these prediction error estimators, two model selection criteria with the same spirit as $AIC$ and Mallow’s $C_{p}$ are suggested. The advantages of our suggested methods are demonstrated in a simulation and a real data analysis of studies involving interpolation and extrapolation in linear mixed model and Gaussian process regression.




prediction

Sparsely observed functional time series: estimation and prediction

Tomáš Rubín, Victor M. Panaretos.

Source: Electronic Journal of Statistics, Volume 14, Number 1, 1137--1210.

Abstract:
Functional time series analysis, whether based on time or frequency domain methodology, has traditionally been carried out under the assumption of complete observation of the constituent series of curves, assumed stationary. Nevertheless, as is often the case with independent functional data, it may well happen that the data available to the analyst are not the actual sequence of curves, but relatively few and noisy measurements per curve, potentially at different locations in each curve’s domain. Under this sparse sampling regime, neither the established estimators of the time series’ dynamics nor their corresponding theoretical analysis will apply. The subject of this paper is to tackle the problem of estimating the dynamics and of recovering the latent process of smooth curves in the sparse regime. Assuming smoothness of the latent curves, we construct a consistent nonparametric estimator of the series’ spectral density operator and use it to develop a frequency-domain recovery approach, that predicts the latent curve at a given time by borrowing strength from the (estimated) dynamic correlations in the series across time. This new methodology is seen to comprehensively outperform a naive recovery approach that would ignore temporal dependence and use only methodology employed in the i.i.d. setting and hinging on the lag zero covariance. Further to predicting the latent curves from their noisy point samples, the method fills in gaps in the sequence (curves nowhere sampled), denoises the data, and serves as a basis for forecasting. Means of providing corresponding confidence bands are also investigated. A simulation study interestingly suggests that sparse observation for a longer time period may provide better performance than dense observation for a shorter period, in the presence of smoothness. The methodology is further illustrated by application to an environmental data set on fair-weather atmospheric electricity, which naturally leads to a sparse functional time series.




prediction

Prediction in several conventional contexts

Bertrand Clarke, Jennifer Clarke

Source: Statist. Surv., Volume 6, 1--73.

Abstract:
We review predictive techniques from several traditional branches of statistics. Starting with prediction based on the normal model and on the empirical distribution function, we proceed to techniques for various forms of regression and classification. Then, we turn to time series, longitudinal data, and survival analysis. Our focus throughout is on the mechanics of prediction more than on the properties of predictors.




prediction

Adaptive Invariance for Molecule Property Prediction. (arXiv:2005.03004v1 [q-bio.QM])

Effective property prediction methods can help accelerate the search for COVID-19 antivirals either through accurate in-silico screens or by effectively guiding on-going at-scale experimental efforts. However, existing prediction tools have limited ability to accommodate scarce or fragmented training data currently available. In this paper, we introduce a novel approach to learn predictors that can generalize or extrapolate beyond the heterogeneous data. Our method builds on and extends recently proposed invariant risk minimization, adaptively forcing the predictor to avoid nuisance variation. We achieve this by continually exercising and manipulating latent representations of molecules to highlight undesirable variation to the predictor. To test the method we use a combination of three data sources: SARS-CoV-2 antiviral screening data, molecular fragments that bind to SARS-CoV-2 main protease and large screening data for SARS-CoV-1. Our predictor outperforms state-of-the-art transfer learning methods by significant margin. We also report the top 20 predictions of our model on Broad drug repurposing hub.




prediction

Optimal prediction in the linearly transformed spiked model

Edgar Dobriban, William Leeb, Amit Singer.

Source: The Annals of Statistics, Volume 48, Number 1, 491--513.

Abstract:
We consider the linearly transformed spiked model , where the observations $Y_{i}$ are noisy linear transforms of unobserved signals of interest $X_{i}$: egin{equation*}Y_{i}=A_{i}X_{i}+varepsilon_{i},end{equation*} for $i=1,ldots ,n$. The transform matrices $A_{i}$ are also observed. We model the unobserved signals (or regression coefficients) $X_{i}$ as vectors lying on an unknown low-dimensional space. Given only $Y_{i}$ and $A_{i}$ how should we predict or recover their values? The naive approach of performing regression for each observation separately is inaccurate due to the large noise level. Instead, we develop optimal methods for predicting $X_{i}$ by “borrowing strength” across the different samples. Our linear empirical Bayes methods scale to large datasets and rely on weak moment assumptions. We show that this model has wide-ranging applications in signal processing, deconvolution, cryo-electron microscopy, and missing data with noise. For missing data, we show in simulations that our methods are more robust to noise and to unequal sampling than well-known matrix completion methods.




prediction

Hierarchical infinite factor models for improving the prediction of surgical complications for geriatric patients

Elizabeth Lorenzi, Ricardo Henao, Katherine Heller.

Source: The Annals of Applied Statistics, Volume 13, Number 4, 2637--2661.

Abstract:
Nearly a third of all surgeries performed in the United States occur for patients over the age of 65; these older adults experience a higher rate of postoperative morbidity and mortality. To improve the care for these patients, we aim to identify and characterize high risk geriatric patients to send to a specialized perioperative clinic while leveraging the overall surgical population to improve learning. To this end, we develop a hierarchical infinite latent factor model (HIFM) to appropriately account for the covariance structure across subpopulations in data. We propose a novel Hierarchical Dirichlet Process shrinkage prior on the loadings matrix that flexibly captures the underlying structure of our data while sharing information across subpopulations to improve inference and prediction. The stick-breaking construction of the prior assumes an infinite number of factors and allows for each subpopulation to utilize different subsets of the factor space and select the number of factors needed to best explain the variation. We develop the model into a latent factor regression method that excels at prediction and inference of regression coefficients. Simulations validate this strong performance compared to baseline methods. We apply this work to the problem of predicting surgical complications using electronic health record data for geriatric patients and all surgical patients at Duke University Health System (DUHS). The motivating application demonstrates the improved predictive performance when using HIFM in both area under the ROC curve and area under the PR Curve while providing interpretable coefficients that may lead to actionable interventions.




prediction

On Bayesian new edge prediction and anomaly detection in computer networks

Silvia Metelli, Nicholas Heard.

Source: The Annals of Applied Statistics, Volume 13, Number 4, 2586--2610.

Abstract:
Monitoring computer network traffic for anomalous behaviour presents an important security challenge. Arrivals of new edges in a network graph represent connections between a client and server pair not previously observed, and in rare cases these might suggest the presence of intruders or malicious implants. We propose a Bayesian model and anomaly detection method for simultaneously characterising existing network structure and modelling likely new edge formation. The method is demonstrated on real computer network authentication data and successfully identifies some machines which are known to be compromised.




prediction

Predicting paleoclimate from compositional data using multivariate Gaussian process inverse prediction

John R. Tipton, Mevin B. Hooten, Connor Nolan, Robert K. Booth, Jason McLachlan.

Source: The Annals of Applied Statistics, Volume 13, Number 4, 2363--2388.

Abstract:
Multivariate compositional count data arise in many applications including ecology, microbiology, genetics and paleoclimate. A frequent question in the analysis of multivariate compositional count data is what underlying values of a covariate(s) give rise to the observed composition. Learning the relationship between covariates and the compositional count allows for inverse prediction of unobserved covariates given compositional count observations. Gaussian processes provide a flexible framework for modeling functional responses with respect to a covariate without assuming a functional form. Many scientific disciplines use Gaussian process approximations to improve prediction and make inference on latent processes and parameters. When prediction is desired on unobserved covariates given realizations of the response variable, this is called inverse prediction. Because inverse prediction is often mathematically and computationally challenging, predicting unobserved covariates often requires fitting models that are different from the hypothesized generative model. We present a novel computational framework that allows for efficient inverse prediction using a Gaussian process approximation to generative models. Our framework enables scientific learning about how the latent processes co-vary with respect to covariates while simultaneously providing predictions of missing covariates. The proposed framework is capable of efficiently exploring the high dimensional, multi-modal latent spaces that arise in the inverse problem. To demonstrate flexibility, we apply our method in a generalized linear model framework to predict latent climate states given multivariate count data. Based on cross-validation, our model has predictive skill competitive with current methods while simultaneously providing formal, statistical inference on the underlying community dynamics of the biological system previously not available.




prediction

Prediction of small area quantiles for the conservation effects assessment project using a mixed effects quantile regression model

Emily Berg, Danhyang Lee.

Source: The Annals of Applied Statistics, Volume 13, Number 4, 2158--2188.

Abstract:
Quantiles of the distributions of several measures of erosion are important parameters in the Conservation Effects Assessment Project, a survey intended to quantify soil and nutrient loss on crop fields. Because sample sizes for domains of interest are too small to support reliable direct estimators, model based methods are needed. Quantile regression is appealing for CEAP because finding a single family of parametric models that adequately describes the distributions of all variables is difficult and small area quantiles are parameters of interest. We construct empirical Bayes predictors and bootstrap mean squared error estimators based on the linearly interpolated generalized Pareto distribution (LIGPD). We apply the procedures to predict county-level quantiles for four types of erosion in Wisconsin and validate the procedures through simulation.




prediction

Sequential decision model for inference and prediction on nonuniform hypergraphs with application to knot matching from computational forestry

Seong-Hwan Jun, Samuel W. K. Wong, James V. Zidek, Alexandre Bouchard-Côté.

Source: The Annals of Applied Statistics, Volume 13, Number 3, 1678--1707.

Abstract:
In this paper, we consider the knot-matching problem arising in computational forestry. The knot-matching problem is an important problem that needs to be solved to advance the state of the art in automatic strength prediction of lumber. We show that this problem can be formulated as a quadripartite matching problem and develop a sequential decision model that admits efficient parameter estimation along with a sequential Monte Carlo sampler on graph matching that can be utilized for rapid sampling of graph matching. We demonstrate the effectiveness of our methods on 30 manually annotated boards and present findings from various simulation studies to provide further evidence supporting the efficacy of our methods.




prediction

Prediction and estimation consistency of sparse multi-class penalized optimal scoring

Irina Gaynanova.

Source: Bernoulli, Volume 26, Number 1, 286--322.

Abstract:
Sparse linear discriminant analysis via penalized optimal scoring is a successful tool for classification in high-dimensional settings. While the variable selection consistency of sparse optimal scoring has been established, the corresponding prediction and estimation consistency results have been lacking. We bridge this gap by providing probabilistic bounds on out-of-sample prediction error and estimation error of multi-class penalized optimal scoring allowing for diverging number of classes.




prediction

Neural Evidence for the Prediction of Animacy Features during Language Comprehension: Evidence from MEG and EEG Representational Similarity Analysis

It has been proposed that people can generate probabilistic predictions at multiple levels of representation during language comprehension. We used magnetoencephalography (MEG) and electroencephalography (EEG), in combination with representational similarity analysis, to seek neural evidence for the prediction of animacy features. In two studies, MEG and EEG activity was measured as human participants (both sexes) read three-sentence scenarios. Verbs in the final sentences constrained for either animate or inanimate semantic features of upcoming nouns, and the broader discourse context constrained for either a specific noun or for multiple nouns belonging to the same animacy category. We quantified the similarity between spatial patterns of brain activity following the verbs until just before the presentation of the nouns. The MEG and EEG datasets revealed converging evidence that the similarity between spatial patterns of neural activity following animate-constraining verbs was greater than following inanimate-constraining verbs. This effect could not be explained by lexical-semantic processing of the verbs themselves. We therefore suggest that it reflected the inherent difference in the semantic similarity structure of the predicted animate and inanimate nouns. Moreover, the effect was present regardless of whether a specific word could be predicted, providing strong evidence for the prediction of coarse-grained semantic features that goes beyond the prediction of individual words.

SIGNIFICANCE STATEMENT Language inputs unfold very quickly during real-time communication. By predicting ahead, we can give our brains a "head start," so that language comprehension is faster and more efficient. Although most contexts do not constrain strongly for a specific word, they do allow us to predict some upcoming information. For example, following the context of "they cautioned the...," we can predict that the next word will be animate rather than inanimate (we can caution a person, but not an object). Here, we used EEG and MEG techniques to show that the brain is able to use these contextual constraints to predict the animacy of upcoming words during sentence comprehension, and that these predictions are associated with specific spatial patterns of neural activity.




prediction

Modulations of Insular Projections by Prior Belief Mediate the Precision of Prediction Error during Tactile Learning

Awareness for surprising sensory events is shaped by prior belief inferred from past experience. Here, we combined hierarchical Bayesian modeling with fMRI on an associative learning task in 28 male human participants to characterize the effect of the prior belief of tactile events on connections mediating the outcome of perceptual decisions. Activity in anterior insular cortex (AIC), premotor cortex (PMd), and inferior parietal lobule (IPL) were modulated by prior belief on unexpected targets compared with expected targets. On expected targets, prior belief decreased the connection strength from AIC to IPL, whereas it increased the connection strength from AIC to PMd when targets were unexpected. Individual differences in the modulatory strength of prior belief on insular projections correlated with the precision that increases the influence of prediction errors on belief updating. These results suggest complementary effects of prior belief on insular-frontoparietal projections mediating the precision of prediction during probabilistic tactile learning.

SIGNIFICANCE STATEMENT In a probabilistic environment, the prior belief of sensory events can be inferred from past experiences. How this prior belief modulates effective brain connectivity for updating expectations for future decision-making remains unexplored. Combining hierarchical Bayesian modeling with fMRI, we show that during tactile associative learning, prior expectations modulate connections originating in the anterior insula cortex and targeting salience-related and attention-related frontoparietal areas (i.e., parietal and premotor cortex). These connections seem to be involved in updating evidence based on the precision of ascending inputs to guide future decision-making.




prediction

Validation of a Clinical Prediction Rule to Distinguish Lyme Meningitis From Aseptic Meningitis

Available clinical prediction rules to identify children with cerebrospinal fluid pleocytosis at low risk for Lyme meningitis include headache duration, cranial nerve palsy, and percent cerebrospinal fluid mononuclear cells. These rules require independent validation.

These clinical prediction rules accurately identify patients at low risk for Lyme meningitis in our large multicenter cohort. Children at low risk may be considered for outpatient management while awaiting Lyme serology. (Read the full article)




prediction

Prediction of Inflicted Brain Injury in Infants and Children Using Retinal Imaging

Retinal hemorrhages occur in accidental and inflicted traumatic brain injury (ITBI) and some medical encephalopathies. Large numbers and peripherally located retinal hemorrhages are frequently cited as distinguishing features of ITBI in infants, but the predictive value has not been established.

This prospective retinal imaging study found that a diagnosis of ITBI in infants and children can be distinguished from other traumatic and nontraumatic causes by the presence of >25 dot-blot (intraretinal layer) hemorrhages (positive predictive value = 93%). (Read the full article)




prediction

Genotype Prediction of Adult Type 2 Diabetes From Adolescence in a Multiracial Population

Among middle-aged adults, genotype scores predict incident type 2 diabetes but do not improve prediction models based on clinical risk factors including family history and BMI. These clinical factors are more dynamic in adolescence, however.

A genotype score also predicts type 2 diabetes from adolescence over a mean 27 years of follow-up into adulthood but does not improve prediction models based on clinical risk factors assessed in adolescence. (Read the full article)




prediction

Prediction of Neonatal Outcomes in Extremely Preterm Neonates

Extremely preterm infants are at high risk of neonatal mortality or morbidities. Existing prediction models focus on mortality, specific morbidities, or composite mortality and morbidity outcomes and ignore differences in outcome severity.

A simple and practical statistical model was developed that can be applied on the first day after NICU admission to predict outcome severity spanning from no morbidity to mortality. The model is highly discriminative (C-statistic = 90%) and internally valid. (Read the full article)




prediction

A Clinical Prediction Rule for the Severity of Congenital Diaphragmatic Hernias in Newborns

Predicting high-risk populations in congenital diaphragmatic hernia (CDH) can help target care strategies. Prediction rules for infants with CDH often lack validation, are aimed at a prenatal population, and are of limited generalizability. We cannot currently discriminate the highest risk neonates during the crucial period shortly after birth.

This clinical prediction rule was developed and validated on an international database. It discriminates patients and high, intermediate, and low risk of mortality; is easy to apply; and is generalizable to most infants with CDH. (Read the full article)




prediction

Validation of a Clinical Prediction Rule for Pediatric Abusive Head Trauma

Pediatric Brain Injury Research Network investigators recently derived a highly sensitive clinical prediction rule for pediatric abusive head trauma (AHT).

The performance of this AHT screening tool has been validated. Four clinical variables, readily available at the time of admission, detect pediatric AHT with high sensitivity in intensive care settings. (Read the full article)




prediction

Validation of a Prediction Tool for Abusive Head Trauma

A previous multivariable statistical model, using individual patient data, estimated the probability of abusive head trauma based on the presence or absence of 6 clinical features: rib fracture, long-bone fracture, apnea, seizures, retinal hemorrhage, and head or neck bruising.

The model performed well in this validation, with a sensitivity of 72.3%, specificity of 85.7%, and area under the curve of 0.88. In children <3 years old with intracranial injury plus ≥3 features, the estimated probability of abuse is >81.5%. (Read the full article)




prediction

Prediction of antibiotic susceptibility for urinary tract infection in a hospital setting [Epidemiology and Surveillance]

Objectives: Empiric antibiotic prescribing can be supported by guidelines and/or local antibiograms, but these have limitations. We sought to use data from a comprehensive electronic health record to use statistical learning to develop predictive models for individual antibiotics that incorporate patient-, and hospital-specific factors. This paper reports on the development and validation of these models on a large retrospective cohort.

Methods: This is a retrospective cohort study including hospitalized patients with positive urine cultures in the first 48 hours of hospitalization at a 1500 bed, tertiary care hospital over a 4.5 year period. All first urine cultures with susceptibilities were included. Statistical learning techniques, including penalized logistic regression, were used to create predictive models for cefazolin, ceftriaxone, ciprofloxacin, cefepime, and piperacillin-tazobactam. These were validated on a held-out cohort.

Results: The final dataset used for analysis included 6,366 patients. Final model covariates included demographics, comorbidity score, recent antibiotic use, recent antimicrobial resistance, and antibiotic allergies. Models had acceptable to good discrimination in the training dataset and acceptable performance in the validation dataset, with a point estimate for area under the receiver operating characteristic curve (AUC) that ranged from 0.65 for ceftriaxone to 0.69 for cefazolin. All models had excellent calibration.

Conclusion: In this study we used electronic health record data to create predictive models to estimate antibiotic susceptibilities for UTIs in hospitalized patients. Our models had acceptable performance in a held-out validation cohort.




prediction

Validation of a Prediction Rule for Mortality in Congenital Diaphragmatic Hernia

BACKGROUND:

Congenital diaphragmatic hernia (CDH) is a rare congenital anomaly with a mortality of ~27%. The Congenital Diaphragmatic Hernia Study Group (CDHSG) developed a simple postnatal clinical prediction rule to predict mortality in newborns with CDH. Our aim for this study is to externally validate the CDHSG rule in the European population and to improve its prediction of mortality by adding prenatal variables.

METHODS:

We performed a European multicenter retrospective cohort study and included all newborns diagnosed with unilateral CDH who were born between 2008 and 2015. Newborns born from November 2011 onward were included for the external validation of the rule (n = 343). To improve the prediction rule, we included all patients born between 2008 and 2015 (n = 620) with prenatally diagnosed CDH and collected pre- and postnatal variables. We build a logistic regression model and performed bootstrap resampling and computed calibration plots.

RESULTS:

With our validation data set, the CDHSG rule had an area under the curve of 79.0%, revealing a fair predictive performance. For the new prediction rule, prenatal herniation of the liver was added, and absent 5-minute Apgar score was taken out. The new prediction rule revealed good calibration, and with an area under the curve of 84.6%, it had good discriminative abilities.

CONCLUSIONS:

In this study, we externally validated the CDHSG rule for the European population, which revealed fair predictive performance. The modified rule, with prenatal liver herniation as an additional variable, appears to further improve the model’s ability to predict mortality in a population of patients with prenatally diagnosed CDH.




prediction

Abnormal Pulmonary Outcomes in Premature Infants: Prediction From Oxygen Requirement in the Neonatal Period

Andrew T. Shennan
Oct 1, 1988; 82:527-532
ARTICLES




prediction

Gaming in 2020: 4 Reasonable Predictions and 2 Ridiculous Ones

2019 is nearly over, so let's look ahead to what awaits the video game industry in the first year of the new decade. Informed opinions and hot takes abound.




prediction

The 2025 User Experience: Predictions for the Future of Personalized Technology

History is witness to the many scientific leaders and technology visionaries who all tried to predict what innovations will exist in the future. While not all predictions come to fruition, others were not so far off. We may not have flying cars like The...




prediction

Dominance Effects and Functional Enrichments Improve Prediction of Agronomic Traits in Hybrid Maize [Genomic Prediction]

Single-cross hybrids have been critical to the improvement of maize (Zea mays L.), but the characterization of their genetic architectures remains challenging. Previous studies of hybrid maize have shown the contribution of within-locus complementation effects (dominance) and their differential importance across functional classes of loci. However, they have generally considered panels of limited genetic diversity, and have shown little benefit from genomic prediction based on dominance or functional enrichments. This study investigates the relevance of dominance and functional classes of variants in genomic models for agronomic traits in diverse populations of hybrid maize. We based our analyses on a diverse panel of inbred lines crossed with two testers representative of the major heterotic groups in the U.S. (1106 hybrids), as well as a collection of 24 biparental populations crossed with a single tester (1640 hybrids). We investigated three agronomic traits: days to silking (DTS), plant height (PH), and grain yield (GY). Our results point to the presence of dominance for all traits, but also among-locus complementation (epistasis) for DTS and genotype-by-environment interactions for GY. Consistently, dominance improved genomic prediction for PH only. In addition, we assessed enrichment of genetic effects in classes defined by genic regions (gene annotation), structural features (recombination rate and chromatin openness), and evolutionary features (minor allele frequency and evolutionary constraint). We found support for enrichment in genic regions and subsequent improvement of genomic prediction for all traits. Our results suggest that dominance and gene annotations improve genomic prediction across diverse populations in hybrid maize.




prediction

Direct kinetic measurements and theoretical predictions of an isoprene-derived Criegee intermediate [Chemistry]

Isoprene has the highest emission into Earth’s atmosphere of any nonmethane hydrocarbon. Atmospheric processing of alkenes, including isoprene, via ozonolysis leads to the formation of zwitterionic reactive intermediates, known as Criegee intermediates (CIs). Direct studies have revealed that reactions involving simple CIs can significantly impact the tropospheric oxidizing capacity, enhance...




prediction

Assessing the accuracy of direct-coupling analysis for RNA contact prediction [ARTICLE]

Many noncoding RNAs are known to play a role in the cell directly linked to their structure. Structure prediction based on the sole sequence is, however, a challenging task. On the other hand, thanks to the low cost of sequencing technologies, a very large number of homologous sequences are becoming available for many RNA families. In the protein community, the idea of exploiting the covariance of mutations within a family to predict the protein structure using the direct-coupling-analysis (DCA) method has emerged in the last decade. The application of DCA to RNA systems has been limited so far. We here perform an assessment of the DCA method on 17 riboswitch families, comparing it with the commonly used mutual information analysis and with state-of-the-art R-scape covariance method. We also compare different flavors of DCA, including mean-field, pseudolikelihood, and a proposed stochastic procedure (Boltzmann learning) for solving exactly the DCA inverse problem. Boltzmann learning outperforms the other methods in predicting contacts observed in high-resolution crystal structures.




prediction

Seismic pore pressure prediction at the Halten Terrace in the Norwegian Sea

Pre-drill pore pressure prediction is essential for safe and efficient drilling, and is a key element in the risk-reducing toolbox when designing a well. On the Norwegian Continental Shelf, pore pressure prediction commonly relies on traditional 1D offset well analysis, whereas velocity data from seismic surveys are often not considered. Our work with seismic interval velocities shows that the velocity field can provide an important basis for pressure prediction and enable the construction of regional 3D pressure cubes. This may increase the confidence in the pore pressure models and aid the pre-drill geohazard screening process. We demonstrate how a 3D velocity field can be converted to a 3D pore pressure cube using reported pressures in offset wells as calibration points. The method is applied to a regional dataset at the Halten Terrace in the Norwegian Sea; an area with a complex pattern of pore pressure anomalies which traditionally has been difficult to predict. The algorithm is searching for a velocity to pore pressure transform that best matches the reported pressures. The 3D velocity field is a proxy of rock velocity and is derived from seismic surveys, and is verified to checkshot velocities and sonic data in the offset wells.




prediction

Prediction of tunnelling impact on flow rates of adjacent extraction water wells

The decline or drying up of groundwater sources near a tunnel route is damaging to groundwater users. Therefore, forecasting the impact of a tunnel on nearby groundwater sources is a challenging task in tunnel design. In this study, numerical and analytical approaches were applied to the Qomroud water conveyance tunnel (located in Lorestan province, Iran) to assess the impact of tunnelling on the nearby extraction water wells. Using simulation of groundwater-level fluctuation owing to tunnelling, the drawdown at the well locations was determined. From the drawdowns and using Dupuit's equation, the depletion of well flow rates after tunnelling was estimated. To evaluate the results, observed well flow rates before and after tunnelling were compared with the predicted flow rates. The observed and estimated water well flows (before and after tunnelling) showed a regression factor of 0.64, pointing to satisfactory results




prediction

Response Prediction of 177Lu-PSMA-617 Radioligand Therapy Using Prostate-Specific Antigen, Chromogranin A, and Lactate Dehydrogenase

Neuroendocrinelike transdifferentiation of prostate cancer adenocarcinomas correlates with serum levels of chromogranin A (CgA) and drives treatment resistance. The aim of this work was to evaluate whether CgA can serve as a response predictor for 177Lu-prostate-specific membrane antigen 617 (PSMA) radioligand therapy (RLT) in comparison with the established tumor markers. Methods: One hundred consecutive patients with metastasized castration-resistant prostate cancer scheduled for PSMA RLT were evaluated for prostate-specific antigen (PSA), lactate dehydrogenase (LDH), and CgA at baseline and in follow-up of PSMA RLT. Tumor uptake of PSMA ligand, a known predictive marker for response, was assessed as a control variable. Results: From the 100 evaluated patients, 35 had partial remission, 16 stable disease, 15 mixed response, and 36 progression of disease. Tumor uptake above salivary gland uptake translated into partial remission, with an odds ratio (OR) of 60.265 (95% confidence interval [CI], 5.038–720.922). Elevated LDH implied a reduced chance for partial remission, with an OR of 0.094 (95% CI, 0.017–0.518), but increased the frequency of progressive disease (OR, 2.717; 95% CI, 1.391–5.304). All patients who achieved partial remission had a normal baseline LDH. Factor-2 elevation of CgA increased the risk for progression, with an OR of 3.089 (95% CI, 1.302–7.332). Baseline PSA had no prognostic value for response prediction. Conclusion: In our cohort, baseline PSA had no prognostic value for response prediction. LDH was the marker with the strongest prognostic value, and elevated LDH increased the risk for progression of disease under PSMA RLT. Elevated CgA demonstrated a moderate impact as a negative prognostic marker in general but was explicitly related to the presence of liver metastases. Well in line with the literature, sufficient tumor uptake is a prerequisite to achieve tumor response.




prediction

Using Genetic Distance from Archived Samples for the Prediction of Antibiotic Resistance in Escherichia coli [Epidemiology and Surveillance]

The rising rates of antibiotic resistance increasingly compromise empirical treatment. Knowing the antibiotic susceptibility of a pathogen’s close genetic relative(s) may improve empirical antibiotic selection. Using genomic and phenotypic data for Escherichia coli isolates from three separate clinically derived databases, we evaluated multiple genomic methods and statistical models for predicting antibiotic susceptibility, focusing on potentially rapidly available information, such as lineage or genetic distance from archived isolates. We applied these methods to derive and validate the prediction of antibiotic susceptibility to common antibiotics. We evaluated 968 separate episodes of suspected and confirmed infection with Escherichia coli from three geographically and temporally separated databases in Ontario, Canada, from 2010 to 2018. Across all approaches, model performance (area under the curve [AUC]) ranges for predicting antibiotic susceptibility were the greatest for ciprofloxacin (AUC, 0.76 to 0.97) and the lowest for trimethoprim-sulfamethoxazole (AUC, 0.51 to 0.80). When a model predicted that an isolate was susceptible, the resulting (posttest) probabilities of susceptibility were sufficient to warrant empirical therapy for most antibiotics (mean, 92%). An approach combining multiple models could permit the use of narrower-spectrum oral agents in 2 out of every 3 patients while maintaining high treatment adequacy (~90%). Methods based on genetic relatedness to archived samples of E. coli could be used to predict antibiotic resistance and improve antibiotic selection.




prediction

Genetic and Circulating Biomarker Data Improve Risk Prediction for Pancreatic Cancer in the General Population

Background:

Pancreatic cancer is the third leading cause of cancer death in the United States, and 80% of patients present with advanced, incurable disease. Risk markers for pancreatic cancer have been characterized, but combined models are not used clinically to identify individuals at high risk for the disease.

Methods:

Within a nested case–control study of 500 pancreatic cancer cases diagnosed after blood collection and 1,091 matched controls enrolled in four U.S. prospective cohorts, we characterized absolute risk models that included clinical factors (e.g., body mass index, history of diabetes), germline genetic polymorphisms, and circulating biomarkers.

Results:

Model discrimination showed an area under ROC curve of 0.62 via cross-validation. Our final integrated model identified 3.7% of men and 2.6% of women who had at least 3 times greater than average risk in the ensuing 10 years. Individuals within the top risk percentile had a 4% risk of developing pancreatic cancer by age 80 years and 2% 10-year risk at age 70 years.

Conclusions:

Risk models that include established clinical, genetic, and circulating factors improved disease discrimination over models using clinical factors alone.

Impact:

Absolute risk models for pancreatic cancer may help identify individuals in the general population appropriate for disease interception.




prediction

FIFA 20 TOTS La Liga Predictions for next Team of the Season So Far players



FIFA 20 Team Of The Season So Far continues this week, hopefully with the release of the La Liga FUT Squad. Here some predictions on who might be included, plus when they'll be announced.