data analysis

Assessment of the Amaruq gold deposit signature in glacial sediments using multivariate geochemical data analysis and indicator minerals

de Bronac de Vazelhes, V; McMartin, I; Côté-Mantha, O; Boulianne-Verschelden, N. Journal of Geochemical Exploration vol. 228, 106800, 2021 p. 1-17, https://doi.org/10.1016/j.gexplo.2021.106800
<a href="https://geoscan.nrcan.gc.ca/images/geoscan/20200684.jpg"><img src="https://geoscan.nrcan.gc.ca/images/geoscan/20200684.jpg" title="Journal of Geochemical Exploration vol. 228, 106800, 2021 p. 1-17, https://doi.org/10.1016/j.gexplo.2021.106800" height="150" border="1" /></a>




data analysis

An empirical study on construction emergency disaster management and risk assessment in shield tunnel construction project with big data analysis

Emergency disaster management presents substantial risks and obstacles to shield tunnel building projects, particularly in the event of water leakage accidents. Contemporary water leak detection is critical for guaranteeing safety by reducing the likelihood of disasters and the severity of any resulting damages. However, it can be difficult. Deep learning models can analyse images taken inside the tunnel to look for signs of water damage. This study introduces a unique strategy that employs deep learning techniques, generative adversarial networks (GAN) with long short-term memory (LSTM) for water leakage detection i shield tunnel construction (WLD-STC) to conduct classification and prediction tasks on the massive image dataset. The results demonstrate that for identifying and analysing water leakage episodes during shield tunnel construction, the WLD-STC strategy using LSTM-based GAN networks outperformed other methods, particularly on huge data.




data analysis

Evaluation method of teaching reform quality in colleges and universities based on big data analysis

Research on the quality evaluation of teaching reforms plays an important role in promoting improvements in teaching quality. Therefore, an evaluation method of teaching reform quality in colleges and universities based on big data analysis is proposed. A multivariate logistic model is used to select the evaluation indicators for the quality evaluation of teaching reforms in universities. And clustering and cleaning of the evaluation indicator data are performed through big data analysis. The evaluation indicator data is used as input vectors, and the results of the teaching reform quality evaluation are used as output vectors. A support vector machine model based on the whale algorithm is built to obtain the relevant evaluation results. Experimental results show that the proposed method achieves a minimum recall rate of 98.7% for evaluation indicator data, the minimum data processing time of 96.3 ms, the accuracy rate consistently above 97.1%.




data analysis

Blogs – The New Source of Data Analysis




data analysis

The role of data analysis in workplace safety

What role does data analysis play in ensuring the safety of connected workers, and how can it be used to identify potential hazards and prevent incidents?




data analysis

Automated selection of nanoparticle models for small-angle X-ray scattering data analysis using machine learning

Small-angle X-ray scattering (SAXS) is widely used to analyze the shape and size of nanoparticles in solution. A multitude of models, describing the SAXS intensity resulting from nanoparticles of various shapes, have been developed by the scientific community and are used for data analysis. Choosing the optimal model is a crucial step in data analysis, which can be difficult and time-consuming, especially for non-expert users. An algorithm is proposed, based on machine learning, representation learning and SAXS-specific preprocessing methods, which instantly selects the nanoparticle model best suited to describe SAXS data. The different algorithms compared are trained and evaluated on a simulated database. This database includes 75 000 scattering spectra from nine nanoparticle models, and realistically simulates two distinct device configurations. It will be made freely available to serve as a basis of comparison for future work. Deploying a universal solution for automatic nanoparticle model selection is a challenge made more difficult by the diversity of SAXS instruments and their flexible settings. The poor transferability of classification rules learned on one device configuration to another is highlighted. It is shown that training on several device configurations enables the algorithm to be generalized, without degrading performance compared with configuration-specific training. Finally, the classification algorithm is evaluated on a real data set obtained by performing SAXS experiments on nanoparticles for each of the instrumental configurations, which have been characterized by transmission electron microscopy. This data set, although very limited, allows estimation of the transferability of the classification rules learned on simulated data to real data.




data analysis

MuscleX: data analysis software for fiber diffraction patterns from muscle

MuscleX is an integrated, open-source computer software suite for data reduction of X-ray fiber diffraction patterns from striated muscle and other fibrous systems. It is written in Python and runs on Linux, Microsoft Windows or macOS. Most modules can be run either from a graphical user interface or in a `headless mode' from the command line, suitable for incorporation into beamline control systems. Here, we provide an overview of the general structure of the MuscleX software package and describe the specific features of the individual modules as well as examples of applications.




data analysis

Foreword to the special virtual issue on X-ray spectroscopy to understand functional materials: instrumentation, applications, data analysis




data analysis

Marquis Who's Who Honors Cindy R. Ford, PhD, for Success in Data Analysis, Strategic Problem-Solving, and Consulting

Cindy R. Ford, PhD, is noted for her expertise as a methodological and statistical consultant with AXYZ analytics




data analysis

Using Pandas and SQL Together for Data Analysis

In this tutorial, we’ll explore when and how SQL functionality can be integrated within the Pandas framework, as well as its limitations.





data analysis

From Curve Fitting to Machine Learning An Illustrative Guide to Scientific Data Analysis and Computational Intelligence

Location: Electronic Resource- 




data analysis

Mine Seismology: Data Analysis and Interpretation Palabora Mine Caving Process as Revealed by Induced Seismicity

Location: Electronic Resource- 




data analysis

IoT on tap: Defeating drought through data analysis

Bengaluru start-up digitises water infrastructure of industries, smart cities to halt wastage




data analysis

Practical data analysis with JMP [electronic resource] / Robert Carver.

Cary, NC : SAS Institute, 2019.




data analysis

The determinants of invention in electricity generation technologies: A patent data analysis - Environment Working Paper No. 45

This paper analyses the determinants of invention in efficiency-enhancing electricity generation technologies that have the potential to facilitate climate change mitigation efforts, including fossil fuel based technologies aimed at reducing carbon emissions, renewables and nuclear technologies.




data analysis

GIDVis: a comprehensive software tool for geometry-independent grazing-incidence X-ray diffraction data analysis and pole-figure calculations

GIDVis is a software package based on MATLAB specialized for, but not limited to, the visualization and analysis of grazing-incidence thin-film X-ray diffraction data obtained during sample rotation around the surface normal. GIDVis allows the user to perform detector calibration, data stitching, intensity corrections, standard data evaluation (e.g. cuts and integrations along specific reciprocal-space directions), crystal phase analysis etc. To take full advantage of the measured data in the case of sample rotation, pole figures can easily be calculated from the experimental data for any value of the scattering angle covered. As an example, GIDVis is applied to phase analysis and the evaluation of the epitaxial alignment of pentacene­quinone crystallites on a single-crystalline Au(111) surface.




data analysis

How can ‘omics’ technologies – which enable large-scale, speedy biological data analysis – improve environmental risk assessment and management?

High-throughput ‘omics’ technologies, which allow exact and synchronised study of thousands of DNA, RNA, proteins and other molecules, are rapidly becoming more advanced and affordable. As these technologies develop, it is becoming quicker, easier and more affordable to generate unprecedented amounts of biological data, much of which could usefully inform environmental management. So far, however, the application of omics information in environmental management has failed to keep pace with the rapid development of omics-based research, meaning there is untapped potential. A recent study highlights the value of bringing omics information into environmental management and outlines practical ways in which omics can contribute to the risk assessment and management of chemicals.




data analysis

Optimization to identify nearest objects in a dataset for data analysis

In one embodiment, a plurality of objects associated with a dataset and a specified number of nearest objects to be identified are received. The received objects are sorted in a structured format. Further, a key object and a number of adjacent objects corresponding to the key object are selected from the sorted plurality of objects, wherein the number of adjacent objects is selected based on the specified number of nearest objects to be identified. Furthermore, distances between the key object and the number of adjacent objects are determined to identify the specified number of nearest objects, wherein the distances are determined until the specified number of nearest objects is identified. Based on the determined distances, the specified number of nearest objects in the dataset is identified for data analysis.




data analysis

Bayesian Data Analysis, 3rd Edition [pdf]

https://news.ycombinator.com/item?id=23091359




data analysis

Testing goodness of fit for point processes via topological data analysis

Christophe A. N. Biscio, Nicolas Chenavier, Christian Hirsch, Anne Marie Svane.

Source: Electronic Journal of Statistics, Volume 14, Number 1, 1024--1074.

Abstract:
We introduce tests for the goodness of fit of point patterns via methods from topological data analysis. More precisely, the persistent Betti numbers give rise to a bivariate functional summary statistic for observed point patterns that is asymptotically Gaussian in large observation windows. We analyze the power of tests derived from this statistic on simulated point patterns and compare its performance with global envelope tests. Finally, we apply the tests to a point pattern from an application context in neuroscience. As the main methodological contribution, we derive sufficient conditions for a functional central limit theorem on bounded persistent Betti numbers of point processes with exponential decay of correlations.




data analysis

Time series of count data: A review, empirical comparisons and data analysis

Glaura C. Franco, Helio S. Migon, Marcos O. Prates.

Source: Brazilian Journal of Probability and Statistics, Volume 33, Number 4, 756--781.

Abstract:
Observation and parameter driven models are commonly used in the literature to analyse time series of counts. In this paper, we study the characteristics of a variety of models and point out the main differences and similarities among these procedures, concerning parameter estimation, model fitting and forecasting. Alternatively to the literature, all inference was performed under the Bayesian paradigm. The models are fitted with a latent AR($p$) process in the mean, which accounts for autocorrelation in the data. An extensive simulation study shows that the estimates for the covariate parameters are remarkably similar across the different models. However, estimates for autoregressive coefficients and forecasts of future values depend heavily on the underlying process which generates the data. A real data set of bankruptcy in the United States is also analysed.




data analysis

Sampling random graph homomorphisms and applications to network data analysis. (arXiv:1910.09483v2 [math.PR] UPDATED)

A graph homomorphism is a map between two graphs that preserves adjacency relations. We consider the problem of sampling a random graph homomorphism from a graph $F$ into a large network $mathcal{G}$. We propose two complementary MCMC algorithms for sampling a random graph homomorphisms and establish bounds on their mixing times and concentration of their time averages. Based on our sampling algorithms, we propose a novel framework for network data analysis that circumvents some of the drawbacks in methods based on independent and neigborhood sampling. Various time averages of the MCMC trajectory give us various computable observables, including well-known ones such as homomorphism density and average clustering coefficient and their generalizations. Furthermore, we show that these network observables are stable with respect to a suitably renormalized cut distance between networks. We provide various examples and simulations demonstrating our framework through synthetic networks. We also apply our framework for network clustering and classification problems using the Facebook100 dataset and Word Adjacency Networks of a set of classic novels.




data analysis

Intrinsic Riemannian functional data analysis

Zhenhua Lin, Fang Yao.

Source: The Annals of Statistics, Volume 47, Number 6, 3533--3577.

Abstract:
In this work we develop a novel and foundational framework for analyzing general Riemannian functional data, in particular a new development of tensor Hilbert spaces along curves on a manifold. Such spaces enable us to derive Karhunen–Loève expansion for Riemannian random processes. This framework also features an approach to compare objects from different tensor Hilbert spaces, which paves the way for asymptotic analysis in Riemannian functional data analysis. Built upon intrinsic geometric concepts such as vector field, Levi-Civita connection and parallel transport on Riemannian manifolds, the developed framework applies to not only Euclidean submanifolds but also manifolds without a natural ambient space. As applications of this framework, we develop intrinsic Riemannian functional principal component analysis (iRFPCA) and intrinsic Riemannian functional linear regression (iRFLR) that are distinct from their traditional and ambient counterparts. We also provide estimation procedures for iRFPCA and iRFLR, and investigate their asymptotic properties within the intrinsic geometry. Numerical performance is illustrated by simulated and real examples.




data analysis

Bayesian mixed effects models for zero-inflated compositions in microbiome data analysis

Boyu Ren, Sergio Bacallado, Stefano Favaro, Tommi Vatanen, Curtis Huttenhower, Lorenzo Trippa.

Source: The Annals of Applied Statistics, Volume 14, Number 1, 494--517.

Abstract:
Detecting associations between microbial compositions and sample characteristics is one of the most important tasks in microbiome studies. Most of the existing methods apply univariate models to single microbial species separately, with adjustments for multiple hypothesis testing. We propose a Bayesian analysis for a generalized mixed effects linear model tailored to this application. The marginal prior on each microbial composition is a Dirichlet process, and dependence across compositions is induced through a linear combination of individual covariates, such as disease biomarkers or the subject’s age, and latent factors. The latent factors capture residual variability and their dimensionality is learned from the data in a fully Bayesian procedure. The proposed model is tested in data analyses and simulation studies with zero-inflated compositions. In these settings and within each sample, a large proportion of counts per microbial species are equal to zero. In our Bayesian model a priori the probability of compositions with absent microbial species is strictly positive. We propose an efficient algorithm to sample from the posterior and visualizations of model parameters which reveal associations between covariates and microbial compositions. We evaluate the proposed method in simulation studies, and then analyze a microbiome dataset for infants with type 1 diabetes which contains a large proportion of zeros in the sample-specific microbial compositions.




data analysis

Bayesian Sparse Multivariate Regression with Asymmetric Nonlocal Priors for Microbiome Data Analysis

Kurtis Shuler, Marilou Sison-Mangus, Juhee Lee.

Source: Bayesian Analysis, Volume 15, Number 2, 559--578.

Abstract:
We propose a Bayesian sparse multivariate regression method to model the relationship between microbe abundance and environmental factors for microbiome data. We model abundance counts of operational taxonomic units (OTUs) with a negative binomial distribution and relate covariates to the counts through regression. Extending conventional nonlocal priors, we construct asymmetric nonlocal priors for regression coefficients to efficiently identify relevant covariates and their effect directions. We build a hierarchical model to facilitate pooling of information across OTUs that produces parsimonious results with improved accuracy. We present simulation studies that compare variable selection performance under the proposed model to those under Bayesian sparse regression models with asymmetric and symmetric local priors and two frequentist models. The simulations show the proposed model identifies important covariates and yields coefficient estimates with favorable accuracy compared with the alternatives. The proposed model is applied to analyze an ocean microbiome dataset collected over time to study the association of harmful algal bloom conditions with microbial communities.




data analysis

Comment on “Automated Versus Do-It-Yourself Methods for Causal Inference: Lessons Learned from a Data Analysis Competition”

Susan Gruber, Mark J. van der Laan.

Source: Statistical Science, Volume 34, Number 1, 82--85.

Abstract:
Dorie and co-authors (DHSSC) are to be congratulated for initiating the ACIC Data Challenge. Their project engaged the community and accelerated research by providing a level playing field for comparing the performance of a priori specified algorithms. DHSSC identified themes concerning characteristics of the DGP, properties of the estimators, and inference. We discuss these themes in the context of targeted learning.




data analysis

IBM AI – Watson’s role must be expanded to data analysis and forecasting trends

ICMR, at present, is only using Watson for backend reporting, but it also needs to deploy it for data analysis and forecasting trends.




data analysis

Characterizing and inferring quantitative cell cycle phase in single-cell RNA-seq data analysis [METHOD]

Cellular heterogeneity in gene expression is driven by cellular processes, such as cell cycle and cell-type identity, and cellular environment such as spatial location. The cell cycle, in particular, is thought to be a key driver of cell-to-cell heterogeneity in gene expression, even in otherwise homogeneous cell populations. Recent advances in single-cell RNA-sequencing (scRNA-seq) facilitate detailed characterization of gene expression heterogeneity and can thus shed new light on the processes driving heterogeneity. Here, we combined fluorescence imaging with scRNA-seq to measure cell cycle phase and gene expression levels in human induced pluripotent stem cells (iPSCs). By using these data, we developed a novel approach to characterize cell cycle progression. Although standard methods assign cells to discrete cell cycle stages, our method goes beyond this and quantifies cell cycle progression on a continuum. We found that, on average, scRNA-seq data from only five genes predicted a cell's position on the cell cycle continuum to within 14% of the entire cycle and that using more genes did not improve this accuracy. Our data and predictor of cell cycle phase can directly help future studies to account for cell cycle–related heterogeneity in iPSCs. Our results and methods also provide a foundation for future work to characterize the effects of the cell cycle on expression heterogeneity in other cell types.




data analysis

Medidata analysis shows COVID-19 impact on trials

The companyâs global analysis from thousands of studies and sites indicates dramatic shifts in enrollment across several countries since the pandemic began.




data analysis

Tax-News.com: Tax Agencies Meet To Discuss Use Of Big Data Analysis

Tax agencies from 31 countries discussed the ways they are using data analysis tools to improve tax enforcement and administration, at the first meeting of the IOTA Forum on Tax Debt Management, held in Prague, Czech Republic, On October 1-3, 2019.




data analysis

Tax-News.com: Tax Agencies Meet To Discuss Use Of Big Data Analysis

Tax agencies from 31 countries discussed the ways they are using data analysis tools to improve tax enforcement and administration, at the first meeting of the IOTA Forum on Tax Debt Management, held in Prague, Czech Republic, On October 1-3, 2019.




data analysis

Release of a discussion draft on BEPS Action 11 (Data Analysis)

Public Comments are invited on a discussion draft which deals with Action 11 (Improving the analysis of BEPS) of the BEPS Action Plan.




data analysis

Public comments received on discussion draft on BEPS Action 11 (Data Analysis)

On 16 April 2015, interested parties were invited to comment on the discussion draft on Action 11 (Data Analysis) of the BEPS Action Plan. The OECD is grateful to the commentators for their input and is now publishing the comments received.




data analysis

Cellphone location data analysis in Wuhan virology lab suggests 'hazardous event, October shutdown

The report, which was based on commercially-available cellphone location data - indicates that there might've been a 'hazardous event' at China's Wuhan Institute of Virology in early October.




data analysis

Analyzing time interval data : introducing an information system for time interval data analysis [Electronic book] / Philipp Meisen.

Wiesbaden : Springer Vieweg, [2016]




data analysis

Modeling and data analysis: an introduction with environmental applications / John B. Little

Hayden Library - GE45.M37 L57 2019




data analysis

Bayesian data analysis / Andrew Gelman, John B. Carlin, Hal S. Stern, David B. Dunson, Aki Vehtari, Donald B. Rubin

Gelman, Andrew, author




data analysis

Qualitative data analysis : a methods sourcebook / Matthew B. Miles, A. Michael Huberman, Johnny Saldaña

Miles, Matthew B., author




data analysis

Genomics data analysis: false discovery rates and empirical Bayes methods / David R. Bickel, University of Ottawa, Ottawa Institute of Systems Biology, Department of Biochemistry, Microbiology and Immunology, Department of Mathematics and Statistics

Dewey Library - QH438.4.S73 B53 2019




data analysis

An introduction to data analysis : quantitative, qualitative and mixed methods / Tiffany Bergin

Bergin, Tiffany, author




data analysis

Understanding downhole microseismic data analysis: with applications in hydraulic fracture monitoring / Jubran Akram

Online Resource




data analysis

Data analysis for Omic sciences: methods and applications / edited by Joaquim Jaumot, Carmen Bedia, Romà Tauler

Hayden Library - QA76.9.Q36 D38 2018




data analysis

Asteroseismic data analysis: foundations and techniques / Sarbani Basu and William J. Chaplin

Hayden Library - QB812.B37 2017




data analysis

Astronomical Data Analysis Software and Systems XXVIII: proceedings of a conference held at The Hotel at the University of Maryland, College Park, Maryland, USA, 11-15 November 2018 / edited by Peter J. Teuben, Marc W. Pound, Brian A. Thomas, Elizabeth M.

Hayden Library - QB51.3.E43 A88 2018




data analysis

Astronomical Data Analysis Software and Systems XXVI: proceedings of a conference held at Stazione Marittima, Trieste, Italy, 16-20 October 2016 / edited by Marco Molinaro, Keith Shortridge, Fabio Pasian

Hayden Library - QB51.3.E43 A88 2019




data analysis

Astronomical Data Analysis Software and Systems XXVII: proceedings of a conference held at Sheraton Santiago Convention Center, Santiago de Chile, Chile, 22-26 October 2017 / edited by Pascal Ballester, Jorge Ibsen, Mauricio Solar, Keith Shortridge

Dewey Library - QB51.3.E43 A88 2017




data analysis

Data Analysis and Reduction for Big Scientific Data (DRBSD-5), IEEE/ACM International Workshop on [electronic journal].

IEEE / Institute of Electrical and Electronics Engineers Incorporated




data analysis

2019 IEEE/ACM 5th International Workshop on Data Analysis and Reduction for Big Scientific Data (DRBSD-5) [electronic journal].

IEEE / Institute of Electrical and Electronics Engineers Incorporated




data analysis

2019 IEEE/ACM 5th International Workshop on Data Analysis and Reduction for Big Scientific Data (DRBSD-5) [electronic journal].

IEEE / Institute of Electrical and Electronics Engineers Incorporated