data

The data on extreme human ageing is flawed

most "blue zones," concentrated areas of supercentenarians, can be attributed to pension fraud or bad record-keeping #




data

Cruftbox’s Halloween 2024 costume data

for the last 19 years, Michael Pusateri has tracked children's Halloween costumes at his front door and published the stats online #





data

No additional Treasury funds for PSNI data breach

NI is not currently in line to get one-off funding to cover the costs of the PSNI data breach.




data

Adding structured data support for Product Variants

In 2022, Google expanded support for Product structured data, enabling enhanced product experiences in Google Search. Then, in 2023 we added support for shipping and returns structured data. Today, we are adding structured data support for Product variants, allowing merchants to easily show more variations of the products they sell, and show shoppers more relevant, helpful results. Providing variant structured data will also complement and enhance merchant center feeds, including automated feeds.




data

Integrating Personal Web Data through Semantically Enhanced Web Portal

Currently, the World Wide Web is mostly composed of isolated and loosely connected "data islands". Connecting them together and retrieving only the information that is of interest to the user is the common Web usage process. Creating infrastructure that would support automation of that process by aggregating and integrating Web data in accordance to user's personal preferences would greatly improve today's Web usage. A significant part of Web data is available only through the login and password protected applications. As that data is very important for the usefulness of described process, proposed infrastructure needs to support authorized access to user's personal data. In this paper we propose a semantically enhanced Web portal that presents unique personalized user's entry to the domain-specific Web information. We also propose an identity management system that supports authorized access to the protected Web data. To verify the proposed solution, we have built Sweb - a semantically enhanced Web portal that uses proposed identity management system.




data

Knowledge Extraction from RDF Data with Activation Patterns

RDF data can be analyzed with various query languages such as SPARQL. However, due to their nature these query languages do not support fuzzy queries that would allow us to extract a broad range of additional information. In this article we present a new method that transforms the information presented by subject-relationobject relations within RDF data into Activation Patterns. These patterns represent a common model that is the basis for a number of sophisticated analysis methods such as semantic relation analysis, semantic search queries, unsupervised clustering, supervised learning or anomaly detection. In this article, we explain the Activation Patterns concept and apply it to an RDF representation of the well known CIA World Factbook.




data

A feature-based model selection approach using web traffic for tourism data

The increased volume of accessible internet data creates an opportunity for researchers and practitioners to improve time series forecasting for many indicators. In our study, we assess the value of web traffic data in forecasting the number of short-term visitors travelling to Australia. We propose a feature-based model selection framework which combines random forest with feature ranking process to select the best performing model using limited and informative number of features extracted from web traffic data. The data was obtained for several tourist attraction and tourism information websites that could be visited by potential tourists to find out more about their destinations. The results of random forest models were evaluated over 3- and 12-month forecasting horizon. Features from web traffic data appears in the final model for short term forecasting. Further, the model with additional data performs better on unseen data post the COVID19 pandemic. Our study shows that web traffic data adds value to tourism forecasting and can assist tourist destination site managers and decision makers in forming timely decisions to prepare for changes in tourism demand.




data

Transactions on Data Privacy 12:2 (2019)

Transactions on Data Privacy, Volume 12 Issue 2 (2019) has been published.




data

Transactions on Data Privacy 12:3 (2019)

Transactions on Data Privacy, Volume 12 Issue 3 (2019) has been published.




data

Transactions on Data Privacy 13:1 (2020)

Transactions on Data Privacy, Volume 13 Issue 1 (2020) has been published.




data

Transactions on Data Privacy 13:2 (2020)

Transactions on Data Privacy, Volume 13 Issue 2 (2020) has been published.




data

Transactions on Data Privacy 13:3 (2020)

Transactions on Data Privacy, Volume 13 Issue 3 (2020) has been published.




data

Transactions on Data Privacy 14:1 (2021)

Transactions on Data Privacy, Volume 14 Issue 1 (2021) has been published.




data

Transactions on Data Privacy 14:2 (2021)

Transactions on Data Privacy, Volume 14 Issue 2 (2021) has been published.




data

Transactions on Data Privacy 14:3 (2021)

Transactions on Data Privacy, Volume 14 Issue 3 (2021) has been published.




data

Transactions on Data Privacy 15:1 (2022)

Transactions on Data Privacy, Volume 15 Issue 1 (2022) has been published.




data

Transactions on Data Privacy 15:2 (2022)

Transactions on Data Privacy, Volume 15 Issue 2 (2022) has been published.




data

Transactions on Data Privacy 15:3 (2022)

Transactions on Data Privacy, Volume 15 Issue 3 (2022) has been published.




data

Transactions on Data Privacy 16:1 (2023)

Transactions on Data Privacy, Volume 16 Issue 1 (2023) has been published.




data

Transactions on Data Privacy 16:3 (2023)

Transactions on Data Privacy, Volume 16 Issue 3 (2023) has been published.




data

Transactions on Data Privacy 17:1 (2024)

Transactions on Data Privacy, Volume 17 Issue 1 (2024) has been published.




data

Transactions on Data Privacy 17:2 (2024)

Transactions on Data Privacy, Volume 17 Issue 2 (2024) has been published.




data

Transactions on Data Privacy 17:3 (2024)

Transactions on Data Privacy, Volume 17 Issue 3 (2024) has been published.




data

Research on Weibo marketing advertising push method based on social network data mining

The current advertising push methods have low accuracy and poor advertising conversion effects. Therefore, a Weibo marketing advertising push method based on social network data mining is studied. Firstly, establish a social network graph and use graph clustering algorithm to mine the association relationships of users in the network. Secondly, through sparsisation processing, the association between nodes in the social network graph is excavated. Then, evaluate the tightness between user preferences and other nodes in the social network, and use the TF-IDF algorithm to extract user interest features. Finally, an attention mechanism is introduced to improve the deep learning model, which matches user interests with advertising domain features and outputs push results. The experimental results show that the push accuracy of this method is higher than 95%, with a maximum advertising click through rate of 82.7% and a maximum advertising conversion rate of 60.7%.




data

Data dissemination and policy enforcement in multi-level secure multi-domain environments

Several challenges exist in disseminating multi-level secure (MLS) data in multi-domain environments. First, the security domains participating in data dissemination generally use different MLS labels and lattice structures. Second, when MLS data objects are transferred across multiple domains, there is a need for an agreed security policy that must be properly applied, and correctly enforced for the data objects. Moreover, the data sender may not be able to predetermine the data recipients located beyond its trust boundary. To address these challenges, we propose a new framework that enables secure dissemination and access of the data as intended by the owner. Our novel framework leverages simple public key infrastructure and active bundle, and allows domains to securely disseminate data without the need to repackage it for each domain.




data

An effective differential privacy protection method of location data based on perturbation loss constraint

Differential privacy is usually applied to location privacy protection scenarios, which confuses real data by adding interference noise to location points to achieve the purpose of protecting privacy. However, this method can result in a significant amount of redundant noisy data and impact the accuracy of the location. Considering the security and practicability of location data, an effective differential privacy protection method of location data based on perturbation loss constraint is proposed. After applying the Laplace mechanism under the condition of differential privacy to perturb the location data, the Savitzky-Golay filtering technology is used to correct the data with noise, and the data with large deviation and low availability is optimised. The introduction of Savitzky-Golay filtering mechanism in differential privacy can reduce the error caused by noise data while protecting user privacy. The experiments results indicate that the scheme improves the practicability of location data and is feasible.




data

Integrating big data collaboration models: advancements in health security and infectious disease early warning systems

In order to further improve the public health assurance system and the infectious diseases early warning system to give play to their positive roles and enhance their collaborative capacity, this paper, based on the big and thick data analytics technology, designs a 'rolling-type' data synergy model. This model covers districts and counties, municipalities, provinces, and the country. It forms a data blockchain for the public health assurance system and enables high sharing of data from existing system platforms such as the infectious diseases early warning system, the hospital medical record management system, the public health data management system, and the health big and thick data management system. Additionally, it realises prevention, control and early warning by utilising data mining and synergy technologies, and ideally solves problems of traditional public health assurance system platforms such as excessive pressure on the 'central node', poor data tamper-proofing capacity, low transmission efficiency of big and thick data, bad timeliness of emergency response, and so on. The realisation of this technology can greatly improve the application and analytics of big and thick data and further enhance the public health assurance capacity.




data

Access controllable multi-blockchain platform for enterprise R&D data management

In the era of big data, enterprises have accumulated a large amount of research and development data. Effective management of their precipitated data and safe sharing of data can improve the collaboration efficiency of research and development personnel, which has become the top priority of enterprise development. This paper proposes to use blockchain technology to assist the collaboration efficiency of enterprise R&D personnel. Firstly, the multi-chain blockchain platform is used to realise the data sharing of internal data of enterprise R&D data department, project internal data and enterprise data centre, and then the process of construction of multi-chain structure and data sharing is analysed. Finally, searchable encryption was introduced to achieve data retrieval and secure sharing, improving the collaboration efficiency of enterprise research and development personnel and maximising the value of data assets. Through the experimental verification, the multi-chain structure improves the collaboration efficiency of researchers and data security sharing.




data

Human resource management and organisation decision optimisation based on data mining

The utilisation of big data presents significant opportunities for businesses to create value and gain a competitive edge. This capability enables firms to anticipate and uncover information quickly and intelligently. The author introduces a human resource scheduling optimisation strategy using a parallel network fusion structure model. The author's approach involves designing a set of network structures based on parallel networks and streaming media, enabling the macro implementation of the enterprise parallel network fusion structure. Furthermore, the author proposes a human resource scheduling optimisation method based on a parallel deep learning network fusion structure. It combines convolutional neural networks and transformer networks to fuse streaming media features, thereby achieving comprehensive identification of the effectiveness of the current human resource scheduling in enterprises. The result shows that the macro and deep learning methods achieve a recognition rate of 87.53%, making it feasible to assess the current state of human resource scheduling in enterprises.




data

An empirical study on construction emergency disaster management and risk assessment in shield tunnel construction project with big data analysis

Emergency disaster management presents substantial risks and obstacles to shield tunnel building projects, particularly in the event of water leakage accidents. Contemporary water leak detection is critical for guaranteeing safety by reducing the likelihood of disasters and the severity of any resulting damages. However, it can be difficult. Deep learning models can analyse images taken inside the tunnel to look for signs of water damage. This study introduces a unique strategy that employs deep learning techniques, generative adversarial networks (GAN) with long short-term memory (LSTM) for water leakage detection i shield tunnel construction (WLD-STC) to conduct classification and prediction tasks on the massive image dataset. The results demonstrate that for identifying and analysing water leakage episodes during shield tunnel construction, the WLD-STC strategy using LSTM-based GAN networks outperformed other methods, particularly on huge data.




data

Dual network control system for bottom hole throttling pressure control based on RBF with big data computing

In the context of smart city development, the managed pressure drilling (MPD) drilling process faces many uncertainties, but the characteristics of the process are complex and require accurate wellbore pressure control. However, this process runs the risk of introducing un-modelled dynamics into the system. To this problem, this paper employs neural network control techniques to construct a dual-network system for throttle pressure control, the design encompasses both the controller and identifier components. The radial basis function (RBF) network and proportional features are connected in parallel in the controller structure, and the RBF network learning algorithm is used to train the identifier structure. The simulation results show that the actual wellbore pressure can quickly track the reference pressure value when the pressure setpoint changes. In addition, the controller based on neural network realises effective control, which enables the system to track the input target quickly and achieve stable convergence.




data

Design of data mining system for sports training biochemical indicators based on artificial intelligence and association rules

Physiological indicators are an important basis for reflecting the physiological health status of the human body and play an important role in medical practice. Association rules have also been one of the important research hotspots in recent years. This study aims to create a data mining system of association rules and artificial intelligence in biochemical indicators of sports training. This article uses Markov logic for network creation and system training, and tests whether the Markov logic network can be associated with the training system. The results show that the accuracy and recall rate obtained are about 90%, which shows that it is feasible to establish biochemical indicators of sports training based on Markov logic network, and the system has universal, guiding and constructive significance, ensuring that the construction of training system indicators will not go in the wrong direction.




data

International Journal of Data Mining and Bioinformatics




data

A Realistic Data Warehouse Project: An Integration of Microsoft Access® and Microsoft Excel® Advanced Features and Skills




data

Algorithm Visualization System for Teaching Spatial Data Algorithms




data

Database Security: What Students Need to Know




data

A Tools-Based Approach to Teaching Data Mining Methods




data

Automatic Grading of Spreadsheet and Database Skills




data

A Database Practicum for Teaching Database Administration and Software Development at Regis University




data

Using Educational Data Mining to Predict Students’ Academic Performance for Applying Early Interventions

Aim/Purpose: One of the main objectives of higher education institutions is to provide a high-quality education to their students and reduce dropout rates. This can be achieved by predicting students’ academic achievement early using Educational Data Mining (EDM). This study aims to predict students’ final grades and identify honorary students at an early stage. Background: EDM research has emerged as an exciting research area, which can unfold valuable knowledge from educational databases for many purposes, such as identifying the dropouts and students who need special attention and discovering honorary students for allocating scholarships. Methodology: In this work, we have collected 300 undergraduate students’ records from three departments of a Computer and Information Science College at a university located in Saudi Arabia. We compared the performance of six data mining methods in predicting academic achievement. Those methods are C4.5, Simple CART, LADTree, Naïve Bayes, Bayes Net with ADTree, and Random Forest. Contribution: We tested the significance of correlation attribute predictors using four different methods. We found 9 out of 18 proposed features with a significant correlation for predicting students’ academic achievement after their 4th semester. Those features are student GPA during the first four semesters, the number of failed courses during the first four semesters, and the grades of three core courses, i.e., database fundamentals, programming language (1), and computer network fundamentals. Findings: The empirical results show the following: (i) the main features that can predict students’ academic achievement are the student GPA during the first four semesters, the number of failed courses during the first four semesters, and the grades of three core courses; (ii) Naïve Bayes classifier performed better than Tree-based Models in predicting students’ academic achievement in general, however, Random Forest outperformed Naïve Bayes in predicting honorary students; (iii) English language skills do not play an essential role in students’ success at the college of Computer and Information Sciences; and (iv) studying an orientation year does not contribute to students’ success. Recommendations for Practitioners: We would recommend instructors to consider using EDM in predicting students’ academic achievement and benefit from that in customizing students’ learning experience based on their different needs. Recommendation for Researchers: We would highly endorse that researchers apply more EDM studies across various universities and compare between them. For example, future research could investigate the effects of offering tutoring sessions for students who fail core courses in their first semesters, examine the role of language skills in social science programs, and examine the role of the orientation year in other programs. Impact on Society: The prediction of academic performance can help both teachers and students in many ways. It also enables the early discovery of honorary students. Thus, well-deserved opportunities can be offered; for example, scholarships, internships, and workshops. It can also help identify students who require special attention to take an appropriate intervention at the earliest stage possible. Moreover, instructors can be aware of each student’s capability and customize the teaching tasks based on students’ needs. Future Research: For future work, the experiment can be repeated with a larger dataset. It could also be extended with more distinctive attributes to reach more accurate results that are useful for improving the students’ learning outcomes. Moreover, experiments could be done using other data mining algorithms to get a broader approach and more valuable and accurate outputs.




data

Implementing Team-Based Learning: Findings From a Database Class

Aim/Purpose: The complexity of today’s organizational databases highlights the importance of hard technical skills as well as soft skills including teamwork, communication, and problem-solving. Therefore, when teaching students about databases it follows that using a team approach would be useful. Background: Team-based learning (TBL) has been developed and tested as an instructional strategy that leverages learning in small groups in order to achieve increased overall effectiveness. This research studies the impact of utilizing team-based learning strategies in an undergraduate Database Management course in order to determine if the methodology is effective for student learning related to database technology concepts in addition to student preparation for working in database teams. Methodology: In this study, a team-based learning strategy is implemented in an undergraduate Database Management course over the course of two semesters. Students were assessed both individually and in teams in order to see if students were able to effectively learn and apply course concepts on their own and in collaboration with their team. Quantitative and qualitative data was collected and analyzed in order to determine if the team approach improved learning effectiveness and allowed for soft skills development. The results from this study are compared to previous semesters when team-based learning was not adopted. Additionally, student perceptions and feedback are captured. Contribution: This research contributes to the literature on database education and team-based learning and presents a team-based learning process for faculty looking to adopt this methodology in their database courses. This research contributes by showing how the collaborative assessment aspect of team-based learning can provide a solution for the conceptual and collaborative needs of database education. Findings: Findings related to student learning and perceptions are presented illustrating that team-based learning can lead to improvements in performance and provides a solution for the conceptual and collaborative needs of database education. Specifically, the findings do show that team scores were significantly higher than individual scores when completing class assessments. Student perceptions of both their team members and the team-based learning process were overall positive with a notable difference related to the perception of team preparedness based on gender. Recommendations for Practitioners: Educational implications highlight the challenges of team-based learning for assessment (e.g., gender differences in perceptions of team preparedness), as well as the benefits (e.g., development of soft skills including teamwork and communication). Recommendation for Researchers: This study provides research implications supporting the study of team assessment techniques for learning and engagement in the context of database education. Impact on Society: Faculty looking to develop student skills in relation to database concepts and application as well as in relation to teamwork and communication may find value in this approach, ultimately benefiting students, employers, and society. Future Research: Future research may examine the methodology from this study in different contexts as well as explore different strategies for group assignments, room layout, and the impact of an online environment.




data

Intelligent traffic congestion discrimination method based on wireless sensor network front-end data acquisition

Conventional intelligent traffic congestion discrimination methods mainly use GPS terminals to collect traffic congestion data, which is vulnerable to the influence of vehicle time distribution, resulting in poor final discrimination effect. Necessary to design a new intelligent traffic congestion discrimination method based on wireless sensor network front-end data collection. That is to use the front-end data acquisition technology of wireless sensor network to generate a front-end data acquisition platform to obtain intelligent traffic congestion data, and then design an intelligent traffic congestion discrimination algorithm based on traffic congestion rules so as to achieve intelligent traffic congestion discrimination. The experimental results show that the intelligent traffic congestion discrimination method designed based on the front-end data collection of wireless sensor network has good discrimination effect, the obtained discrimination data is more accurate, effective and has certain application value, which has made certain contributions to reducing the frequency of urban traffic accidents.




data

High quality management of higher education based on data mining

In order to improve the quality of higher education, student satisfaction, and employment rate, a data mining based high-quality management method for higher education is proposed. Firstly, construct a high-quality evaluation system for higher education based on the principles of education quality evaluation. Secondly, the association rule mining method is used to construct a university education quality management model and determine the weight of the impact indicators for high-quality management of university education. Finally, the fuzzy evaluation method is used to determine the high-quality evaluation function of higher education, and the results of high-quality evaluation of higher education are obtained. High-quality management strategies are developed based on the evaluation results to improve the quality of education. The experimental results show that the student satisfaction rate of this method can reach 99.3%, and the student employment rate can reach 99.9%.




data

Reflections on strategies for psychological health education for college students based on data mining

In order to improve the mental health level of college students, a data mining based mental health education strategy for college students is proposed. Firstly, analyse the characteristics of data mining and its potential value in mental health education. Secondly, after denoising the mental health data of college students using wavelet transform, data mining methods are used to identify the psychological crisis status of college students. Finally, based on the psychological crisis status of college students, measures for mental health education are proposed from the following aspects: building a psychological counselling platform, launching psychological health promotion activities, establishing a psychological support network, strengthening academic guidance and stress management. The example analysis results show that after the application of the strategy in this article, the psychological health scores of college students have been effectively improved, with an average score of 93.5 points.




data

A data classification method for innovation and entrepreneurship in applied universities based on nearest neighbour criterion

Aiming to improve the accuracy, recall, and F1 value of data classification, this paper proposes an applied university innovation and entrepreneurship data classification method based on the nearest neighbour criterion. Firstly, the decision tree algorithm is used to mine innovation and entrepreneurship data from applied universities. Then, dynamic weight is introduced to improve the similarity calculation method based on edit distance, and the improved method is used to realise data de-duplication to avoid data over fitting. Finally, the nearest neighbour criterion method is used to classify applied university innovation and entrepreneurship data, and cosine similarity is used to calculate the similarity between the samples to be classified and each sample in the training data, achieving data classification. The experimental results demonstrate that the proposed method achieves a maximum accuracy of 96.5% and an average F1 score of 0.91. These findings indicate a high level of accuracy, recall, and F1 value for data classification using the proposed method.




data

Learning behaviour recognition method of English online course based on multimodal data fusion

The conventional methods for identifying English online course learning behaviours have the problems of low recognition accuracy and high time cost. Therefore, a multimodal data fusion-based method for identifying English online course learning behaviours is proposed. Firstly, the analytic hierarchy process is used for decision fusion of multimodal data of learning behaviour. Secondly, based on the fusion results of multimodal data, weight coefficients are set to minimise losses and extract learning behaviour features. Finally, based on the extracted learning behaviour characteristics, the optimal classification function is constructed to classify the learning behaviour of English online courses. Based on the transfer information of learning behaviour status, the identification of online course learning behaviour is completed. The experimental results show that the recognition accuracy of the proposed method is above 90%, and its recognition accuracy is and can shorten the recognition time of learning behaviour, with high practical application reliability.




data

A method for evaluating the quality of college curriculum teaching reform based on data mining

In order to improve the evaluation effect of current university teaching reform, a new method for evaluating the quality of university course teaching reform is proposed based on data mining algorithms. Firstly, the optimal data clustering criterion was used to select evaluation indicators and a quality evaluation system for university curriculum teaching reform was established. Next, a reform quality evaluation model is constructed using BP neural network, and the training process is improved through genetic algorithm to obtain the model weight and threshold of the optimal solution. Finally, the calculated parameters are substituted into the model to achieve accurate evaluation of the quality of university curriculum teaching reform. Selecting evaluation accuracy and evaluation efficiency as evaluation indicators, the practicality of the proposed method was verified through experiments. The experimental results showed that the proposed method can mine teaching reform data and evaluate the quality of teaching reform. Its evaluation accuracy is higher than 96.3%, and the evaluation time is less than 10ms, which is much better than the comparison method, fully demonstrating the practicality of the method.




data

Evaluation method of teaching reform quality in colleges and universities based on big data analysis

Research on the quality evaluation of teaching reforms plays an important role in promoting improvements in teaching quality. Therefore, an evaluation method of teaching reform quality in colleges and universities based on big data analysis is proposed. A multivariate logistic model is used to select the evaluation indicators for the quality evaluation of teaching reforms in universities. And clustering and cleaning of the evaluation indicator data are performed through big data analysis. The evaluation indicator data is used as input vectors, and the results of the teaching reform quality evaluation are used as output vectors. A support vector machine model based on the whale algorithm is built to obtain the relevant evaluation results. Experimental results show that the proposed method achieves a minimum recall rate of 98.7% for evaluation indicator data, the minimum data processing time of 96.3 ms, the accuracy rate consistently above 97.1%.




data

A personalised recommendation method for English teaching resources on MOOC platform based on data mining

In order to enhance the accuracy of teaching resource recommendation results and optimise user experience, a personalised recommendation method for English teaching resources on the MOOC platform based on data mining is proposed. First, the learner's evaluation of resources and resource attributes are abstracted into the same space, and resource tags are established using the Knowledge graph. Then, interest preference constraints are introduced to mine sequential patterns of user historical learning behaviour in the MOOC platform. Finally, a graph neural network is used to construct a recommendation model, which adjusts users' short-term and short-term interest parameters to achieve dynamic personalised teaching recommendation resources. The experimental results show that the accuracy and recall of the resource recommendation results of the research method are always higher than 0.9, the normalised sorting gain is always higher than 0.5.