Spelling suggestions: "subject:"data coequality model"" "subject:"data coequality godel""
1 |
Data Quality Model for Machine LearningNitesh Varma Rudraraju, Nitesh, Varun Boyanapally, Varun January 2019 (has links)
Context: - Machine learning is a part of artificial intelligence, this area is now continuously growing day by day. Most internet related services such as Social media service, Email Spam, E-commerce sites, Search engines are now using machine learning. The Quality of machine learning output relies on the input data, so the input data is crucial for machine learning and good quality of input data can give a better outcome to the machine learning system. In order to achieve quality data, a data scientist can use a data quality model on data of machine learning. Data quality model can help data scientists to monitor and control the input data of machine learning. But there is no considerable amount of research done on data quality attributes and data quality model for machine learning. Objectives: - The primary objectives of this paper are to find and understand the state-of-art and state-of-practice on data quality attributes for machine learning, and to develop a data quality model for machine learning in collaboration with data scientists. Methods: - This paper mainly consists of two studies: - 1) Conducted a literature review in the different database in order to identify literature on data quality attributes and data quality model for machine learning. 2) An in-depth interview study was conducted to allow a better understanding and verifying of data quality attributes that we identified from our literature review study, this process is carried out with the collaboration of data scientists from multiple locations. Totally of 15 interviews were performed and based on the results we proposed a data quality model based on these interviewees perspective. Result: - We identified 16 data quality attributes as important from our study which is based on the perspective of experienced data scientists who were interviewed in this study. With these selected data quality attributes, we proposed a data quality model with which quality of data for machine learning can be monitored and improved by data scientists, and effects of these data quality attributes on machine learning have also been stated. Conclusion: - This study signifies the importance of quality of data, for which we proposed a data quality model for machine learning based on the industrial experiences of a data scientist. This research gap is a benefit to all machine learning practitioners and data scientists who intended to identify quality data for machine learning. In order to prove that data quality attributes in the data quality model are important, a further experiment can be conducted, which is proposed in future work.
|
2 |
Data Quality Knowledge in Sport Informatics: A Scoping ReviewKremser, Wolfgang 14 October 2022 (has links)
As sport informatics research produces more and more digital data, effective data quality management becomes a necessity. This systematic scoping review investigates how data quality is currently understood in the field. Results show the lack of a common data quality model. Combining data quality approaches from related fields such as Ambient Assisted Living and eHealth could be the first step toward a data quality model for sport informatics. / Da in der Sportinformatikforschung immer mehr digitale Daten erzeugt werden, wird ein effektives Datenqualitätsmanagement zu einer Notwendigkeit. In dieser systematischen Übersichtsarbeit wird untersucht, wie Datenqualität derzeit in diesem Bereich verstanden wird. Die Ergebnisse zeigen das Fehlen eines gemeinsamen Datenqualitätsmodells. Die Kombination von Datenqualitätsansätzen aus verwandten Bereichen wie Ambient Assisted Living und eHealth könnte der erste Schritt zu einem Datenqualitätsmodell für die Sportinformatik sein.
|
Page generated in 0.0711 seconds