This research determines whether data quality mining can be used to describe, monitor and evaluate the scope and impact of semantic data quality problems in the learner enrolment data on the National Learners’ Records Database. Previous data quality mining work has focused on anomaly detection and has assumed that the data quality aspect being measured exists as a data value in the data set being mined. The method for this research is quantitative in that the data mining techniques and model that are best suited for semantic data quality deficiencies are identified and then applied to the data. The research determines that unsupervised data mining techniques that allow for weighted analysis of the data would be most suitable for the data mining of semantic data deficiencies. Further, the academic Knowledge Discovery in Databases model needs to be amended when applied to data mining semantic data quality deficiencies. / School of Computing / M. Tech. (Information Technology)
Identifer | oai:union.ndltd.org:netd.ac.za/oai:union.ndltd.org:unisa/oai:uir.unisa.ac.za:10500/25576 |
Date | 07 1900 |
Creators | Barth, Kirstin |
Contributors | Bankole, F. O., Omlin, Christian W. |
Source Sets | South African National ETD Portal |
Language | English |
Detected Language | English |
Type | Dissertation |
Format | 1 online resource (iii, 642 leaves) : illustrations, graphs |
Page generated in 0.0033 seconds