We classify data quality problems that are addressed by data cleaning and provide an overview of the main solution approaches. Data cleaning is especially required when integrating heterogeneous data sources and should be addressed together with schema-related data transformations. In data warehouses, data cleaning is a major part of the so-called ETL process. We also discuss current tool support for data cleaning.
Identifer | oai:union.ndltd.org:DRESDEN/oai:qucosa:de:qucosa:32968 |
Date | 04 February 2019 |
Creators | Rahm, Erhard, Do, Hong Hai |
Source Sets | Hochschulschriftenserver (HSSS) der SLUB Dresden |
Language | English |
Detected Language | English |
Type | info:eu-repo/semantics/acceptedVersion, doc-type:article, info:eu-repo/semantics/article, doc-type:Text |
Rights | info:eu-repo/semantics/openAccess |
Page generated in 0.0019 seconds