2. Detection and correction by machine learning
Recent work has shown that machine learning models can be used to accurately identify problems in data and correct certain types of error with complex (semi-)automatic correction mechanisms that were previously performed manually or based on hard-to-maintain heuristics. New learning strategies can also determine which corrections are necessary depending on the analysis objective, as it is not necessarily necessary (or even feasible) to correct all problems.
Learning-based approaches can be used to detect or correct erroneous data. They rely on examples of erroneous and correct records to train the model. But designing models that are sufficiently expressive and therefore complex requires the use of a large number of examples. Depending on the task (detection of outliers, duplicates, inconsistencies, etc.), the creation of these sets of examples can prove very difficult,...
Exclusive to subscribers. 97% yet to be discovered!
You do not have access to this resource.
Click here to request your free trial access!
Already subscribed? Log in!
The Ultimate Scientific and Technical Reference
This article is included in
Software technologies and System architectures
This offer includes:
Knowledge Base
Updated and enriched with articles validated by our scientific committees
Services
A set of exclusive tools to complement the resources
Practical Path
Operational and didactic, to guarantee the acquisition of transversal skills
Doc & Quiz
Interactive articles with quizzes, for constructive reading
Detection and correction by machine learning
Bibliography
Events
International conferences :
Very Large Databases (VLDB) Conference: http://vldb.org/conference.html
ACM SIGMOD (Special Interest Group on Management of Data): https://dl.acm.org/event.cfm?id=RE227
...
Standards and norms
- Data quality — Part 1: Overview https://www.iso.org/standard/50798.html - ISO/TS 8000-1 - 2011
- Data quality — Part 2: Vocabulary https://www.iso.org/standard/73456.html - ISO 8000-2 - 2017
- Data quality — Part 8: Information and data quality: Concepts and measuring https://www.iso.org/standard/60805.html - ISO 8000-8 - 2015
- Data quality — Part 61: Data quality management: Process reference model...
Exclusive to subscribers. 97% yet to be discovered!
You do not have access to this resource.
Click here to request your free trial access!
Already subscribed? Log in!
The Ultimate Scientific and Technical Reference