Spelling suggestions: "subject:"characterbased similarity metrics"" "subject:"characters.based similarity metrics""
1 |
Record LinkageLarsen, Stasha Ann Bown 11 December 2013 (has links) (PDF)
This document explains the use of different metrics involved with record linkage. There are two forms of record linkage: deterministic and probabilistic. We will focus on probabilistic record linkage used in merging and updating two databases. Record pairs will be compared using character-based and phonetic-based similarity metrics to determine at what level they match. Performance measures are then calculated and Receiver Operating Characteristic (ROC) curves are formed. Finally, an economic model is applied that returns the optimal tolerance level two databases should use to determine a record pair match in order to maximize profit.
|
Page generated in 0.1178 seconds