Historians and social scientists are very interested in longitudinal data created from historical sources as the longitudinal data creates opportunities for studying people’s lives over time. However, its generation is a challenging problem since historical sources do not have personal identifiers. At the University of Guelph, the People-in-Motion group have currently constructed a record linkage system to link the 1871 Canadian census to the 1881 Canadian census. In this thesis, we discuss one aspect of linking historical census data, the problem of disambiguating multiple links that are created at the linkage step. We show that the disambiguating techniques explored in this thesis improve upon the linkage rate of the People-in-Motion’s system, while maintaining a false positive rate no greater than 5%.
Identifer | oai:union.ndltd.org:LACETR/oai:collectionscanada.gc.ca:OGU.10214/7444 |
Date | 30 August 2013 |
Creators | Richards, Laura |
Contributors | Grewal, Gary, Antonie, Luiza, Areibi, Shawki |
Source Sets | Library and Archives Canada ETDs Repository / Centre d'archives des thèses électroniques de Bibliothèque et Archives Canada |
Language | English |
Detected Language | English |
Type | Thesis |
Page generated in 0.0018 seconds