Return to search

Identifying Duplicates : Disambiguating Bibsys

The digital information age has brought with it the information seekers. These seekers, which are ordinary people, are one step ahead of many libraries, and require all information to be retrievable by posting a query and/or by browsing through information related to their information needs. Disambiguating (identifying and managing ambiguous entries) creators of publications, makes it browsing in information related to a specified creator feasible. This thesis pose a framework, named iDup, for disambiguation of bibliographic information, and evaluates the original edit-distance and a specially designed time-frame measure for comparing entries in a collection of BIBSYS-MARC records. The strength of the time-frame measure and edit-distance are both shown, as is the weakness of the edit-distance.

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:ntnu-9508
Date January 2007
CreatorsMyrhaug, Kristian
PublisherNorges teknisk-naturvitenskapelige universitet, Institutt for datateknikk og informasjonsvitenskap, Institutt for datateknikk og informasjonsvitenskap
Source SetsDiVA Archive at Upsalla University
LanguageEnglish
Detected LanguageEnglish
TypeStudent thesis, info:eu-repo/semantics/bachelorThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess

Page generated in 0.0179 seconds