Spelling suggestions: "subject:"informatics, computer science, databases"" "subject:"informatics, computer science, atabases""
1 |
DOL - an Interoperable Document ServerMelnik, Sergey, Rahm, Erhard, Sosna, Dieter 05 February 2019 (has links)
We describe the design and expierences gained with the database and web-based document server DOL, which we developed at the University of Leipzig (http://dol.uni-leipzig.de). The server provides a central repository for a variety of fulltext documents. In Leipzig, it has been used since 1998 as a university-wide digital library for documents by local authors, in particular Ph.D. theses, master theses, research papers, lecture notes etc., offering a central access point to the university´s research results and educational material. Decentralized administration and different workflows are supported to met organizational and legal requirements of specific document types (e.g., Ph.D. theses). All documents are converted into several formats, and can be downloaded or viewed online in a page-wise fashion. The documents are searchable in a flexible way using fulltext and bibliographic queries. Moreover, a multi-level navigation interface is provided, supporting browsing along several dimentions. DOL is interoperable with global digital libraries such as NCSTRL and can be ported to the needs of different organisations. It is also in use at Stanford University.
|
2 |
Training Selection for Tuning Entity MatchingKöpcke, Hanna, Rahm, Erhard 06 February 2019 (has links)
Entity matching is a crucial and difficult task for data integration. An effective solution strategy typically has to combine several techniques and to find suitable settings for critical configuration parameters such as similarity thresholds. Supervised (training-based) approaches promise to reduce the manual work for
determining (learning) effective strategies for entity matching. However, they critically depend on training data selection which is a difficult problem that has so far mostly been addressed manually by human experts. In this paper we propose a training-based framework called STEM for entity matching and present different generic methods for automatically selecting training data to combine and configure several matching techniques. We evaluate the proposed methods for different match tasks and small- and medium-sized training sets.
|
Page generated in 0.1252 seconds