Spelling suggestions: "subject:"canonical text dervice"" "subject:"canonical text bservice""
1 |
Release of the MySQL based implementation of the CTS protocolTiepmar, Jochen 20 April 2016 (has links) (PDF)
In a project called "A Library of a Billion Words" we needed an implementation of the CTS protocol that is capable of handling a text collection containing at least 1 billion words. Because the existing solutions did not work for this scale or were still in development I started an implementation of the CTS protocol using methods that MySQL provides. Last year we published a paper that introduced a prototype with the core functionalities without being compliant with the specifications of CTS (Tiepmar et al., 2013). The purpose of this paper is to describe and evaluate the MySQL based implementa-tion now that it is fulfilling the specifications version 5.0 rc.1 and mark it as finished and ready to use. Fur-ther information, online instances of CTS for all de-scribed datasets and binaries can be accessed via the projects website1. Reference Tiepmar J, Teichmann C, Heyer G, Berti M and Crane G. 2013. A new Implementation for Canonical Text Services. in Proceedings of the 8th Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities (LaTeCH).
|
2 |
Release of the MySQL based implementation of the CTS protocolTiepmar, Jochen January 2016 (has links)
In a project called "A Library of a Billion Words" we needed an implementation of the CTS protocol that is capable of handling a text collection containing at least 1 billion words. Because the existing solutions did not work for this scale or were still in development I started an implementation of the CTS protocol using methods that MySQL provides. Last year we published a paper that introduced a prototype with the core functionalities without being compliant with the specifications of CTS (Tiepmar et al., 2013). The purpose of this paper is to describe and evaluate the MySQL based implementa-tion now that it is fulfilling the specifications version 5.0 rc.1 and mark it as finished and ready to use. Fur-ther information, online instances of CTS for all de-scribed datasets and binaries can be accessed via the projects website1. Reference Tiepmar J, Teichmann C, Heyer G, Berti M and Crane G. 2013. A new Implementation for Canonical Text Services. in Proceedings of the 8th Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities (LaTeCH).
|
3 |
Implementation and Evaluation of the Canonical Text Service Protocol as Part of a Research Infrastructure in the Digital HumanitiesTiepmar, Jochen 23 May 2018 (has links)
Einer der bestimmenden Faktoren moderner Gesellschaften ist die fortlaufende Digitalisierung von Informationen und Resourcen. Dieser Trend spiegelt sich in heutiger Forschung wider und hat starken Einfluss auf akademische und industrielle Projekte. Es ist nahezu unmöglich, ein modernes Projekt aufzusetzen, welches keinerlei digitale Aspekte beinhaltet und viele Projekte werden mit dem alleinigen Zweck der Digitalisierung eines Teils der Welt ins Leben gerufen. Dieser Trend führt zur Entstehung neuer Forschungsfelder an den Schnittstellen zwischen der analogen Welt -- beispielsweise den Geisteswissenschaften -- und der Digitalen -- beispielsweise der Informatik. Eine davon ist das für diese Arbeit interessante Gebiet der Digital Humanities.
Dabei werden komplexe Forschungsfragen, -techniken und -prinzipien verbunden, die sich unabhängig voneinander entwickelten. Viel Mühe ist nötig, um die Kommunikation zwischen deren Konzepte zu definieren um Missverständnisse und Fehleinschätzungen zu vermeiden. Dieser Prozess der Brückenbildung ist eine zentrale Aufgabe der neu entstehenden Forschungsfelder.
Diese Arbeit schlägt eine solche Brücke für die textorientierten Digital Humanities vor. Diese Lösung basiert auf einem Referenzsystem für digitalen Text, welches in den Geisteswissenschaften spezifiziert und im Rahmen dieser Arbeit zu einem Datenkommunikationsprotokoll für die Informatik uminterpretiert wurde: dem Canonical Text Service (CTS) Protokoll. / One of the defining factors of modern societies is the ongoing digitization of information, resources and in many ways even life itself. This trend is obviously also reflected in today's research environments and heavily influences the direction in which academic and industrial projects are headed. It is borderline impossible to set up a modern project without including digital aspects and many projects are even set up for the sole purpose of digitizing a specific part of the world. One of the side effects of this trend is the emergence of new research fields at the intersection points between the analog world -- represented for example by the humanities -- and the digital world -- represented for example by computer science. One set of such research fields are the digital humanities, the area of interest for this work.
In the process of this development, complex research questions, techniques, and principles are aligned next to each other that were developed independently from another. A lot of work has to go into defining communication between the concepts to prevent misunderstandings and misconceptions on both sides. This bridge building process is one of the major tasks that must be done by the newly developed research fields.
This work proposes such a bridge for the text-oriented digital humanities based on a digital text reference system that was previously developed in the humanities and is in this work reinterpreted as a data communication protocol for computer science: The Canonical Text Service (CTS) protocol.
|
Page generated in 0.0618 seconds