• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 8
  • 7
  • 1
  • Tagged with
  • 17
  • 17
  • 7
  • 6
  • 6
  • 6
  • 4
  • 4
  • 4
  • 4
  • 4
  • 3
  • 3
  • 2
  • 2
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
11

Annotating figurative language: another perspective for digital Altertumswissenschaften

Beyer, Stefan, Di Biase-Dyson, Camilla, Wagenknecht, Nina January 2016 (has links)
Whereas past and current digital projects in ancient language studies have been concerned with the annotation of linguistic elements and metadata, there is now an increased interest in the annotation of elements above the linguistic level that are determined by context – like figurative language. Such projects bring their own set of problems (the automatisation of annotation is more difficult, for instance), but also allow us to develop new ways of examining the data. For this reason, we have attempted to take an already annotated database of Ancient Egyptian texts and develop a complementary tagging layer rather than starting from scratch with a new database. In this paper, we present our work in developing a metaphor annotation layer for the Late Egyptian text database of Projet Ramsès (Université de Liège) and in so doing address more general questions: 1) How to ‚tailor-make’ annotation layers to fit other databases? (Workflow) 2) How to make annotations that are flexible enough to be altered in the course of the annotation process? (Project design) 3) What kind of potential do such layers have for integration with existing and future annotations? (Sustainability)
12

[en] BUILDING AND EVALUATING A GOLD-STANDARD TREEBANK / [pt] CONSTRUÇÃO E AVALIAÇÃO DE UM TREEBANK PADRÃO OURO

ELVIS ALVES DE SOUZA 29 May 2023 (has links)
[pt] Esta dissertação apresenta o processo de desenvolvimento do PetroGold, um corpus anotado com informação morfossintática – um treebank – padrão ouro para o domínio do petróleo. O desenvolvimento do recurso é abordado sob duas lentes: do lado linguístico, estudamos a literatura gramatical e tomamos decisões linguisticamente motivadas para garantir a qualidade da anotação do corpus; do lado computacional, avaliamos o recurso considerando a sua utilidade para o processamento de linguagem natural (PLN). Recursos como o PetroGold recebem relevância especial no contexto atual, em que o PLN estatístico tem se beneficiado de recursos padrão ouro de domínios específicos para alimentar o aprendizado automático. No entanto, o treebank é útil também para tarefas como a avaliação de sistemas de anotação baseados em regras e para os estudos linguísticos. O PetroGold foi anotado segundo as diretivas do projeto Universal Dependencies, tendo como pressupostos a ideia de que a anotação de um corpus é um processo interpretativo, por um lado, e utilizando o paradigma da linguística empírica, por outro. Além de descrever a anotação propriamente, aplicamos alguns métodos para encontrar erros na anotação de treebanks e apresentamos uma ferramenta criada especificamente para busca, edição e avaliação de corpora anotados. Por fim, avaliamos o impacto da revisão de cada uma das categorias linguísticas do treebank no aprendizado automático de um modelo alimentado pelo PetroGold e disponibilizamos publicamente a terceira versão do corpus, a qual, quando submetida à avaliação intrínseca de um modelo, alcança métricas até 2,55 por cento melhores que a versão anterior. / [en] This thesis reports on the development process of PetroGold, a goldstandard annotated corpus with morphosyntactic information – a treebank – for the oil and gas domain. The development of the resource is seen from two perspectives: on the linguistic side, we study the grammatical literature and make linguistically motivated decisions to ensure the quality of corpus annotation; on the computational side, we evaluate the resource considering its usefulness for natural language processing (NLP). Resources like PetroGold receive special importance in the current context, where statistical NLP has benefited from domain-specific gold-standard resources to train machine learning models. However, the treebank is also useful for tasks such as evaluating rule-based annotation systems and for linguistic studies. PetroGold was annotated according to the guidelines of the Universal Dependencies project, having as theoretical assumptions the idea that the annotation of a corpus is an interpretative process, on the one hand, and using the empirical linguistics paradigm, on the other. In addition to describing the annotation itself, we apply some methods to find errors in the annotation of treebanks and present a tool created specifically for searching, editing and evaluating annotated corpora. Finally, we evaluate the impact of revising each of the treebank linguistic categories on the automatic learning of a model powered by PetroGold and make the third version of the corpus publicly available, which, when performing an intrinsic evaluation for a model using the corpus, achieves metrics up to 2.55 perecent better than the previous version.
13

Broad-domain Quantifier Scoping with RoBERTa

Rasmussen, Nathan Ellis 10 August 2022 (has links)
No description available.
14

A practical approach to the standardisation and elaboration of Zulu as a technical language

Van Huyssteen, Linda 30 November 2003 (has links)
The lack of terminology in Zulu can be overcome if it is developed to meet international scientific and technical demands. This lack of terminology can be traced back to the absence of proper language policy implementation with regard to the African languages. Even though Zulu possesses the basic elements that are necessary for its development, such as orthographical standards, dictionaries, grammars and published literature, a number of problems exist within the technical elaboration and standardisation processes: * Inconsistencies in the application of standard rules, in relation to both orthography and terminology. * The lack of standardisation of the (technical) word-formation patterns in Zulu. (Generally the role of culture in elaboration has largely been overlooked). * The avoidance of exploiting written technical text corpora as a resource for terminology. (Text encoding by means of corpus query tools in term extraction has just begun in Zulu and needs to be properly exemplified). * The avoidance of introducing oral technical corpora as a resource for improving the acceptability of technical terminology by, for instance, designing a type of reusable corpus annotation. This study contributes towards solving these problems by offering a practical approach within the context of the real written, standard and oral Zulu language, mainly within the medical terminological domain. This approach offers a reusable methodological foundation with proper language exemplification that can guide terminologists in terminological research, or to some extent even train them, to achieve effective technical elaboration and eventual standardisation. This thesis aims at attaining consistent standardisation on the orthographical level in order to ease the elaboration task of the terminologist. It also aims at standardising the methods of word- (term-) formation linking them to cultural factors, such as taboo. However, this thesis also emphasises the significance of using written and oral technical corpora as terminology resource. This, for instance, is made possible through the application of corpus linguistics, in semi-automatic term extraction from a written technical corpus to aid lemmatisation (listing entries) and in corpus annotation to improve the acceptability of terminology, based on the comparison of standard terms with oral terms. / Linguistics / D. Litt et Phil. (Linguistics)
15

A practical approach to the standardisation and elaboration of Zulu as a technical language

Van Huyssteen, Linda 30 November 2003 (has links)
The lack of terminology in Zulu can be overcome if it is developed to meet international scientific and technical demands. This lack of terminology can be traced back to the absence of proper language policy implementation with regard to the African languages. Even though Zulu possesses the basic elements that are necessary for its development, such as orthographical standards, dictionaries, grammars and published literature, a number of problems exist within the technical elaboration and standardisation processes: * Inconsistencies in the application of standard rules, in relation to both orthography and terminology. * The lack of standardisation of the (technical) word-formation patterns in Zulu. (Generally the role of culture in elaboration has largely been overlooked). * The avoidance of exploiting written technical text corpora as a resource for terminology. (Text encoding by means of corpus query tools in term extraction has just begun in Zulu and needs to be properly exemplified). * The avoidance of introducing oral technical corpora as a resource for improving the acceptability of technical terminology by, for instance, designing a type of reusable corpus annotation. This study contributes towards solving these problems by offering a practical approach within the context of the real written, standard and oral Zulu language, mainly within the medical terminological domain. This approach offers a reusable methodological foundation with proper language exemplification that can guide terminologists in terminological research, or to some extent even train them, to achieve effective technical elaboration and eventual standardisation. This thesis aims at attaining consistent standardisation on the orthographical level in order to ease the elaboration task of the terminologist. It also aims at standardising the methods of word- (term-) formation linking them to cultural factors, such as taboo. However, this thesis also emphasises the significance of using written and oral technical corpora as terminology resource. This, for instance, is made possible through the application of corpus linguistics, in semi-automatic term extraction from a written technical corpus to aid lemmatisation (listing entries) and in corpus annotation to improve the acceptability of terminology, based on the comparison of standard terms with oral terms. / Linguistics and Modern Languages / D. Litt et Phil. (Linguistics)
16

A critical investigation of deaf comprehension of signed tv news interpretation

Wehrmeyer, Jennifer Ella January 2013 (has links)
This study investigates factors hampering comprehension of sign language interpretations rendered on South African TV news bulletins in terms of Deaf viewers’ expectancy norms and corpus analysis of authentic interpretations. The research fills a gap in the emerging discipline of Sign Language Interpreting Studies, specifically with reference to corpus studies. The study presents a new model for translation/interpretation evaluation based on the introduction of Grounded Theory (GT) into a reception-oriented model. The research question is addressed holistically in terms of target audience competencies and expectations, aspects of the physical setting, interpreters’ use of language and interpreting choices. The South African Deaf community are incorporated as experts into the assessment process, thereby empirically grounding the research within the socio-dynamic context of the target audience. Triangulation in data collection and analysis was provided by applying multiple mixed data collection methods, namely questionnaires, interviews, eye-tracking and corpus tools. The primary variables identified by the study are the small picture size and use of dialect. Secondary variables identified include inconsistent or inadequate use of non-manual features, incoherent or non-simultaneous mouthing, careless or incorrect sign execution, too fast signing, loss of visibility against skin or clothing, omission of vital elements of sentence structure, adherence to source language structures, meaningless additions, incorrect referencing, oversimplification and violations of Deaf norms of restructuring, information transfer, gatekeeping and third person interpreting. The identification of these factors allows the construction of a series of testable hypotheses, thereby providing a broad platform for further research. Apart from pioneering corpus-driven sign language interpreting research, the study makes significant contributions to present knowledge of evaluative models, interpreting strategies and norms and systems of transcription and annotation. / Linguistics / Thesis (D. Litt.et Phil. (Linguistics)
17

A critical investigation of deaf comprehension of signed tv news interpretation

Wehrmeyer, Jennifer Ella January 2013 (has links)
This study investigates factors hampering comprehension of sign language interpretations rendered on South African TV news bulletins in terms of Deaf viewers’ expectancy norms and corpus analysis of authentic interpretations. The research fills a gap in the emerging discipline of Sign Language Interpreting Studies, specifically with reference to corpus studies. The study presents a new model for translation/interpretation evaluation based on the introduction of Grounded Theory (GT) into a reception-oriented model. The research question is addressed holistically in terms of target audience competencies and expectations, aspects of the physical setting, interpreters’ use of language and interpreting choices. The South African Deaf community are incorporated as experts into the assessment process, thereby empirically grounding the research within the socio-dynamic context of the target audience. Triangulation in data collection and analysis was provided by applying multiple mixed data collection methods, namely questionnaires, interviews, eye-tracking and corpus tools. The primary variables identified by the study are the small picture size and use of dialect. Secondary variables identified include inconsistent or inadequate use of non-manual features, incoherent or non-simultaneous mouthing, careless or incorrect sign execution, too fast signing, loss of visibility against skin or clothing, omission of vital elements of sentence structure, adherence to source language structures, meaningless additions, incorrect referencing, oversimplification and violations of Deaf norms of restructuring, information transfer, gatekeeping and third person interpreting. The identification of these factors allows the construction of a series of testable hypotheses, thereby providing a broad platform for further research. Apart from pioneering corpus-driven sign language interpreting research, the study makes significant contributions to present knowledge of evaluative models, interpreting strategies and norms and systems of transcription and annotation. / Linguistics and Modern Languages / Thesis (D. Litt.et Phil. (Linguistics)

Page generated in 0.1116 seconds