Global ETD Search

1	Frazémy ve dvojjazyčném slovníku / Phrasemes in a Bilingual Dictionary Ježková, Jaroslava January 2016 (has links) This thesis deals with area of set phrasemes processing in dictionary, specifically processing of somatisms. The thesis consists of theoretical and practical part. The aim of theoretical part is phraseology in general, phrasemes (occasionally phraseologisms) and their application in Czech and German linguistics. Field of phrasemes like somatisms in order to language unit character is taken into account in the first section as well as dependence of phrasemes like their meaning explanation on the context in which they appear. Furthermore, there are listed and described main phrasemes characteristics which distinguish them from other language phenomenons. Conclusion of theoretical part analyzes area of corpus linguistics and its application based on corpus and co-occurrence analysis. Built on first part of thesis, practical part deals with processing of somatisms in bilingual dictionary particularly in lexicography point of view whereas proposal of specific solutions are given. As the attachment there are processed results of search into database input which may be considered as a part of bilingual dictionary. Keywords: phrasem, bilingual dictionary, corpus lexicography, corpus analysis, somatism
2	[en] THE CORPUS NEVER LIES: ON THE IDENTIFICATION AND USE OF MULTIWORD EXPRESSIONS / [pt] O CÓRPUS NÃO MENTE JAMAIS: SOBRE A IDENTIFICAÇÃO E USO DE COMBINAÇÕES MULTIVOCABULARES DO TIPO VERBO MAIS SINTAGMA NOMINAL MILENA DE UZEDA GARRAO 22 August 2006 (has links) [pt] Muitos estudos recentes sobre a identificação e uso de combinações multivocabulares (CMs) adotam uma perspectiva representacionista do significado da palavra. Este estudo propõe que é muito mais interessante identificar as CMs por um olhar não-representacionista. A metodologia proposta foi testada em CMs do tipo V+SN, um padrão bastante freqüente no português do Brasil (PB). Trata-se de uma análise estatística com base em córpus que pode ser resumida em três etapas: 1) córpus robusto do PB como base de análise, 2) aplicação de um teste estatístico ao córpus, a saber, teste de Logaritmo de Verossimilhança (Banerjee e Pedersen, 2003), para detecção das CMs mais freqüentes com padrão V+SN (como tomar café) e exclusão de co-ocorrências sintáticas aleatórias dos mesmos itens lexicais, 3) aplicação de Medidas de Similaridade (Baeza-Yates e Ribeiro-Neto, 1999) entre todos os parágrafos contendo uma certa CM (por exemplo, fazer campanha) e todos os parágrafos contendo o substantivo fora da CM (campanha). Esta última etapa foi utilizada para avaliar o grau de composicionalidade da CM. Pôde-se concluir que quanto maior a similaridade entre os parágrafos contendo a CM e os parágrafos contendo o substantivo fora da expressão, maior será o grau de composicionalidade da CM. Por essa razão, este estudo tem um impacto tanto teórico quanto prático para a semântica. / [en] A considerable amount of recent researches on defining multi-word expressions´ (MWE) phenomenon has an underlying representational framework of word meaning. In this study we claim that it is much more interesting to view MWE from a non-representational perspective. By choosing this path, we avoid the time-consuming and controversial human intuitions to MWE identification and definition. Our methodology was tested on Brazilian Portuguese verbal phrases of V+NP pattern. It is a statistically-based corpus analysis which could be summed up as the following three sequent steps: 1) robust linguistic corpora as output, 2) application of a probabilistic test to the corpora, namely Log Likelihood test (Banerjee and Pedersen, 2003), in order to spot the Portuguese MWEs of V+NP pattern (such as tomar café) and disregard casual syntactic and not otherwise motivated co-occurrences of the same lexical items, 3) application of Similarity Measures (Baeza-Yates and Ribeiro-Neto, 1999) between all the paragraphs containing a certain MWE and all the paragraphs containing its separate noun. This latter step is crucial to assess the MWE compositionality level. We conclude that the higher are the similarity measures between the MWE (such as fazer campanha) and its separate noun (campanha), the more compositional will be the MWE. Therefore, we believe that this work has both a practical and a theoretical impact to semantics. [pt] COMBINACOES MULTIVOCABULARES [en] MULTIWORD EXPRESSIONS [pt] COLOCACOES VERBAIS [en] VERBAL COLLOCATIONS [pt] LEXICOGRAFIA DE CORPUS [en] CORPUS LEXICOGRAPHY [pt] SEMANTICA DE CORPUS [en] CORPUS SEMANTICS
3	[en] SUPPORT NOUNS: OPERATIONAL CRITERIA FOR CHARACTERIZATION / [pt] O SUBSTANTIVO-SUPORTE: CRITÉRIOS OPERACIONAIS DE CARACTERIZAÇÃO CLAUDIA MARIA GARCIA MEDEIROS DE OLIVEIRA 06 March 2007 (has links) [pt] Este trabalho tem por objetivo prover um critério operacional para caracterizar substantivos em combinações de substantivo seguido de adjetivo, em que o substantivo apresenta situação análoga à dos chamados verbos leves ou verbos-suporte, largamente estudados em Lingüística e Processamento de Linguagem Natural nos últimos anos. O trabalho se situa na confluência entre estudos lingüísticos, lexicográficos e computacionais e pretende explorar a potencialidade da análise automática de corpora e instrumentos quantitativos em busca de uma maior objetividade na fundamentação de conceitos que norteiam a atividade de análise lingüística. O desenvolvimento da pesquisa alia a pesquisa em corpus ao dicionário tradicional para realizar o levantamento das principais propriedades das combinações S - Adj, particularizado para o caso de ocorrência de adjetivos denominais. A partir das informações lexicográficas e contextuais demonstra-se a existência de um conjunto de substantivos que participam das construções estudadas de maneira semelhante aos verbos- suporte em combinações V - SN. Um método automático de reconhecimento dos substantivos-suporte em textos é elaborado, com o objetivo de fornecer aos estudiosos um instrumento capaz de produzir evidências convincentes, dada a insuficiência de julgamentos intuitivos para justificar a delimitação de expressões de aparente irregularidade. / [en] The main goal of this work is to provide operational criteria for characterizing nouns in Noun - Adjective combinations, in which the noun occurs in an analogous way to so called light verbs or support verbs, widely studied in recent years in both Linguistics and Natural Language Processing. In the work, linguistic, lexicographic and computational studies converge in order to explore the potential for automatic analysis of corpora, whose aim is to provide quantitative tools and methods which would lead to a more objective way of establishing concepts which underlie linguistic analysis. The work unites corpus-based research with traditional lexicography in order to elicit the main properties of the N-Adj combinations occurring with denominal adjectives. The lexicographic and contextual data reveal the existence of a set of nouns that occur in the studied constructions in a way similar to light verbs in V-Noun phrasal combinations. An automatic method for recognizing support nouns in texts is developed, which will provide language specialists with an instrument capable of bringing solid evidence to add to intuitive judgments in the task of justifying the delimitation of expressions that are apparently irregular [pt] LINGUISTICA [en] LINGUISTICS [pt] LEXICOGRAFIA DE CORPUS [en] CORPUS LEXICOGRAPHY [pt] SUBSTANTIVO-SUPORTE [en] SUPPORT NOUN [pt] ADJETIVO DENOMINAL [en] DENOMINAL ADJECTIVE [pt] CLASSE DE PALAVRAS [en] PART OF SPEECH
4	Srovnávací aspekty lotyšského a českého lexikonu (Materiály k sestavení lotyšsko-českého slovníku) / Comparative aspects of Latvian and Czech lexicons: Materials for assembling a Latvian- Czech dictionary Škrabal, Michal January 2016 (has links) Title: Comparative aspects of Latvian and Czech lexicons: Materials for assembling a Latvian-Czech dictionary Autor: Mgr. Michal Škrabal Department: Institute of the Czech National Corpus Supervisor: prof. PhDr. František Čermák, DrSc. The primary aim of this work is to classify the Latvian lexicon, or better its relevant segment, into individual groups, definable semantically, grammatically, syntagmatically, pragmatically, and so on, and to attempt to find for these classifications an ideal method of lexicographical adaptation and apply it to an emerging Latvian-Czech dictionary (the very first manual of its type). To this end, modern instruments were utilized which, in the recent past, have radically altered the methodology of lexicographical work: on the one hand, the linguistic corpora, which nowadays represent authentic, linguistic usage and, on the other hand, the specialized lexicographic software TshwaneLex, in which a lexical database of Latvian is constructed and from which the dictionary itself will be subsequently constructed. Because of the limited size of the Latvian corpus it was not possible to completely eliminate traditional sources, and the author of the work was forced to consolidate traditional and modern lexicographical methods. His primary source however remained the corpus...

1

Page generated in 0.0431 seconds