Global ETD Search

81	Αυτόματη εξαγωγή λεξικής - σημασιολογικής γνώσης από ηλεκτρονικά σώματα κειμένων με χρήση ελαχίστων πόρων / Automatic extraction of lexico - semantic knowledge from electronic text corpora using minimal resources Θανόπουλος, Αριστομένης 25 June 2007 (has links) Το αντικείμενο της διατριβής είναι η μελέτη μεθόδων αυτόματης εξαγωγής των συμφράσεων και των σημασιολογικών ομοιοτήτων των λέξεων από μεγάλα σώματα κειμένων. Υιοθετείται μια προσέγγιση ελάχιστων γλωσσικών πόρων που εξασφαλίζει την απεριόριστη μεταφερσιμότητα των μεθόδων σε φυσικές γλώσσες και θεματικές περιοχές. Για την αξιολόγηση των προτεινόμενων μεθόδων προτείνονται, αξιολογούνται και εφαρμόζονται μεθοδολογίες με βάση πρότυπες βάσεις λεξικής γνώσης (στην Αγγλική), όπως το WordNet. Για την εξαγωγή των συμφράσεων προτείνονται νέα μέτρα εξαγωγής στατιστικά σημαντικών διγράμμων και γενικά ν-γράμμων που αξιολογούνται θετικά. Για την εξαγωγή των λεξικών - σημασιολογικών ομοιοτήτων των λέξεων ακολουθείται καταρχήν η προσέγγιση ομοιότητας περικειμένων λέξεων με παραθυρικές μεθόδους, όπου μελετώνται το πεδίο συμφραζομένων, το φιλτράρισμα των συνεμφανίσεων των λέξεων, τα μέτρα ομοιότητας, όπου εισάγεται ο παράγοντας του αριθμού κοινών παραμέτρων, καθώς και η αντιμετώπιση συστηματικών σφαλμάτων, ενώ προτείνεται η αξιοποίηση των λειτουργικών λέξεων. Επιπλέον, προτείνεται η αξιοποίηση της ομοιότητας περικείμενων εκφράσεων, που απαντάται συχνά σε θεματικώς εστιασμένα κείμενα, με ένα αλγόριθμο βασισμένο στην ετεροσυσχέτιση ακολουθιών λέξεων. Μελετάται η μεθοδολογία αξιοποίησης των παρατακτικών συνδέσεων ενώ προτείνεται μια μέθοδος ενοποίησης ετερογενών σωμάτων γνώσης λεξικών – σημασιολογικών ομοιοτήτων. Τέλος, η εξαχθείσα γνώση μετασχηματίζεται σε σημασιολογικές κλάσεις με μια συμβολική μέθοδο ιεραρχικής ομαδοποίησης και επίσης ενσωματώνεται επιτυχώς σε ένα διαλογικό σύστημα μηχανικής μάθησης όπου ενισχύει την απόδοση της αναγνώρισης του σκοπού του χρήστη συμβάλλοντας στην εκτίμηση του ρόλου των άγνωστων λέξεων. / The research described in this dissertation regards automatic extraction of collocations and lexico-semantic similarities from large text corpora. We follow an approach based on minimal linguistic resources in order to achieve unrestricted portability across languages and thematic domains. In order to evaluate the proposed methods we propose, evaluate and apply methodologies based on English gold standard lexical resources, such as WordNet. For the extraction of collocations we propose and test a few novel measures for the identification of statistically significant bigrams and, generally, n-grams, which exhibit strong performance. For the extraction of lexico-semantic similarities we follow a distributional window-based approach. We study the contextual scope, the filtering of lexical co-occurrences and the performance of similarity measures. We propose the incorporation of the number of common parameters into the latter, the exploitation of functional words and a method for the elimination of systematic errors. Moreover, we propose a novel approach to exploitation of word sequence similarities, common in technical texts, based on cross-correlation of word sequences. We refine an approach for word similarity extraction from coordinations and we propose a method for the amalgamation of lexico-semantic similarity databases extracted via different principles and methods. Finally, the extracted similarity knowledge is transformed in the form of soft hierarchical semantic clusters and it is successfully incorporated into a machine learning based dialogue system, reinforcing the performance of user’s plan recognition by estimating the semantic role of unknown words. Λεξική σημασιολογία Αυτόματες μέθοδοι Στατιστικές μέθοδοι Συμφράσεις 410.285 Lexical semantics Automatic methods Statistical methods Natural language processing Semantic similarity Collocations
82	Acquisition automatique de traductions d'unités lexicales complexes à partir du Web Léon, Stéphanie 08 December 2008 (has links) (PDF) Les systèmes de traduction automatique ont connu des progrès récents avec la prise en compte d'expressions complexes telles que " vol à main armée " ("armed robbery" en anglais). Cependant, dès que l'on sort de ces listes d'expressions figées, on retombe rapidement dans des erreurs de traduction. Par exemple, le traducteur Systran traduit " caisse centrale " par "central case" au lieu de "central fund". Cette expression aurait pu être automatiquement traduite grâce au Web. Le but de cette étude est la création d'une base bilingue français-anglais de traduction automatique d'unités lexicales complexes à partir du Web. Nous axerons notre étude sur les difficultés de traduction telles que la polysémie ou le caractère idiomatique et proposerons des traitements adaptés. Au-delà des aspects linguistiques et technologiques, nous analyserons les utilisations du Web dans le domaine de la linguistique. [INFO] Computer Science [INFO] Informatique Traduction automatique Acquisition automatique Unités lexicales complexes Désambiguïsation lexicale World Wide Web Corpus Collocations Recherche d'informations Compositionnalité Terminologie
83	Kartläggning av KVINNA och MAN i August Strindbergs verk : En korpusstudie av sammansatta substantiv och kollokationer med ett diakront perspektiv / Mapping the use of WOMAN and MAN in August Strindberg's works : A corpus study of compound nouns and collocations with a diachronic view Guseinova, Fatima January 2018 (has links) Diskursprosodi och semantisk preferens är inneboende aspekter i språk, då inget ord existerar som en isolerad enhet så snart det är en del av en text. Syftet med denna studie var att kvantitativt undersöka August Strindbergs bruk av orden ’kvinna’ och ’man’. Materialet som låg till grund för studien utgjordes av trilogin Röda Rummet, Götiska Rummen och Svarta Fanor. För att göra det undersöktes ordens förekomst med avseende på frekvens, grad av emotionell laddning i sammansättningar, och distribution av lexikaliserade sammansättningar. Därutöver undersöktes kontexten för lemman kvinna och man, genom en kartläggning av diskursprosodi och semantisk preferens för kollokationerna för kvinna och man. Syftet var att kartlägga detta ur ett diakront perspektiv. Resultaten visade att män omtalas, i lemmaform, i större utsträckning än kvinnor i trilogin. Sammansättningar för man är oftast lexikaliserade, medan de för kvinna är det mycket mer sällan. Kvinnor färgas oftare negativt i sammansättningar med kvinna medan motsvarigheten för män nästan alltid är neutral i sin emotionella laddning. Kollokationsanalysen kunde inte säkert visa på Strindbergs attityder gentemot kvinnor och män. Mer data och djupare analysmetoder skulle behövas för en kartläggning av författarens attityd. / Discourse prosody and semantic preference are inherent aspects of language. As soon as a word becomes part of text, it seizes to exist as an isolated unit. The aim of this thesis was to study quantitatively the use of the lemmas woman and man, and the compound nouns containing them, in the works of August Strindberg. The material used consists of his novels The Red Room, Gothic Rooms and Black Banners. The occurrence of the two lemmas was observed with respect to frequency, the degree of emotional weight in compounds and the distribution of lexicalized compounds between women and men. Additionally, the context of lemmas was observed diachronically, through an analysis of discourse prosody and semantic preference of the collocations for woman and man. The results showed that the lemma man is mentioned more often than woman. Most compounds for man are lexicalized, while the opposite pertains to women. Compounds containing woman are more often negatively charged. Meanwhile, compounds containing man are predominantly neutral. The analysis of collocations for the lemmas was not able to map the author’s attitude accurately and more data and deeper methods of analysis are needed. semantic prosody semantic preference compounds collocations emotional weight lexicalization semantisk prosodi semantisk preferens sammansatta ord kollokationer emotionell laddning lexikalisering General Language Studies and Linguistics
84	Colocações criativas presentes no corpus literário paralelo Memórias póstumas de Brás Cubas sob a perspectiva de um novo olhar / Creative collocations in the literary parallel corpus: Memórias póstumas de Brás Cubas under a new perspective Teixeira, Luiz Gustavo [UNESP] 29 July 2016 (has links) Submitted by Luiz Gustavo Teixeira null (guteixeiranh@hotmail.com) on 2016-09-13T00:13:33Z No. of bitstreams: 1 dissertação final_corrigida.pdf: 1490041 bytes, checksum: c24052ade68282564ca0f8ec02c73aa8 (MD5) / Approved for entry into archive by Felipe Augusto Arakaki (arakaki@reitoria.unesp.br) on 2016-09-14T20:13:10Z (GMT) No. of bitstreams: 1 teixeira_lg_me_sjrp.pdf: 1490041 bytes, checksum: c24052ade68282564ca0f8ec02c73aa8 (MD5) / Made available in DSpace on 2016-09-14T20:13:10Z (GMT). No. of bitstreams: 1 teixeira_lg_me_sjrp.pdf: 1490041 bytes, checksum: c24052ade68282564ca0f8ec02c73aa8 (MD5) Previous issue date: 2016-07-29 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES) / O presente trabalho tem como objetivo a análise das traduções das colocações criativas presentes em um corpus literário paralelo, constituído pela obra originalmente escrita em português, Memórias Póstumas de Brás Cubas (TO), de Machado de Assis e de suas três versões para língua inglesa: Epitaph of a Small Winner (TT¹), de Grossman (1953), Posthumous Reminiscences of Braz Cubas (TT²), de Ellis (1955) e, The Posthumous Memoirs of Brás Cubas (TT³), de Rabassa (1997). Como fundamentação teórica e metodológica, apoiamo-nos nos pressupostos teóricos da Linguística de Corpus e de sua interface com os Estudos da Tradução Baseados em Corpus e a Literatura, no conceito de colocações criativas, bem como nos estudos machadianos, de Bosi (1999, 2006) e Schwarz (1990), mostrando como o olhar do defunto autor é retratado pelos olhos dos personagens, nas passagens selecionadas. Para o levantamento das palavras de maior índice de chavicidade, utilizamos o programa WordSmith Tools (SCOTT, 2012), o qual nos possibilitou realizar uma análise mais abrangente e dinâmica dos dados. Como corpora de referências em inglês e português, usamos respectivamente o Brown Corpus e o corpus Lácio-Ref. O levantamento das palavras-chave apontou a significativa chavicidade dos nódulos “olhos”, no texto original (TO) e de eyes, nos textos traduzidos (TT¹, TT², TT³), a partir dos quais extraímos e analisamos as colocações criativas relacionadas aos referidos nódulos. Tanto o levantamento das palavras-chave quanto a análise das traduções das colocações criativas, nas passagens selecionadas, mostram-nos que, apesar de todos os tradutores repetirem as traduções de algumas colocações criativas, Grossman (TT¹) as repete com mais frequência e, portanto, não explora a criatividade presente no estilo machadiano. A análise também nos sugere que em determinadas passagens, os tradutores não absorvem o sentido das colocações criativas originalmente empregadas, revelando a dificuldade de tradução do estilo machadiano. / This study aims to analyze the creative collocations in a literary parallel corpus comprised of the original text in Portuguese Memórias Póstumas de Brás Cubas, by Machado de Assis (1891), and its three translations into English Epitaph of a Small Winner (TT¹), by Grossman (1953), Posthumous Reminiscences of Braz Cubas (TT²), by Ellis(1955) and, The Posthumous Memoirs of Brás Cubas (TT³), by Rabassa (1997). The theoretical and methodological approach was based on Corpus Linguistics and its relations with Corpus-based Translation Studies and Literature, on the study of creative collocations, and some literary concepts from Alfredo Bosi (1999, 2006) and Schwarz (1990), trying to show the implications of the dead Brás Cubas‘ looks on the characters in the selected fragments. In order to extract the most significant key words, we used the computer program WordSmith Tools (SCOTT, 2012) which allowed us to accomplish a broader analysis of data. As reference corpora we used the Brown Corpus in English and the Lacio-Ref corpus in Portuguese. The extraction of the keywords has shown a significant keyness value of the nodes ―olhos‖ in the original text (TO) and eyes in translated texts (TT¹, TT², TT³) and thus the creative collocations related to these nodes were analyzed. Both the extraction of keywords and the analysis of the translations of creative collocations, in the selected fragments, show us that in spite of the translators repeating the translation of some creative collocations, Grossman (TT¹) did it more frequently, and did not explored the creativity that Machado‘s writing entails. The analysis also suggests that in some fragments the translators do not render the very sense of the collocations, revealing how difficult the task of translating Machado‘s style is. Colocações Criativas Estudos da tradução baseados em Corpus Corpus Literário paralelo Memórias póstumas de Brás Cubas Creative Collocations Corpus-Based translation studies Literary Parallel Corpus Memórias póstumas de Brás Cubas
85	As colocações verbais em três dicionários bilíngues e bilemáticos de alemão-português / The verbal collocations in three bilingual and bilemmatical german-portuguese dictionaries Nara Cristina Sanseverino Mahler 06 November 2009 (has links) A presente pesquisa pertence à área de conhecimento chamada Lexicografia e trata da inserção de colocações verbais em dicionários bilíngües e bilemáticos de alemão-português. Tendo em vista que a maioria dos livros didáticos de alemão como língua estrangeira não dá um tratamento adequado a elas, é necessário que o aprendiz procure-as nos dicionários, tanto no momento da decodificação, quanto no da codificação de textos. Essa pesquisa levantou certo número de colocações verbais consideradas fundamentais na formação de um vocabulário-base em situações rotineiras de comunicação, verificando em três dicionários se elas ocorrem e como ocorrem, na tentativa de estabelecer até que ponto tais dicionários se constituem numa ajuda efetiva não só na compreensão, como (principalmente) na produção de textos. / This research belongs to the area of knowledge called Lexicography and treats of the insertions of verbal collocations in bilingual and bilemmatical German- Portuguese dictionaries. Considering that the majority of the didactic books of German as foreign language does not treat them properly, it is necessary that the learner looks them up in dictionaries at the moment of the decoding as well as when coding texts. For this research we selected a certain number of verbal collocations considered fundamental to the formation of a basic vocabulary for routine communication situations, checking in three dictionaries if they occur and how they occur, in an attempt to establish how these dictionaries represent an effective help not only for comprehension but also for text production. Alemão língua estrangeira Aprendizagem de língua estrangeira Colocações verbais Lexicografia Bilingual dictionaries german-portuguese Foreign language learning German as foreign language Lexicography Verbal collocations
86	Názvy současných profesí ve zdravotnictví / Names of contemporary professions in healthcare Hubková, Helena January 2016 (has links) The presented thesis describes the typology of current names of professions in healthcare in terms of onomasiology and structural-morphology. The starting point of the thesis is the existing professional linguistic description of naming structures implemented as one-word names and nominal collocations. The language material consists of the names of professions in healthcare that are currently used. The names are classified in terms of word-formation, lexical semantics and the origin of words.
87	Estrategias Didácticas para la Enseñanza del Enfoque Léxico y las Nuevas Tecnologías de la Información y de la Comunicación (TIC) en la Clase de Español como Lengua Extranjera (ELE) Reed, Stella L 05 1900 (has links) The purpose of this research was to use a lexical approach, and information and communication technologies (ICT) as strategies to facilitate learning to students of Spanish as a foreign language, to answer the research questions: (1) if the use of multiword lexical units improves the lexical competence and (2) whether the use of technology is an effective strategy to facilitate learning. A lesson plan with different activities was designed and put into practice with two groups of students (experimental and control) of the intermediate level of the University of North Texas (UNT). The collected data were analyzed using the quantitative paradigm with the variance model ANOVA with repeated measures, and the qualitative or interpretive paradigm to offer a broader perspective of the learning process of multiwords lexical units (collocations and idiomatic expressions). The results of this investigation answered the research questions and confirmed the effectiveness of the lexical approach and ICT in the teaching-learning process and facilitated the student's acquisition of L2. collocations idiomatic expressions
88	Implementace nových metod řešení úloh dynamiky v programovém systému ANSYS / Implementation a new methods for the dynamic analysis using the program system ANSYS Klaška, Petr January 2008 (has links) This paper deals with an implementation trigonometric collocation method for ANSYS program system and its application with databases additional stiffness, mass and damping effects of journals or hydrodynamic dampers.
89	La conception d’un dictionnaire bilingue arabe/français/ français/arabe de verbes supports destinés à la traduction / Creating a bilingual Arabic / French, French / Arabic dictionnary of light verbs intended for translation Aji, Hani 05 July 2019 (has links) Ce travail vise à étudier les collocations à verbe support. Un phénomène qui a été introduit par Z. Harris (1964) et décrit par plusieurs chercheurs à la suite des travaux du laboratoire d’automatique documentaire et linguistique dirigés par M. Gross. L’un des principaux buts de ce travail est de savoir comment ces phénomènes sont construits et comment les modéliser et les anticiper. Il s’agira de donner les caractéristiques différenciant ce phénomène des autres constructions comme les locutions figées, en nous penchant sur sa charge sémantique considérée à tort comme vide de sens.Cette recherche est faite dans l’optique de créer un dictionnaire bilingue de verbes supports (arabe – français) (français – arabe). C’est en ce sens que cette recherche s’intéressera aussi à la traduction de ces verbes supports en arabe afin d’essayer de créer un nouveau type d’article dictionnairique spécialement conçu pour ces derniers. Pour créer cet article dictionnairique, nous chercherons à démontrer la possibilité de classer les noms prédicatifs et leurs verbes supports suivants des catégories. Les noms prédicatifs seront divisés suivant des catégories lexicales engendrées par une ontologie binaire construite avec quatre notions et dégageant onze catégories lexicales. Les verbes supports sont à leur tour divisés en catégories sémantiques de verbes supports « scénarios » suivants lesquelles ils sont classés et peuvent être anticipés. Ces scénarios répondront aux deux critères de la limitation en nombre de scénarios et de l’exhaustivité de l’application. / This work aims at studying light verb colocations. A phenomenon that was introduced by Z. Harris (1964) and described by several researchers following the lab work of automatic documentary and linguistics directed by M. Gross.One of the main goals of this work is to know how these phenomena are built and how to model them and anticipate them. It will cover the characteristics differentiating this phenomenon from the other constructions like idioms, by leaning on its semantic load considered wrongly as meaningless.This research is done with the aim of creating a bilingual dictionary of light verbs (Arabic - French) (French - Arabic). Therefore, this research will also focus on the translation of these Arabic verbs in order to try to create a new type of dictionary article specially designed for them.To create this dictionary article, we will try to demonstrate the possibility of classifying the predicative nouns and their following light verbs into categories. Predicative nouns will be divided into lexical categories generated by a binary ontology created with four notions and distinguishing eleven lexical categories.The light verbs will be divided into semantic categories of light verbs "scenarios" following which they are classified and can be anticipated. These scenarios will meet the two criteria of limiting the number of scenarios and entireness/exhaustiveness of the application. Verbe support Nom prédicatif Collocations Combinatoire Lexicographie Scénario Traduction Catégorisation Ontologie Sémantique Light verb Predicative noun Colocation Colocation Lexicography Scenario Translation Categorization Ontology Semantics 492
90	Hodný, zlý a ošklivý (The Good, the Bad and the Ugly) : The representation of three minority groups in printed media discourse from the Czech Republic / Den gode, den onde, den fule : tre minoritetsgruppers representation i tjeckisk mediediskurs Elmerot, Irene January 2017 (has links) Syftet med detta arbete är att göra en kvantitativ, korpusbaserad undersökning av den språkliga andrafieringen av tre minoritetsgrupper: romer, vietnameser och ukrainare, i den tjeckiska mediadiskursen under 15 år, samt att få ett omfattande, representativt resultat genom att jämföra neutrala, positiva och negativa adjektiv som står intill sökorden. Till teoretisk grund ligger hur språket hjälper till att bygga och förstärka maktstrukturer samt hur korpussökningar efter kollokationer och närliggande ord kan tydliggöra en språklig andrafiering. Här används en kvantitativ metod för att besvara analytiska frågor om dessa benämningar. Materialet som ligger till grund för analysen är SYN version 4 i det tjeckiska Nationalkorpuset – i sin helhet består det av 275 miljoner meningar. Det verkar inte tidigare ha utförts någon sådan undersökning på ett så stort material, även om några forskare har använt liknande metoder. Resultatet bekräftar att de olika grupperna beskrivs på olika sätt, och är, genom det stora källmaterialet, ett bevis på hur språket i mediadiskursen speglar diskursen i samhället i stort. corpus analysis linguistic othering media discourse collocations in Czech stereotypes frequency analysis korpusundersökning språklig andrafiering mediadiskurs tjeckiska kollokationer stereotyper frekvensundersökning Specific Languages Studier av enskilda språk

Search results