Spelling suggestions: "subject:"similarity."" "subject:"imilarity.""
351 |
Text readability and summarisation for non-native reading comprehensionXia, Menglin January 2019 (has links)
This thesis focuses on two important aspects of non-native reading comprehension: text readability assessment, which estimates the reading difficulty of a given text for L2 learners, and learner summarisation assessment, which evaluates the quality of learner summaries to assess their reading comprehension. We approach both tasks as supervised machine learning problems and present automated assessment systems that achieve state-of-the-art performance. We first address the task of text readability assessment for L2 learners. One of the major challenges for a data-driven approach to text readability assessment is the lack of significantly-sized level-annotated data aimed at L2 learners. We present a dataset of CEFR-graded texts tailored for L2 learners and look into a range of linguistic features affecting text readability. We compare the text readability measures for native and L2 learners and explore methods that make use of the more plentiful data aimed at native readers to help improve L2 readability assessment. We then present a summarisation task for evaluating non-native reading comprehension and demonstrate an automated summarisation assessment system aimed at evaluating the quality of learner summaries. We propose three novel machine learning approaches to assessing learner summaries. In the first approach, we examine using several NLP techniques to extract features to measure the content similarity between the reading passage and the summary. In the second approach, we calculate a similarity matrix and apply a convolutional neural network (CNN) model to assess the summary quality using the similarity matrix. In the third approach, we build an end-to-end summarisation assessment model using recurrent neural networks (RNNs). Further, we combine the three approaches to a single system using a parallel ensemble modelling technique. We show that our models outperform traditional approaches that rely on exact word match on the task and that our best model produces quality assessments close to professional examiners.
|
352 |
O conceito de semelhança: uma proposta de ensinoMaciel, Alexsandra Camara 07 October 2004 (has links)
Made available in DSpace on 2016-04-29T14:32:22Z (GMT). No. of bitstreams: 1
dissertacao_alexsandra_camara_maciel.pdf: 7431979 bytes, checksum: 0599c4281d6132ee79282b6e5acd0f5f (MD5)
Previous issue date: 2004-10-07 / The aim of this research was to analyse the difficulties dealt with by first form students attending the Brazilian high school system, as well as to check to what extent a teaching sequence using the concept of homothetic integrated with Geometric Optics can favour the learning of the concept of similarity. Our preliminary studies lead us to observe that teaching-learning problems of the concept of similarity relate to some aspects linked to configurations, enough necessary conditions and the context. Basing our study on three hypotheses, we have tried to answer the following question: Can a student who has been introduced to the concept of homothetic with Geometric Optics learn significantly the concept of similarity? . To make our hypotheses valid, we have elaborated and applied a teaching sequence, using as a theoretical approach a Teaching Boar of Geometry by Bernard Parsysz and the Register of Representation and Functioning of a figure by Raymond Duval. After the use of the teaching sequence, we have done a post-test and a qualitative and quantitative analysis leading to discussions, aiming at the validity of our hypotheses and the answering of the questions of the research. We have come to the conclusion that our hypotheses seem pertinent: the development of activities based on situations related to the formation of shadows and images in dark chambers linked to the concept of homothetic allowed us to touch on the concept of similarity in a broad way, working with various similarity in a broad way, working with various types of configurations and the sufficient conditions necessary for the existence of similarity between two figures / O objetivo desta pesquisa é analisar as dificuldades enfrentadas na formação do conceito de semelhança por alunos da 1ª série do Ensino Médio, bem como verificar até que ponto uma seqüência de ensino que utilize o conceito de homotetia, integrado com a Ótica Geométrica, pode favorecer a apreensão do conceito de semelhança. Nossos estudos preliminares nos permitiram verificar que os problemas de ensino-aprendizagem do conceito de semelhança têm relação com aspectos ligados às configurações, às condições necessárias e suficientes e ao contexto. Com base em três hipóteses, procuramos responder à seguinte questão: Uma seqüência de ensino que utilize o conceito de homotetia, integrado com a Ótica Geométrica, proporciona ao aluno uma aprendizagem significativa do conceito de semelhança? . Para validar nossas hipóteses, elaboramos e aplicamos uma seqüência de ensino, empregando como suporte teórico um Quadro de Ensino da Geometria de Bernard Parsysz e o Registro de Representação e Funcionamento de uma Figura de Duval. Após a aplicação da seqüência de ensino, realizamos um pós-teste e uma análise qualitativa e quantitativa levantando algumas discussões com o objetivo de validar nossas hipóteses e responder à questão de pesquisa. Para finalizar, concluímos que as hipóteses parecem pertinentes: o desenvolvimento de atividades baseadas em situações de formação de sombra e de imagem em câmara escura relacionadas ao conceito de homotetia propiciou abordar o conceito de semelhança de forma ampla, trabalhando os tipos variados de configurações e as condições necessárias e suficientes para a existência da semelhança entre duas figuras
|
353 |
A Floristic Study of the Woody Vegetation of the North American Cross TimbersHarrison, Thieron Pike 12 1900 (has links)
This research represents the first systematic collection of the woody plants throughout the Cross Timbers. It provides the first keys to these plants in their vegetative condition, plant descriptions, distribution maps, and some quantitative measurements used for descriptive purposes. Descriptions of the woody plants were constructed as an aid in verification after a specimen has been identified by use of the keys. The measurements given pertain only to the woody plants as they occur in the Cross Timbers. Distributional maps are provided for all the taxa considered in this research. With the exception of those species which have the ecological amplitude to grow throughout the Cross Timbers, the distribution of the majority of the remaining species seems to be most strongly influenced by average annual precipitation. In a few instances, conditions associated with latitude appear to govern the distribution of species or varieties within the Cross Timbers. Throughout the Cross Timbers, post oak (Quercus stelta), blackjack oak (Quercus marilandica), and hickory (Caraa texan) dominate the upland forests. The streamside forests are dominated by willow (alix nigra), cottonwood (Populus deltoides), and hackberry (Celtis laevi ata). The variation in the vegetation of the Cross Timbers is not due to any change in dominant species, but rather to the distribution of the associated species which occur in the two prominent community types.
|
354 |
Metaphor and metonymy in Cantonese and English body-part idioms: a comparative, cognitive semantic study, with pedagogic applications. / CUHK electronic theses & dissertations collectionJanuary 2009 (has links)
Although it is generally accepted that L2 learners of English need to gain a good grasp of idioms, the teaching and learning of idioms in L2 is no easy task. One of the reasons is that idioms are enormous in number. The Collins COBUILD Idioms Dictionary (2002) lists over 6,000 idioms which are of contemporary, everyday, use. Another reason is that a considerable number of idioms are figurative in nature---that is, their overall meaning cannot be obtained by simply adding up the literal meanings of the components involved. Added and related to these factors is the fact that the traditional vocabulary listing method adopted in most ESL/EFL textbooks presents each idiom entry and its meaning in such a way that the choice of each single word in the idiom seems random, and the overall figurative meaning deriving from the combination of the constituent words appears inexplicable. Taken together, these factors make idioms one of the most difficult aspects of L2 teaching and learning. / Idioms, which are a type of phraseological unit and are largely figurative in nature, are pervasive and ubiquitous in human language. A significant part of L1 everyday linguistic repertoire is formed by idioms and idiom-like constructions. In fact, the level of command of idioms serves as an important indicator of L2 proficiency. In other words, fluent and native-like language, a concern particularly for many advanced L2 learners, entails a good mastery of idioms. / In the light of the above problems, the present study, founded on Cognitive Linguistics (CL), aims to shed light on a more effective and manageable teaching and learning of L2 idioms by examining the CL theoretical assumptions compatible with L2 pedagogy. The CL feature which possesses the greatest potential for complementing language pedagogy is the notion of 'motivation.' In other words, the author explores, on the one hand, the potential of CL notions such as metaphor and metonymy for providing motivation for L2 idiom pedagogy, and, on the other, the potential of the above notions for comparing L1 and L2 idioms both linguistically and conceptually. Such a linguistic and conceptual comparative analysis of L1 and L2 idioms enables us to anticipate the possible difficulties encountered by L2 learners in learning idioms. / There are three methodologically independent but theoretically coherent research components in the present study. Study One is the elicitation of the body-part idioms in English (L2) and Cantonese (L1). Nine body parts are involved: head, eye/eyes, face, mouth, hand/hands, heart, foot/feet, body and bones. There is an examination of the underlying cognitive motivation (i.e. pure metaphor, pure metonymy, metaphor and metonymy) of each of the elicited idioms in both languages. Study Two is the think-aloud experiment which aims at eliciting from a group of Cantonese advanced L2 learners of English the mental images they produce in response to the English body-part idioms. These mental images should provide insight into the conceptual and linguistic similarities and differences between idioms in the two languages. Study Three is the experimental study. This aims at testing empirically the pedagogical soundness of teaching English idioms using conceptual metaphor and metonymy as well as an English-Cantonese idiom comparison. A total number of 106 Cantonese advanced L2 learners of English majoring in English were invited to participate in the experiment. They were divided into three groups, each of which was treated with a particular L2 idiom learning method. Experimental results show that students receiving the method involving conceptual metaphor, conceptual metonymy and a cross-linguistic comparison (i.e. the '3Cs' method) outperformed students in the other two groups, thus implying the pedagogical soundness of the 3Cs method in enhancing L2 idiom teaching and learning. / Leung, Chung Hong. / Adviser: Peter Crisp. / Source: Dissertation Abstracts International, Volume: 70-09, Section: A, page: . / Thesis submitted in: November 2008. / Thesis (Ph.D.)--Chinese University of Hong Kong, 2009. / Includes bibliographical references (leaves 351-363). / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Electronic reproduction. [Ann Arbor, MI] : ProQuest Information and Learning, [200-] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Abstracts in English and Chinese. / School code: 1307.
|
355 |
Vers une théorie de l'anticipation du sens : Principes d'analyse structurale / Towards a theory of the meaning anticipation : Structural analysis principlesDecobert, Bernard 24 October 2014 (has links)
Ce travail de thèse interroge la structure fondamentale du langage. Il s’inscrit dans la tradition greimassienne en ce sens qu’elle assume son affiliation avec les théories de Hjelmslev, Jakobson, Lévi-Strauss et Propp. Mais au-delà, il cherche ce que ces théories ont en commun et qu’elles n’avaient peut-être pas encore suffisamment abordé, notamment certaines relations avec la théorie thomienne/petitotienne. Nous postulons dans cette étude que le discours subit la contrainte d’un ordre sémantico-syntaxique sous-jacent et structurant. Plus précisément, nous postulons que le discours tend à s’organiser à partir, et/ou autour, d’une même chaîne sous-jacente, hiérarchique, contraignante, non aléatoire, transphrastique et restreinte de catégories sémantiques fondamentales. Nous nous inscrivons de fait dans le champ morphodynamique. A partir d’une double approche sémasiologique et distributionnelle, puis par des méthodes statistiques et mathématiques, nous montrons que certains sèmes, assimilables à des universaux et aux propriétés topologiques sous-jacentes, s’incrémentent naturellement dans le corps des énoncés. Ainsi, ce ne sont plus principalement des sèmes isolés que nous étudions mais des séries, des chaînes, des agencements syntagmatiques qui débouchent sur une forme de « proto-raisonnement » associant des sèmes. Enfin, nous montrons par l’expérimentation qu’il existe vraisemblablement aussi d’autres régularités à l’intérieur de cette structure continue ; régularités qui permettent de conforter l’hypothèse d’un principe autosimilaire entre structure profonde et structure superficielle. Notre thèse s’attache en principal à décrire, et à justifier l’existence d’un tel phénomène structural. / This thesis questions the fundamental structure of language. It belongs to the Greimas tradition in the sense that it assumes its affiliation with the theories of Hjelmslev, Jakobson , Levi-Strauss, and Propp. This work tries to go beyond by defining connections with Thom and Petitot theory. In this study, we postulate that discourse is constrained by an underlying semantic and syntactic structural order. More specifically, our assumption is that the discourse tends to be organized from and/or around a same underlying chain, involving non-random hierarchy, going beyond the sentences and being constrained to semantic fundamental categories. Our analysis is also included in a morphodynamic framework.From a double semasiologic and distributional approach, and from statistical and mathematical methods, we show that various semes, linked to universals and topological underlying properties, are naturally incremented in the discourse.As a consequence, what we are studying is not a collection of isolated semes but series, chains, syntagmatic arrangements that yield to a form of "proto-reasoning" associating semes. Finally, we show by experimentation, as there are likely other regularity patterns within the continuous structure. These regularities allow us to support the hypothesis of a self-similar principle between deep structure and surface structure. This thesis work is mainly focused on the description and the justification of the existence of such a structural phenomenon.
|
356 |
Index pro podobnostní vyhledávání ve vysokodimenzionálních prostorech / Index Suitable for Similar Search in High-dimensional SpacesKrejčová, Martina January 2012 (has links)
In this paper, we focus on indexing and searching in high-dimensional data. To achieve the target we implemented the Metric Index, a model of the similarity search based on the metric spaces, that employs many of known principles of partitioning and filtering. The metric space is a general model of similarity, which enables the usage of implemented index for various data. With this index, stored data could be searched effectively. The internal structure of data is hidden, we just require an implementation of the function for feature extraction, which produces a vector representing data, and the metric function applicable to the given data. The Metric Index was implemented as a data cartridge, the mechanism for extending the capabilities of the Oracle server. This data cartridge enables indexing of large unstructured data in the Oracle server known as LOBs.
|
357 |
Emprego de tÃcnicas de prÃ-processamento textual e algoritmos de comparaÃÃo como suporte à correÃÃo de questÃes dissertativas: experimentos, anÃlises e contribuiÃÃes / Employing texts preprocessing techniques and string-matching algorithms to support correction of essay questions: experiments, analyzes and contributions.Ricardo Lima Feitosa Ãvila 23 August 2013 (has links)
CoordenaÃÃo de AperfeiÃoamento de Pessoal de NÃvel Superior / Esta dissertaÃÃo apresenta um estudo de tÃcnicas que podem ser empregadas como apoio para a correÃÃo de questÃes dissertativas com base na adaptaÃÃo de algoritmos de comparaÃÃo textual combinados a tÃcnicas de prÃ-processamento de textos. O principal desafio na concepÃÃo de uma ferramenta para este tipo de aplicaÃÃo à a ambiguidade da linguagem natural. Para analisar situaÃÃes de correÃÃo de questÃes subjetivas, foram efetuados testes com esses algoritmos, tendo-se desenvolvido uma ferramenta para tal propÃsito. Confrontando respostas de alunos ao padrÃo de resposta de questÃes propostas em provas subjetivas, foram analisados o desempenho individual dos algoritmos e de um conjunto de tÃcnicas de prÃ-processamento que sÃo encontrados na literatura, de maneira isolada e combinada. Buscando contornar situaÃÃes especÃficas de falso negativo e falso positivo, foram propostas algumas tÃcnicas auxiliares como contribuiÃÃo deste trabalho. ApÃs a anÃlise dos experimentos realizados, os resultados de Ãndice de similaridade entre respostas indicam o uso da soluÃÃo como suporte a correÃÃo de questÃes discursivas, podendo, ainda, ser aplicado na detecÃÃo de plÃgio e ser integrado a um ambiente virtual de ensino e aprendizagem. / This master thesis presents a study of techniques used as support for a correction of essay questions based in an adaptation of string-matching algorithms combined with text preprocessing techniques. The main challenge to design a tool like this is an ambiguity of natural language. To analyze a correction of subjective questions, tests were performed with these algorithms, and a tool have been developed for this purpose. Comparing student responses with response pattern of questions proposed in subjective tests, we analyzed the performance of individual algorithms and a set of pre-processing techniques that are found in the literature, in isolation and combined. Seeking to neutralize specific situations of false negative and false positive, some techniques have been proposed as auxiliary contribution of this work. After analyzing the experiments, the results of similarity index between responses indicate the use of the solution to support the correction of essay questions, and may also be applied in the detection of plagiarism and be integrated to a learning management system.
|
358 |
Diversidade funcional em uma floresta de restinga / Functional diversity in a restinga forestJuliana Lopes Vendrami 07 July 2014 (has links)
Entender os processos responsáveis pela origem e manutenção da diversidade de espécies nas comunidades representa uma questão central em ecologia. Dos inúmeros processos aventados para explicar a diversidade de organismos, podemos destacar o filtro ambiental e a limitação de similaridade. O filtro ambiental atua restringindo a variação e a distribuição dos organismos em determinado ambiente, enquanto que a limitação de similaridade atua pressionando a diferenciação das características dos organismos, uma vez que a coexistência entre os indivíduos depende da divergência na utilização dos recursos. A abordagem funcional tem sido utilizada para testar os processos responsáveis pela coexistência de espécies e consiste na comparação da similaridade funcional entre as espécies de uma comunidade através da quantificação dos seus atributos. A combinação de diferentes atributos em um organismo define a sua estratégia ecológica e, consequentemente, a sua distribuição nos habitats. As florestas de restingas são ambientes propícios para testar as hipóteses de coexistência das espécies nas comunidades, por apresentarem gradientes ambientes bem marcados e que definem a disponibilidade de recursos. Nesse sentido, este trabalho teve como objetivo avaliar: i) o efeito da condição edáfica (seco e alagado) sobre os atributos funcionais e as estratégias ecológicas de espécies arbóreas de restinga alta e; ii) o efeito que os atributos funcionais e as estratégias ecológicas têm sobre a preferência de habitat pelas plantas. Realizamos este trabalho em uma área de restinga alta na Ilha do Cardoso (SP) que compreende dois tipos de solo: seco e alagado. Coletamos cinco atributos funcionais (área foliar, área foliar específica, espessura foliar, conteúdo de matéria seca foliar e densidade da madeira) de 44 espécies arbóreas. Selecionamos 30 indivíduos de cada espécie sendo 15 em cada tipo de solo. Utilizamos seleção de modelos para as análises estatísticas, sendo usados modelos lineares mistos e modelos lineares simples para avaliar o efeito do solo nos valores médios e na variação dos atributos e das estratégias ecológicas, respectivamente. Encontramos efeito do solo sobre os coeficientes de variação (CV) do conteúdo de matéria seca foliar (CMSF) e da área foliar específica (AFE), sendo maiores no solo alagado. No caso do CV da AFE, o efeito só foi significativo quando excluímos as palmeiras das análises. Não encontramos efeito do solo sobre a variação dos demais atributos funcionais, das estratégias ecológicas e tampouco do tipo de estratégia ecológica. Os resultados encontrados apontam que no ambiente alagado, a limitação de similaridade seja o processo preponderante na estruturação dessa comunidade. Tal resultado difere dos reportados por outros estudos em florestas tropicais. Não encontramos efeito dos atributos e das estratégias ecológicas sobre a preferência de habitat das espécies, com exceção do CV de CMSF e de AFE. Novamente, no caso do CV de AFE, o efeito só foi significativo quando excluímos Euterpe edulis (palmito-juçara) das análises. Este resultado reforça a importância da plasticidade fenotípica para definir a ocorrência das espécies em diferentes habitats / Understanding the processes underlying the origin and maintenance of species diversity in communities is a central goal in ecology. Among the numerous processes proposed to explain the organisms\' diversity, we can highlight environmental filter and limiting similarity. Environmental filter operates by restricting the variation and distribution of organisms in a given environment, while the process of limiting similarity acts by pressing differentiation in the organisms\' characteristics, because the coexistence of individuals depends on difference in resource utilization. The functional approach has been used to test the processes responsible for species coexistence and consists in the comparison of species functional similarities in a community through their traits. The combination of different traits in an organism defines its ecological strategy and, therefore, their distribution on habitats. Restinga forests are suitable to test species coexistence hypotheses in communities, because it presents a well marked environmental gradient, which is defined by resource availability. Thus, this study aimed to evaluate: i) the effect of soil condition (drained and flooded) on functional traits and on ecological strategies of restinga trees species and, ii) the effect of functional traits on plant´s habitat preference. We conducted this study in an area of high restinga at Cardoso Island (SP), which comprises two soil types: drained and flooded. We collected five functional traits (leaf area, specific leaf area, leaf thickness, leaf dry matter content and wood density) of 44 tree species. We selected 30 individuals of each species, 15 in each soil type. We used model selection for statistical analyses, being linear models to assess soil type effect on trait and ecological strategies variances and linear mixed models to assess ecological strategies mean values. We found soil effect on the coefficients of variation of leaf dry matter content (LDMC) and of specific leaf area (SLA), which was higher in the flooded soil. In the case of SLA coefficient of variance, the effect was only significant when we excluded the palms from analyses. We found no soil effect on the other functional traits and on ecological strategies variation, neither on ecological strategy type. As for the results, it was found that in the flooded soil, limiting similarity is the dominant process structuring this community. This result differs from those reported by other tropical forests researches. We found no effect of traits and ecological strategies on habitat species preference, with the exception of LDMC and SLA coefficients of variations. Again, for SLA coefficient of variation, the effect was only significant when we exclude Euterpe edulis (juçara palm) from analyses. This result reinforces the importance of phenotypic plasticity to define species occurrence in different habitats
|
359 |
A adaptação de empréstimos recentes no papiamentu moderno / The adaptation of recent loanwords in modern papiamentuManuele Bandeira de Menezes 08 February 2013 (has links)
No presente estudo, são investigados os processos fonológicos e morfológicos na adapta ção/nativização de itens lexicais, incorporados no século XX, provenientes de outras línguas (sobretudo do espanhol, holandês, inglês e português) para o papiamentu moderno, língua falada nas ilhas de Aruba, Bonaire e Curaçao por mais de 200 mil pessoas. A partir do corpus formado pela seleção de aproximadamente mil palavras de nativização recente, este trabalho limitou-se aos domínios lexicais do Esporte, Política, Economia, Tecnologia e Desenvolvimento. A escolha teve por objetivo limitar os dados obtidos para apenas itens de empréstimos recentes, ou seja, constituir um corpus com palavras incorporadas a partir do século XX em diante. O material utilizado para pesquisa foi retirado de fontes como manuais e livros de esporte (DANIELS, 1986; PERSAUD, 1992; COSTER, 2010); guias e livros sobre política e economia (HOYER, 1944; SINDIKATONAN, 1985; HEILIGERSHALABI, 1988; KOMERSIO, 1996); jornal impresso Èxtra (2011), produzido e vendido em Curaçao; dicionário bilíngue papiamentu/inglês (RATZLAFF-HENRIQUEZ, 2008). A análise de itens lexicais de adaptação recente buscou avaliar se os empréstimos são nativizados segundo o padrão linguístico do papiamentu ou se há uma gramática especial para estas palavras (PARADIS & LABEL, 1994; PARADIS, 1996; KENSTOWICZ, 2001; KENSTOWICZ & SUCHATO, 2004) e também buscou identicar os padrões fonológicos e morfológicos existentes na adaptação. Para realizar tais propósitos, os itens lexicais coletados das fontes escritas foram gravados com pelo menos dois falantes nativos (para cada palavra). Posteriormente, deu-se início ao processo de análise de dados que foi baseado em padrões de correspondência de som entre palavras do papiamentu e palavras de empréstimo de outras línguas. Foram coletados 930 itens relativos ao empréstimo recente. Desse número total, pôde-se traçar, em 892 palavras, uma única língua fonte, enquanto que 38 itens ou são palavras resultantes do processo de hibridismo, em que o item lexical é formado por palavras fontes de duas ou mais línguas ou possuem mais de um étimo possível. Ao analisar os números obtidos pelo espanhol e português em todos os campos lexicais, constata-se que tais línguas são as que mais participam no processo de empréstimo de itens lexicais para o papiamentu. Esse predomínio sinaliza que o papiamentu moderno ainda é inuenciado pelas línguas que formaram o seu superstrato, embora não se possa ignorar a presença de um número signicativo de itens de étimo holandês e inglês na formação de seu vocabulário recente. Tendo como embasamento teórico a Teoria de Restrições e Estratégias de Reparo (TCRS), proposta por Paradis (1988) e a Teoria da Similaridade Perceptual (STERIADE, 2002; FLEISCHHACKER, 2001, 2002; WALKER 2003), o estudo conrmou que a informação segmental tende a ser maximamente preservada. O apagamento não foi o recurso mais usado, ocorrendo apenas quando era necessário evitar a violação de uma restrição na L1. Além disso, os falantes tendem a maximizar a similaridade perceptual entre a forma adaptada e o input estrangeiro nos processos de empréstimo. / This study investigates the phonological and morphological processes involved in the adaptation/ nativization of lexical items from other languages (especially Spanish, Dutch, English, and Portuguese) incorporated during the twentieth century into modern Papiamentu, a language spoken on the islands of Aruba, Bonaire and Curaçao by over 200 thousand people. Based on a corpus made up of a selection of approximately a thousand recently nativized words, the study limits itself to the lexical domains of Sport, Politics, Economics, Technology, and Development. The choice aimed to limit data obtained to only items of recent loans, that is, the aim was to constitute a corpus of words incorporated from the twentieth century onwards. The material used for the study was taken from sources such as sports manuals and books (DANIELS, 1986; PERSAUD, 1992; COSTER, 2010); guides and books on politics and economics (HOYER, 1944; SINDIKATONAN, 1985; HEILIGERSHALABI, 1988; KOMERSIO, 1996); the newspaper Èxtra (2011), produced and sold in Cura çao; and a Papiamentu/English bilingual dictionary (RATZLAFF-HENRIQUEZ, 2008). The analysis of recently adapted lexical items seeks to evaluate whether the loanwords studied were nativized according to the linguistic patterns of Papiamentu, or if there is a special grammar for these words. (PARADIS & LABEL, 1994; PARADIS, 1996; KENSTOWICZ, 2001; KENSTOWICZ & SUCHATO, 2004) and it also seeks to identify the existing phonological and morphological patterns in the adaptation. In order to achieve these aims, the lexical items collected from the written sources were recorded by at least two native speakers (for each word). Then, a process of data analysis was initiated, based on patterns of sound correspondence between Papiamentu words and loanwords from other languages. A total of 930 items were collected in relation to recent loans. Of these, a single language source can be traced in 892 words, while 38 items are either words formed by hybridism, in which the lexical item is formed by source words from two or more languages, or words with more than one possible etymon. In the analysis of the numbers obtained relating to Spanish and Portuguese in all the lexical elds, it can be seen that these two languages are the ones with the largest part in the process of loaning of lexical items to Papiamentu. This predominance indicates that modern Papiamentu continues to be inuenced by the languages that framed its superstratum, although the presence in recent Papiamentu vocabulary of a signicant number of items with Dutch and English etymons should also not be ignored. Taking as a theoretical framework the Theory of Constraints and Repair Strategies (TCRS), proposed by Paradis (1988) and the Theory of Perceptual Similarity (STERIADE, 2002; FLEISCHHACKER, 2001, 2002; WALKER 2003), this study attests that segmental information tends to be maximally preserved. Deletion was not the most commonly used resource, apparently occurring only when there was a need to avoid violation of a restriction in the L1. Additionally, speakers tend to maximize the perceptual similarity between the adapted from and the foreign input in the loan processes.
|
360 |
Towards effective geographic ontology semantic similarity assessmentHess, Guillermo Nudelman January 2008 (has links)
A cada dia cresce a importância da integração de informações geográficas, em virtude da facilidade de intercambiar dados através da Internet e do alto custo de produção deste tipo de informação. Com o advento da web semântica, o uso de ontologias para descrever informações geográficas está se tornando popular. Para permitir a integração, um dos estágios no qual muitas pesquisas estão focando é o chamado matching das ontologias geográficas. Matching consiste na medida de similaridade entre os elementos de duas ou mais ontologias geográficas. Estes elementos são chamados de conceitos e instâncias. O principal problema enfrentado no matching de ontologias é que estas podem ser descritas por diferentes pessoas (ou grupos), utilizando vocabulários diferentes e perspectivas variadas. No caso de ontologias geográficas os problemas são ainda maiores, em razão das particularidades da informação geográfica (geometria, localização espacial e relacionamentos espaciais), em função da falta de um modelo para descrição de ontologias geográficas amplamente adotado e, também, porque as ontologias são, muitas vezes, descritas em diferentes níveis de granularidade semântica. Estas particularidades das ontologias geográficas torna os matchers convencionais inadequados para o matching de ontologias geográficas. Por outro lado, os matchers existentes para o domínio geográfico são bastante limitados e somente funcionam para ontologias descritas em um modelo específico. Com o objetivo de superar essas limitações, neste trabalho são apresentados algoritmos e expressões (métricas) para medir a similaridade entre duas ontologias geográficas efetivamente, tanto em nível de instâncias quanto em nível de conceitos. Os algoritmos propostos combinam métricas para medir a similaridade considerando os aspectos não geográficos dos conceitos e instâncias com expressões criadas especificamente para tratar as características geográficas. Além disto, este trabalho também propõe um modelo para ontologia geográfica genérico, que pode servir como base para a criação de ontologias geográficas de forma padronizada. Este modelo é compatível com as recomendações do OGC e é a base para os algoritmos. Para validar estes algoritmos foi criada uma arquitetura de software chamada IG-MATCH a qual apresenta também a possibilidade de enriquecer a semântica das ontologias geográficas com relacionamentos topológicos e do tipo generalização/especialização através da análise de suas instâncias. / Integration of geographic information is becoming more important every day, due to the facility to exchange data through the Internet and the high cost to produce them. With the semantic web, the description of geographic information using ontologies is getting popular. To allow the integration, one of the steps in which many researches are focusing is the matching of geographic ontologies. A matching consists on measuring the similarity of the elements, namely either concepts or instances, of two (or more) given ontologies. The main problem with ontology matching is that the ontologies may be described by different communities, using different vocabularies and different perspectives. For geographic ontologies the difficulties may be even worse, for the particularities of the geographic information (geometry, location and spatial relationships) as well as due to the lack of a widely accepted geographic ontology model, and because the ontologies are usually described at different semantic granularities. The specificities of geographic ontologies make conventional matchers not suitable for matching geographic ontologies. On the other hand, the existing geographic ontology matchers are considerably limited in their functionality and deal with ontologies described in a particular perspective. To overcome the current limitations, in this work we present a number of similarity measurement expressions and algorithms to efficiently match two geographic ontologies, at both the concept and instance-level. These algorithms combine expressions used to assess the similarity of the so-called conventional features with expressions tailor made for covering the geographic particularities. Furthermore, this research also proposes a geographic ontology meta-model to serve as a basis for the development of geographic ontologies in order to standardize their description. This model is compliant with the OGC recommendations and is the basis upon which the algorithms are defined. For the evaluation of the algorithms, a software architecture called IG-MATCH was created with an additional feature of making possible to enrich the geographic ontologies with topological relationships and parent-child relationships by the analysis of the instances.
|
Page generated in 0.0721 seconds