Spelling suggestions: "subject:"[een] NLP"" "subject:"[enn] NLP""
271 |
Spelling Normalisation and Linguistic Analysis of Historical Text for Information ExtractionPettersson, Eva January 2016 (has links)
Historical text constitutes a rich source of information for historians and other researchers in humanities. Many texts are however not available in an electronic format, and even if they are, there is a lack of NLP tools designed to handle historical text. In my thesis, I aim to provide a generic workflow for automatic linguistic analysis and information extraction from historical text, with spelling normalisation as a core component in the pipeline. In the spelling normalisation step, the historical input text is automatically normalised to a more modern spelling, enabling the use of existing taggers and parsers trained on modern language data in the succeeding linguistic analysis step. In the final information extraction step, certain linguistic structures are identified based on the annotation labels given by the NLP tools, and ranked in accordance with the specific information need expressed by the user. An important consideration in my implementation is that the pipeline should be applicable to different languages, time periods, genres, and information needs by simply substituting the language resources used in each module. Furthermore, the reuse of existing NLP tools developed for the modern language is crucial, considering the lack of linguistically annotated historical data combined with the high variability in historical text, making it hard to train NLP tools specifically aimed at analysing historical text. In my evaluation, I show that spelling normalisation can be a very useful technique for easy access to historical information content, even in cases where there is little (or no) annotated historical training data available. For the specific information extraction task of automatically identifying verb phrases describing work in Early Modern Swedish text, 91 out of the 100 top-ranked instances are true positives in the best setting.
|
272 |
Automatic Text Ontological Representation and Classification via Fundamental to Specific Conceptual Elements (TOR-FUSE)Razavi, Amir Hossein 16 July 2012 (has links)
In this dissertation, we introduce a novel text representation method mainly used for text classification purpose. The presented representation method is initially based on a variety of closeness relationships between pairs of words in text passages within the entire corpus. This representation is then used as the basis for our multi-level lightweight ontological representation method (TOR-FUSE), in which documents are represented based on their contexts and the goal of the learning task. The method is unlike the traditional representation methods, in which all the documents are represented solely based on the constituent words of the documents, and are totally isolated from the goal that they are represented for. We believe choosing the correct granularity of representation features is an important aspect of text classification. Interpreting data in a more general dimensional space, with fewer dimensions, can convey more discriminative knowledge and decrease the level of learning perplexity. The multi-level model allows data interpretation in a more conceptual space, rather than only containing scattered words occurring in texts. It aims to perform the extraction of the knowledge tailored for the classification task by automatic creation of a lightweight ontological hierarchy of representations. In the last step, we will train a tailored ensemble learner over a stack of representations at different conceptual granularities. The final result is a mapping and a weighting of the targeted concept of the original learning task, over a stack of representations and granular conceptual elements of its different levels (hierarchical mapping instead of linear mapping over a vector). Finally the entire algorithm is applied to a variety of general text classification tasks, and the performance is evaluated in comparison with well-known algorithms.
|
273 |
Développement d’un système d’appariement pour l’e-recrutementDieng, Mamadou Alimou 04 1900 (has links)
Ce mémoire tente de répondre à une problématique très importante dans le domaine de recrutement : l’appariement entre offre d’emploi et candidats.
Dans notre cas nous disposons de milliers d’offres d’emploi et de millions de profils ramassés sur les sites dédiés et fournis par un industriel spécialisé dans le recrutement.
Les offres d’emploi et les profils de candidats sur les réseaux sociaux professionnels sont généralement destinés à des lecteurs humains qui sont les recruteurs et les chercheurs d’emploi.
Chercher à effectuer une sélection automatique de profils pour une offre d’emploi se heurte donc à certaines difficultés que nous avons cherché à résoudre dans le présent mémoire.
Nous avons utilisé des techniques de traitement automatique de la langue naturelle pour extraire automatiquement les informations pertinentes dans une offre d’emploi afin de construite une requête qui nous permettrait d’interroger notre base de données de profils.
Pour valider notre modèle d’extraction de métier, de compétences et de d’expérience, nous avons évalué ces trois différentes tâches séparément en nous basant sur une référence cent offres d’emploi canadiennes que nous avons manuellement annotée. Et pour valider notre outil d’appariement nous avons fait évaluer le résultat de l’appariement de dix offres d’emploi canadiennes par un expert en recrutement. / Our work seeks to address a very important issue in the recruitment field: matching jobs postings and candidates.
We have thousands of jobs postings and millions of profiles collected from internet provided by a specialized firm in recruitment.
Job postings and candidate profiles on professional social networks are generally intended for human readers who are recruiters and job seekers.
We use natural language processing (NLP) techniques to automatically extract relevant information in a job offer.
We use the extracted information to build automatically a query on our database.
To validate our information retrieval model of occupation, skills and experience, we use hundred Canadian jobs postings manually annotated. And to validate our matching tool we evaluate the result of the matching of ten Canadian jobs by a recruitment expert.
|
274 |
Answer Triggering Mechanisms in Neural Reading Comprehension-based Question Answering SystemsTrembczyk, Max January 2019 (has links)
We implement a state-of-the-art question answering system based on Convolutional Neural Networks and Attention Mechanisms and include four different variants of answer triggering that have been discussed in recent literature. The mechanisms are included in different places in the architecture and work with different information and mechanisms. We train, develop and test our models on the popular SQuAD data set for Question Answering based on Reading Comprehension that has in its latest version been equipped with additional non-answerable questions that have to be retrieved by the systems. We test the models against baselines and against each other and provide an extensive evaluation both in a general question answering task and in the explicit performance of the answer triggering mechanisms. We show that the answer triggering mechanisms all clearly improve the model over the baseline without answer triggering by as much as 19.6% to 31.3% depending on the model and the metric. The best performance in general question answering shows a model that we call Candidate:No, that treats the possibility that no answer can be found in the document as just another answer candidate instead of having an additional decision step at some place in the model's architecture as in the other three mechanisms. The performance on detecting the non-answerable questions is very similar in three of the four mechanisms, while one performs notably worse. We give suggestions which approach to use when a more or less conservative approach is desired, and discuss suggestions for future developments.
|
275 |
Deep neural semantic parsing: translating from natural language into SPARQL / Análise semântica neural profunda: traduzindo de linguagem natural para SPARQLLuz, Fabiano Ferreira 07 February 2019 (has links)
Semantic parsing is the process of mapping a natural-language sentence into a machine-readable, formal representation of its meaning. The LSTM Encoder-Decoder is a neural architecture with the ability to map a source language into a target one. We are interested in the problem of mapping natural language into SPARQL queries, and we seek to contribute with strategies that do not rely on handcrafted rules, high-quality lexicons, manually-built templates or other handmade complex structures. In this context, we present two contributions to the problem of semantic parsing departing from the LSTM encoder-decoder. While natural language has well defined vector representation methods that use a very large volume of texts, formal languages, like SPARQL queries, suffer from lack of suitable methods for vector representation. In the first contribution we improve the representation of SPARQL vectors. We start by obtaining an alignment matrix between the two vocabularies, natural language and SPARQL terms, which allows us to refine a vectorial representation of SPARQL items. With this refinement we obtained better results in the posterior training for the semantic parsing model. In the second contribution we propose a neural architecture, that we call Encoder CFG-Decoder, whose output conforms to a given context-free grammar. Unlike the traditional LSTM encoder-decoder, our model provides a grammatical guarantee for the mapping process, which is particularly important for practical cases where grammatical errors can cause critical failures. Results confirm that any output generated by our model obeys the given CFG, and we observe a translation accuracy improvement when compared with other results from the literature. / A análise semântica é o processo de mapear uma sentença em linguagem natural para uma representação formal, interpretável por máquina, do seu significado. O LSTM Encoder-Decoder é uma arquitetura de rede neural com a capacidade de mapear uma sequência de origem para uma sequência de destino. Estamos interessados no problema de mapear a linguagem natural em consultas SPARQL e procuramos contribuir com estratégias que não dependam de regras artesanais, léxico de alta qualidade, modelos construídos manualmente ou outras estruturas complexas feitas à mão. Neste contexto, apresentamos duas contribuições para o problema de análise semântica partindo da arquitetura LSTM Encoder-Decoder. Enquanto para a linguagem natural existem métodos de representação vetorial bem definidos que usam um volume muito grande de textos, as linguagens formais, como as consultas SPARQL, sofrem com a falta de métodos adequados para representação vetorial. Na primeira contribuição, melhoramos a representação dos vetores SPARQL. Começamos obtendo uma matriz de alinhamento entre os dois vocabulários, linguagem natural e termos SPARQL, o que nos permite refinar uma representação vetorial dos termos SPARQL. Com esse refinamento, obtivemos melhores resultados no treinamento posterior para o modelo de análise semântica. Na segunda contribuição, propomos uma arquitetura neural, que chamamos de Encoder CFG-Decoder, cuja saída está de acordo com uma determinada gramática livre de contexto. Ao contrário do modelo tradicional LSTM Encoder-Decoder, nosso modelo fornece uma garantia gramatical para o processo de mapeamento, o que é particularmente importante para casos práticos nos quais erros gramaticais podem causar falhas críticas em um compilador ou interpretador. Os resultados confirmam que qualquer resultado gerado pelo nosso modelo obedece à CFG dada, e observamos uma melhora na precisão da tradução quando comparada com outros resultados da literatura.
|
276 |
"Programação neurolinguística: transformação e persuasão no metamodelo" / Neuro-Linguistic Programming: transformation and persuasion in meta-model.Regina Maria Azevedo 19 April 2006 (has links)
Neste estudo apresentamos as origens da Programação Neurolingüística (PNL), seus principais fundamentos, pressupostos teóricos e objetivos; analisamos o metamodelo, sua relação com a linguagem e sua exploração por meio do processo de modelagem, a partir do enfoque presente na obra A estrutura da magia I: um livro sobre linguagem e terapia, de Richard Bandler e John Grinder, idealizadores da PNL. Examinamos as transformações obtidas mediante o processo de derivação, com base na Gramática Gerativo-Transformacional de Noam Chomsky, objetivando verificar sua relação com o metamodelo. Explorando o discurso do Sujeito submetido ao processo de modelagem, verificamos em que medida os novos conteúdos semânticos revelados pelas transformações poderiam influenciá-lo, a ponto de mudar sua visão de mundo. Para esta análise, investigamos ainda as teorias clássicas da Argumentação, em especial os conceitos de convicção e persuasão, constatando que a modelagem oferece ao Sujeito recursos para ampliar seu repertório lingüístico, apreender novos significados a partir de seus próprios enunciados e, por meio da deliberação consigo mesmo, convencer-se e persuadir-se. / This study aims at presenting the origins of the Neuro-Linguistic Programming (NLP), its main ideas, theoretical presuppositions and goals. Furthermore, it will be analyzed the meta-model, its relationship with language and its exploitation through the modeling process, all based on the book The structure of magic I: a book about language and therapy, by Richard Bandler and John Grinder, the founders of NLP. Moreover, it will be examined the transformations obtained from the derivation process, based on Noam Chomsky´s Transformational-generative grammar, with the goal of verifying its relationship with the meta-model. When exploiting the subject´s discourse submitted for the process of modeling, it will be verified in which way the new semantic contents revealed by the transformations could influence that subject and made him alter his vision of the world. For this analysis, it will be investigated also the classic theories of Argumentation, especially the conviction and persuasion concepts. It will also be verified that the process of modeling can offer resources to the subject, for him to enhance his linguistic vocabulary, to learn new meanings from his own sentences and to be able to persuade and convince himself through deliberating with his inner self.
|
277 |
"Programação neurolinguística: transformação e persuasão no metamodelo" / Neuro-Linguistic Programming: transformation and persuasion in meta-model.Azevedo, Regina Maria 19 April 2006 (has links)
Neste estudo apresentamos as origens da Programação Neurolingüística (PNL), seus principais fundamentos, pressupostos teóricos e objetivos; analisamos o metamodelo", sua relação com a linguagem e sua exploração por meio do processo de modelagem", a partir do enfoque presente na obra A estrutura da magia I: um livro sobre linguagem e terapia, de Richard Bandler e John Grinder, idealizadores da PNL. Examinamos as transformações obtidas mediante o processo de derivação, com base na Gramática Gerativo-Transformacional de Noam Chomsky, objetivando verificar sua relação com o metamodelo". Explorando o discurso do Sujeito submetido ao processo de modelagem", verificamos em que medida os novos conteúdos semânticos revelados pelas transformações poderiam influenciá-lo, a ponto de mudar sua visão de mundo. Para esta análise, investigamos ainda as teorias clássicas da Argumentação, em especial os conceitos de convicção e persuasão, constatando que a modelagem" oferece ao Sujeito recursos para ampliar seu repertório lingüístico, apreender novos significados a partir de seus próprios enunciados e, por meio da deliberação consigo mesmo, convencer-se e persuadir-se. / This study aims at presenting the origins of the Neuro-Linguistic Programming (NLP), its main ideas, theoretical presuppositions and goals. Furthermore, it will be analyzed the meta-model, its relationship with language and its exploitation through the modeling process, all based on the book The structure of magic I: a book about language and therapy, by Richard Bandler and John Grinder, the founders of NLP. Moreover, it will be examined the transformations obtained from the derivation process, based on Noam Chomsky´s Transformational-generative grammar, with the goal of verifying its relationship with the meta-model. When exploiting the subject´s discourse submitted for the process of modeling, it will be verified in which way the new semantic contents revealed by the transformations could influence that subject and made him alter his vision of the world. For this analysis, it will be investigated also the classic theories of Argumentation, especially the conviction and persuasion concepts. It will also be verified that the process of modeling can offer resources to the subject, for him to enhance his linguistic vocabulary, to learn new meanings from his own sentences and to be able to persuade and convince himself through deliberating with his inner self.
|
278 |
Treinadores de sentido: notas etnográficas sobre atividades motivacionais modernas / Trainers of meaning: ethnographic notes on modern motivational activitiesOliveira Junior, Jorge Gonçalves de 07 December 2015 (has links)
O objeto desta dissertação são as atividades motivacionais modernas, ou seja, atividades voltadas para a criação de um estímulo em indivíduos que, supõe-se, necessitam de um sentido para seu trabalho e/ou sua vida. Tais atividades demandam interrupções da rotina e uma reflexão sobre a própria trajetória do praticante, e há uma oferta variada de produtos com esse fim, oferecidos por empresas especializadas. Neste trabalho nos concentramos em analisar palestras motivacionais, um grupo de encontro de fim de semana chamado Leader Training e o coaching associado à programação neurolinguística (PNL). Buscamos realizar observações participantes, entrevistas e análises de material de divulgação, livros, vídeos e sítios da internet a fim de conseguir apreender essas práticas em suas diversas facetas. Nossa análise e interpretação buscou compreender os mecanismos e estratégias dessas atividades para construção de sua eficácia, por meio de narrativas e performances que buscam realinhar as intenções dos participantes com uma noção convencional de sucesso e felicidade, observando as diferentes maneiras como o discurso científico é acionado e o religioso é sutilmente sugerido na elaboração de seus conteúdos. Operando comparações das descrições de campo com a teoria antropológica, em intersecção com uma análise baseada nos pressupostos dos science studies na sua forma de conceber o moderno em suas simetrias com o não-moderno, acreditamos ter contribuído com a ampliação do olhar sobre o objeto. / The object of this dissertation are the \"modern motivational activities\", i.e. activities aimed at creating an incentive in individuals who, one assumes, require a meaning to their work and/or their life. Such activities require routine interruptions and a reflection on the participants own trajectory, and there is a wide range of products for this purpose, offered by specialized companies. In this dissertation, we focus on analyzing motivational talks, a weekend group meeting called Leader Training and coaching associated with Neuro-Linguistic Programming (NLP). We seek to conduct participant observation, interviews and analysis of promotional materials, books, videos and websites in order to be able to understand these practices in its various facets. Our analysis and interpretation sought to understand the mechanisms and strategies of these activities to build their effectiveness through narratives and performances, that seek to realign the intentions of the participants with a conventional notion of success and happiness, noticing the different ways in which scientific discourse is triggered and the religious is subtly suggested in the development of its contents. We believe that we have contributed to broaden the view on the subject, operating comparisons of field descriptions with the anthropological theory, in intersection with an analysis based on the assumptions of Science Studies in their way of conceiving the modern in its symmetries with the non-modern.
|
279 |
Uma abordagem de redes neurais convolucionais para an?lise de sentimento multi-lingualBecker, Willian Eduardo 24 November 2017 (has links)
Submitted by PPG Ci?ncia da Computa??o (ppgcc@pucrs.br) on 2018-09-03T14:11:33Z
No. of bitstreams: 1
WILLIAN EDUARDO BECKER_DIS.pdf: 2142751 bytes, checksum: e6501a586bb81f7cbad7fa5ef35d32f2 (MD5) / Approved for entry into archive by Sheila Dias (sheila.dias@pucrs.br) on 2018-09-04T14:43:25Z (GMT) No. of bitstreams: 1
WILLIAN EDUARDO BECKER_DIS.pdf: 2142751 bytes, checksum: e6501a586bb81f7cbad7fa5ef35d32f2 (MD5) / Made available in DSpace on 2018-09-04T14:57:29Z (GMT). No. of bitstreams: 1
WILLIAN EDUARDO BECKER_DIS.pdf: 2142751 bytes, checksum: e6501a586bb81f7cbad7fa5ef35d32f2 (MD5)
Previous issue date: 2017-11-24 / Nowadays, the use of social media has become a daily activity of our society. The huge and uninterrupt flow of information in these spaces opens up the possibility of exploring this data in different ways. Sentiment Analysis (SA) is a task that aims to obtain knowledge about the polarity of a given text relying on several techniques of Natural Language Processing, with most of solutions dealing with only one language at a time. However, approaches that are not restricted to explore only one language are more related to extract the whole knowledge and possibilities of these data. Recent approaches based on Machine Learning propose to solve SA by using mainly Deep Learning Neural Networks have obtained good results in this task. In this work is proposed three Convolutional Neural Network architectures that deal with multilingual Twitter data of four languages. The first and second proposed models are characterized by the fact they require substantially less learnable parameters than other considered baselines while are more accurate than several other Deep Neural architectures. The third proposed model is able to perform a multitask classification by identifying the polarity of a given sentences and also its language. This model reaches an accuracy of 74.43% for SA and 98.40% for Language Identification in the four-language multilingual dataset. Results confirm that proposed model is the best choice for both sentiment and language classification by outperforming the considered baselines. / A utiliza??o de redes sociais tornou-se uma atividade cotidiana na sociedade atual. Com o enorme, e ininterrupto, fluxo de informa??es geradas nestes espa?os, abre-se a possibilidade de explorar estes dados de diversas formas. A An?lise de Sentimento (AS) ? uma tarefa que visa obter conhecimento sobre a polaridade das mensagens postadas, atrav?s de diversas t?cnicas de Processamento de Linguagem Natural, onde a maioria das solu??es lida com somente um idioma de cada vez. Entretanto, abordagens que n?o restringem se a explorar somente uma l?ngua, est?o mais pr?ximas de extra?rem todo o conhecimento e possibilidades destes dados. Abordagens recentes baseadas em Aprendizado de M?quina prop?em-se a resolver a AS apoiando-se principalmente nas Redes Neurais Profundas (Deep Learning), as quais obtiveram bons resultados nesta tarefa. Neste trabalho s?o propostas tr?s arquiteturas de Redes Neurais Convolucionais que lidam com dados multi-linguais extra?dos do Twitter contendo quatro l?nguas. Os dois primeiros modelos propostos caracterizam-se pelo fato de possu?rem um total de par?metros muito menor que os demais baselines considerados, e ainda assim, obt?m resultados superiores com uma boa margem de diferen?a. O ?ltimo modelo proposto ? capaz de realizar uma classifica??o multitarefa, identificando a polaridade das senten?as e tamb?m a l?ngua. Com este ?ltimo modelo obt?m-se uma acur?cia de 74.43% para AS e 98.40% para Identifica??o da L?ngua em um dataset com quatro l?nguas, mostrando-se a melhor escolha entre todos os baselines analisados.
|
280 |
Une approche hybride de l'extraction d'information : sous-langages et lexique-grammaireWatrin, Patrick 25 October 2006 (has links)
L'extraction d'information consiste habituellement à remplir, au départ d'un ensemble de documents, les champs d'un formulaire préalablement établi et articulé autour d'un scénario précis. Dans ce travail, nous cherchons à étudier la pertinence des bases de données syntaxiques du lexique-grammaire afin de répondre aux questions et enjeux posés par ce domaine d'application (adaptabilité, performance,...).
La phrase élémentaire (couple <prédicat, compléments essentiels>) est l'unité significative minimale de cette théorie linguistique (M. Gross, 1975) qui réunit lexique et syntaxe en un unique formalisme. Chacune de ces phrases dessine le sens d'un prédicat au moyen de critères tant distributionnels que transformationnels. Dans un cadre générique, on ne peut malheureusement pas prétendre caractériser davantage ces phrases. Il convient en effet que l'analyse syntaxico-sémantique qui dérive de l'utilisation du formalisme s'adapte à toute situation d'énonciation. Toutefois, si on limite l'analyse dont il est question à un contexte ou sous-langage particulier, celui d'un scénario d'extraction par exemple, en d'autres termes, si nous bornons la situation d'énonciation, il est envisageable de préciser la sémantique du prédicat et de ses compléments essentiels. Les phrases élémentaires peuvent ainsi être appréhendées comme autant de motifs d'extraction.
|
Page generated in 0.0439 seconds