Global ETD Search

381	Otázky literární hermeneutiky / Issues in Literary Hermeneutics Válková, Natalia January 2013 (has links) The goal of this thesis is to explore the possibility of systematization of the literary hermeneutics as a method of interpretation. Three selective studies focused on the hermeneutic the- ory (namely problem of the language, understanding and textuality) should provide a theoretical and philosophical framework for the interpretative part of this thesis, which is focused on interpretation of a literary text, namely Joseph Brodsky's poem Isaac and Abraham. The thesis also explores con- cept of the literary hermeneutics, which stands between phenomenological-ontological hermeneu- tics and methodological-normanative orientated theory of interpretation. Despite the explicit tension between these two attitudes, there is also a space within the literary hermeneutics for their inspira- tional dialog.
382	Classificação de textos com redes complexas / Using complex networks to classify texts Amancio, Diego Raphael 29 October 2013 (has links) A classificação automática de textos em categorias pré-estabelecidas tem despertado grande interesse nos últimos anos devido à necessidade de organização do número crescente de documentos. A abordagem dominante para classificação é baseada na análise de conteúdo dos textos. Nesta tese, investigamos a aplicabilidade de atributos de estilo em tarefas tradicionais de classificação, usando a modelagem de textos como redes complexas, em que os vértices representam palavras e arestas representam relações de adjacência. Estudamos como métricas topológicas podem ser úteis no processamento de línguas naturais, sendo a tarefa de classificação apoiada por métodos de aprendizado de máquina, supervisionado e não supervisionado. Um estudo detalhado das métricas topológicas revelou que várias delas são informativas, por permitirem distinguir textos escritos em língua natural de textos com palavras distribuídas aleatoriamente. Mostramos também que a maioria das medidas de rede depende de fatores sintáticos, enquanto medidas de intermitência são mais sensíveis à semântica. Com relação à aplicabilidade da modelagem de textos como redes complexas, mostramos que existe uma dependência significativa entre estilo de autores e topologia da rede. Para a tarefa de reconhecimento de autoria de 40 romances escritos por 8 autores, uma taxa de acerto de 65% foi obtida com métricas de rede e intermitência de palavras. Ainda na análise de estilo, descobrimos que livros pertencentes ao mesmo estilo literário tendem a possuir estruturas topológicas similares. A modelagem de textos como redes também foi útil para discriminar sentidos de palavras ambíguas, a partir apenas de informação topológica dos vértices, evidenciando uma relação não trivial entre sintaxe e semântica. Para algumas palavras, a discriminação com redes complexas foi ainda melhor que a estratégia baseada em padrões de recorrência contextual de palavras polissêmicas. Os estudos desenvolvidos nesta tese confirmam que aspectos de estilo e semânticos influenciam na organização estrutural de conceitos em textos modelados como rede. Assim, a modelagem de textos como redes de adjacência de palavras pode ser útil não apenas para entender mecanismos fundamentais da linguagem, mas também para aperfeiçoar aplicações reais quando combinada com métodos tradicionais de processamento de texto. / The automatic classification of texts in pre-established categories is drawing increasing interest owing to the need to organize the ever growing number of electronic documents. The prevailing approach for classification is based on analysis of textual contents. In this thesis, we investigate the applicability of attributes based on textual style using the complex network (CN) representation, where nodes represent words and edges are adjacency relations. We studied the suitability of CN measurements for natural language processing tasks, with classification being assisted by supervised and unsupervised machine learning methods. A detailed study of topological measurements in texts revealed that several measurements are informative in the sense that they are able to distinguish meaningful from shuffled texts. Moreover, most measurements depend on syntactic factors, while intermittency measurements are more sensitive to semantic factors. As for the use of the CN model in practical scenarios, there is significant correlation between authors style and network topology. We achieved an accuracy rate of 65% in discriminating eight authors of novels with the use of network and intermittency measurements. During the stylistic analysis, we also found that books belonging to the same literary movement could be identified from their similar topological features. The network model also proved useful for disambiguating word senses. Upon employing only topological information to characterize nodes representing polysemous words, we found a strong relationship between syntax and semantics. For several words, the CN approach performed surprisingly better than the method based on recurrence patterns of neighboring words. The studies carried out in this thesis confirm that stylistic and semantic aspects play a crucial role in the structural organization of word adjacency networks. The word adjacency model investigated here might be useful not only to provide insight into the underlying mechanisms of the language, but also to enhance the performance of real applications implementing both CN and traditional approaches. Classificação textual Complex networks Pattern recognition Processamento de texto Reconhecimento de padrões Redes complexas Text classification Text processing
383	Um método para a fusão automática de sentenças similares em português / A method for automatic fusion of similar sentence in portuguese Seno, Eloize Rossi Marques 24 May 2010 (has links) Nos últimos anos, há um crescente interesse por aplicações do Processamento de Língua Natural (PLN) que processam uma coleção de textos sobre um mesmo assunto e produzem um novo texto de saída, quer seja um sumário ou uma resposta para uma dada pergunta. Para se produzir textos com qualidade, essas aplicações precisam lidar adequadamente com vários fenômenos, tais como a redundância, a contradição e a complementaridade de informações. Nesse contexto, um processo que permita a identificação de informações comuns em um conjunto de sentenças relacionadas, e gere uma nova sentença a partir da fusão de informações das sentenças de entrada, sem redundâncias e sem contradições, é de grande relevância para as aplicações que processam múltiplos textos. A fusão automática de sentenças é um tema de pesquisa relativamente recente na literatura de PLN e para a língua portuguesa, em particular, não se tem conhecimento de trabalhos dessa natureza. Neste trabalho propõe-se um método inédito para a fusão de sentenças similares em português, baseado em uma abordagem simbólica e independente de domínio, e produz-se o Zíper, um sistema de fusão sentencial que implementa o método proposto. O Zíper é o primeiro sistema a contemplar a geração de sentenças que expressam todas as informações das sentenças de entrada, ou seja, que representam a união do conjunto. Além disso, ele permite a geração de sentenças que expressam apenas as informações redundantes do conjunto (consideradas mais importantes), isto é, que representam a interseção das sentenças de entrada. O sistema foi avaliado intrinsecamente e os resultados obtidos mostram que, de modo geral, as sentenças produzidas são bem formadas e preservam a mensagem original do conjunto (isto é, a mensagem toda, na fusão por união e apenas a mensagem principal, na fusão por interseção). Zíper também foi avaliado extrinsecamente no contexto de um sumarizador multidocumento do português. Os resultados alcançados sugerem que o método proposto contribui para melhorar a qualidade dos sumários, reduzindo a redundância de informações, que frequentemente provoca a perda de coesão e de coerência / In recent years, there is increasing interest in applications of Natural Language Processing (NLP) that process a collection of texts on the same subject and generate a new output text, for instance, a summary or an answer to a given question. In order to generate quality texts, these applications need to cope with various phenomena such as information redundancy, contradiction and complementarity. In this context, a process that is able to identify common information in a set of related sentences and generate a new sentence by merging information from the input sentences, without redundancies and contradictions, is of great relevance for applications that process multiple texts. Automatic sentence fusion is a relatively new research topic in NLP literature and for Portuguese, in particular, we are not aware of any such work. This work proposes a new method for fusing similar sentences in Portuguese, based on a symbolic and domainindependent approach, and produces Zíper, a sentence fusion system that implements the proposed method. Zíper is the first such system to generate sentences that express all the information from input sentences, i.e., the union of the input set. Moreover, it allows generating sentences that express only the redundant information of the set (considered more important), i.e., the intersection of the input sentences. The system was evaluated intrinsically and the results show that, in general, the generated sentences are well formed and preserve the original message of the set (i.e. the entire message in the fusion by union, and only the main message in the fusion by intersection). Zíper was also evaluated extrinsically in the context of a Portuguese multi-document summarizer. The results suggest that it can improve the quality of summaries by reducing redundancy, which often causes loss of cohesion and coherence Automatic sentence fusion Fusão automática de sentenças Geração de texto a partir de texto Text-on-text generation
384	A revisão textual nos anos iniciais da escolaridade: percursos e procedimentos / Text revision in early years of schooling: pathways and procedures Dutra, Érica de Faria 28 March 2011 (has links) Escrever um texto com sentido, garantindo a compreensão para um destinatário e atendendo a um dado propósito não é tarefa simples, principalmente quando quem escreve são crianças recém-alfabéticas. Revisar o texto, nesta perspectiva, contribui significativamente para uma produção mais ajustada à interlocução posta pela escrita. Por isso, a revisão é uma prática que torna possível a reflexão sobre muitos aspectos da língua escrita, podendo ser vista como um conteúdo essencial para apropriação das habilidades textuais. Os pressupostos que embasam este trabalho estão apoiados na concepção de ensino e aprendizagem sócio-histórica que ressalta a importância da interação e a complexidade do processo redacional. De fato, além da constituição da situação interlocutiva, a escrita pressupõe a familiaridade com o gênero e as possibilidades de planejar, textualizar, revisar e até editar, quando for o caso. Partimos da concepção bakhtiniana de linguagem, que considera a escrita como processo dialógico, e do ensino da escrita centrado nas práticas interlocutivas entre sujeitos ativos e responsivos. A partir deste referencial, pretendemos estudar a prática de revisão como fonte inesgotável de reflexões e aprendizagens. Nosso objetivo é investigar as principais tendências de revisão em crianças do primeiro e segundo ano do Ensino Fundamental, em um intervalo de sete meses, comparando versões feitas individualmente e em duplas. Interessa-nos também analisar os recursos utilizados nas alterações feitas nos textos. Para tanto, foi proposto a alunos de uma escola, situada em São Paulo, a reescrita do conto Diamantes e sapos e a revisão desta produção em dois momentos distintos. Com base nos 54 textos que compõem o corpus da presente pesquisa (18 de reescrita e 36 de revisão), pudemos situar dois relevantes eixos de análise, o discursivo e o notacional, a partir dos quais cinco critérios apareceram como ocorrências significativas: enredo, linguagem, pontuação, segmentação de palavras e ortografia. Os dados coletados permitem constatar que, mesmo sem ter conhecimentos sistemáticos sobre os aspectos revisados, as crianças foram capazes de variadas reflexões acerca da língua, o que nos permite repensar os paradigmas do tradicional cenário pedagógico: mais do que corrigir a ortografia e aprimorar a legibilidade do texto, as crianças recém-alfabéticas são também capazes de lidar com aspectos da linguagem, procedimento este que normalmente é considerado viável apenas a escritores mais experientes. Além disso, a pesquisa evidencia que a própria prática de revisar proporciona a construção de saberes que se processam ao longo da sistemática participação em situações nas quais os alunos são estimulados a aprimorar sua produção textual. Os resultados, entretanto, não são imediatos; as conquistas colhidas nas práticas de revisão são tributárias de um percurso que, para além dos ganhos pontuais (a revisão em cada texto), justificam a longo prazo o desenvolvimento da aprendizagem da língua escrita. / It is not really a simple task to write a text with meaning, in a way that the recipient will understand it, as well as serving a given purpose, mainly when the writers are newly alphabetic children. In such a perspective, revising and correcting the text significantly contributes to a text production which more properly suits the dialogue set by the writing process. Therefore, the practice of revising and correcting a text makes it possible to reflect upon many aspects of the written language and can be considered as an essential content for the appropriation of textual skills. The assumptions on which this work is based are supported on a social and historical teaching and learning conception, which highlights the importance of interaction and the complexity of the writing process. In fact, besides the establishment of the interlocutory situation, writing requires familiarity with the genre and the possibilities of textualization, planning, revising and even editing, if it is the case. We are based on the Bakhtins conception of language which considers writing as a dialogic process, and the teaching of writing focused on interlocutory practices between active and responsive subjects. From this benchmark, we will study the practice of revising a text as an inexhaustible source of reflection and learning. Our goal is to investigate the main trends in revising texts by children attending the first and second grade of elementary school, within a gap of seven months. Individual and in pairs versions were compared. We are also interested in examining the resources used in the amendments performed on the texts. Therefore, students of a school in Sao Paulo were proposed the rewriting of the tale \"Toads and Diamonds\", and then the revising and correcting that text production, which took place at two different moments. Based on 54 papers comprising the corpus of this research (18 on rewriting and 36 on revising and correcting), we were able to establish two important lines of analysis, the discursive and notational ones, from which five criteria emerged as significant events: plot, language, punctuation, spelling and word segmentation. The data collected indicate that, even without systematic knowledge on the issues revised, children were able of developing varied reflections on language, which allows us to reconsider the paradigms of the traditional pedagogical setting: more than correcting spelling and improving the readability in the text, newly alphabetic children are also able to deal with aspects of the language, a procedure which is generally considered suitable only for more experienced writers. Moreover, the research shows that the practice of revising texts provides the building up of acquaintances which take place over the systematic involvement in situations where students are encouraged to improve their textual production. The results, however, are not immediate; achievements gathered in text revising practices are due to a pursue which, in addition to the specific achievements (revising each text), justifies the long-term development of learning how to deal with written language. alfabetização interação interaction língua escrita literacy produção de texto revisão textual text production text revision written language
385	New data-driven approaches to text simplification Štajner, Sanja January 2015 (has links) Many texts we encounter in our everyday lives are lexically and syntactically very complex. This makes them difficult to understand for people with intellectual or reading impairments, and difficult for various natural language processing systems to process. This motivated the need for text simplification (TS) which transforms texts into their simpler variants. Given that this is still a relatively new research area, many challenges are still remaining. The focus of this thesis is on better understanding the current problems in automatic text simplification (ATS) and proposing new data-driven approaches to solving them. We propose methods for learning sentence splitting and deletion decisions, built upon parallel corpora of original and manually simplified Spanish texts, which outperform the existing similar systems. Our experiments in adaptation of those methods to different text genres and target populations report promising results, thus offering one possible solution for dealing with the scarcity of parallel corpora for text simplification aimed at specific target populations, which is currently one of the main issues in ATS. The results of our extensive analysis of the phrase-based statistical machine translation (PB-SMT) approach to ATS reject the widespread assumption that the success of that approach largely depends on the size of the training and development datasets. They indicate more influential factors for the success of the PB-SMT approach to ATS, and reveal some important differences between cross-lingual MT and the monolingual v MT used in ATS. Our event-based system for simplifying news stories in English (EventSimplify) overcomes some of the main problems in ATS. It does not require a large number of handcrafted simplification rules nor parallel data, and it performs significant content reduction. The automatic and human evaluations conducted show that it produces grammatical text and increases readability, preserving and simplifying relevant content and reducing irrelevant content. Finally, this thesis addresses another important issue in TS which is how to automatically evaluate the performance of TS systems given that access to the target users might be difficult. Our experiments indicate that existing readability metrics can successfully be used for this task when enriched with human evaluation of grammaticality and preservation of meaning. 415
386	Metodologia para mapeamento de informações não estruturadas descritas em laudos médicos para uma representação atributo-valor / A methodology for mapping non-structured medical findings to the attribute-value table format Honorato, Daniel de Faveri 29 April 2008 (has links) Devido à facilidade com que informações biomédicas em língua natural são registras e armazenadas no formato digital, a recuperação de informações a partir de registros de pacientes nesse formato não estruturado apresenta diversos problemas a serem solucionados. Assim, a extração de informações estruturadas (por exemplo, no formato atributo-valor) a partir de registros não estruturados é um importante problema de pesquisa. Além disso, a representação de registros médicos não estruturados no formato atributo-valor, permite a aplicação de uma grande variedade de métodos de extração de padrões. Para mapear registros médicos não estruturados no formato atributo-valor, propomos uma metodologia que pode ser utilizada para automaticamente (ou semi-automaticamente, com a ajuda de um especialista do domínio) mapear informações médicas de interesse armazenadas nos registros médicos e descritas em linguagem natural em um formato estruturado. Essa metodologia foi implementada em um sistema computacional chamado TP-DISCOVER, o qual gera uma tabela no formato atributo-valor a partir de um conjunto de registros de pacientes (documentos). De modo a identificar entidades importantes no conjunto de documentos, assim como relacionamentos significantes entre essas entidades, propomos uma abordagem de extração de terminologia híbrida (lingüística/estatística) a qual seleciona palavras e frases que aparecem com freqüência acima de um dado limiar por meio da aplicação de medidas estatísticas. A idéia geral dessa abordagem híbrida de extração de terminologia é que documentos especializados são caracterizados por repetir o uso de certas unidades léxicas ou construções morfo-sintáticas. Nosso objetivo é reduzir o esforço despendido na modelagem manual por meio da observação de regularidades no texto e o mapeamento dessas regularidades como nomes de atributos na representação atributo-valor. A metodologia proposta foi avaliada realizando a estruturação automática de uma coleção de 6000 documentos com informações de resultados de exames de Endoscopia Digestiva Alta descritos em língua natural. Os resultados experimentais, os quais podem ser considerados os piores resultados, uma vez que esses resultados poderiam ser muito melhores caso a metodologia for utilizada semi-automaticamente junto com um especialista do domínio, mostram que a metodologia proposta é adequada e permite reduzir o tempo usado pelo especialista para analisar grande quantidade de registros médicos / The information retrieval from text stored in computer-based patient records is an important open-ended research problem, as the ease in which biomedical information recorded and stored in digital form grows. Thus, means to extract structured information (for example, in the so-called attribute-value format) from free-text records is an important research endeavor. Furthermore, by representing the free-text records in the attribute-value format, available pattern extraction methods can be directly applied. To map free-text medical records into the attribute-value format, we propose a methodology that can be used to automatically (or semi-automatically, with the help of a medical expert) map the important medical information stored in patient records which are described in natural language into an structured format. This methodology has been implemented in a computational system called TP-DISCOVER, which generates a database in the attribute-value format from a set of patient records (documents). In order to identify important entities in the set of documents, as well as significant relations among these entities, we propose a hybrid linguistic/statistical terminology extraction approach which filters out words and phrases that appear with a frequency higher than a given threshold by applying statistical measures. The underlying assumption of this hybrid approach to terminology extraction is that specialized documents are characterized by repeated use of certain lexical units or morpho-syntactic constructions. Our goal is to reduce the effort spent in manual modelling by observing regularities in the texts and by mapping them into suitable attribute names in the attribute-value representation format. The proposed methodology was evaluated to automatically structure a collection of 6000 documents which contains High Digestive Endoscopies exams´ results described in natural language. The experimental results, all of which can be considered lower bound results as they would greatly improve in case the methodology is applied semi-automatically together with a medical expert, show that the proposed methodology is suitable to reduce the medical expert workload in analysing large amounts of medical records Extração de terminologia Mineração de textos Pré-processamento de textos Terminology extraction Text mining Text pre-processing
387	Pré-natal do parceiro: uso da estratégia PRENACEL para melhorar o envolvimento masculino no pré-natal / Prenatal care of partner: use the PRENACEL strategy to improve the male involvement in prenatal care Lívia Pimenta Bonifácio 21 September 2018 (has links) Introdução: O acompanhamento do parceiro no pré-natal, parto e pós-parto de sua companheira mostram resultados positivos em relação à saúde materna, infantil e também relacionados à saúde do homem. É uma importante estratégia de aproximar os futuros pais dos serviços de saúde e melhorar o vínculo destes com a paternidade. Objetivo: Avaliar se a implementação da tecnologia SMS através do programa PRENACEL para o parceiro como um programa de educação em saúde é um suplemento útil ao acompanhamento pré-natal padrão. Método: Ensaio aleatorizado controlado por conglomerados representados por unidades de saúde. Selecionamos 20 unidades de saúde que foram aleatoriamente alocadas segundo critérios pré estabelecidos, 10 sendo controle e 10 como intervenção. Os parceiros das gestantes que iniciaram o pré-natal antes da 20ª semana de gestação foram a população do estudo. Os parceiros inscritos no PRENACEL receberam periodicamente mensagens curtas de texto via celular com informações sobre gestação e parto. Nas unidades do grupo controle os parceiros receberam, junto com suas companheiras, o pré-natal padrão. Resultados: 186 parceiros foram entrevistados, 62 do grupo PRENACEL, 73 do grupo intervenção, mas que não optaram pelo PRENACEL e 51 do grupo controle. Encontramos um perfil com idade média de 30 anos e a maioria dos entrevistados (51%) se declarou como raça/cor parda. Grande parte dos entrevistados (39,7%) relatou ter em média de 9,3 anos de estudo. A maioria dos homens (57,5%) coabita com a companheira e foi classificada como classe C (63,7%). A adesão ao programa PRENACEL foi de 53,4%. Houve uma maior participação dos parceiros do grupo PRENACEL nas consultas de pré- natal, assim como foi observada uma maior presença destes no momento do parto como acompanhante quando comparado aos demais grupos. Conclusão: O estudo mostrou que uma estratégia de educação em saúde utilizando as tecnologias de comunicação parece ter boa aceitabilidade e um papel promissor no engajamento de homens aos cuidados pré-natal, parto e pós-parto de suas companheiras. / Introduction: The partner accompanying the prenatal care, birth and postpartum care of the woman has presented positive results in relation to mother and child health and also in relation to the health of the man. This is an important strategy to bring future fathers closer to health services and to improve their link with paternity. Aim: To evaluate whether the implementation of SMS technology, through the PRENACEL program for the partner as a health education program, is a useful supplement to standard prenatal monitoring. Method: A parallel cluster randomized trial, with the clusters representing health units. The partners of the pregnant women who started prenatal care prior to the 20th week of gestation were the study population of the intervention group. The participants received periodic short text messages via mobile phone with information about the pregnancy and birth. In the control group units the partners, together with the women, received the standard prenatal care. Results: 186 partners were interviewed, 62 from the PRENACEL group, 73 from the intervention group that did not opt for PRENACEL and 51 from the control group. A profile with a mean age of 30 years was found and the majority of respondents (51%) declared themselves as brown race/color. The interviewees presented a mean of 9.3 years of study. The majority of the men (57.5%) cohabited with their partner and 63.7% were classified as socioeconomic class C. The adherence to the PRENACEL program was 53.4%. There was a greater participation of the PRENACEL partners in the prenatal consultations, as well as a greater presence of them accompanying the woman at the moment of the birth when compared to the other groups. Conclusion: The study showed that a health education strategy using communication technology seems to have good acceptability and a promising role in engaging men in the prenatal care, birth and postpartum care of their partners. envolvimento paterno mensagens de texto mHealth pré-natal SMS mHealth paternal involvement prenatal SMS text text messaging
388	Knowledge-enhanced text classification : descriptive modelling and new approaches Martinez-Alvarez, Miguel January 2014 (has links) The knowledge available to be exploited by text classification and information retrieval systems has significantly changed, both in nature and quantity, in the last years. Nowadays, there are several sources of information that can potentially improve the classification process, and systems should be able to adapt to incorporate multiple sources of available data in different formats. This fact is specially important in environments where the required information changes rapidly, and its utility may be contingent on timely implementation. For these reasons, the importance of adaptability and flexibility in information systems is rapidly growing. Current systems are usually developed for specific scenarios. As a result, significant engineering effort is needed to adapt them when new knowledge appears or there are changes in the information needs. This research investigates the usage of knowledge within text classification from two different perspectives. On one hand, the application of descriptive approaches for the seamless modelling of text classification, focusing on knowledge integration and complex data representation. The main goal is to achieve a scalable and efficient approach for rapid prototyping for Text Classification that can incorporate different sources and types of knowledge, and to minimise the gap between the mathematical definition and the modelling of a solution. On the other hand, the improvement of different steps of the classification process where knowledge exploitation has traditionally not been applied. In particular, this thesis introduces two classification sub-tasks, namely Semi-Automatic Text Classification (SATC) and Document Performance Prediction (DPP), and several methods to address them. SATC focuses on selecting the documents that are more likely to be wrongly assigned by the system to be manually classified, while automatically labelling the rest. Document performance prediction estimates the classification quality that will be achieved for a document, given a classifier. In addition, we also propose a family of evaluation metrics to measure degrees of misclassification, and an adaptive variation of k-NN.
389	Uso do minerador de textos sobek como ferramenta de apoio à compreensão textual Epstein, Daniel January 2017 (has links) A presente tese tem por objetivo investigar os efeitos do uso do minerador de textos Sobek no processo de leitura e compreensão textual de estudantes. Este minerador de textos é capaz de extrair informações relevantes de textos e representá-las de forma gráfica. Esta tese está apoiada nas teorias de aprendizagem significativa, de uso de mapas conceituais para representação de conhecimento e em pesquisas que apontam que representações gráficas de palavras auxiliam na leitura de textos e na sua decodificação. De acordo com a pesquisa de David Ausubel, a aprendizagem significativa ocorre através da assimilação de novos conceitos e ideias e associação destas ao conhecimento que a pessoa já possui. Através da utilização de um minerador de textos com representação gráfica de informações, busca-se apresentar aos estudantes uma representação visual de textos. Esta representação se assemelha a de um mapa conceitual, de forma a auxiliar no processo de compreensão e assimilação de informações pelos estudantes. Nesta representação, ligações entre termos considerados relevantes pelo minerador auxiliam no entendimento destes termos e simbolizam relações presentes no texto, fato esse que pode auxiliar os estudantes a compreenderem melhor o texto e relacionarem novas informações àquelas que já possuem Nesta tese, foi realizado um estudo para auxiliar estudantes nas atividades relacionadas ao letramento. A pesquisa se caracteriza como mista (qualitativa e quantitativa). A coleta de dados se deu a partir da aplicação de questionários com professores e alunos, além de avaliações com o objetivo de verificar contribuições do uso da ferramenta a partir de seu uso do ponto de vista do letramento. Como resultado, encontramos que estudantes que utilizaram o Sobek obtiveram um número mais elevado de respostas corretas nas atividades de interpretação de textos. Em média, os alunos acertaram 66% das questões quando utilizando o minerador de textos Sobek, contra apenas 47% das questões que eram respondidas sem o apoio do minerador. Outro resultado apresentado é o alto grau de satisfação de alunos e professores quanto à tecnologia e seu uso em sala de aula. Além destes resultados, obtivemos uma avaliação acerca da capacidade do minerador de textos de extrair termos considerados relevantes ao texto. / This thesis aimed to investigate the effects of using Sobek Text Miner to improve literacy. Sobek is a tool capable of extracting relevant information from texts and representing them in a graphical way. The thesis is supported by meaningful learning theory, conceptual maps theory and several research theories which indicate that graphical representation of words may improve reading capability and word decoding. According to David Ausubel, meaningful learning occurs through the assimilation of new concepts and ideas and association of those to what the person already knows. Using text mining with graphical representation of information, we seek to provide students with a graphical representation of a text. This text representation is similar to a concept map, helping students assimilate and comprehend that information. In Sobek’s representation, the relationship between terms considered relevant to text comprehension may assist students to better understand the meaning of each term and demonstrate relationships that are presented in the text, improving context comprehension. Furthermore, the relationship between terms may help information assimilation, once it relates the new information with previous known information This project conducted a study using Sobek text miner in classroom to support student’s literacy. In order to assess the tool’s possible benefits in reading and comprehension activities, we designed a series of classroom activities. To evaluate those activities, qualitative and quantitative approaches were used. The study was conducted in two primary schools, with students from 5th grade and 8th grade. Interviews were also made with the teachers and students, inquiring them about the tool's and main functions and its ability to help students from a literacy point of view. The study shows that students answered more correct question when using Sobek than when no support technology was used. Also, both students and teachers approved the software and agreed that it does improve student’s text comprehension. It also describes an evaluation of Sobek's capability to extract terms considered relevant for text comprehension. Tecnologia educacional Aprendizagem significativa Compreensão de texto Text Comprehension Sobek Reading Meenaningful learning Text mining Graphs Literacy
390	Intégration du web social dans les systèmes de recommandation / Social web integration in recommendation systems Nana jipmo, Coriane 19 December 2017 (has links) Le Web social croît de plus en plus et donne accès à une multitude de ressources très variées, qui proviennent de sites de partage tels que del.icio.us, d’échange de messages comme Twitter, des réseaux sociaux à finalité professionnelle, comme LinkedIn, ou plus généralement à finalité sociale, comme Facebook et LiveJournal. Un même individu peut être inscrit et actif sur différents réseaux sociaux ayant potentiellement des finalités différentes, où il publie des informations diverses et variées, telles que son nom, sa localité, ses communautés, et ses différentes activités. Ces informations (textuelles), au vu de la dimension internationale du Web, sont par nature, d’une part multilingue, et d’autre part, intrinsèquement ambiguë puisqu’elles sont éditées par les individus en langage naturel dans un vocabulaire libre. De même, elles sont une source de données précieuses, notamment pour les applications cherchant à connaître leurs utilisateurs afin de mieux comprendre leurs besoins et leurs intérêts. L’objectif de nos travaux de recherche est d’exploiter, en utilisant essentiellement l’encyclopédie Wikipédia, les ressources textuelles des utilisateurs extraites de leurs différents réseaux sociaux afin de construire un profil élargi les caractérisant et exploitable par des applications telles que les systèmes de recommandation. En particulier, nous avons réalisé une étude afin de caractériser les traits de personnalité des utilisateurs. De nombreuses expérimentations, analyses et évaluations ont été réalisées sur des données réelles collectées à partir de différents réseaux sociaux. / The social Web grows more and more and gives through the web, access to a wide variety of resources, like sharing sites such as del.icio.us, exchange messages as Twitter, or social networks with the professional purpose such as LinkedIn, or more generally for social purposes, such as Facebook and LiveJournal. The same individual can be registered and active on different social networks (potentially having different purposes), in which it publishes various information, which are constantly growing, such as its name, locality, communities, various activities. The information (textual), given the international dimension of the Web, is inherently multilingual and intrinsically ambiguous, since it is published in natural language in a free vocabulary by individuals from different origin. They are also important, specially for applications seeking to know their users in order to better understand their needs, activities and interests. The objective of our research is to exploit using essentially the Wikpédia encyclopedia, the textual resources extracted from the different social networks of the same individual in order to construct his characterizing profile, which can be exploited in particular by applications seeking to understand their users, such as recommendation systems. In particular, we conducted a study to characterize the personality traits of users. Many experiments, analyzes and evaluations were carried out on real data collected from different social networks. Web social Text mining Traitement multilingue Wikipédia Personnalité Social Web Text mining Multilingual processing Wikipédia Personality

Search results