1 |
Lexical Chains and Sliding Locality Windows in Content-based Text Similarity DetectionNahnsen, Thade, Uzuner, Ozlem, Katz, Boris 19 May 2005 (has links)
We present a system to determine content similarity of documents. More specifically, our goal is to identify book chapters that are translations of the same original chapter; this task requires identification of not only the different topics in the documents but also the particular flow of these topics. We experiment with different representations employing n-grams of lexical chains and test these representations on a corpus of approximately 1000 chapters gathered from books with multiple parallel translations. Our representations include the cosine similarity of attribute vectors of n-grams of lexical chains, the cosine similarity of tf*idf-weighted keywords, and the cosine similarity of unweighted lexical chains (unigrams of lexical chains) as well as multiplicative combinations of the similarity measures produced by these approaches. Our results identify fourgrams of unordered lexical chains as a particularly useful representation for text similarity evaluation.
|
2 |
Text readability and summarisation for non-native reading comprehensionXia, Menglin January 2019 (has links)
This thesis focuses on two important aspects of non-native reading comprehension: text readability assessment, which estimates the reading difficulty of a given text for L2 learners, and learner summarisation assessment, which evaluates the quality of learner summaries to assess their reading comprehension. We approach both tasks as supervised machine learning problems and present automated assessment systems that achieve state-of-the-art performance. We first address the task of text readability assessment for L2 learners. One of the major challenges for a data-driven approach to text readability assessment is the lack of significantly-sized level-annotated data aimed at L2 learners. We present a dataset of CEFR-graded texts tailored for L2 learners and look into a range of linguistic features affecting text readability. We compare the text readability measures for native and L2 learners and explore methods that make use of the more plentiful data aimed at native readers to help improve L2 readability assessment. We then present a summarisation task for evaluating non-native reading comprehension and demonstrate an automated summarisation assessment system aimed at evaluating the quality of learner summaries. We propose three novel machine learning approaches to assessing learner summaries. In the first approach, we examine using several NLP techniques to extract features to measure the content similarity between the reading passage and the summary. In the second approach, we calculate a similarity matrix and apply a convolutional neural network (CNN) model to assess the summary quality using the similarity matrix. In the third approach, we build an end-to-end summarisation assessment model using recurrent neural networks (RNNs). Further, we combine the three approaches to a single system using a parallel ensemble modelling technique. We show that our models outperform traditional approaches that rely on exact word match on the task and that our best model produces quality assessments close to professional examiners.
|
3 |
Termediator-II: Identification of Interdisciplinary Term Ambiguity Through Hierarchical Cluster AnalysisRiley, Owen G. 23 April 2014 (has links) (PDF)
Technical disciplines are evolving rapidly leading to changes in their associated vocabularies. Confusion in interdisciplinary communication occurs due to this evolving terminology. Two causes of confusion are multiple definitions (overloaded terms) and synonymous terms. The formal names for these two problems are polysemy and synonymy. Termediator-I, a web application built on top of a collection of glossaries, uses definition count as a measure of term confusion. This tool was an attempt to identify confusing cross-disciplinary terms. As more glossaries were added to the collection, this measure became ineffective. This thesis provides a measure of term polysemy. Term polysemy is effectively measured by semantically clustering the text concepts, or definitions, of each term and counting the number of resulting clusters. Hierarchical clustering uses a measure of proximity between the text concepts. Three such measures are evaluated: cosine similarity, latent semantic indexing, and latent Dirichlet allocation. Two linkage types, for determining cluster proximity during the hierarchical clustering process, are also evaluated: complete linkage and average linkage. Crowdsourcing through a web application was unsuccessfully attempted to obtain a viable clustering threshold by public consensus. An alternate metric of polysemy, convergence value, is identified and tested as a viable clustering threshold. Six resulting lists of terms ranked by cluster count based on convergence values are generated, one for each similarity measure and linkage type combination. Each combination produces a competitive list, and no clear combination can be determined as superior. Semantic clustering successfully identifies polysemous terms, but each similarity measure and linkage type combination provides slightly different results.
|
4 |
Comparing Text Similarity Functions For Outlier Detection : In a Dataset with Small Collections of TitlesRabo, Vide, Winbladh, Erik January 2022 (has links)
Detecting when a title is put in an incorrect data category can be of interest for commercial digital services, such as streaming platforms, since they group movies by genre. Another example of a beneficiary is price comparison services, which categorises offers by their respective product. In order to find data points that are significantly different from the majority (outliers), outlier detection can be applied. A title in the wrong category is an example of an outlier. Outlier detection algorithms may require a metric that quantify nonsimilarity between two points. Text similarity functions can provide such a metric when comparing text data. The question therefore arises, "Which text similarity function is best suited for detecting incorrect titles in practical environments such as commercial digital services?" In this thesis, different text similarity functions are evaluated when set to detect outlying (incorrect) product titles, with both efficiency and effectiveness taken into consideration. Results show that the variance in performance between functions generally is small, with a few exceptions. The overall top performer is Sørensen-Dice, a function that divides the number of common words with the total amount of words found in both strings. While the function is efficient in the sense that it identifies most outliers in a practical time-frame, it is not likely to find all of them and is therefore deemed to not be effective enough to by applied in practical use. Therefore it might be better applied as part of a larger system, or in combination with manual analysis. / Att identifiera när en titel placeras i en felaktig datakategori kan vara av intresse för kommersiella digitala tjänster, såsom plattformar för filmströmning, eftersom filmer delas upp i genrer. Också prisjämförelsetjänster, som kategoriserar erbjudanden efter produkt skulle dra nytta. Outlier detection kan appliceras för att finna datapunkter som skiljer sig signifikant från de övriga (outliers). En titel i en felaktig kategori är ett exempel på en sådan outlier. Outlier detection algoritmer kan kräva ett mått som kvantifierar hur olika två datapunkter är. Text similarity functions kvantifierar skillnaden mellan textsträngar och kan därför integreras i dessa algoritmer. Med detta uppkommer en följdfråga: "Vilken text similarity function är bäst lämpad för att hitta avvikande titlar i praktiska miljöer såsom kommersiella digitala tjänster?”. I detta examensarbete kommer därför olika text similarity functions att jämföras när de används för att finna felaktiga produkttitlar. Jämförelsen tar hänsyn till både tidseffektivitet och korrekthet. Resultat visar att variationen i prestation mellan funktioner generellt är liten, med ett fåtal undantag. Den totalt sett högst presterande funktionen är Sørensen-Dice, vilken dividerar antalet gemensamma ord med det totala antalet ord i båda texttitlarna. Funktionen är effektiv då den identiferar de flesta outliers inom en praktisk tidsram, men kommer sannolikt inte hitta alla. Istället för att användas som en fullständig lösning, skulle det därför vara fördelaktigt att kombinera den med manuell analys eller en mer övergripande lösning.
|
5 |
AURA : a hybrid approach to identify framework evolutionWu, Wei 02 1900 (has links)
Les cadriciels et les bibliothèques sont indispensables aux systèmes logiciels d'aujourd'hui. Quand ils évoluent, il est souvent fastidieux et coûteux pour les développeurs de faire la mise à jour de leur code.
Par conséquent, des approches ont été proposées pour aider les développeurs à migrer leur code. Généralement, ces approches ne peuvent identifier automatiquement les règles de modification une-remplacée-par-plusieurs méthodes et plusieurs-remplacées-par-une méthode. De plus, elles font souvent un compromis entre rappel et précision dans leur résultats en utilisant un ou plusieurs seuils expérimentaux.
Nous présentons AURA (AUtomatic change Rule Assistant), une nouvelle approche hybride qui combine call dependency analysis et text similarity analysis pour surmonter ces limitations. Nous avons implanté AURA en Java et comparé ses résultats sur cinq cadriciels avec trois approches précédentes par Dagenais et Robillard, M. Kim et al., et Schäfer et al. Les résultats de cette comparaison montrent que, en moyenne, le rappel de AURA est 53,07% plus que celui des autre approches avec une précision similaire (0,10% en moins). / Software frameworks and libraries are indispensable to today's software systems. As they evolve, it is often time-consuming for developers to keep their code up-to-date.
Approaches have been proposed to facilitate this. Usually, these approaches cannot automatically identify change rules for one-replaced-by-many and many-replaced-by-one methods, and they trade off recall for higher precision using one or more experimentally-evaluated thresholds.
We introduce AURA (AUtomatic change Rule Assistant), a novel hybrid approach that combines call dependency and text similarity analyses to overcome these limitations. We implement it in a Java system and compare it on five frameworks with three previous approaches by Dagenais and Robillard, M. Kim et al., and Schäfer et al. The comparison shows that, on average, the recall of AURA is 53.07% higher while its precision is similar (0.10% lower).
|
6 |
AURA : a hybrid approach to identify framework evolutionWu, Wei 02 1900 (has links)
Les cadriciels et les bibliothèques sont indispensables aux systèmes logiciels d'aujourd'hui. Quand ils évoluent, il est souvent fastidieux et coûteux pour les développeurs de faire la mise à jour de leur code.
Par conséquent, des approches ont été proposées pour aider les développeurs à migrer leur code. Généralement, ces approches ne peuvent identifier automatiquement les règles de modification une-remplacée-par-plusieurs méthodes et plusieurs-remplacées-par-une méthode. De plus, elles font souvent un compromis entre rappel et précision dans leur résultats en utilisant un ou plusieurs seuils expérimentaux.
Nous présentons AURA (AUtomatic change Rule Assistant), une nouvelle approche hybride qui combine call dependency analysis et text similarity analysis pour surmonter ces limitations. Nous avons implanté AURA en Java et comparé ses résultats sur cinq cadriciels avec trois approches précédentes par Dagenais et Robillard, M. Kim et al., et Schäfer et al. Les résultats de cette comparaison montrent que, en moyenne, le rappel de AURA est 53,07% plus que celui des autre approches avec une précision similaire (0,10% en moins). / Software frameworks and libraries are indispensable to today's software systems. As they evolve, it is often time-consuming for developers to keep their code up-to-date.
Approaches have been proposed to facilitate this. Usually, these approaches cannot automatically identify change rules for one-replaced-by-many and many-replaced-by-one methods, and they trade off recall for higher precision using one or more experimentally-evaluated thresholds.
We introduce AURA (AUtomatic change Rule Assistant), a novel hybrid approach that combines call dependency and text similarity analyses to overcome these limitations. We implement it in a Java system and compare it on five frameworks with three previous approaches by Dagenais and Robillard, M. Kim et al., and Schäfer et al. The comparison shows that, on average, the recall of AURA is 53.07% higher while its precision is similar (0.10% lower).
|
7 |
Arcabouço semiautomático para apoio à participação de avaliação em fóruns de EaD. / Semiautomatic framework to support the evaluation of participation in Distance Education forums.MEDEIROS, Danielle Chaves de. 11 December 2017 (has links)
Submitted by Johnny Rodrigues (johnnyrodrigues@ufcg.edu.br) on 2017-12-11T13:08:51Z
No. of bitstreams: 1
DANIELLE CHAVES DE MEDEIROS - DISSERTAÇÃO PPGCC 2014.pdf: 1794393 bytes, checksum: 843f24f727881568e52ff59c12122403 (MD5) / Made available in DSpace on 2017-12-11T13:08:51Z (GMT). No. of bitstreams: 1
DANIELLE CHAVES DE MEDEIROS - DISSERTAÇÃO PPGCC 2014.pdf: 1794393 bytes, checksum: 843f24f727881568e52ff59c12122403 (MD5)
Previous issue date: 2014-12 / A seleção de critérios para a análise de informações durante o processo de avaliação da participação em fóruns de discussão de cursos de Educação a Distância (EaD) é um grande desafio. São muitas as variáveis que devem ser consideradas neste processo, além da subjetividade inerente à análise realizada pelo docente, passível de erro humano. Os docentes geralmente não possuem a seu dispor todos os recursos necessários, se tornando necessário o uso de uma metodologia ou ferramenta que o auxilie no processo de avaliação. Diante desta demanda e, a partir de um estudo dos principais indicadores qualitativos/quantitativos utilizados pelos professores de EaD, foi desenvolvido um arcabouço para a análise da participação dos alunos em fóruns. O objetivo deste arcabouço é servir de apoio à tomada de decisão do professor, fornecendo um mecanismo mais efetivo para a mensuração da quantidade e da qualidade das interações, passível de adaptação à metodologia tradicional adotada
por cada docente. A validação deste arcabouço deu-se a partir da administração de questionários para a sondagem da opinião de docentes atuantes na área de ensino a distância, assim como pela realização de estudos de caso envolvendo a avaliação da
acurácia de instâncias do arcabouço para o cálculo da nota de participação de alunos. Foi desenvolvido um Sistema Especialista (SE) para o processamento dos dados, com o uso de funções de similaridade para realizar, de forma semiautomática, a avaliação
do conteúdo das mensagens dos alunos. Assim, as notas de participação calculadas foram confrontadas com as notas atribuídas pelo docente utilizando a abordagem tradicional. Os resultados obtidos demonstraram que, em três das cinco turmas
observadas, não foi possível verificar a existência de diferenças estatísticas significativas entre o desempenho das abordagens estudadas. Um estudo da acurácia e correlação revela que, em todos os casos analisados, há uma forte relação entre os
dados e o erro médio encontrado foi inferior a 3%, demonstrando a aplicabilidade do arcabouço ao contexto da avaliação da participação em fóruns. / The selection of criteria for the information analysis during the participation evaluation process, in discussion forums of distance courses, is a major challenge. There are many variables to consider in this process, in addition to the subjectivity inherent in the analysis carried out by the instructor, which is subject to human error.
Instructors, generally, do not have at their disposal all the resources necessary, thus, the use of methodologies or tools that can help them with this process are necessary. Facing this demand, and after performing a study of the major qualitative/quantitative indicators used by distance education teachers, we developed a framework for the analysis of the student participation in the forums. The aim of
this framework is to support the decision-making process, by providing a more effective mechanism to measure the quantity and quality of interactions, capable of adjusting itself to the traditional methodology adopted by each teacher. The
validation of this framework was performed by the administration of questionnaires that surveyed the opinion of active teachers in the distance learning area, and by the execution of case studies involving the assessment of the accuracy of instances of
this framework for calculating the participation grade of students. The study involved the development of an Expert System, for the treatment and processing of the data,
using similarity functions to perform, semi-automatically, the assessment of the content of the students' messages. Thus, it was possible to confront the calculated participation grades with the grades assigned by the teacher using the traditional approach. The results showed that, in three out of the five classes observed, it was
not possible to verify the existence of statistically significant differences between the performance of both the approaches studied. A study of the accuracy and correlation shows that, in all the cases analyzed, there is a strong relationship between the data, and the average error was less than 3%, demonstrating the applicability of the
proposed framework to the assessment of student participation in forums.
|
8 |
Evaluation of Sentence Representations in Semantic Text Similarity Tasks / Utvärdering av meningsrepresentation för semantisk textlikhetBalzar Ekenbäck, Nils January 2021 (has links)
This thesis explores the methods of representing sentence representations for semantic text similarity using word embeddings and benchmarks them against sentence based evaluation test sets. Two methods were used to evaluate the representations: STS Benchmark and STS Benchmark converted to a binary similarity task. Results showed that preprocessing of the word vectors could significantly boost performance in both tasks and conclude that word embed-dings still provide an acceptable solution for specific applications. The study also concluded that the dataset used might not be ideal for this type of evalua-tion, as the sentence pairs in general had a high lexical overlap. To tackle this, the study suggests that a paraphrasing dataset could act as a complement but that further investigation would be needed. / Denna avhandling undersöker metoder för att representera meningar i vektor-form för semantisk textlikhet och jämför dem med meningsbaserade testmäng-der. För att utvärdera representationerna användes två metoder: STS Bench-mark, en vedertagen metod för att utvärdera språkmodellers förmåga att ut-värdera semantisk likhet, och STS Benchmark konverterad till en binär lik-hetsuppgift. Resultaten visade att förbehandling av texten och ordvektorerna kunde ge en signifikant ökning i resultatet för dessa uppgifter. Studien konklu-derade även att datamängden som användes kanske inte är ideal för denna typ av utvärdering, då meningsparen i stort hade ett högt lexikalt överlapp. Som komplement föreslår studien en parafrasdatamängd, något som skulle kräva ytterligare studier.
|
9 |
SubRosa – Multi-Feature-Ähnlichkeitsvergleiche von UntertitelnLuhmann, Jan, Burghardt, Manuel, Tiepmar, Jochen 20 June 2024 (has links)
No description available.
|
10 |
Improving customer support efficiency through decision support powered by machine learningBoman, Simon January 2023 (has links)
More and more aspects of today’s healthcare are becoming integrated with medical technology and dependent on medical IT systems, which consequently puts stricter re-quirements on the companies delivering these solutions. As a result, companies delivering medical technology solutions need to spend a lot of resources maintaining high-quality, responsive customer support. In this report, possible ways of increasing customer support efficiency using machine learning and NLP is examined at Sectra, a medical technology company. This is done through a qualitative case study, where empirical data collection methods are used to elicit requirements and find ways of adding decision support. Next, a prototype is built featuring a ticket recommendation system powered by GPT-3 and based on 65 000 available support tickets, which is integrated with the customer supports workflow. Lastly, this is evaluated by having six end users test the prototype for five weeks, followed by a qualitative evaluation consisting of interviews, and a quantitative measurement of the user-perceivedusability of the proposed prototype. The results show some support that machine learning can be used to create decision support in a customer support context, as six out of six test users believed that their long-term efficiency could improve using the prototype in terms of reducing the average ticket resolution time. However, one out of the six test users expressed some skepticism towards the relevance of the recommendations generated by the system, indicating that improvements to the model must be made. The study also indicates that the use of state-of-the-art NLP models for semantic textual similarity can possibly outperform keyword searches.
|
Page generated in 0.0769 seconds