111 |
Automated Essay Scoring for English Using Different Neural Network Models for Text ClassificationDeng, Xindi January 2021 (has links)
Written skills are an essential evaluation criterion for a student’s creativity, knowledge, and intellect. Consequently, academic writing is a common part of university and college admissions applications, standardized tests, and classroom assessments. However, the task for teachers is quite daunting when it comes to essay scoring. Then Automated Essay Scoring may be a helpful tool in the decision-making by the teacher. There have been many successful models with supervised or unsupervised machine learning algorithms in the eld of Automated Essay Scoring. This thesis work makes a comparative study among various neural network models with supervised machine learning algorithms and different linguistic feature combinations. It also proves that the same linguistic features are applicable to more than one language. The models studied in this experiment include TextCNN, TextRNN_LSTM, Tex- tRNN_GRU, and TextRCNN trained with the essays from the Automated Student Assessment Prize (ASAP) from Kaggle competitions. Each essay is represented with linguistic features measuring linguistic complexity. Those features are divided into four groups: count-based, morphological, syntactic, and lexical features, and the four groups of features can form a total of 14 combinations. The models are evaluated via three measurements: Accuracy, F1 score, and Quadratic Weighted Kappa. The experimental results show that models trained only with count-based features outperform the models trained using other feature combinations. In addition, TextRNN_LSTM performs best, with an accuracy of 54.79%, an F1 score of 0.55, and a Quadratic Weighted Kappa of 0.59, which beats the statistically-based baseline models.
|
112 |
Targeted Topic Modeling for Levantine ArabicZahra, Shorouq January 2020 (has links)
Topic models for focused analysis aim to capture topics within the limiting scope of a targeted aspect (which could be thought of as some inner topic within a certain domain). To serve their analytic purposes, topics are expected to be semantically-coherent and closely aligned with human intuition – this in itself poses a major challenge for the more common topic modeling algorithms which, in a broader sense, perform a full analysis that covers all aspects and themes within a collection of texts. The paper attempts to construct a viable focused-analysis topic model which learns topics from Twitter data written in a closely related group of non-standardized varieties of Arabic widely spoken in the Levant region (i.e Levantine Arabic). Results are compared to a baseline model as well as another targeted topic model designed precisely to serve the purpose of focused analysis. The model is capable of adequately capturing topics containing terms which fall within the scope of the targeted aspect when judged overall. Nevertheless, it fails to produce human-friendly and semantically-coherent topics as several topics contained a number of intruding terms while others contained terms, while still relevant to the targeted aspect, thrown together seemingly at random.
|
113 |
Constructiveness-Based Product Review ClassificationLoobuyck, Ugo January 2020 (has links)
Promoting constructiveness in online comment sections is an essential step to make the internet a more productive place. On online marketplaces, customers often have the opportunity to voice their opinion and relate their experience with a given product. In this thesis, we investigate the possibility to model constructiveness in product review in order to promote the most informative and argumentative customer feedback. We develop a new constructiveness 4-class scale taxonomy based on heuristics and specific categorical criteria. We use this taxonomy to annotate 4000 Amazon customer reviews as our training set, referred to as the Corpus for Review Constructiveness (CRC). In addition to the 4-class constructiveness tag, we include a binary tag to compare modeling performance with previous work. We train and test several computational models such as Bidirectional Encoder Representations from Transformers (BERT), a Stacked Bidirectional LSTM and a Gradient Boosting Machine. We demonstrate our annotation scheme’s reliability with a set of inter-annotator agreement experiments, and show that good levels of performance can be reached in both multiclass setting (0.69 F1 and 57% error reduction over the baseline) and binary setting (0.85 F1 and 71% error reduction). Different features are evaluated individually and in combination. Moreover, we compare the advantages, downsides and performance of both feature-based and neural network models. Finally, these models trained on CRC are tested on out-of-domain data (news article comments) and shown to be nearly as proficient as on in-domain data. This work allows the extension of constuctiveness modeling to a new type of data and provides a new non-binary taxonomy for data labeling.
|
114 |
Exploring Cross-lingual Sublanguage Classification with Multi-lingual Word EmbeddingsShih, Min-Chun January 2020 (has links)
Cross-lingual text classification is an important task due to the globalization and the increased availability of multilingual data. This thesis explores the method of implementing cross-lingual classification on Swedish and English medical corpora. Specifically, this the- sis explores the simple convolutional neural network (CNN) with MUSE pre-trained word embeddings to approach binary classification of sublanguages (“lay” and “specialized”) from Swedish healthcare texts to English healthcare texts. MUSE is a library that provides state-of-the-art multilingual word embeddings and large-scale high-quality bilingual dictionaries. The thesis presents experiments with imbalanced and balanced class distribution on training data and test data to examine the effect of class distribution, and also examine the influences of clean test dataset and noisy test dataset. The results show that balanced distribution of classes in training data performs significantly better than the training data with imbalanced class distribution, and clean test data gives the benefit of transferring the labels from one language to another. The thesis also compares the performance of the simple convolutional neural network model with the Naive Bayes baseline. Results show that on this task a simple Naive Bayes classifier based on bag-of-words translated using MUSE English-Swedish dictionary outperforms a simple CNN model based on MUSE pre-trained word embeddings in several experimental settings.
|
115 |
Cross-lingual and Multilingual Automatic Speech Recognition for Scandinavian LanguagesČerniavski, Rafal January 2022 (has links)
Research into Automatic Speech Recognition (ASR), the task of transforming speech into text, remains highly relevant due to its countless applications in industry and academia. State-of-the-art ASR models are able to produce nearly perfect, sometimes referred to as human-like transcriptions; however, accurate ASR models are most often available only in high-resource languages. Furthermore, the vast majority of ASR models are monolingual, that is, only able to handle one language at a time. In this thesis, we extensively evaluate the quality of existing monolingual ASR models for Swedish, Danish, and Norwegian. In addition, we search for parallels between monolingual ASR models and the cognition of foreign languages in native speakers of these languages. Lastly, we extend the Swedish monolingual model to handle all three languages. The research conducted in this thesis project is divided into two main sections, namely monolingual and multilingual models. In the former, we analyse and compare the performance of monolingual ASR models for Scandinavian languages in monolingual and cross-lingual settings. We compare these results against the levels of mutual intelligibility of Scandinavian languages in native speakers of Swedish, Danish, and Norwegian to see whether the monolingual models favour the same languages as native speakers. We also examine the performance of the monolingual models on the regional dialects of all three languages and perform qualitative analysis of the most common errors. As for multilingual models, we expand the most accurate monolingual ASR model to handle all three languages. To do so, we explore the most suitable settings via trial models. In addition, we propose an extension to the well-established Wav2Vec 2.0-CTC architecture by incorporating a language classification component. The extension enables the usage of language models, thus boosting the overall performance of the multilingual models. The results reported in this thesis suggest that in a cross-lingual setting, monolingual ASR models for Scandinavian languages perform better on the languages that are easier to comprehend for native speakers. Furthermore, the addition of a statistical language model boosts the performance of ASR models in monolingual, cross-lingual, and multilingual settings. ASR models appear to favour certain regional dialects, though the gap narrows in a multilingual setting. Contrary to our expectations, our multilingual model performs comparably with the monolingual Swedish ASR models and outperforms the Danish and Norwegian models. The multilingual architecture proposed in this thesis project is fairly simple yet effective. With greater computational resources at hand, further extensions offered in the conclusions might improve the models further.
|
116 |
Evaluating Transcription of Ciphers with Few-Shot LearningMilioni, Nikolina January 2022 (has links)
Ciphers are encrypted documents created to hide their content from those who were not the receivers of the message. Different types of symbols, such as zodiac signs, alchemical symbols, alphabet letters or digits are exploited to compose the encrypted text which needs to be decrypted to gain access to the content of the documents. The first step before decryption is the transcription of the cipher. The purpose of this thesis is to evaluate an automatic transcription tool from image to a text format to provide a transcription of the cipher images. We implement a supervised few-shot deep-learning model which is tested on different types of encrypted documents and use various evaluation metrics to assess the results. We show that the few-shot model presents promising results on seen data with Symbol Error Rates (SER) ranging from 8.21% to 47.55% and accuracy scores from 80.13% to 90.27%, whereas SER in out-of-domain datasets reaches 79.91%. While a wide range of symbols are correctly transcribed, the erroneous symbols mainly contain diacritics or are punctuation marks.
|
117 |
Towards the creation of a Clinical SummarizerGunnarsson, Axel January 2022 (has links)
While Electronic Medical Records provide extensive information about patients, the vast amounts of data cause issues in attempts to quickly retrieve valuable information needed to make accurate assumptions and decisions directly concerned with patients’ health. This search process is naturally time-consuming and forces health professionals to focus on a labor intensive task that diverts their attention from the main task of applying their knowledge to save lives. With the general aim of potentially relieving the professionals from this task of finding information needed for an operational decision, this thesis explores the use of a general BERT model for extractive summarization of Swedish medical records to investigate its capability in extracting sentences that convey important information to MRI physicists. To achieve this, a domain expert evaluation of medical histories was performed, creating the references summaries that were used for model evaluation. Three implementations are included in this study and one of which is TextRank, a prominent unsupervised approach to extractive summarization. The other two are based on clustering and rely on BERT to encode the text. The implementations are then evaluated using ROUGE metrics. The results support the use of a general BERT model for extractive summarization on medical records. Furthermore, the results are discussed in relation to the collected reference summaries, leading to a discussion about potential improvements to be made with regards to the domain expert evaluation, as well as the possibilities for future work on the topic of summarization of clinical documents.
|
118 |
Automatic Post-editing and Quality Estimation in Machine Translation of Product DescriptionsKukk, Kätriin January 2022 (has links)
As a result of drastically improved machine translation quality in recent years, machine translation followed by manual post-editing is currently a trend in the language industry that is slowly but surely replacing manual translation from scratch. In this thesis, the applicability of machine translation to product descriptions of clothing items is studied. The focus lies on determining whether automatic post-editing is a viable approach for improving baseline translations when new training data becomes available and finding out if there is an existing quality estimation system that could reliably assign quality scores to machine translated texts. It is shown that machine translation is a promising approach for the target domain with the majority of systems experimented with being able to generate translations that on average are of almost publishable quality according to the human evaluation carried out, meaning that only light post-editing is needed before the translations can be published. Automatic post-editing is shown to be able to improve the worst baseline translations but struggles with improving the overall translation quality due to its tendency to overcorrect good translations. Nevertheless, one of the trained post-editing systems is still rated higher than the baseline by human evaluators. A new finding is that training a post-editing model on more data using worse translations leads to better performance compared to training on less but higher-quality data. None of the quality estimation systems experimented with shows a strong correlation with human evaluation results which is why it is suggested not to provide the confidence scores of the baseline model to the human evaluators responsible for correcting and approving translations. The main contributions of this work are showing that the target domain of product descriptions is suitable for integrating machine translation into the translation workflow, proposing an approach for that translation workflow that is more automated than the current one as well as the finding that it is better to use more data and poorer translations compared to less data and higher-quality translations when training an automatic post-editing system.
|
119 |
Translation Memory System Optimization : How to effectively implement translation memory system optimization / Optimering av översättningsminnessystem : Hur man effektivt implementerar en optimering i översättningsminnessystemChau, Ting-Hey January 2015 (has links)
Translation of technical manuals is expensive, especially when a larger company needs to publish manuals for their whole product range in over 20 different languages. When a text segment (i.e. a phrase, sentence or paragraph) is manually translated, we would like to reuse these translated segments in future translation tasks. A translated segment is stored with its corresponding source language, often called a language pair in a Translation Memory System. A language pair in a Translation Memory represents a Translation Entry also known as a Translation Unit. During a translation, when a text segment in a source document matches a segment in the Translation Memory, available target languages in the Translation Unit will not require a human translation. The previously translated segment can be inserted into the target document. Such functionality is provided in the single source publishing software, Skribenta developed by Excosoft. Skribenta requires text segments in source documents to find an exact or a full match in the Translation Memory, in order to apply a translation to a target language. A full match can only be achieved if a source segment is stored in a standardized form, which requires manual tagging of entities, and often reoccurring words such as model names and product numbers. This thesis investigates different ways to improve and optimize a Translation Memory System. One way was to aid users with the work of manual tagging of entities, by developing Heuristic algorithms to approach the problem of Named Entity Recognition (NER). The evaluation results from the developed Heuristic algorithms were compared with the result from an off the shelf NER tool developed by Stanford. The results shows that the developed Heuristic algorithms is able to achieve a higher F-Measure compare to the Stanford NER, and may be a great initial step to aid Excosofts’ users to improve their Translation Memories. / Översättning av tekniska manualer är väldigt kostsamt, speciellt när större organisationer behöver publicera produktmanualer för hela deras utbud till över 20 olika språk. När en text (t.ex. en fras, mening, paragraf) har blivit översatt så vill vi kunna återanvända den översatta texten i framtida översättningsprojekt och dokument. De översatta texterna lagras i ett översättningsminne (Translation Memory). Varje text lagras i sitt källspråk tillsammans med dess översättning på ett annat språk, så kallat målspråk. Dessa utgör då ett språkpar i ett översättningsminnessystem (Translation Memory System). Ett språkpar som lagras i ett översättningsminne utgör en Translation Entry även kallat Translation Unit. Om man hittar en matchning när man söker på källspråket efter en given textsträng i översättningsminnet, får man upp översättningar på alla möjliga målspråk för den givna textsträngen. Dessa kan i sin tur sättas in i måldokumentet. En sådan funktionalitet erbjuds i publicerings programvaran Skribenta, som har utvecklats av Excosoft. För att utföra en översättning till ett målspråk kräver Skribenta att text i källspråket hittar en exakt matchning eller en s.k. full match i översättningsminnet. En full match kan bara uppnås om en text finns lagrad i standardform. Detta kräver manuell taggning av entiteter och ofta förekommande ord som modellnamn och produktnummer. I denna uppsats undersöker jag hur man effektivt implementerar en optimering i ett översättningsminnessystem, bland annat genom att underlätta den manuella taggningen av entitier. Detta har gjorts genom olika Heuristiker som angriper problemet med Named Entity Recognition (NER). Resultat från de utvecklade Heuristikerna har jämförts med resultatet från det NER-verktyg som har utvecklats av Stanford. Resultaten visar att de Heuristiker som jag utvecklat uppnår ett högre F-Measure jämfört med Stanford NER och kan därför vara ett bra inledande steg för att hjälpa Excosofts användare att förbättra deras översättningsminnen.
|
120 |
The Impact of Semantic and Stylistic Features in Genre Classification for NewsPei, Ziming January 2022 (has links)
In this thesis, we investigate the usefulness of a group of features in genre classification problems for news. We choose a diverse feature set, covering features related to content and styles of the texts. The features are divided into two groups: semantic and stylistic. More specifically, the semantic features include genre-exclusive words, emotional words and synonyms. The stylistic features include character-level and document-level features. We use three traditional machine learning classification models and one neural network model to evaluate the effects of our features: Support Vector Machine, Complement Naive Bayes, k-Nearest Neighbor, and Convolutional Neural Networks. The results are evaluated by F1 score, precision and recall (both micro- and macro-averaged). We compare the performance of different models to find the optimal feature set for this news genre classification task, and meanwhile seek the most suitable classifier. We show that genre-exclusive words and synonyms are beneficial to the classification task, in that they are the most informative features in the training process. Emotional words have negative effect on the results. We present the best result of 0.97 by macro-average F1 score, precision and recall on the feature set combining the preprocessed dataset and its synonym sets generated based on contexts classified by the Complement Naive Bayes model. We discuss the results achieved from the experiments and the best-performing models, answer the research questions, and provide suggestions for future studies.
|
Page generated in 0.5748 seconds