Spelling suggestions: "subject:"roberta"" "subject:"coberta""
1 |
The French ChairGrossi, Roberta 17 December 2011 (has links)
This play is a comedy which revolves around the importance of a French Chair (Louis Xiii) in the life of a family. A young newlywed woman discovers that her husband has a "secret" half sister born out of wedlock. Her mother-in-law has had an affair with a French man, but her husband always believed she had been abused. The young lady senses something is not clear and decides to look for her "half" sister-in-law. She does and manages to organize a get-together to bring everyone in the same place and finally understand what really happened. Eventually, her mother-in-law, now a widower, falls in love again with her ex-lover and they decide to marry. The French Chair was a reason of fights in the young couple and with the mother-in-law. But at the end we learn that young lady's half sister-in-law was conceived on this chair.
|
2 |
AI-POWERED TEXT ANALYSIS TOOL FOR SENTIMENT ANALYSISKebede, Dani, Tesfai, Naod January 2023 (has links)
In today’s digital era, text data plays a ubiquitous role across various domains. This bachelor thesis focuses on the field of sentiment analysis, specifically addressing the task of classifying text into positive, negative, or neutral sentiments with the help of an AI tool. The key research questions addressed are: (1) How can an accurate sentiment classification system be developed to categorize customer reviews as positive, negative, or neutral? (2) How can the performance of the sentiment analysis tool be optimized and evaluated, considering the factors that influence its accuracy? (3) How does Chat-GPT evaluate text-based feedback from customers with our results as input, i.a. can"Artificial General Intelligence" be adapted to solve a specific problem in the domain of this work? To accomplish this, the study harnesses the power of RoBERTa, an implemented transformer model renowned for its prowess in natural language processing tasks. The model will mainly focus on review comments from Amazon and on the product, "Samsung Galaxy A53". A small comparative analysis will also be carried out with Chat-GPT and the RoBERTa model’s sentiment positions. The results demonstrate the effectiveness of the RoBERTa model in sentiment classification, showcasing its ability to categorize sentiments for different review comments. The evaluation process identified key factors that influence the tool’s performance and provided insights into areas for further improvement. In conclusion, this thesis contributes to the field of sentiment analysis by providing a comprehensive overview of the development, optimization, and evaluation of an AI-powered text analysis tool for the sentiment classification of customer reviews. The result affects the importance of understanding customer sentiment and providing practical implications for businesses to improve their decision-making processes and enhance customer satisfaction.
|
3 |
Detection of suicidal ideation in written communicationBernsland, Melina January 2023 (has links)
Suicide remains a global cause of mortality, presenting challenges in detection and prevention despite known warning signs. This work aimed to improve personal security management by leveraging machine learning advancements to identify suicidal ideation in written communications. Using a design science approach, six machine learning models based on the RoBERTa model were developed with different hyperparameter values. These models were trained on a well-balanced dataset comprising 1,114 instances of suicide letters and social media posts. The model achieving the highest accuracy (0.919) and F1 score (0.919) during training was evaluated on a dataset consisting of posts from the subreddits r/terraluna and r/Terra_Luna_crypto. These posts were published during a period when the cryptocurrency Terra Luna experienced a crash, leading to reported cases of alleged suicides. The fine-tuned model demonstrated a reasonably high accuracy (0.841) and weighted F1 score (0.913) when tested on this real-world dataset. Additionally, a smaller test was conducted on selected posts (34 posts) from this dataset containing mentions of specific words. The model achieved an accuracy of 0.852, and a weighted F1 score of 0.887 when classifying these posts. There exist a considerable potential for further research and development in this field. By expanding and improving the dataset used in this project, incorporating additional features and contextual information, the accuracy and practicality of the model in real-life situations can be greatly enhanced. The ultimate objective is to create a resilient system that genuinely assists in the prevention of suicide. The results of this work offer hope and optimism for a future where advanced technology, combined with human compassion, addresses one of the most pressing public health issues of our time.
|
4 |
Java Syntax Error Repair Using RoBERTaXiang, Ziyi January 2022 (has links)
Deep learning has achieved promising results for automatic program repair (APR).In this paper, we revisit this topic and propose an end-to-end approach Classfix tocorrect java syntax errors. Classfix uses the RoBERTa classification model to localizethe error, and uses the RoBERTa encoder-decoder model to repair the located buggyline. Our work introduces a new localization method that enables us to fix a programwith an arbitrary length. Our approach categorises errors into symbol errors and worderrors. We conduct a large scale experiment to evaluate Classfix and the result showsClassfix is able to repair 75.5% symbol errors and 64.3% word errors. In addition,Classfix achieves 97% and 84.7% accuracy in locating symbol errors and word errors,respectively. / Deep learning har uppnått lovande resultat för automatisk programreparation (APR).I den här uppsatsen återkommer vi till det här ämnet och använder Classfix för attkorrigera javasyntaxfel. Classfix använder en RoBERTa-classification model för attlokalisera felet och en RoBERTa-encoder-decoder model för att reparera buggar.Vårt arbete introducerar en ny lokaliseringsmetod som gör att vi kan fixa programav godtycklig längd. Studien kategoriserar fel i symbolfel och ordfel. Vi genomförstorskaliga experiment för att utvärdera Classfix. Resultatet visar att Classfix kan fixa75.5% av symbolfel och 64.3% av ordfel. Dessutom uppnår Classfix 97% och 84,7% noggrannhet när det gäller att lokalisera symbolfel respektive ordfel.
|
5 |
Chicanská kulturní identita v USA: Tomás Rivera a Roberta Fernándezová / Chicano cultural identity in the USA: Tomás Rivera and Roberta FernándezPaclíková, Edita January 2012 (has links)
The thesis focuses on theme of cultural identity of Mexican Americans. The introduction is based on the common history of Mexico and the United States of America (the question of the immigration, the Chicano Movement, Chicano Spanish). Attention is paid to the conception of Mexican American literature and essayistic, poetic and narrative work of Tomás Rivera, the major representative of the Chicano movement literature. The most important part of this work consists of the analysis of some peculiar motives in Rivera's cycle ...And the Earth Did not Devour Him, that create a picture of Mexican American life (the motive of religion, despair, journey, etc.). To understand the integrity of Mexican American literature, (i.e. the literature of the Chicano Movement and the Chicana literature) Rivera is compared with Roberta Fernández's novel in six stories Fronterizas. 1
|
6 |
Efficient Sentiment Analysis and Topic Modeling in NLP using Knowledge Distillation and Transfer Learning / Effektiv sentimentanalys och ämnesmodellering inom NLP med användning av kunskapsdestillation och överföringsinlärningMalki, George January 2023 (has links)
This abstract presents a study in which knowledge distillation techniques were applied to a Large Language Model (LLM) to create smaller, more efficient models without sacrificing performance. Three configurations of the RoBERTa model were selected as ”student” models to gain knowledge from a pre-trained ”teacher” model. Multiple steps were used to improve the knowledge distillation process, such as copying some weights from the teacher to the student model and defining a custom loss function. The selected task for the knowledge distillation process was sentiment analysis on Amazon Reviews for Sentiment Analysis dataset. The resulting student models showed promising performance on the sentiment analysis task capturing sentiment-related information from text. The smallest of the student models managed to obtain 98% of the performance of the teacher model while being 45% lighter and taking less than a third of the time to analyze an entire the entire IMDB Dataset of 50K Movie Reviews dataset. However, the student models struggled to produce meaningful results on the topic modeling task. These results were consistent with the topic modeling results from the teacher model. In conclusion, the study showcases the efficacy of knowledge distillation techniques in enhancing the performance of LLMs on specific downstream tasks. While the model excelled in sentiment analysis, further improvements are needed to achieve desirable outcomes in topic modeling. These findings highlight the complexity of language understanding tasks and emphasize the importance of ongoing research and development to further advance the capabilities of NLP models. / Denna sammanfattning presenterar en studie där kunskapsdestilleringstekniker tillämpades på en stor språkmodell (Large Language Model, LLM) för att skapa mindre och mer effektiva modeller utan att kompremissa på prestandan. Tre konfigurationer av RoBERTa-modellen valdes som ”student”-modeller för att inhämta kunskap från en förtränad ”teacher”-modell. Studien mäter även modellernas prestanda på två ”DOWNSTREAM” uppgifter, sentimentanalys och ämnesmodellering. Flera steg användes för att förbättra kunskapsdestilleringsprocessen, såsom att kopiera vissa vikter från lärarmodellen till studentmodellen och definiera en anpassad förlustfunktion. Uppgiften som valdes för kunskapsdestilleringen var sentimentanalys på datamängden Amazon Reviews for Sentiment Analysis. De resulterande studentmodellerna visade lovande prestanda på sentimentanalysuppgiften genom att fånga upp information relaterad till sentiment från texten. Den minsta av studentmodellerna lyckades erhålla 98% av prestandan hos lärarmodellen samtidigt som den var 45% lättare och tog mindre än en tredjedel av tiden att analysera hela IMDB Dataset of 50K Movie Reviews datasettet.Dock hade studentmodellerna svårt att producera meningsfulla resultat på ämnesmodelleringsuppgiften. Dessa resultat överensstämde med ämnesmodelleringsresultaten från lärarmodellen. Dock hade studentmodellerna svårt att producera meningsfulla resultat på ämnesmodelleringsuppgiften. Dessa resultat överensstämde med ämnesmodelleringsresultaten från lärarmodellen.
|
7 |
Analyzing the Anisotropy Phenomenon in Transformer-based Masked Language Models / En analys av anisotropifenomenet i transformer-baserade maskerade språkmodellerLuo, Ziyang January 2021 (has links)
In this thesis, we examine the anisotropy phenomenon in popular masked language models, BERT and RoBERTa, in detail. We propose a possible explanation for this unreasonable phenomenon. First, we demonstrate that the contextualized word vectors derived from pretrained masked language model-based encoders share a common, perhaps undesirable pattern across layers. Namely, we find cases of persistent outlier neurons within BERT and RoBERTa's hidden state vectors that consistently bear the smallest or largest values in said vectors. In an attempt to investigate the source of this information, we introduce a neuron-level analysis method, which reveals that the outliers are closely related to information captured by positional embeddings. Second, we find that a simple normalization method, whitening can make the vector space isotropic. Lastly, we demonstrate that ''clipping'' the outliers or whitening can more accurately distinguish word senses, as well as lead to better sentence embeddings when mean pooling.
|
8 |
Analysis of Syntactic Behaviour of Neural Network Models by Using Gradient-Based Saliency Method : Comparative Study of Chinese and English BERT, Multilingual BERT and RoBERTaZhang, Jiayi January 2022 (has links)
Neural network models such as Transformer-based BERT, mBERT and RoBERTa are achieving impressive performance (Devlin et al., 2019; Lewis et al., 2020; Liu et al., 2019; Raffel et al., 2020; Y. Sun et al., 2019), but we still know little about their inner working due to the complex technique like multi-head self-attention they implement. Attention is commonly taken as a crucial way to explain the model outputs, but there are studies argue that attention may not provide faithful and reliable explanations in recent years (Jain and Wallace, 2019; Pruthi et al., 2020; Serrano and Smith, 2019; Wiegreffe and Pinter, 2019). Bastings and Filippova (2020) then propose that saliency may give better model interpretations since it is designed to find which token contributes to the prediction, i.e. the exact goal of explanation. In this thesis, we investigate the extent to which syntactic structure is reflected in BERT, mBERT and RoBERTa trained on English and Chinese by using a gradient-based saliency method introduced by Simonyan et al. (2014). We examine the dependencies that our models and baselines predict. We find that our models can predict some dependencies, especially those that have shorter mean distance and more fixed position of heads and dependents, even though all our models can handle global dependencies in theory. Besides, BERT usually has higher overall accuracy on connecting dependents to their corresponding heads, followed by mBERT and RoBERTa. Yet all the three model in fact have similar results on individual relations. Moreover, models trained on English have better performances than models trained on Chinese, possibly because of the flexibility of Chinese language.
|
9 |
Cyberbullying Detection Using Weakly Supervised and Fully Supervised LearningAbhishek, Abhinav 22 September 2022 (has links)
No description available.
|
10 |
<b>EXPLORING ENSEMBLE MODELS AND GAN-BASED </b><b>APPROACHES FOR AUTOMATED DETECTION OF </b><b>MACHINE-GENERATED TEXT</b>Surbhi Sharma (18437877) 29 April 2024 (has links)
<p dir="ltr">Automated detection of machine-generated text has become increasingly crucial in various fields such as cybersecurity, journalism, and content moderation due to the proliferation of generated content, including fake news, spam, and bot-generated comments. Traditional methods for detecting such content often rely on rule-based systems or supervised learning approaches, which may struggle to adapt to evolving generation techniques and sophisticated manipulations. In this thesis, we explore the use of ensemble models and Generative Adversarial Networks (GANs) for the automated detection of machine-generated text. </p><p dir="ltr">Ensemble models combine the strengths of different approaches, such as utilizing both rule-based systems and machine learning algorithms, to enhance detection accuracy and robustness. We investigate the integration of linguistic features, syntactic patterns, and semantic cues into machine learning pipelines, leveraging the power of Natural Language Processing (NLP) techniques. By combining multiple modalities of information, Ensemble models can effectively capture the subtle characteristics and nuances inherent in machine-generated text, improving detection performance. </p><p dir="ltr">In my latest experiments, I examined the performance of a Random Forest classifier trained on TF-IDF representations in combination with RoBERTa embeddings to calculate probabilities for machine-generated text detection. Test1 results showed promising accuracy rates, indicating the effectiveness of combining TF-IDF with RoBERTa probabilities. Test2 further validated these findings, demonstrating improved detection performance compared to standalone approaches.<br></p><p dir="ltr">These results suggest that leveraging Random Forest TF-IDF representation with RoBERTa embeddings to calculate probabilities can enhance the detection accuracy of machine-generated text.<br></p><p dir="ltr">Furthermore, we delve into the application of GAN-RoBERTa, a class of deep learning models comprising a generator and a discriminator trained adversarially, for generating and detecting machine-generated text. GANs have demonstrated remarkable capabilities in generating realistic text, making them a potential tool for adversaries to produce deceptive content. However, this same adversarial nature can be harnessed for detection purposes,<br>where the discriminator is trained to distinguish between genuine and machine-generated text.<br></p><p dir="ltr">Overall, our findings suggest that the use of Ensemble models and GAN-RoBERTa architectures holds significant promise for the automated detection of machine-generated text. Through a combination of diverse approaches and adversarial training techniques, we have demonstrated improved detection accuracy and robustness, thereby addressing the challenges posed by the proliferation of generated content across various domains. Further research and refinement of these approaches will be essential to stay ahead of evolving generation techniques and ensure the integrity and trustworthiness of textual content in the digital landscape.</p>
|
Page generated in 0.0241 seconds