11 |
Multicultural Emotional reasoning in Vision Language ModelsMOHAMED, YOUSSEF SHERIF MANSOUR 03 1900 (has links)
Human intelligence, with its many components, has been elusive. Until recently, the emphasis has been on facts and how humans perceive them. Now, it is time to embellish these facts with emotions and commentary. Emotional experiences and expressions play a critical role in human behavior and are influenced by language and cultural diversity. In this thesis, we explore the importance of emotions across multiple languages, such as Arabic, Chinese, and Spanish. In addition, we argue for the importance of collecting diverse emotional experiences including negative ones. We aim to develop AI systems that have a deeper understanding of emotional experiences. We open-source two datasets that emphasize diversity over emotions, language, and culture. ArtELingo contains affective annotations in the aforementioned languages, revealing valuable insights into how linguistic backgrounds shape emotional perception and expression. While ArtEmis 2.0 has a balanced distribution of positive and negative emotional experiences. Studying emotional experiences in AI is crucial for creating applications that genuinely understand and resonate with users.
We identify and tackle challenges in popular existing affective captioning datasets, mainly unbalanced emotion distribution, and generic captions, we pro- pose a contrastive data collection method. This approach results in a dataset with a balanced distribution of emotions, significantly enhancing the quality of trained neural speakers and emotion recognition models. Consequently, our trained speakers generate emotionally accurate and relevant captions, demonstrating the advantages of using a linguistically and emotionally diverse dataset in AI systems.
In addition, we explore the cultural aspects of emotional experiences and
expressions, highlighting the importance of considering cultural differences in the development of AI applications. By incorporating these insights, our research lays the groundwork for future advancements in culturally diverse affective computing.
This thesis establishes a foundation for future research in emotionally and culturally diverse affective computing, contributing to the development of AI applications capable of effectively understanding and engaging with humans on a deeper emotional level, regardless of their cultural background.
|
12 |
Predicting the Unpredictable – Using Language Models to Assess Literary QualityWu, Yaru January 2023 (has links)
People read for various purposes like learning specific skills, acquiring foreign languages, and enjoying the pure reading experience, etc. This kind of pure enjoyment may credit to many aspects, such as the aesthetics of languages, the beauty of rhyme, and the entertainment of being surprised by what will happen next, the last of which is typically featured in fictional narratives and is also the main topic of this project. In other words, “good” fiction may be better at entertaining readers by baffling and eluding their expectations whereas “normal” narratives may contain more cliches and ready-made sentences that are easy to predict. Therefore, this project examines whether “good” fiction is less predictable than “normal” fiction, the two of which are predefined as canonized and non-canonized. The predictability can be statistically reflected by the probability of the next words being correctly predicted given the previous content, which is then further measured in the metric of perplexity. Thanks to recent advances in deep learning, language models based on neural networks with billions of parameters can now be trained on terabytes of text to improve their performance in predicting the next unseen texts. Therefore, the generative pre-trained modeling and the text generator are combined to estimate the perplexities of canonized literature and non-canonized literature. Due to the potential risk that the terabytes of text on which the advanced models have been trained may contain book content within the corpus, two series of models are designed to yield non-biased perplexity results, namely the self-trained models and the generative pre-trained Transformer-2 models. The comparisons of these two groups of results set up the final hierarchy of architecture constituted by five models for further experiments. Over the process of perplexity estimation, the perplexity variance can also be generated at the same time, which is then used to denote how predictability varies across sequences with a certain length within each piece of literature. Evaluated by the perplexity variance, the literature property of homogeneity can also be examined between these two groups of literature. The ultimate results from the five models imply that there lie distinctions in both perplexity values and variances between the canonized literature and non-canonized literature. Besides, the canonized literature shows higher perplexity values and variances measured in both median and mean metrics, which denotes that it is less predictable and homogeneous than the non-canonized literature. Obviously, the perplexity values and variances cannot be used to define the literary quality directly. However, they offer some signals that the metric of perplexity can be insightful in the literary quality analysis using natural language processing techniques.
|
13 |
Smart Compose for Live Chat Agent / Kundtjänstens automatiska kompletteringssystemZhang, Tonghua January 2021 (has links)
In the digital business environment, customer service communication has grown up to become a labor- intensive task. In consideration of high labor costs, automatic customer service could be such a good alternative for many companies. However, communication with customers can not be easily automated. Staffs of customer service always need task-specific knowledge and information, which is incapable for automated systems to reply. Therefore, industries with frequent communication to consumers need a semiauto completion system, to cut manpower cost. In this thesis project, I utilized the GPT2 model, which was pre-trained by OpenAI, and finetuned it on MultiWOZ dataset in unsupervised way to train a full-fledged and task-oriented language model. On the basis of this auto-regressive language model, I designed and deployed an auto-completion system that timely predicts words or sentences which users may input in the next moment and provides quick completing suggestions for subsequent dialogue. After that, I evaluated the performance of the language model and practicability of the auto-completion system, and furthermore proposed a possible optimization framework to balance the system’s endogenous contradictions. / I den digitala affärsmiljön har kundservicekommunikation vuxit upp till att bli en arbetsintensiv uppgift. Med tanke på höga arbetskraftskostnader kan automatisk kundservice vara ett bra alternativ för många företag. Kundtjänstpersonal behöver alltid uppgiftspecifik kunskap och information, vilket inte är möjligt för automatiska system att leverera. Därför behöver industrier med frekvent kommunikation till konsumenterna ett semiautomatiskt kompletteringssystem, för att sänka arbetskraftskostnaderna. I detta avhandlingsprojekt använde jag GPT-2-modellen, som förtränats av OpenAI, och finjusterade den på MultiWOZ-datamängden på ett oövervakat sätt för att träna en fullfjädrad och uppgiftsorienterad språkmodell. På grundval av denna autoregressiva språkmodell designade och implementerade jag ett system för automatisk komplettering som i rätt tid förutsäger ord eller meningar som användarna kan mata in i nästa ögonblick och ger snabba kompletteringsförslag för efterföljande dialog. Därefter utvärderade jag prestandan för språkmodellen och genomförbarheten för det automatiska kompletteringssystemet och föreslog dessutom en möjlig optimeringsram för att balansera systemets endogena motsägelser.
|
14 |
Transformer-based Source Code Description Generation : An ensemble learning-based approach / Transformatorbaserad Generering av Källkodsbeskrivning : En ensemblemodell tillvägagångssättAntonios, Mantzaris January 2022 (has links)
Code comprehension can be significantly benefited from high-level source code summaries. For the majority of the developers, understanding another developer’s code or code that was written in the past by them, is a timeconsuming and frustrating task. This is necessary though in software maintenance or in cases where several people are working on the same project. A fast, reliable and informative source code description generator can automate this procedure, which is often avoided by developers. The rise of Transformers has turned the attention to them leading to the development of various Transformer-based models that tackle the task of source code summarization from different perspectives. Most of these models though are treating each other in a competitive manner when their complementarity could be proven beneficial. To this end, an ensemble learning-based approach is followed to explore the feasibility and effectiveness of the collaboration of more than one powerful Transformer-based models. The used base models are PLBart and GraphCodeBERT, two models with different focuses, and the ensemble technique is stacking. The results show that such a model can improve the performance and informativeness of individual models. However, it requires changes in the configuration of the respective models, that might harm them, and also further fine-tuning at the aggregation phase to find the most suitable base models’ weights and next-token probabilities combination, for the at the time ensemble. The results also revealed the need for human evaluation since metrics like BiLingual Evaluation Understudy (BLEU) are not always representative of the quality of the produced summary. Even if the outcome is promising, further work should follow, driven by this approach and based on the limitations that are not resolved in this work, for the development of a potential State Of The Art (SOTA) model. / Mjukvaruunderhåll samt kodförståelse är två områden som märkbart kan gynnas av källkodssammanfattning på hög nivå. För majoriteten av dagens utvecklare är det en tidskrävande och frustrerande uppgift att förstå en annan utvecklares kod.. För majoriteten av utvecklarna är det en tidskrävande och frustrerande uppgift att förstå en annan utvecklares kod eller kod som skrivits tidigare an dem. Detta är nödvändigt vid underhåll av programvara eller när flera personer arbetar med samma projekt. En snabb, pålitlig och informativ källkodsbeskrivningsgenerator kan automatisera denna procedur, som ofta undviks av utvecklare. Framväxten av Transformers har riktat uppmärksamheten mot dem, vilket har lett till utvecklingen av olika Transformer-baserade modeller som tar sig an uppgiften att sammanfatta källkod ur olika perspektiv. De flesta av dessa modeller behandlar dock varandra på ett konkurrenskraftigt sätt när deras komplementaritet kan bevisas vara mer fördelaktigt. För detta ändamål följs en ensembleinlärningsbaserad strategi för att utforska genomförbarheten och effektiviteten av samarbetet mellan mer än en kraftfull transformatorbaserad modell. De använda basmodellerna är PLBart och GraphCodeBERT, två modeller med olika fokus, och ensemblingstekniken staplas. Resultaten visar att en sådan modell kan förbättra prestanda och informativitet hos enskilda modeller. Det kräver dock förändringar i konfigurationen av respektive modeller som kan leda till skada, och även ytterligare finjusteringar i aggregeringsfasen för att hitta de mest lämpliga basmodellernas vikter och nästa symboliska sannolikhetskombination för den dåvarande ensemblen. Resultaten visade också behovet av mänsklig utvärdering eftersom mätvärden som BLEU inte alltid är representativa för kvaliteten på den producerade sammanfattningen. Även om resultaten är lovande bör ytterligare arbete följa, drivet av detta tillvägagångssätt och baserat på de begränsningar som inte är lösta i detta arbete, för utvecklingen av en potentiell SOTA-modell.
|
15 |
Kan artificiell intelligens skapa en trovärdig hållbarhetsrapport? : En kvantitativ studie om förmågan att urskilja AI-genererade texter från mänskligt skrivna texter samt vilka faktorer som påverkar. / Can artificial intelligence create a credible sustainability report? : A quantitative study on the ability to distinguish AI-generated texts from human-written texts and which factors influence.Högberg, Arvid, Karlström, William January 2024 (has links)
Titel: Kan artificiell intelligens skapa en trovärdig hållbarhetsrapport? En kvantitativ studie om förmågan att urskilja AI-genererade texter från mänskligt skrivna texter samt vilka faktorer som påverkar. Nivå: Examensarbete på grundnivå (kandidatexamen) i ämnet företagsekonomi Författare: Arvid Högberg och William Karlström Handledare: Asif M. Huq Datum: 2024 – maj Syfte: Experimentet syftar till att se om faktorer som yrkesmässig bakgrund, utbildningsnivå, ålder och kön påverkar respondenternas förmåga att urskilja AI-genererad text från mänskligt framställd text. Metod: Studien utfördes med en kvantitativ enkätundersökning där 84 respondenter deltog. Den bestod av inledningsvis demografiska frågor som följdes av två experiment där två texter presenterades och respondenten avgjorde vilken text som var AI-genererad. Analysmetoderna som använts var ANOVA (Analysis of Variance), frekvenstabeller och korrelationsanalyser. Våra respondenter var ekonomistudenter, utbildare inom ekonomi och yrkesverksamma inom redovisning/revision. Resultat och slutsats: Studien visade att faktorer som kön, ålder och erfarenhet inte hade någon signifikant påverkan på förmågan att urskilja AI-genererad text. Men högre AI-användning ökade chansen att urskilja AI-genererad text. Examensarbetets bidrag: Studien har bidragit med värdefull information om vad som påverkar förmågan att urskilja AI-genererad text från mänskligt skriven text. Resultatet visar att demografiska faktorer inte spelar någon signifikant roll. AI-användning däremot har en viss påverkan. Studien visar även hur långt AI-teknologin har kommit och hur svårt det kan vara att skilja mellan AI-genererad text och mänskligt skriven text. Förslag till fortsatt forskning: Effekterna av AI-träning och utbildning av människors förmåga att skilja på olika typer av text är något som framtida forskning behöver belysa. Det går även att utforska användningen av AI för textgenerering inom olika branscher och hur detta påverkar människors förmåga att skilja på AI-genererad text och mänskligt skriven text. / Title: Can artificial intelligence create a credible sustainability report? A quantitative study on the ability to distinguish AI-generated texts from human-written texts and which factors influence. Level: Bachelor's degree thesis in business administration Authors: Arvid Högberg and William Karlström Supervisor: Asif M. Huq Date: 2024 – May Purpose: The experiment aims to see if factors such as work experience, education background, age and gender affect the respondents' ability to distinguish AI-generated text from human-produced text. Method: The study was carried out with a quantitative survey in which 84 respondents participated. It consisted of initial demographic questions followed by two experiments where two texts were presented and the respondent decided which text was AI-generated. The analysis methods used were ANOVA (Analysis of Variance), frequency tables and correlation analyses. Our respondents were economics students, educators in economics and professionals in accounting/auditing. Results and conclusion: The study showed that factors such as gender, age and experience had no significant impact on the ability to distinguish AI-generated text. But higher AI usage increased the chance of distinguishing AI-generated text. Contribution of the thesis: The study has contributed valuable information about what affects the ability to distinguish AI-generated text from human-written text. The result shows that demographic factors do not play a significant role. AI use, on the other hand, has a certain impact. The study also shows how far AI technology has come and how difficult it can be to distinguish between AI-generated text and human-written text. Suggestions for further research: The effects of AI training and education on people's ability to distinguish between different types of text is something that future research needs to explore. It is also possible to explore the use of AI for text generation in various industries and how this affects people's ability to distinguish between AI-generated text and human-written text.
|
16 |
Generating and simplifying sentences / Génération et simplification des phrasesNarayan, Shashi 07 November 2014 (has links)
Selon la représentation d’entrée, cette thèse étudie ces deux types : la génération de texte à partir de représentation de sens et à partir de texte. En la première partie (Génération des phrases), nous étudions comment effectuer la réalisation de surface symbolique à l’aide d’une grammaire robuste et efficace. Cette approche s’appuie sur une grammaire FB-LTAG et prend en entrée des arbres de dépendance peu profondes. La structure d’entrée est utilisée pour filtrer l’espace de recherche initial à l’aide d’un concept de filtrage local par polarité afin de paralléliser les processus. Afin nous proposons deux algorithmes de fouille d’erreur: le premier, un algorithme qui exploite les arbres de dépendance plutôt que des données séquentielles et le second, un algorithme qui structure la sortie de la fouille d’erreur au sein d’un arbre afin de représenter les erreurs de façon plus pertinente. Nous montrons que nos réalisateurs combinés à ces algorithmes de fouille d’erreur améliorent leur couverture significativement. En la seconde partie (Simplification des phrases), nous proposons l’utilisation d’une forme de représentations sémantiques (contre à approches basées la syntaxe ou SMT) afin d’améliorer la tâche de simplification de phrase. Nous utilisons les structures de représentation du discours pour la représentation sémantique profonde. Nous proposons alors deux méthodes de simplification de phrase: une première approche supervisée hybride qui combine une sémantique profonde à de la traduction automatique, et une seconde approche non-supervisée qui s’appuie sur un corpus comparable de Wikipedia / Depending on the input representation, this dissertation investigates issues from two classes: meaning representation (MR) to text and text-to-text generation. In the first class (MR-to-text generation, "Generating Sentences"), we investigate how to make symbolic grammar based surface realisation robust and efficient. We propose an efficient approach to surface realisation using a FB-LTAG and taking as input shallow dependency trees. Our algorithm combines techniques and ideas from the head-driven and lexicalist approaches. In addition, the input structure is used to filter the initial search space using a concept called local polarity filtering; and to parallelise processes. To further improve our robustness, we propose two error mining algorithms: one, an algorithm for mining dependency trees rather than sequential data and two, an algorithm that structures the output of error mining into a tree to represent them in a more meaningful way. We show that our realisers together with these error mining algorithms improves on both efficiency and coverage by a wide margin. In the second class (text-to-text generation, "Simplifying Sentences"), we argue for using deep semantic representations (compared to syntax or SMT based approaches) to improve the sentence simplification task. We use the Discourse Representation Structures for the deep semantic representation of the input. We propose two methods: a supervised approach (with state-of-the-art results) to hybrid simplification using deep semantics and SMT, and an unsupervised approach (with competitive results to the state-of-the-art systems) to simplification using the comparable Wikipedia corpus
|
17 |
Extractive Multi-document Summarization of News ArticlesGrant, Harald January 2019 (has links)
Publicly available data grows exponentially through web services and technological advancements. To comprehend large data-streams multi-document summarization (MDS) can be used. In this research, the area of multi-document summarization is investigated. Multiple systems for extractive multi-document summarization are implemented using modern techniques, in the form of the pre-trained BERT language model for word embeddings and sentence classification. This is combined with well proven techniques, in the form of the TextRank ranking algorithm, the Waterfall architecture and anti-redundancy filtering. The systems are evaluated on the DUC-2002, 2006 and 2007 datasets using the ROUGE metric. Where the results show that the BM25 sentence representation implemented in the TextRank model using the Waterfall architecture and an anti-redundancy technique outperforms the other implementations, providing competitive results with other state-of-the-art systems. A cohesive model is derived from the leading system and tried in a user study using a real-world application. The user study is conducted using a real-time news detection application with users from the news-domain. The study shows a clear favour for cohesive summaries in the case of extractive multi-document summarization. Where the cohesive summary is preferred in the majority of cases.
|
18 |
Reasoning with qualitative spatial and temporal textual cases / Raisonnement qualitatif spatio-temporel à partir de cas textuelsDufour-Lussier, Valmi 07 October 2014 (has links)
Cette thèse propose un modèle permettant la mise en œuvre d'un système de raisonnement à partir de cas capable d'adapter des procédures représentées sous forme de texte en langue naturelle, en réponse à des requêtes d'utilisateurs. Bien que les cas et les solutions soient sous forme textuelle, l'adaptation elle-même est d'abord appliquée à un réseau de contraintes temporelles exprimées à l'aide d'une algèbre qualitative, grâce à l'utilisation d'un opérateur de révision des croyances. Des méthodes de traitement automatique des langues sont utilisées pour acquérir les représentations algébriques des cas ainsi que pour regénérer le texte à partir du résultat de l'adaptation / This thesis proposes a practical model making it possible to implement a case-based reasoning system that adapts processes represented as natural language text in response to user queries. While the cases and the solutions are in textual form, the adaptation itself is performed on networks of temporal constraints expressed with a qualitative algebra, using a belief revision operator. Natural language processing methods are used to acquire case representations and to regenerate text based on the adaptation result
|
19 |
Fine-Tuning Pre-Trained Language Models for CEFR-Level and Keyword Conditioned Text Generation : A comparison between Google’s T5 and OpenAI’s GPT-2 / Finjustering av förtränade språkmodeller för CEFR-nivå och nyckelordsbetingad textgenerering : En jämförelse mellan Googles T5 och OpenAIs GPT-2Roos, Quintus January 2022 (has links)
This thesis investigates the possibilities of conditionally generating English sentences based on keywords-framing content and different difficulty levels of vocabulary. It aims to contribute to the field of Conditional Text Generation (CTG), a type of Natural Language Generation (NLG), where the process of creating text is based on a set of conditions. These conditions include words, topics, content or perceived sentiments. Specifically, it compares the performances of two well-known model architectures: Sequence-toSequence (Seq2Seq) and Autoregressive (AR). These are applied to two different tasks, individual and combined. The Common European Framework of Reference (CEFR) is used to assess the vocabulary level of the texts. In the absence of openly available CEFR-labelled datasets, the author has developed a new methodology with the host company to generate suitable datasets. The generated texts are evaluated on accuracy of the vocabulary levels and readability using readily available formulas. The analysis combines four established readability metrics, and assesses classification accuracy. Both models show a high degree of accuracy when classifying texts into different CEFR-levels. However, the same models are weaker when generating sentences based on a desired CEFR-level. This study contributes empirical evidence suggesting that: (1) Seq2Seq models have a higher accuracy than AR models in generating English sentences based on a desired CEFR-level and keywords; (2) combining Multi-Task Learning (MTL) with instructiontuning is an effective way to fine-tune models on text-classification tasks; and (3) it is difficult to assess the quality of computer generated language using only readability metrics. / I den här studien undersöks möjligheterna att villkorligt generera engelska meningar på så-kallad “naturligt” språk, som baseras på nyckelord, innehåll och vokabulärnivå. Syftet är att bidra till området betingad textgenerering, en underkategori av naturlig textgenerering, vilket är en metod för att skapa text givet vissa ingångsvärden, till exempel ämne, innehåll eller uppfattning. I synnerhet jämförs prestandan hos två välkända modellarkitekturer: sekvenstill-sekvens (Seq2Seq) och autoregressiv (AR). Dessa tillämpas på två uppgifter, såväl individuellt som kombinerat. Den europeiska gemensamma referensramen (CEFR) används för att bedöma texternas vokabulärnivå. I och med avsaknaden av öppet tillgängliga CEFR-märkta dataset har författaren tillsammans med värdföretaget utvecklat en ny metod för att generera lämpliga dataset. De av modellerna genererade texterna utvärderas utifrån vokabulärnivå och läsbarhet samt hur väl de uppfyller den sökta CEFRnivån. Båda modellerna visade en hög träffsäkerhet när de klassificerar texter i olika CEFR-nivåer. Dock uppvisade samma modeller en sämre förmåga att generera meningar utifrån en önskad CEFR-nivå. Denna studie bidrar med empiriska bevis som tyder på: (1) att Seq2Seq-modeller har högre träffsäkerhet än AR-modeller när det gäller att generera engelska meningar utifrån en önskad CEFR-nivå och nyckelord; (2) att kombinera inlärning av multipla uppgifter med instruktionsjustering är ett effektivt sätt att finjustera modeller för textklassificering; (3) att man inte kan bedömma kvaliteten av datorgenererade meningar genom att endast använda läsbarhetsmått.
|
20 |
[pt] GERAÇÃO DE DESCRIÇÕES DE PRODUTOS A PARTIR DE AVALIAÇÕES DE USUÁRIOS USANDO UM LLM / [en] PRODUCT DESCRIPTION GENERATION FROM USER REVIEWS USING A LLMBRUNO FREDERICO MACIEL GUTIERREZ 04 June 2024 (has links)
[pt] No contexto de comércio eletrônico, descrições de produtos exercem
grande influência na experiência de compra. Descrições bem feitas devem
idealmente informar um potencial consumidor sobre detalhes relevantes do
produto, esclarecendo potenciais dúvidas e facilitando a compra. Gerar boas
descrições, entretanto, é uma atividade custosa, que tradicionalmente exige
esforço humano. Ao mesmo tempo, existe uma grande quantidade de produtos
sendo lançados a cada dia. Nesse contexto, este trabalho apresenta uma nova
metodologia para a geração automatizada de descrições de produtos, usando
as avaliações deixadas por usuários como fonte de informações. O método
proposto é composto por três etapas: (i) a extração de sentenças adequadas
para uma descrição a partir das avaliações (ii) a seleção de sentenças dentre
as candidatas (iii) a geração da descrição de produto a partir das sentenças
selecionadas usando um Large Language Model (LLM) de forma zero-shot.
Avaliamos a qualidade das descrições geradas pelo nosso método comparando-as com descrições de produto reais postadas pelos próprios anunciantes. Nessa
avaliação, contamos com a colaboração de 30 avaliadores, e verificamos que
nossas descrições são preferidas mais vezes do que as descrições originais,
sendo consideradas mais informativas, legíveis e relevantes. Além disso, nessa
mesma avaliação replicamos um método da literatura recente e executamos
um teste estatístico comparando seus resultados com o nosso método, e dessa
comparação verificamos que nosso método gera descrições mais informativas e
preferidas no geral. / [en] In the context of e-commerce, product descriptions have a great influence on the shopping experience. Well-made descriptions should ideally inform a potential consumer about relevant product details, clarifying potential doubt sand facilitating the purchase. Generating good descriptions, however, is a costly activity, which traditionally requires human effort. At the same time, there are a large number of products being launched every day. In this context, this work presents a new methodology for the automated generation of product descriptions, using reviews left by users as a source of information. The proposed method consists of three steps: (i) the extraction of suitable sentences for a description from the reviews (ii) the selection of sentences among the candidates (iii) the generation of the product description from the selected sentences using a Large Language Model (LLM) in a zero-shot way. We evaluate the quality of descriptions generated by our method by comparing them to real product descriptions posted by sellers themselves. In this evaluation, we had the collaboration of 30 evaluators, and we verified that our descriptions are preferred more often than the original descriptions, being considered more informative, readable and relevant. Furthermore, in this same evaluation we replicated a method from recent literature and performed a statistical test comparing its results with our method, and from this comparison we verified that our method generates more informative and preferred descriptions overall.
|
Page generated in 0.1427 seconds