Spelling suggestions: "subject:"språkteknologi"" "subject:"språkteknologin""
351 |
Question-answering chatbot for Northvolt IT SupportHjelm, Daniel January 2023 (has links)
Northvolt is a Swedish battery manufacturing company that specializes in the production of sustainable lithium-ion batteries for electric vehicles and energy storage systems. Established in 2016, the company has experienced significant growth in recent years. This growth has presented a major challenge for the IT Support team, as they face a substantial volume of ITrelated inquiries. To address this challenge and allow the IT Support team to concentrate on more complex support tasks, a question-answering chatbot has been implemented as part of this thesis project. The chatbot has been developed using the Microsoft Bot Framework and leverages Microsoft cloud services, specifically Azure Cognitive Services, to provide intelligent and cognitive capabilities for answering employee questions directly within Microsoft Teams. The chatbot has undergone testing by a diverse group of employees from various teams within the organization and was evaluated based on three key metrics: effectiveness (including accuracy, precision, and intent recognition rate), efficiency (including response time and scalability), and satisfaction. The test results indicate that the accuracy, precision, and intent recognition rate fall below the required thresholds for production readiness. However, these metrics can be improved by expanding the knowledge base of the bot. The chatbot demonstrates impressive efficiency in terms of response time and scalability, and its user-friendly nature contributes to a positive user experience. Users express high levels of satisfaction with their interactions with the bot, and the majority would recommend it to their colleagues, recognizing it as a valuable service solution that will benefit all employees at Northvolt in the future. Moving forward, the primary focus should be on expanding the knowledge base and effectively communicating the bot’s purpose and scope to enhance effectiveness and satisfaction. Additionally, integrating the bot with advanced AI features, such as OpenAI’s language models available within Microsoft’s ecosystem, would elevate the bot to the next level.
|
352 |
Distillation or loss of information? : The effects of distillation on model redundancySventickaite, Eva Elzbieta January 2022 (has links)
The necessity for billions of parameters in large language models has lately been questioned as there are still unanswered questions regarding how information is captured in the networks. It could be argued that without this knowledge, there may be a tendency to overparametarize the models. In turn, the investigation of model redundancy and the methods which minimize it is important both to the academic and commercial entities. As such, the two main goals of this project were to, firstly, discover whether one of such methods, namely, distillation, reduces the redundancy of the language models without losing linguistic capabilities and, secondly, to determine whether the model architecture or multilingualism has a bigger effect on said reduction. To do so, ten models, both monolingual, multilingual, and their distilled counterparts, were evaluated layer and neuron-wise. In terms of layers, we have evaluated the layer correlation of all models by visualising heatmaps and calculating the average per layer similarity. For establishing the neuron-level redundancy, a classifier probe was applied on the model neurons, both the whole model and reduced by applying a clustering algorithm, and its performance was assessed for two tasks, Part-of-Speech (POS) and Dependency (DEP) tagging. To determine the distillation effects on the multilingualism of the models, we have investigated cross-lingual transfer for the same tasks and compared the results of the classifier as applied on multilingual models and one distilled variant in ten languages, nine Indo-European and one non-Indo-European. The results show that distillation reduces the number of redundant neurons at the cost of losing some of the linguistic knowledge. In addition, the redundancy in the distilled models is mainly attributed to the architecture on which it is based, with the multilingualism aspect having only a mild impact. Finally, the cross-lingual transfer experiments have shown that after distillation the model loses the ability to capture some languages more than others. In turn, the outcome of the project suggests that distillation could be applied to reduce the size of billion parameter models and is a promising method in terms of reducing the redundancy in current language models.
|
353 |
LSTM vs Random Forest for Binary Classification of Insurance Related Text / LSTM vs Random Forest för binär klassificering av försäkringsrelaterad textKindbom, Hannes January 2019 (has links)
The field of natural language processing has received increased attention lately, but less focus is put on comparing models, which differ in complexity. This thesis compares Random Forest to LSTM, for the task of classifying a message as question or non-question. The comparison was done by training and optimizing the models on historic chat data from the Swedish insurance company Hedvig. Different types of word embedding were also tested, such as Word2vec and Bag of Words. The results demonstrated that LSTM achieved slightly higher scores than Random Forest, in terms of F1 and accuracy. The models’ performance were not significantly improved after optimization and it was also dependent on which corpus the models were trained on. An investigation of how a chatbot would affect Hedvig’s adoption rate was also conducted, mainly by reviewing previous studies about chatbots’ effects on user experience. The potential effects on the innovation’s five attributes, relative advantage, compatibility, complexity, trialability and observability were analyzed to answer the problem statement. The results showed that the adoption rate of Hedvig could be positively affected, by improving the first two attributes. The effects a chatbot would have on complexity, trialability and observability were however suggested to be negligible, if not negative. / Det vetenskapliga området språkteknologi har fått ökad uppmärksamhet den senaste tiden, men mindre fokus riktas på att jämföra modeller som skiljer sig i komplexitet. Den här kandidatuppsatsen jämför Random Forest med LSTM, genom att undersöka hur väl modellerna kan användas för att klassificera ett meddelande som fråga eller icke-fråga. Jämförelsen gjordes genom att träna och optimera modellerna på historisk chattdata från det svenska försäkringsbolaget Hedvig. Olika typer av word embedding, så som Word2vec och Bag of Words, testades också. Resultaten visade att LSTM uppnådde något högre F1 och accuracy än Random Forest. Modellernas prestanda förbättrades inte signifikant efter optimering och resultatet var också beroende av vilket korpus modellerna tränades på. En undersökning av hur en chattbot skulle påverka Hedvigs adoption rate genomfördes också, huvudsakligen genom att granska tidigare studier om chattbotars effekt på användarupplevelsen. De potentiella effekterna på en innovations fem attribut, relativ fördel, kompatibilitet, komplexitet, prövbarhet and observerbarhet analyserades för att kunna svara på frågeställningen. Resultaten visade att Hedvigs adoption rate kan påverkas positivt, genom att förbättra de två första attributen. Effekterna en chattbot skulle ha på komplexitet, prövbarhet och observerbarhet ansågs dock vara försumbar, om inte negativ.
|
354 |
A Step Toward GDPR Compliance : Processing of Personal Data in EmailOlby, Linnea, Thomander, Isabel January 2018 (has links)
The General Data Protection Regulation enforced on the 25th of may in 2018 is a response to the growing importance of IT in today’s society, accompanied by public demand for control over personal data. In contrast to the previous directive, the new regulation applies to personal data stored in an unstructured format, such as email, rather than solely structured data. Companies are now forced to accommodate to this change, among others, in order to be compliant. This study aims to provide a code of conduct for the processing of personal data in email as a measure for reaching compliance. Furthermore, this study investigates whether Named Entity Recognition (NER) can aid this process as a means of finding personal data in the form of names. A literature review of current research and recommendations was conducted for the code of conduct proposal. A NER system was constructed using a hybrid approach with Binary Logistic Regression, hand-crafted rules and gazetteers. The model was applied to a selection of emails, including attachments, obtained from a small consultancy company in the automotive industry. The proposed code of conduct consists of six items, applied to the consultancy firm. The NER-model demonstrated low ability to identify names and was therefore deemed insufficient for this task. / Dataskyddsförordningen började gälla den 25e maj 2018, och uppstod som ett svar på den okände betydelsen av IT i dagens samhälle samt allmänhetens krav på ökad kontroll över personuppgifter för den enskilde individen. Till skillnad från det tidigare direktivet, omfattar den nya förordningen även personuppgifter som är lagrad i ostrukturerad form, som till exempel e-post, snarare än endast i strukturerad form. Många företag tvingas därmed att anpassa sig efter detta, tillsammans med ett flertal andra nya krav, i syfte att efterfölja förordningen. Den här studien syftar till att lägga fram ett förslag på en uppförandekod för behandling av personuppgifter i e-post som ett verktyg för att nå medgörlighet. Utöver detta undersöks det om Named Entity Recognition (NER) kan användas som ett hjälpmedel vid identifiering av personuppgifter, mer specifikt namn. En litteraturstudie kring tidigare forskning och aktuella rekommendationer utfördes inför utformningen av uppförandekoden. Ett NER-system konstruerades med hjälp av Binär Logistisk Regression, handgjorda regler och ordlistor. Modellen applicerades på ett urval av e-postmeddelanden, med eventuella bilagor, som tillhandahölls från ett litet konsultbolag aktivt inom bilindustrin. Den rekommenderade uppförandekoden består av sex punkter, applicerade på konsultbolaget. NER-modellen påvisade en låg förmåga att identifiera namn och ansågs därför inte vara lämplig för den utsatta uppgiften.
|
355 |
The Effect of Data Quantity on Dialog System Input Classification Models / Datamängdens effekt på modeller för avsiktsklassificering i chattkonversationerLipecki, Johan, Lundén, Viggo January 2018 (has links)
This paper researches how different amounts of data affect different word vector models for classification of dialog system user input. A hypothesis is tested that there is a data threshold for dense vector models to reach the state-of-the-art performance that have been shown with recent research, and that character-level n-gram word-vector classifiers are especially suited for Swedish classifiers–because of compounding and the character-level n-gram model ability to vectorize out-of-vocabulary words. Also, a second hypothesis is put forward that models trained with single statements are more suitable for chat user input classification than models trained with full conversations. The results are not able to support neither of our hypotheses but show that sparse vector models perform very well on the binary classification tasks used. Further, the results show that 799,544 words of data is insufficient for training dense vector models but that training the models with full conversations is sufficient for single statement classification as the single-statement- trained models do not show any improvement in classifying single statements. / Detta arbete undersöker hur olika datamängder påverkar olika slags ordvektormodeller för klassificering av indata till dialogsystem. Hypotesen att det finns ett tröskelvärde för träningsdatamängden där täta ordvektormodeller när den högsta moderna utvecklingsnivån samt att n-gram-ordvektor-klassificerare med bokstavs-noggrannhet lämpar sig särskilt väl för svenska klassificerare söks bevisas med stöd i att sammansättningar är särskilt produktiva i svenskan och att bokstavs-noggrannhet i modellerna gör att tidigare osedda ord kan klassificeras. Dessutom utvärderas hypotesen att klassificerare som tränas med enkla påståenden är bättre lämpade att klassificera indata i chattkonversationer än klassificerare som tränats med hela chattkonversationer. Resultaten stödjer ingendera hypotes utan visar istället att glesa vektormodeller presterar väldigt väl i de genomförda klassificeringstesterna. Utöver detta visar resultaten att datamängden 799 544 ord inte räcker till för att träna täta ordvektormodeller väl men att konversationer räcker gott och väl för att träna modeller för klassificering av frågor och påståenden i chattkonversationer, detta eftersom de modeller som tränats med användarindata, påstående för påstående, snarare än hela chattkonversationer, inte resulterar i bättre klassificerare för chattpåståenden.
|
356 |
Distributionella representationer av ord för effektiv informationssökning : Algoritmer för sökning i kundsupportforum / Distributional Representations of Words for Effective Information Retrieval : Information Retrieval in Customer Support ForumsLachmann, Tim, Sabel, Johan January 2017 (has links)
I takt med att informationsmängden ökar i samhället ställs högre krav på mer förfinade metoder för sökning och hantering av information. Att utvinna relevant data från företagsinterna system blir en mer komplex uppgift då större informationsmängder måste hanteras och mycket kommunikation förflyttas till digitala plattformar. Metoder för vektorbaserad ordinbäddning har under senare år gjort stora framsteg; i synnerhet visade Google 2013 banbrytande resultat med modellen Word2vec och överträffade äldre metoder. Vi implementerar en sökmotor som utnyttjar ordinbäddningar baserade på Word2vec och liknande modeller, avsedd att användas på IT-företaget Kundo och för produkten Kundo Forum. Resultaten visar på potential för informationssökning med markant bättre täckning utan minskad precision. Kopplat till huvudområdet informationssökning genomförs också en analys av vilka implikationer en förbättrad sökmotor har ur ett marknads- och produktutvecklingsperspektiv. / As the abundance of information in society increases, so does the need for more sophisticated methods of information retrieval. Extracting information from internal systems becomes a more complex task when handling larger amounts of information and when more communications are transferred to digital platforms. Recent years methods for word embedding in vector space have gained traction. In 2013 Google sent ripples across the field of Natural Language Processing with a new method called Word2vec, significantly outperforming former practices. Among different established methods for information retrieval, we implement a retrieval method utilizing Word2vec and related methods of word embedding for the search engine at IT company Kundo and their product Kundo Forum. We demonstrate the potential to improve information retrieval recall by a significant margin without diminishing precision. Coupled with the primary subject of information retrieval we also investigate potential market and product development implications related to a different kind of search engine.
|
357 |
Evaluation of methods for question answering data generation : Using large language models / Utvärdering av metoder för skapande av fråge-svar dataBissessar, Daniel, Bois, Alexander January 2022 (has links)
One of the largest challenges in the field of artificial intelligence and machine learning isthe acquisition of a large quantity of quality data to train models on.This thesis investigates and evaluates approaches to data generation in a telecom domain for the task of extractive QA. To do this a pipeline was built using a combination ofBERT-like models and T5 models for data generation. We then evaluated our generateddata using the downstream task of QA on a telecom domain data set. We measured theperformance using EM and F1-scores. We achieved results that are state of the art on thetelecom domain data set.We found that synthetic data generation is a viable approach to obtaining synthetictelecom QA data with the potential of improving model performance when used in addition to human-annotated data. We also found that using models from the general domainprovided results that are on par or better than domain-specific models for the generation, which provides possibilities to use a single generation pipeline for many differentdomains. Furthermore, we found that increasing the amount of synthetic data providedlittle benefit for our models on the downstream task, with diminishing returns setting inquickly. We were unable to pinpoint the reason for this. In short, our approach works butmuch more work remains to understand and optimize it for greater results
|
358 |
Bridging Language & Data : Optimizing Text-to-SQL Generation in Large Language Models / Från ord till SQL : Optimering av text-till-SQL-generering i stora språkmodellerWretblad, Niklas, Gordh Riseby, Fredrik January 2024 (has links)
Text-to-SQL, which involves translating natural language into Structured Query Language (SQL), is crucial for enabling broad access to structured databases without expert knowledge. However, designing models for such tasks is challenging due to numerous factors, including the presence of ’noise,’ such as ambiguous questions and syntactical errors. This thesis provides an in-depth analysis of the distribution and types of noise in the widely used BIRD-Bench benchmark and the impact of noise on models. While BIRD-Bench was created to model dirty and noisy database values, it was not created to contain noise and errors in the questions and gold queries. We found after a manual evaluation that noise in questions and gold queries are highly prevalent in the financial domain of the dataset, and a further analysis of the other domains indicate the presence of noise in other parts as well. The presence of incorrect gold SQL queries, which then generate incorrect gold answers, has a significant impact on the benchmark’s reliability. Surprisingly, when evaluating models on corrected SQL queries, zero-shot baselines surpassed the performance of state-of-the-art prompting methods. The thesis then introduces the concept of classifying noise in natural language questions, aiming to prevent the entry of noisy questions into text-to-SQL models and to annotate noise in existing datasets. Experiments using GPT-3.5 and GPT-4 on a manually annotated dataset demonstrated the viability of this approach, with classifiers achieving up to 0.81 recall and 80% accuracy. Additionally, the thesis explored the use of LLMs for automatically correcting faulty SQL queries. This showed a 100% success rate for specific query corrections, highlighting the potential for LLMs in improving dataset quality. We conclude that informative noise labels and reliable benchmarks are crucial to developing new Text-to-SQL methods that can handle varying types of noise.
|
359 |
Wordlength inference in the Spade HDL : Seven implementations of wordlength inference and one implementation that actually works / Ordlängdsinferans i Spade HDL : Sju olika implementationer av ordlängdsinferens och en implementation som faktiskt fungerarThörnros, Edvard January 2023 (has links)
Compilers, complex programs with the potential to greatly facilitate software and hardware design. This thesis focuses on enhancing the Spade hardware description language, known for its user-friendly approach to hardware design. In the realm of hardware development data size - for numerical values data size is known as "wordlength" - plays a critical role for reducing the hardware resources. This study presents an innovative approach that seamlessly integrates wordlength inference directly into the Spade language, enabling the over-estimation of numeric data sizes solely from the program's source code. The methodology involves iterative development, incorporating various smaller implementations and evaluations, reminiscent of an agile approach. To assess the efficacy of the wordlength inference, multiple place and route operations are performed on identical Spade code using various versions of nextpnr. Surprisingly, no discernible impact on hardware resource utilization emerges from the modifications introduced in this thesis. Nonetheless, the true significance of this endeavor lies in its potential to unlock more advanced language features within the Spade compiler. It is important to note that while the wordlength inference proposed in this thesis shows promise, it necessitates further integration efforts to realize its full potential.
|
360 |
A Prompting Framework for Natural Language Processing in the Medical Field : Assessing the Potential of Large Language Models for Swedish Healthcare / Ett ramverk för behandling av naturliga språkmodeller inom hälso- och sjukvården : Bedömningen av potentialen hos stora språkmodeller inom svensk sjukvårdMondal, Anim January 2023 (has links)
The increasing digitisation of healthcare through the use of technology and artificial intelligence has affected the medical field in a multitude of ways. Generative Pre-trained Transformers (GPTs) is a collection of language models that have been trained on an extensive data set to generate human-like text and have been shown to achieve a strong understanding of natural language. This thesis aims to investigate whether GPT-SW3, a large language model for the Swedish language, is capable of responding to healthcare tasks accurately given prompts and context. To reach the goal, a framework was created. The framework consisted of general medical questions, an evaluation of medical reasoning, and conversations between a doctor and patient has been created to evaluate GPT-SW3's abilities in the respective areas. Each component has a ground truth which is used when evaluating the responses. Based on the results, GPT-SW3 is capable of dealing with specific medical tasks and shows, in particular instances, signs of understanding. In more basic tasks, GPT-SW3 manages to provide adequate answers to some questions. In more advanced scenarios, such as conversation and reasoning, GPT-SW3 struggles to provide coherent answers reminiscent of a human doctor's conversation. While there have been some great advancements in natural language processing, further work into a Swedish model will have to be conducted to create a model that is useful for healthcare. Whether the work is in fine-tuning the weights of the models or retraining the models with domain-specific data is left for subsequent works. / Den ökande digitaliseringen av vården genom användning av teknik och artificiell intelligens har påverkat det medicinska fältet på både positiva och negativa sätt. Generative Pre-trained Transformers (GPTs) är en samling språkmodeller som har tränats på en stor datamängd för att generera människoliknande text och har visat sig uppnå en stark förståelse av naturligt språk. Syftet med den här uppsatsen är att undersöka om GPT-SW3, en stor språkmodell för det svenska språket, kan svara på hälso- och sjukvårdsuppgifter på ett korrekt sätt med hänsyn till uppmaningar och sammanhang. För att uppnå målet skapades ett ramverk. Ramverket bestod av allmänna medicin-ska frågor, en utvärdering av medicinska resonemang samt konversationer mellan en läkare och en patient har skapats för att utvärdera GPT-SW3:s förmåga inom respektive områden. Varje komponent har en grundsanning som används vid utvärderingen av svaren. Generellt sett klarar GPT-SW3 av att hantera specifika medicinska uppgifter och modellen visar tecken på förståelse. I mer grundläggande uppgifter lyckas GPT-SW3 ge adekvata svar på vissa frågor. I mer avancerade scenarier, t.ex. samtal och resonemang, har GPT-SW3 svårt att ge sammanhängande svar. Även om det har gjorts stora framsteg inom språkteknologi måste ytterligare arbete med en svensk modell utföras för att skapa en modell som är användbar för hälso- och sjukvården. Huruvida arbetet består i att finjustera modellernas vikter eller att träna om modellerna med domänspecifika data lämnas till kommande arbeten.
|
Page generated in 0.0444 seconds