Spelling suggestions: "subject:"[een] NLP"" "subject:"[enn] NLP""
401 |
Design of a Robust and Flexible Grammar for Speech ControlLudyga, Tomasz 28 May 2024 (has links)
Voice interaction is an established automatization and accessibility feature. While many satisfactory speech recognition solutions are available today, the interpretation of text se-mantic is in some use-cases difficult. Differentiated can be two types of text semantic ex-traction models: probabilistic and pure rule-based. Rule-based reasoning is formalizable into grammars and enables fast language validation, transparent decision-making and easy customization. In this thesis we develop a context-free ANTLR semantic grammar to control software by speech in a medical, smart glasses related, domain. The implementation is preceded by research of state-of-the-art, requirements consultation and a thorough design of reusable system abstractions. Design includes definitions of DSL, meta grammar, generic system ar-chitecture and tool support. Additionally, we investigate trivial and experimental grammar improvement techniques. Due to multifaceted flexibility and robustness of the designed framework, we indicate its usability in critical and adaptive systems. We determine 75% semantic recognition accuracy in the medical main use-case. We compare it against se-mantic extraction using SpaCy and two fine-tuned AI classifiers. The evaluation reveals high accuracy of BERT for sequence classification and big potential of hybrid solutions with AI techniques on top grammars, essentially for detection of alerts. The accuracy is strong dependent on input quality, highlighting the importance of speech recognition tailored to specific vocabulary.:1 Introduction 1
1.1 Motivation 1
1.2 CAIS.ME Project 2
1.3 Problem Statement 2
1.4 Thesis Overview 3
2 Related Work 4
3 Foundational Concepts and Systems 6
3.1 Human-Computer Interaction in Speech 6
3.2 Speech Recognition 7
3.2.1 Open-source technologies 8
3.2.2 Other technologies 9
3.3 Language Recognition 9
3.3.1 Regular expressions 10
3.3.2 Lexical tokenization 10
3.3.3 Parsing 10
3.3.4 Domain Specific Languages 11
3.3.5 Formal grammars 11
3.3.6 Natural Language Processing 12
3.3.7 Model-Driven Engineering 14
4 State-of-the-Art: Grammars 15
4.1 Overview 15
4.2 Workbenches for Grammar Design 16
4.2.1 ANTLR 16
4.2.2 Xtext 17
4.2.3 JetBrains MPS 17
4.2.4 Other tools 18
4.3 Design Approaches 19
5 Problem Analysis 23
5.1 Methodology 23
5.2 Identification of Use-Cases 24
5.3 Requirements Analysis 26
5.3.1 Functional requirements 26
5.3.2 Qualitative requirements 26
5.3.3 Acceptance criteria 27
6 Design 29
6.1 Preprocessing 29
6.2 Underlying Domain Specific Modelling 31
6.2.1 Language model definition 31
6.2.2 Formalization 32
6.2.3 Constraints 32
6.3 Generic Grammar Syntax 33
6.4 Architecture 36
6.5 Integration of AI Techniques 38
6.6 Grammar Improvement 40
6.6.1 Identification of synonyms 40
6.6.2 Automatic addition of synonyms 42
6.6.3 Addition of same-meaning strings 42
6.6.4 Addition and modification of rules 43
6.7 Processing of unrecognized input 44
6.8 Summary 45
7 Implementation and Evaluation 47
7.1 Development Environment 47
7.2 Implementation 48
7.2.1 Grammar model transformation 48
7.2.2 Output construction 50
7.2.3 Testing 50
7.2.4 Reusability for similar use-cases 51
7.3 Limitations and Challenges 52
7.4 Comparison to NLP Solutions 54
8 Conclusion 58
8.1 Summary of Findings 58
8.2 Future Research and Development 60
Acronyms 62
Bibliography 63
List of Figures 73
List of Tables 74
List of Listings 75
|
402 |
Using Blockchain to Ensure Reputation Credibility in Decentralized Review ManagementZaccagni, Zachary James 12 1900 (has links)
In recent years, there have been incidents which decreased people's trust in some organizations and authorities responsible for ratings and accreditation. For a few prominent examples, there was a security breach at Equifax (2017), misconduct was found in the Standard & Poor's Ratings Services (2015), and the Accrediting Council for Independent Colleges and Schools (2022) validated some of the low-performing schools as delivering higher standards than they actually were. A natural solution to these types of issues is to decentralize the relevant trust management processes using blockchain technologies. The research problems which are tackled in this thesis consider the issue of trust in reputation for assessment and review credibility at different angles, in the context of blockchain applications.
We first explored the following questions. How can we trust courses in one college to provide students with the type and level of knowledge which is needed in a specific workplace? Micro-accreditation on a blockchain was our solution, including using a peer-review system to determine the rigor of a course (through a consensus). Rigor is the level of difficulty in regard to a student's expected level of knowledge. Currently, we make assumptions about the quality and rigor of what is learned, but this is prone to human bias and misunderstandings. We present a decentralized approach that tracks student records throughout the academic progress at a school and helps to match employers' requirements to students' knowledge. We do this by applying micro-accredited topics and Knowledge Units (KU) defined by NSA's Center of Academic Excellence to courses and assignments. We demonstrate that the system was successful in increasing accuracy of hires through simulated datasets, and that it is efficient, as well as scalable. Another problem is how can we trust that the peer reviews are honest and reflect an accurate rigor score? Assigning reputation to peers is a natural method to ensure correctness of these assessments. The reputation of the peers providing rigor scores needs to be taken into account for an overall rigor of a course, its topics, and its tasks. Specifically, those with a higher reputation should have more influence on the total score.
Hence, we focused on how a peer's reputation is managed. We explored decentralized reputation management for the peers, choosing a decentralized marketplace as a sample application. We presented an approach to ensuring review credibility, which is a particular aspect of trust in reviews and reputation of the parties who provide them. We use a Proof-of-Stake based Algorand system as a base of our implementation, since this system is open-source, and it has a rich community support. Specifically, we directly map reputation to stake, which allows us to deploy Algorand at the blockchain layer. Reviews are analyzed by the proposed evaluation component using Natural Language Processing (NLP). In our system, NLP gauges the positivity of the written review, compares that value to a scaled numerical rating given, and determines adjustments to a peer's reputation from that result. We demonstrate that this architecture ensures credible and trustworthy assessments. It also efficiently manages the reputation of the peers, while keeping reasonable consensus times.
We then turned our focus on ensuring that a peer's reputation is credible. This led us to introducing a new type of consensus called "Proof-of-Review". Our proposed implementation is again based on Algorand, since its modular architecture allows for easy modifications, such as adding extra components, but this time, we modified the engine. The proposed model then provides a trust in evaluations (review and assessment credibility) and in those who provide them (reputation credibility) using a blockchain. We introduce a blacklisting component, which prevents malicious nodes from participating in the protocol, and a minimum-reputation component, which limits the influence of under-performing users. Our results showed that the proposed blockchain system maintains liveliness and completeness. Specifically, blacklisting and the minimum-reputation requirement (when properly tuned) do not affect these properties. We note that the Proof-of-Review concept can be deployed in other types of applications with similar needs of trust in assessments and the players providing them, such as sensor arrays, autonomous car groups (caravans), marketplaces, and more.
|
403 |
Comparative Analysis of User Satisfaction Between Keyword-based and GPT-based E-commerce Chatbots : A qualitative study utilizing user testing to compare user satisfaction based on the IKEA chatbot.Bitinas, Romas, Hassellöf, Axel January 2024 (has links)
Chatbots are computer programs that interact with users utilizing natural language. Businesses benefit from using chatbots as they can provide a better and more satisfactory customer experience. This thesis investigates differences in user satisfaction with two types of e-commerce chatbots: a keyword-based chatbot and a GPT-based chatbot. The study focuses on user interactions with IKEA's chatbot "Billie" compared to a prototype GPT-based chatbot designed for similar functionalities. Using a within-subjects experimental design, participants were tasked with typical e-commerce queries, followed by interviews to gather qualitative data about each participants experience. The research aims to determine whether a chatbot based on GPT technology can offer a more intuitive, engaging and empathetic user experience, compared to traditional keyword-based chatbots in the realm of e-commerce. Findings reveal that the GPT-based chatbot generally provided more accurate and relevant responses, enhancing user satisfaction. Participants appreciated the GPT chatbot's better comprehension and ability to handle natural language, though both systems still exhibited some unnatural interactions. The keyword-based chatbot often failed to understand user intent accurately, leading to user frustration and lower satisfaction. These results suggest that integrating advanced AI technologies like GPT-based chatbots could improve user satisfaction in e-commerce settings, highlighting the potential for more human-like and effective customer service.
|
404 |
Parametric Optimal Design Of Uncertain Dynamical SystemsHays, Joseph T. 02 September 2011 (has links)
This research effort develops a comprehensive computational framework to support the parametric optimal design of uncertain dynamical systems. Uncertainty comes from various sources, such as: system parameters, initial conditions, sensor and actuator noise, and external forcing. Treatment of uncertainty in design is of paramount practical importance because all real-life systems are affected by it; not accounting for uncertainty may result in poor robustness, sub-optimal performance and higher manufacturing costs.
Contemporary methods for the quantification of uncertainty in dynamical systems are computationally intensive which, so far, have made a robust design optimization methodology prohibitive. Some existing algorithms address uncertainty in sensors and actuators during an optimal design; however, a comprehensive design framework that can treat all kinds of uncertainty with diverse distribution characteristics in a unified way is currently unavailable. The computational framework uses Generalized Polynomial Chaos methodology to quantify the effects of various sources of uncertainty found in dynamical systems; a Least-Squares Collocation Method is used to solve the corresponding uncertain differential equations. This technique is significantly faster computationally than traditional sampling methods and makes the construction of a parametric optimal design framework for uncertain systems feasible.
The novel framework allows to directly treat uncertainty in the parametric optimal design process. Specifically, the following design problems are addressed: motion planning of fully-actuated and under-actuated systems; multi-objective robust design optimization; and optimal uncertainty apportionment concurrently with robust design optimization. The framework advances the state-of-the-art and enables engineers to produce more robust and optimally performing designs at an optimal manufacturing cost. / Ph. D.
|
405 |
Natural Language Processing using Deep Learning in Social MediaGiménez Fayos, María Teresa 02 September 2021 (has links)
[ES] En los últimos años, los modelos de aprendizaje automático profundo (AP) han revolucionado los sistemas de procesamiento de lenguaje natural (PLN).
Hemos sido testigos de un avance formidable en las capacidades de estos sistemas y actualmente podemos encontrar sistemas que integran modelos PLN de manera ubicua.
Algunos ejemplos de estos modelos con los que interaccionamos a diario incluyen modelos que determinan la intención de la persona que escribió un texto, el sentimiento que pretende comunicar un tweet o nuestra ideología política a partir de lo que compartimos en redes sociales.
En esta tesis se han propuestos distintos modelos de PNL que abordan tareas que estudian el texto que se comparte en redes sociales. En concreto, este trabajo se centra en dos tareas fundamentalmente: el análisis de sentimientos y el reconocimiento de la personalidad de la persona autora de un texto.
La tarea de analizar el sentimiento expresado en un texto es uno de los problemas principales en el PNL y consiste en determinar la polaridad que un texto pretende comunicar. Se trata por lo tanto de una tarea estudiada en profundidad de la cual disponemos de una vasta cantidad de recursos y modelos.
Por el contrario, el problema del reconocimiento de personalidad es una tarea revolucionaria que tiene como objetivo determinar la personalidad de los usuarios considerando su estilo de escritura. El estudio de esta tarea es más marginal por lo que disponemos de menos recursos para abordarla pero que no obstante presenta un gran potencial.
A pesar de que el enfoque principal de este trabajo fue el desarrollo de modelos de aprendizaje profundo, también hemos propuesto modelos basados en recursos lingüísticos y modelos clásicos del aprendizaje automático. Estos últimos modelos nos han permitido explorar las sutilezas de distintos elementos lingüísticos como por ejemplo el impacto que tienen las emociones en la clasificación correcta del sentimiento expresado en un texto.
Posteriormente, tras estos trabajos iniciales se desarrollaron modelos AP, en particular, Redes neuronales convolucionales (RNC) que fueron aplicadas a las tareas previamente citadas. En el caso del reconocimiento de la personalidad, se han comparado modelos clásicos del aprendizaje automático con modelos de aprendizaje profundo, pudiendo establecer una comparativa bajo las mismas premisas.
Cabe destacar que el PNL ha evolucionado drásticamente en los últimos años gracias al desarrollo de campañas de evaluación pública, donde múltiples equipos de investigación comparan las capacidades de los modelos que proponen en las mismas condiciones. La mayoría de los modelos presentados en esta tesis fueron o bien evaluados mediante campañas de evaluación públicas, o bien emplearon la configuración de una campaña pública previamente celebrada. Siendo conscientes, por lo tanto, de la importancia de estas campañas para el avance del PNL, desarrollamos una campaña de evaluación pública cuyo objetivo era clasificar el tema tratado en un tweet, para lo cual recogimos y etiquetamos un nuevo conjunto de datos.
A medida que avanzabamos en el desarrollo del trabajo de esta tesis, decidimos estudiar en profundidad como las RNC se aplicaban a las tareas de PNL.
En este sentido, se exploraron dos líneas de trabajo.
En primer lugar, propusimos un método de relleno semántico para RNC, que plantea una nueva manera de representar el texto para resolver tareas de PNL. Y en segundo lugar, se introdujo un marco teórico para abordar una de las críticas más frecuentes del aprendizaje profundo, el cual es la falta de interpretabilidad. Este marco busca visualizar qué patrones léxicos, si los hay, han sido aprendidos por la red para clasificar un texto. / [CA] En els últims anys, els models d'aprenentatge automàtic profund (AP) han revolucionat els sistemes de processament de llenguatge natural (PLN).
Hem estat testimonis d'un avanç formidable en les capacitats d'aquests sistemes i actualment podem trobar sistemes que integren models PLN de manera ubiqua.
Alguns exemples d'aquests models amb els quals interaccionem diàriament inclouen models que determinen la intenció de la persona que va escriure un text, el sentiment que pretén comunicar un tweet o la nostra ideologia política a partir del que compartim en xarxes socials.
En aquesta tesi s'han proposats diferents models de PNL que aborden tasques que estudien el text que es comparteix en xarxes socials. En concret, aquest treball se centra en dues tasques fonamentalment: l'anàlisi de sentiments i el reconeixement de la personalitat de la persona autora d'un text.
La tasca d'analitzar el sentiment expressat en un text és un dels problemes principals en el PNL i consisteix a determinar la polaritat que un text pretén comunicar. Es tracta per tant d'una tasca estudiada en profunditat de la qual disposem d'una vasta quantitat de recursos i models.
Per contra, el problema del reconeixement de la personalitat és una tasca revolucionària que té com a objectiu determinar la personalitat dels usuaris considerant el seu estil d'escriptura. L'estudi d'aquesta tasca és més marginal i en conseqüència disposem de menys recursos per abordar-la però no obstant i això presenta un gran potencial.
Tot i que el fouc principal d'aquest treball va ser el desenvolupament de models d'aprenentatge profund, també hem proposat models basats en recursos lingüístics i models clàssics de l'aprenentatge automàtic. Aquests últims models ens han permès explorar les subtileses de diferents elements lingüístics com ara l'impacte que tenen les emocions en la classificació correcta del sentiment expressat en un text.
Posteriorment, després d'aquests treballs inicials es van desenvolupar models AP, en particular, Xarxes neuronals convolucionals (XNC) que van ser aplicades a les tasques prèviament esmentades. En el cas de el reconeixement de la personalitat, s'han comparat models clàssics de l'aprenentatge automàtic amb models d'aprenentatge profund la qual cosa a permet establir una comparativa de les dos aproximacions sota les mateixes premisses.
Cal remarcar que el PNL ha evolucionat dràsticament en els últims anys gràcies a el desenvolupament de campanyes d'avaluació pública on múltiples equips d'investigació comparen les capacitats dels models que proposen sota les mateixes condicions. La majoria dels models presentats en aquesta tesi van ser o bé avaluats mitjançant campanyes d'avaluació públiques, o bé s'ha emprat la configuració d'una campanya pública prèviament celebrada. Sent conscients, per tant, de la importància d'aquestes campanyes per a l'avanç del PNL, vam desenvolupar una campanya d'avaluació pública on l'objectiu era classificar el tema tractat en un tweet, per a la qual cosa vam recollir i etiquetar un nou conjunt de dades.
A mesura que avançàvem en el desenvolupament del treball d'aquesta tesi, vam decidir estudiar en profunditat com les XNC s'apliquen a les tasques de PNL. En aquest sentit, es van explorar dues línies de treball.En primer lloc, vam proposar un mètode d'emplenament semàntic per RNC, que planteja una nova manera de representar el text per resoldre tasques de PNL. I en segon lloc, es va introduir un marc teòric per abordar una de les crítiques més freqüents de l'aprenentatge profund, el qual és la falta de interpretabilitat. Aquest marc cerca visualitzar quins patrons lèxics, si n'hi han, han estat apresos per la xarxa per classificar un text. / [EN] In the last years, Deep Learning (DL) has revolutionised the potential of automatic systems that handle Natural Language Processing (NLP) tasks.
We have witnessed a tremendous advance in the performance of these systems. Nowadays, we found embedded systems ubiquitously, determining the intent of the text we write, the sentiment of our tweets or our political views, for citing some examples.
In this thesis, we proposed several NLP models for addressing tasks that deal with social media text. Concretely, this work is focused mainly on Sentiment Analysis and Personality Recognition tasks.
Sentiment Analysis is one of the leading problems in NLP, consists of determining the polarity of a text, and it is a well-known task where the number of resources and models proposed is vast.
In contrast, Personality Recognition is a breakthrough task that aims to determine the users' personality using their writing style, but it is more a niche task with fewer resources designed ad-hoc but with great potential.
Despite the fact that the principal focus of this work was on the development of Deep Learning models, we have also proposed models based on linguistic resources and classical Machine Learning models. Moreover, in this more straightforward setup, we have explored the nuances of different language devices, such as the impact of emotions in the correct classification of the sentiment expressed in a text.
Afterwards, DL models were developed, particularly Convolutional Neural Networks (CNNs), to address previously described tasks. In the case of Personality Recognition, we explored the two approaches, which allowed us to compare the models under the same circumstances.
Noteworthy, NLP has evolved dramatically in the last years through the development of public evaluation campaigns, where multiple research teams compare the performance of their approaches under the same conditions. Most of the models here presented were either assessed in an evaluation task or either used their setup. Recognising the importance of this effort, we curated and developed an evaluation campaign for classifying political tweets.
In addition, as we advanced in the development of this work, we decided to study in-depth CNNs applied to NLP tasks.
Two lines of work were explored in this regard.
Firstly, we proposed a semantic-based padding method for CNNs, which addresses how to represent text more appropriately for solving NLP tasks. Secondly, a theoretical framework was introduced for tackling one of the most frequent critics of Deep Learning: interpretability. This framework seeks to visualise what lexical patterns, if any, the CNN is learning in order to classify a sentence.
In summary, the main achievements presented in this thesis are:
- The organisation of an evaluation campaign for Topic Classification from texts gathered from social media.
- The proposal of several Machine Learning models tackling the Sentiment Analysis task from social media. Besides, a study of the impact of linguistic devices such as figurative language in the task is presented.
- The development of a model for inferring the personality of a developer provided the source code that they have written.
- The study of Personality Recognition tasks from social media following two different approaches, models based on machine learning algorithms and handcrafted features, and models based on CNNs were proposed and compared both approaches.
- The introduction of new semantic-based paddings for optimising how the text was represented in CNNs.
- The definition of a theoretical framework to provide interpretable information to what CNNs were learning internally. / Giménez Fayos, MT. (2021). Natural Language Processing using Deep Learning in Social Media [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/172164
|
406 |
<b>Information Extraction from Pilot Weather Reports (PIREPs) using a Structured Two-Level Named Entity Recognition (NER) Approach</b>Shantanu Gupta (18881197) 03 July 2024 (has links)
<p dir="ltr">Weather conditions such as thunderstorms, wind shear, snowstorms, turbulence, icing, and fog can create potentially hazardous flying conditions in the National Airspace System (NAS) (FAA, 2021). In general aviation (GA), hazardous weather conditions are most likely to cause accidents with fatalities (FAA, 2013). Therefore, it is critical to communicate weather conditions to pilots and controllers to increase awareness of such conditions, help pilots avoid weather hazards, and improve aviation safety (NTSB, 2017b). Pilot Reports (PIREPs) are one way to communicate pertinent weather conditions encountered by pilots (FAA, 2017a). However, in a hazardous weather situation, communication adds to pilot workload and GA pilots may need to aviate and navigate to another area before feeling safe enough to communicate the weather conditions. The delay in communication may result in PIREPs that are both inaccurate and untimely, potentially misleading other pilots in the area with incorrect weather information (NTSB, 2017a). Therefore, it is crucial to enhance the PIREP submission process to improve the accuracy, timeliness, and usefulness of PIREPs, while simultaneously reducing the need for hands-on communication.</p><p dir="ltr">In this study, a potential method to incrementally improve the performance of an automated spoken-to-coded-PIREP system is explored. This research aims at improving the information extraction model within the spoken-to-coded-PIREP system by using underlying structures and patterns in the pilot spoken phrases. The first part of this research is focused on exploring the structural elements, patterns, and sub-level variability in the Location, Turbulence, and Icing pilot phrases. The second part of the research is focused on developing and demonstrating a structured two-level Named Entity Recognition (NER) model that utilizes the underlying structures within pilot phrases. A structured two-level NER model is designed, developed, tested, and compared with the initial single level NER model in the spoken-to-coded-PIREP system. The model follows a structured approach to extract information at two levels within three PIREP information categories – Location, Turbulence, and Icing. The two-level NER model is trained and tested using a total of 126 PIREPs containing Turbulence and Icing weather conditions. The performance of the structured two-level NER model is compared to the performance of a comparable single level initial NER model using three metrics – precision, recall, and F1-Score. The overall F1-Score of the initial single level NER model was in the range of 68% – 77%, while the two-level NER model was able to achieve an overall F1-Score in the range of 89% – 92%. The two-level NER model was successful in recognizing and labelling specific phrases into broader entity labels such as Location, Turbulence, and Icing, and then processing those phrases to segregate their structural elements such as Distance, Location Name, Turbulence Intensity, and Icing Type. With improvements to the information extraction model, the performance of the overall spoken-to-coded-PIREP system may be increased and the system may be better equipped to handle the variations in pilot phrases and weather situations. Automating the PIREP submission process may reduce the pilot’s hands-on task-requirement in submitting a PIREP during hazardous weather situations, potentially increase the quality and quantity of PIREPs, and share accurate weather-related information in a timely manner, ultimately making GA flying safter.</p>
|
407 |
Natural Language Based AI Tools in Interaction Design Research : Using ChatGPT for Qualitative User Research Insight AnalysisSaare, Karmen January 2024 (has links)
This thesis investigates the use of Artificial Intelligence, specifically the Large Language Model (LLM) application ChatGPT in the context of qualitative user research, with the goal of enhancing the user research interview analysis process. Through an empirical study where ChatGPT was used in the process of a typical user research insight analysis, the limitations and opportunities of the AI tool are examined. The study's results highlight the most significant insights from the empirical investigation, serving as examples to raise awareness of the implications of using ChatGPT in the context of user interview analysis. The study concludes that ChatGPT has the potential to enhance the interpretation of primarily individual interviews by generating well-articulated summaries, provided their accuracy can be verified. Additionally, ChatGPT may be particularly useful in low-risk design projects where the consequences of potential misinterpretations are minimal. Finally, the significance of clearly articulated written instructions for ChatGPT for best results is pointed out.
|
408 |
Event-Cap – Event Ranking and Transformer-based Video Captioning / Event-Cap – Event rankning och transformerbaserad video captioningCederqvist, Gabriel, Gustafsson, Henrik January 2024 (has links)
In the field of video surveillance, vast amounts of data are gathered each day. To be able to identify what occurred during a recorded session, a human annotator has to go through the footage and annotate the different events. This is a tedious and expensive process that takes up a large amount of time. With the rise of machine learning and in particular deep learning, the field of both image and video captioning has seen large improvements. Contrastive Language-Image Pretraining is capable of efficiently learning a multimodal space, thus able to merge the understanding of text and images. This enables visual features to be extracted and processed into text describing the visual content. This thesis presents a system for extracting and ranking important events from surveillance videos as well as a way of automatically generating a description of the event. By utilizing the pre-trained models X-CLIP and GPT-2 to extract visual information from the videos and process it into text, a video captioning model was created that requires very little training. Additionally, the ranking system was implemented to extract important parts in video, utilizing anomaly detection as well as polynomial regression. Captions were evaluated using the metrics BLEU, METEOR, ROUGE and CIDEr, and the model receives scores comparable to other video captioning models. Additionally, captions were evaluated by experts in the field of video surveillance, who rated them on accuracy, reaching up to 62.9%, and semantic quality, reaching 99.2%. Furthermore the ranking system was also evaluated by the experts, where they agree with the ranking system 78% of the time. / Inom videoövervakning samlas stora mängder data in varje dag. För att kunna identifiera vad som händer i en inspelad övervakningsvideo så måste en människa gå igenom och annotera de olika händelserna. Detta är en långsam och dyr process som tar upp mycket tid. Under de senaste åren har det setts en enorm ökning av användandet av olika maskininlärningsmodeller. Djupinlärningsmodeller har fått stor framgång när det kommer till att generera korrekt och trovärdig text. De har också använts för att generera beskrivningar för både bilder och video. Contrastive Language-Image Pre-training har gjort det möjligt att träna en multimodal rymd som kombinerar förståelsen av text och bild. Detta gör det möjligt att extrahera visuell information och skapa textbeskrivningar. Denna master uppsatts beskriver ett system som kan extrahera och ranka viktiga händelser i en övervakningsvideo samt ett automatiskt sätt att generera beskrivningar till dessa. Genom att använda de förtränade modellerna X-CLIP och GPT-2 för att extrahera visuell information och textgenerering, har en videobeskrivningsmodell skapats som endast behöver en liten mängd träning. Dessutom har ett rankingsystem implementerats för att extrahera de viktiga delarna i en video genom att använda anomalidetektion och polynomregression. Video beskrivningarna utvärderades med måtten BLEU, METOER, ROUGE och CIDEr, där modellerna får resultat i klass med andra videobeskrivningsmodeller. Fortsättningsvis utvärderades beskrivningarna också av experter inom videoövervakningsområdet där de fick besvara hur bra beskrivningarna var i måtten: beskrivningsprecision som uppnådde 62.9% och semantisk kvalité som uppnådde 99.2%. Ranknignssystemet utvärderades också av experterna. Deras åsikter överensstämde till 78% med rankningssystemet.
|
409 |
Clustering and Anomaly detection using Medical Enterprise system Logs (CAMEL) / Klustring av och anomalidetektering på systemloggarAhlinder, Henrik, Kylesten, Tiger January 2023 (has links)
Research on automated anomaly detection in complex systems by using log files has been on an upswing with the introduction of new deep-learning natural language processing methods. However, manually identifying and labelling anomalous logs is time-consuming, error-prone, and labor-intensive. This thesis instead uses an existing state-of-the-art method which learns from PU data as a baseline and evaluates three extensions to it. The first extension provides insight into the performance of the choice of word em-beddings on the downstream task. The second extension applies a re-labelling strategy to reduce problems from pseudo-labelling. The final extension removes the need for pseudo-labelling by applying a state-of-the-art loss function from the field of PU learning. The findings show that FastText and GloVe embeddings are viable options, with FastText providing faster training times but mixed results in terms of performance. It is shown that several of the methods studied in this thesis suffer from sporadically poor performances on one of the datasets studied. Finally, it is shown that using modified risk functions from the field of PU learning provides new state-of-the-art performances on the datasets considered in this thesis.
|
410 |
Direct Preference Optimization for Improved Technical WritingAssistance : A Study of How Language Models Can Support the Writing of Technical Documentation at Saab / En studie i hur språkmodeller kan stödja skrivandet av teknisk dokumentation på SaabBengtsson, Hannes, Habbe, Patrik January 2024 (has links)
This thesis explores the potential of Large Language Models (LLMs) to assist in the technical documentation process at Saab. With the increasing complexity and regulatory demands on such documentation, the objective is to investigate advanced natural language processing techniques as a means of streamlining the creation of technical documentation. Although many standards exist, this thesis particularly focuses on the standard ASD-STE100, Simplified Technical English abbrv. STE, a controlled language for technical documentation. STE's primary aim is to ensure that technical documents are understandable to individuals regardless of their native language or English proficiency. The study focuses on the implementation of Direct Preference Optimization (DPO) and Supervised Instruction Fine-Tuning (SIFT) to refine the capabilities of LLMs in producing clear and concise outputs that comply with STE. Through a series of experiments, we investigate the effectiveness of LLMs in interpreting and simplifying technical language, with a particular emphasis on adherence to STE standards. The study utilizes a dataset comprised of target data paired with synthetic source data generated by a LLM. We apply various model training strategies, including zero-shot performance, supervised instruction fine-tuning, and direct preference optimization. We evaluate the various models' output using established quantitative metrics for text simplification and substitute human evaluators with company internal software for evaluating adherence to company standards and STE. Our findings suggest that while LLMs can significantly contribute to the technical writing process, the choice of training methods and the quality of data play crucial roles in the model's performance. This study shows how LLMs can improve productivity and reduce manual work. It also looks at the problems and suggests ways to make technical documentation automation better in the future.
|
Page generated in 0.0948 seconds