Spelling suggestions: "subject:"[een] NLP"" "subject:"[enn] NLP""
181 |
Propagation of online consumer-perceived negativity: Quantifying the effect of supply chain underperformance on passenger car salesSingh, A., Jenamani, M., Thakker, J.J., Rana, Nripendra P. 10 April 2021 (has links)
Yes / The paper presents a text analytics framework that analyses online reviews to explore how consumer-perceived negativity corresponding to the supply chain propagates over time and how it affects car sales. In particular, the framework integrates aspect-level sentiment analysis using SentiWordNet, time-series decomposition, and bias-corrected least square dummy variable (LSDVc) – a panel data estimator. The framework facilitates the business community by providing a list of consumers’ contemporary interests in the form of frequently discussed product attributes; quantifying consumer-perceived performance of supply chain (SC) partners and comparing the competitors; and a model assessing various firms’ sales performance. The proposed framework demonstrated to the automobile supply chain using a review dataset received from a renowned car-portal in India. Our findings suggest that consumer-voiced negativity is maximum for dealers and minimum for manufacturing and assembly related features. Firm age, GDP, and review volume significantly influence car sales whereas the sentiments corresponding to SC partners do not. The proposed research framework can help the manufacturers in inspecting their SC partners; realising consumer-cited critical car sales influencers; and accurately predicting the sales, which in turn can help them in better production planning, supply chain management, marketing, and consumer relationships.
|
182 |
Chatting Over Course Material : The Role of Retrieval Augmented Generation Systems in Enhancing Academic Chatbots.Monteiro, Hélder January 2024 (has links)
Large Language Models (LLMs) have the potential to enhance learning among students. These tools can be used in chatbot systems allowing students to ask questions about course material, in particular when plugged with the so-called Retrieval Augmented Systems (RAGs). RAGs allow LLMs to access external knowledge, which improves tailored responses when used in a chatbot system. This thesis studies different RAGs through an experimentation approach where each RAG is constructed using different sets of parameters and tools, including small and large language models. We conclude by suggesting which of the RAGs best adapts to high school courses in Physics and undergraduate courses in Mathematics, such that the retrieval systems together with the LLMs are able to return the most relevant answers from provided course material. We conclude with two RAG-powered LLM with different configurations performing over 64% accuracy in physics and 66% in mathematics.
|
183 |
Investigating the impact of Generative AI on newcomers' understanding of Software ProjectsLarsen, Knud Ronau, Edvall, Magnus January 2024 (has links)
Context: In both commercial and open-source software development, newcomers often join the development process in the advanced stages of the software development lifecycle. Newcomers frequently face barriers impeding their ability to make early contributions, often caused by a lack of understanding. For this purpose, we have developed an LLM-based tool called SPAC-B that facilitates project-specific question-answering to aid newcomers' understanding of software projects. Objective: Investigate the LLM-based tool's ability to assist newcomers in understanding software projects by measuring its accuracy and conducting an experiment. Method: In this study, a case study is conducted to investigate the accuracy of the tool, measured in relevance, completeness, and correctness. Furthermore, an experiment is performed among software developers to test the tool's ability to help newcomers formulate better plans for open-source issues. Results: SPAC-B achieved an accuracy of 4.60 in relevance, 4.30 in completeness, and 4.28 in correctness on a scale from 1 to 5. It improved the combined mean score of the plans of the 10 participants in our experiments from 1.90 to 2.70, and 8 out of 10 participants found the tool helpful. Conclusions: SPAC-B has demonstrated high accuracy and helpfulness, but further research is needed to confirm if these results can be generalized to a larger population and other contexts of use.
|
184 |
TEXT ANNOTATION IN PARLIAMENTARY RECORDSUSING BERT MODELSEriksson, Fabian January 2024 (has links)
This thesis has investigated whether a transformer-based language model can be improved by training the model on context sequences which are input sequences with a larger window of text, by combining a transformer model with a neural network for non-text features, or by domain-adaptive pre-training. Two types of context input sequences are tested: left context and full context. The three modifications are explored by applying BERT models to the Swedish Parliamentary Corpus to classify whether a text sequence is a heading. A standard BERT model is trained for sequence classification alongside a position model which adds an additional feedforward neural network to the model. Each model is trained with- and without context sequences as well as with- and without domain-adaptive pre-training. A standard implementation of the BERT model with domain adaptation achieves an F1 score of 0.9358 on the test set and an accuracy of 0.9940. The best performing standard BERT model with a context input sequence achieves an F1 of 0.9636 and an accuracy of 0.9966 while the best performing position model achieves an F1 of 0.9550 and an accuracy of 0.9957. The best performing model which combines context input sequences with the position model achieves an F1 of 0.9908 and an accuracy of 0.9991 on the test set. Analysis of misclassified sequences suggests that the models with context input sequences and positional features are less likely to misclassify sequences which can appear both as a heading and a non-heading in the corpus. However, a McNemar's exact test indicates that only a position model with left context input sequences differs significantly from its standard BERT counterpart in terms of the number of differing misclassifications at a 5% significance level. Furthermore, there is no experimental evidence that domain-adaptive pre-training improves classification performance on this specific sequence classification task.
|
185 |
Functional linguistic based motivations for a conversational software agentPanesar, Kulvinder 07 October 2020 (has links)
Yes / This chapter discusses a linguistically orientated model of a conversational software agent (CSA) (Panesar 2017) framework sensitive to natural language processing (NLP) concepts and the levels of adequacy of a functional linguistic theory (LT). We discuss the relationship between NLP and knowledge representation (KR), and connect this with the goals of a linguistic theory (Van Valin and LaPolla 1997), in particular Role and Reference Grammar (RRG) (Van Valin Jr 2005). We debate the advantages of RRG and consider its fitness and computational adequacy. We present a design of a computational model of the linking algorithm that utilises a speech act construction as a grammatical object (Nolan 2014a, Nolan 2014b) and the sub-model of belief, desire and intentions (BDI) (Rao and Georgeff 1995). This model has been successfully implemented in software, using the resource description framework (RDF), and we highlight some implementation issues that arose at the interface between language and knowledge representation (Panesar 2017).
|
186 |
Analysis of Security Findings and Reduction of False Positives through Large Language ModelsWagner, Jonas 18 October 2024 (has links)
This thesis investigates the integration of State-of-the-Art (SOTA) Large Language Models
(LLMs) into the process of reassessing security findings generated by Static Application
Security Testing (SAST) tools. The primary objective is to determine whether LLMs are
able to detect false positives (FPs) while maintaining a high true positive (TP) rate, thereby
enhancing the efficiency and effectiveness of security assessments.
Four consecutive experiments were conducted, each addressing specific research questions.
The initial experiment, using a dataset of security findings extracted from the OWASP Bench-
mark, identified the optimal combination of context items provided by the SAST tool Spot-
Bugs, which, when used with GPT-3.5 Turbo, reduced FPs while minimizing the loss of
TPs. The second experiment, conducted on the same dataset, demonstrated that advanced
prompting techniques, particularly few-shot Chain-of-Thought (CoT) prompting combined
with Self-Consistency (SC), further improved the reassessment process. The third experiment
compared both proprietary and open-source LLMs on an OWASP Benchmark dataset about
one-fourth the size of the previously used dataset. GPT-4o achieved the highest performance,
detecting 80 out of 128 FPs without missing any TPs, resulting in a perfect TPR of 100% and
a decrease in FPR by 41.27 percentage points. Meanwhile, Llama 3.1 70B detected 112 out
of the 128 FPs but missed 10 TPs, resulting in a TPR of 94.94% and a reduction in FPR by
56.62 percentage points. To validate these findings in a real-world context, the approach was
applied to a dataset generated from the open-source project Mnestix using multiple SAST
tools. GPT-4o again emerged as the top performer, detecting 26 out of 68 FPs while only
missing one TP, resulting in a TPR decreased by 2.22 percentage points but simultaneously
an FPR decreased 37.57 percentage points.:Table of Contents IV
List of Figures VI
List of Tables VIII
List of Source Codes IX
List of Abbreviations XI
1. Motivation 1
2. Background 3
3. Related Work 17
4. Concept 31
5. Preparing a Security Findings Dataset 39
6. Implementing a Workflow 51
7. Identifying Context Items 67
8. Comparing Prompting Techniques 85
9. Comparing Large Language Models 101
10.Evaluating Developed Approach 127
11.Discussion 141
12.Conclusion 145
A. Appendix: Figures 147
A.1. Repository Directory Tree 148
A.2. Precision-Recall Curve of Compared Large Language Models 149
A.3. Performance Metrics Self-Consistency on Mnestix Dataset 150
B. Appendix: Tables 151
B.1. Design Science Research Concept 151
C. Appendix: Code 153
C.1. Pydantic Base Config Documentation 153
C.2. Pydantic LLM Client Config Documentation 155
C.3. LLM BaseClient Class 157
C.4. Test Cases Removed From Dataset 158
|
187 |
Deep Learning for Uncertainty MeasurementKim, Alisa 12 February 2021 (has links)
Diese Arbeit konzentriert sich auf die Lösung des Problems der Unsicherheitsmessung und ihrer Auswirkungen auf Geschäftsentscheidungen, wobei zwei Ziele verfolgt werden: Erstens die Entwicklung und Validierung robuster Modelle zur Quantifizierung der Unsicherheit, wobei insbesondere sowohl die etablierten statistischen Modelle als auch neu entwickelte maschinelle Lernwerkzeuge zum Einsatz kommen. Das zweite Ziel dreht sich um die industrielle Anwendung der vorgeschlagenen Modelle. Die Anwendung auf reale Fälle bei der Messung der Volatilität oder bei einer riskanten Entscheidung ist mit einem direkten und erheblichen Gewinn oder Verlust verbunden.
Diese These begann mit der Untersuchung der impliziten Volatilität (IV) als Proxy für die Wahrnehmung der Unsicherheit von Anlegern für eine neue Klasse von Vermögenswerten - Kryptowährungen. Das zweite Papier konzentriert sich auf Methoden zur Identifizierung risikofreudiger Händler und nutzt die DNN-Infrastruktur, um das Risikoverhalten von Marktakteuren, das auf Unsicherheit beruht und diese aufrechterhält, weiter zu untersuchen. Das dritte Papier befasste sich mit dem herausfordernden Bestreben der Betrugserkennung 3 und bot das Entscheidungshilfe-modell, das eine genauere und interpretierbarere Bewertung der zur Prüfung eingereichten Finanzberichte ermöglichte.
Angesichts der Bedeutung der Risikobewertung und der Erwartungen der Agenten für die wirtschaftliche Entwicklung und des Aufbaus der bestehenden Arbeiten von Baker (2016) bot das vierte Papier eine neuartige DL-NLP-basierte Methode zur Quantifizierung der wirtschaftspolitischen Unsicherheit.
Die neuen Deep-Learning-basierten Lösungen bieten eine überlegene Leistung gegenüber bestehenden Ansätzen zur Quantifizierung und Erklärung wirtschaftlicher Unsicherheiten und ermöglichen genauere Prognosen, verbesserte Planungskapazitäten und geringere Risiken. Die angebotenen Anwendungsfälle bilden eine Plattform für die weitere Forschung. / This thesis focuses on solving the problem of uncertainty measurement and its impact on business decisions while pursuing two goals: first, develop and validate accurate and robust models for uncertainty quantification, employing both the well established statistical models and newly developed machine learning tools, with particular focus on deep learning. The second goal revolves around the industrial application of proposed models, applying them to real-world cases when measuring volatility or making a risky decision entails a direct and substantial gain or loss.
This thesis started with the exploration of implied volatility (IV) as a proxy for investors' perception of uncertainty for a new class of assets - crypto-currencies. The second paper focused on methods to identify risk-loving traders and employed the DNN infrastructure for it to investigate further the risk-taking behavior of market actors that both stems from and perpetuates uncertainty. The third paper addressed the challenging endeavor of fraud detection and offered the decision support model that allowed a more accurate and interpretable evaluation of financial reports submitted for audit.
Following the importance of risk assessment and agents' expectations in economic development and building on the existing works of Baker (2016) and their economic policy uncertainty (EPU) index, it offered a novel DL-NLP-based method for the quantification of economic policy uncertainty.
In summary, this thesis offers insights that are highly relevant to both researchers and practitioners. The new deep learning-based solutions exhibit superior performance to existing approaches to quantify and explain economic uncertainty, allowing for more accurate forecasting, enhanced planning capacities, and mitigated risks. The offered use-cases provide a road-map for further development of the DL tools in practice and constitute a platform for further research.
|
188 |
Ancient Geography goes digital: Representation of Spatial Orientation in Ancient TextsThiering, Martin, Goerz, Guenther, Ilysushechkina, Ekaterina 19 March 2018 (has links)
No description available.
|
189 |
Evaluation of Approaches for Representation and Sentiment of Customer Reviews / Utvärdering av tillvägagångssätt för representation och uppfattning om kundrecensionerGiorgis, Stavros January 2021 (has links)
Classification of sentiment on customer reviews is a real-world application for many companies that offer text analytics and opinion extraction on customer reviews on different domains such as consumer electronics, hotels, restaurants, and car rental agencies. Natural Language Processing’s latest progress has seen the development of many new state-of-the-art approaches for representing the meaning of sentences, phrases, and words in the text using vector space models, so-called embeddings. In this thesis, we evaluated the most current and most popular text representation techniques against traditional methods as a baseline. The evaluation dataset consists of customer reviews from different domains with different lengths used by a text analysis company. Through a train dataset exploration, we evaluated which datasets were the most suitable for this specific task. Furthermore, we explored different techniques that could be used to alter a language model’s decisions without retraining it. Finally, all the methods were evaluated against their time performance and the resource requirements to present an overall experimental assessment that could potentially help the company decide which is the most appropriate technique to replace its system in a production environment. / Klassificeringen av attityd och känsloläge i kundrecensioner är en tillämpning med praktiskt värde för flera företag i marknadsanalysbranschen. Aktuell forskning i språkteknologi har etablerat vektorrum som standardrepresentation för ord, fraser och yttranden, så kallade embeddings. Denna uppsats utvärderar den senaste tidens mest framgångsrika textrepresentationsmodeller jämfört med mer traditionella vektorrum. Utvärdering görs genom att jämföra automatiska analyser med mänskliga bedömningar för kundrecensioner av varierande längd från olika domäner tillhandahållna av ett textanalysföretag. Inom ramen för studien har olika testmängder jämförts och olika sätt att modifera en språkmodells klassficering utan om träning. Alla modeller har också jämförts med avseende på resurs- och tidsåtgång för träning för att hjälpa uppdragsgivaren fatta beslut om vilken teknik som utgör den mest ändamålsenliga utvecklingsvägen för dess driftsatta system.
|
190 |
A Prompting Framework for Natural Language Processing in the Medical Field : Assessing the Potential of Large Language Models for Swedish Healthcare / Ett ramverk för behandling av naturliga språkmodeller inom hälso- och sjukvården : Bedömningen av potentialen hos stora språkmodeller inom svensk sjukvårdMondal, Anim January 2023 (has links)
The increasing digitisation of healthcare through the use of technology and artificial intelligence has affected the medical field in a multitude of ways. Generative Pre-trained Transformers (GPTs) is a collection of language models that have been trained on an extensive data set to generate human-like text and have been shown to achieve a strong understanding of natural language. This thesis aims to investigate whether GPT-SW3, a large language model for the Swedish language, is capable of responding to healthcare tasks accurately given prompts and context. To reach the goal, a framework was created. The framework consisted of general medical questions, an evaluation of medical reasoning, and conversations between a doctor and patient has been created to evaluate GPT-SW3's abilities in the respective areas. Each component has a ground truth which is used when evaluating the responses. Based on the results, GPT-SW3 is capable of dealing with specific medical tasks and shows, in particular instances, signs of understanding. In more basic tasks, GPT-SW3 manages to provide adequate answers to some questions. In more advanced scenarios, such as conversation and reasoning, GPT-SW3 struggles to provide coherent answers reminiscent of a human doctor's conversation. While there have been some great advancements in natural language processing, further work into a Swedish model will have to be conducted to create a model that is useful for healthcare. Whether the work is in fine-tuning the weights of the models or retraining the models with domain-specific data is left for subsequent works. / Den ökande digitaliseringen av vården genom användning av teknik och artificiell intelligens har påverkat det medicinska fältet på både positiva och negativa sätt. Generative Pre-trained Transformers (GPTs) är en samling språkmodeller som har tränats på en stor datamängd för att generera människoliknande text och har visat sig uppnå en stark förståelse av naturligt språk. Syftet med den här uppsatsen är att undersöka om GPT-SW3, en stor språkmodell för det svenska språket, kan svara på hälso- och sjukvårdsuppgifter på ett korrekt sätt med hänsyn till uppmaningar och sammanhang. För att uppnå målet skapades ett ramverk. Ramverket bestod av allmänna medicin-ska frågor, en utvärdering av medicinska resonemang samt konversationer mellan en läkare och en patient har skapats för att utvärdera GPT-SW3:s förmåga inom respektive områden. Varje komponent har en grundsanning som används vid utvärderingen av svaren. Generellt sett klarar GPT-SW3 av att hantera specifika medicinska uppgifter och modellen visar tecken på förståelse. I mer grundläggande uppgifter lyckas GPT-SW3 ge adekvata svar på vissa frågor. I mer avancerade scenarier, t.ex. samtal och resonemang, har GPT-SW3 svårt att ge sammanhängande svar. Även om det har gjorts stora framsteg inom språkteknologi måste ytterligare arbete med en svensk modell utföras för att skapa en modell som är användbar för hälso- och sjukvården. Huruvida arbetet består i att finjustera modellernas vikter eller att träna om modellerna med domänspecifika data lämnas till kommande arbeten.
|
Page generated in 0.086 seconds