Spelling suggestions: "subject:"[een] BERT"" "subject:"[enn] BERT""
131 |
The past, present or future? : A comparative NLP study of Naive Bayes, LSTM and BERT for classifying Swedish sentences based on their tenseNavér, Norah January 2021 (has links)
Natural language processing is a field in computer science that is becoming increasingly important. One important part of NLP is the ability to sort text to the past, present or future, depending on when the event came or will come about. The objective of this thesis was to use text classification to classify Swedish sentences based on their tense, either past, present or future. Furthermore, the objective was also to compare how lemmatisation would affect the performance of the models. The problem was tackled by implementing three machine learning models on both lemmatised and not lemmatised data. The machine learning models were Naive Bayes, LSTM and BERT. The result showed that the overall performance was affected negatively when the data was lemmatised. The best performing model was BERT with an accuracy of 96.3\%. The result was useful as the best performing model had very high accuracy and performed well on newly constructed sentences. / Språkteknologi är område inom datavetenskap som som har blivit allt viktigare. En viktig del av språkteknologi är förmågan att sortera texter till det förflutna, nuet eller framtiden, beroende på när en händelse skedde eller kommer att ske. Syftet med denna avhandling var att använda textklassificering för att klassificera svenska meningar baserat på deras tempus, antingen dåtid, nutid eller framtid. Vidare var syftet även att jämföra hur lemmatisering skulle påverka modellernas prestanda. Problemet hanterades genom att implementera tre maskininlärningsmodeller på både lemmatiserade och icke lemmatiserade data. Maskininlärningsmodellerna var Naive Bayes, LSTM och BERT. Resultatet var att den övergripande prestandan påverkades negativt när datan lemmatiserade. Den bäst presterande modellen var BERT med en träffsäkerhet på 96,3 \%. Resultatet var användbart eftersom den bäst presterande modellen hade mycket hög träffsäkerhet och fungerade bra på nybyggda meningar.
|
132 |
Transfer Learning for Automatic Author Profiling with BERT Transformers and GloVe EmbeddingsFrom, Viktor January 2022 (has links)
Historically author profiling has been used in forensic linguistics. However, it is not until the last decades that the analysis method has worked into computer science and machine learning. In comparison, determining author profiling characteristics in machine learning is nothing new. This paper investigates the possibility to improve upon previous results with modern frameworks using data sets that have seen limited usage. The purpose of this master thesis was to use pre-trained transformers or embeddings together with transfer learning. In addition, to examine if general author profiling characteristics of anonymous users on internet forums or conversations on social media could be determined. The data sets used to investigate the questions above were PAN15 and PANDORA, which contains various properties in text data based on authors paired with ground truth labels such as gender, age, and Big Five/OCEAN. In addition, transfer learning of BERT and GloVe was used as a starting point to decrease the learning time of a new task. PAN15, a Twitter data set, did not contain enough data when training a model and was augmented using PANDORA, a Reddit-based data set. Ultimately, BERT obtained the best performance using a stacked approach, achieving 86 − 91% accuracy for each label on unseen data.
|
133 |
Multimodal Model for Construction Site Aversion ClassificationAppelstål, Michael January 2020 (has links)
Aversion on construction sites can be everything from missingmaterial, fire hazards, or insufficient cleaning. These aversionsappear very often on construction sites and the construction companyneeds to report and take care of them in order for the site to runcorrectly. The reports consist of an image of the aversion and atext describing the aversion. Report categorization is currentlydone manually which is both time and cost-ineffective. The task for this thesis was to implement and evaluate an automaticmultimodal machine learning classifier for the reported aversionsthat utilized both the image and text data from the reports. Themodel presented is a late-fusion model consisting of a Swedish BERTtext classifier and a VGG16 for image classification. The results showed that an automated classifier is feasible for thistask and could be used in real life to make the classification taskmore time and cost-efficient. The model scored a 66.2% accuracy and89.7% top-5 accuracy on the task and the experiments revealed someareas of improvement on the data and model that could be furtherexplored to potentially improve the performance.
|
134 |
Annotating Job Titles in Job Ads using Swedish Language ModelsRidhagen, Markus January 2023 (has links)
This thesis investigates automated annotation approaches to assist public authorities in Sweden in optimizing resource allocation and gaining valuable insights to enhance the preservation of high-quality welfare. The study uses pre-trained Swedish language models for the named entity recognition (NER) task of finding job titles in job advertisements from The Swedish Public Employment Service, Arbetsförmedlingen. Specifically, it evaluates the performance of the Swedish Bidirectional Encoder Representations from Transformers (BERT), developed by the National Library of Sweden (KB), referred to as KB-BERT. The thesis explores the impact of training data size on the models’ performance and examines whether active learning can enhance efficiency and accuracy compared to random sampling. The findings reveal that even with a small training dataset of 220 job advertisements, KB-BERT achieves a commendable F1-score of 0.770 in predicting job titles. The model’s performance improves further by augmenting the training data with an additional 500 annotated job advertisements, yielding an F1-score of 0.834. Notably, the highest F1-score of 0.856 is achieved by applying the active learning strategy of uncertainty sampling and the measure of mean entropy. The test data provided by Arbetsförmedlingen was re-annotated to evaluate the complexity of the task. The human annotator achieved an F1-score of 0.883. Based on these findings, it can be inferred that KB-BERT performs satisfactorily in classifying job titles from job ads.
|
135 |
Characterizing, classifying and transforming language model distributionsKniele, Annika January 2023 (has links)
Large Language Models (LLMs) have become ever larger in recent years, typically demonstrating improved performance as the number of parameters increases. This thesis investigates how the probability distributions output by language models differ depending on the size of the model. For this purpose, three features for capturing the differences between the distributions are defined, namely the difference in entropy, the difference in probability mass in different slices of the distribution, and the difference in the number of tokens covering the top-p probability mass. The distributions are then put into different distribution classes based on how they differ from the distributions of the differently-sized model. Finally, the distributions are transformed to be more similar to the distributions of the other model. The results suggest that classifying distributions before transforming them, and adapting the transformations based on which class a distribution is in, improves the transformation results. It is also shown that letting a classifier choose the class label for each distribution yields better results than using random labels. Furthermore, the findings indicate that transforming the distributions using entropy and the number of tokens in the top-p probability mass makes the distributions more similar to the targets, while transforming them based on the probability mass of individual slices of the distributions makes the distributions more dissimilar.
|
136 |
Clustering and Summarization of Chat Dialogues : To understand a company’s customer base / Klustring och Summering av Chatt-DialogerHidén, Oskar, Björelind, David January 2021 (has links)
The Customer Success department at Visma handles about 200 000 customer chats each year, the chat dialogues are stored and contain both questions and answers. In order to get an idea of what customers ask about, the Customer Success department has to read a random sample of the chat dialogues manually. This thesis develops and investigates an analysis tool for the chat data, using the approach of clustering and summarization. The approach aims to decrease the time spent and increase the quality of the analysis. Models for clustering (K-means, DBSCAN and HDBSCAN) and extractive summarization (K-means, LSA and TextRank) are compared. Each algorithm is combined with three different text representations (TFIDF, S-BERT and FastText) to create models for evaluation. These models are evaluated against a test set, created for the purpose of this thesis. Silhouette Index and Adjusted Rand Index are used to evaluate the clustering models. ROUGE measure together with a qualitative evaluation are used to evaluate the extractive summarization models. In addition to this, the best clustering model is further evaluated to understand how different data sizes impact performance. TFIDF Unigram together with HDBSCAN or K-means obtained the best results for clustering, whereas FastText together with TextRank obtained the best results for extractive summarization. This thesis applies known models on a textual domain of customer chat dialogues, something that, to our knowledge, has previously not been done in literature.
|
137 |
Efficient Estimation for Small Multi-Rotor Air Vehicles Operating in Unknown, Indoor EnvironmentsMacdonald, John Charles 07 December 2012 (has links) (PDF)
In this dissertation we present advances in developing an autonomous air vehicle capable of navigating through unknown, indoor environments. The problem imposes stringent limits on the computational power available onboard the vehicle, but the environment necessitates using 3D sensors such as stereo or RGB-D cameras whose data requires significant processing. We address the problem by proposing and developing key elements of a relative navigation scheme that moves as many processing tasks as possible out of the time-critical functions needed to maintain flight. We present in Chapter 2 analysis and results for an improved multirotor helicopter state estimator. The filter generates more accurate estimates by using an improved dynamic model for the vehicle and by properly accounting for the correlations that exist in the uncertainty during state propagation. As a result, the filter can rely more heavily on frequent and easy to process measurements from gyroscopes and accelerometers, making it more robust to error in the processing intensive information received from the exteroceptive sensors. In Chapter 3 we present BERT, a novel approach to map optimization. The goal of map optimization is to produce an accurate global map of the environment by refining the relative pose transformation estimates generated by the real-time navigation system. We develop BERT to jointly optimize the global poses and relative transformations. BERT exploits properties of independence and conditional independence to allow new information to efficiently flow through the network of transformations. We show that BERT achieves the same final solution as a leading iterative optimization algorithm. However, BERT delivers noticeably better intermediate results for the relative transformation estimates. The improved intermediate results, along with more readily available covariance estimates, make BERT especially applicable to our problem where computational resources are limited. We conclude in Chapter 4 with analysis and results that extend BERT beyond the simple example of Chapter 3. We identify important structure in the network of transformations and address challenges arising in more general map optimization problems. We demonstrate results from several variations of the algorithm and conclude the dissertation with a roadmap for future work.
|
138 |
Natural Language Processing for Swedish Nuclear Power Plants : A study of the challenges of applying Natural language processing in Operations and Maintenance and how BERT can be used in this industryKåhrström, Felix January 2022 (has links)
In this study, the current use of natural language processing in Swedish and international nuclear power plants has been investigated through semi-structured interviews. Furthermore, natural language processing techniques have been studied to find out how text data can be analyzed and utilized to aid operations and maintenance in the Swedish nuclear power plant industry. The state-of-the-art transformers model BERT was used to analyze text data from operations at a Swedish nuclear power plant. This study has not managed to find any current implementations of natural language processing techniques for operations and maintenance in Swedish nuclear power plants. Natural language processing does exist in examples such as embedded search functionalities internally or chatbots on the customer side, but it does not relate to the scope of this project. Some international actors have successfully implemented natural language processing for the classification of text data such as corrective action programs. Furthermore, it was observed that the lingo and jargon in the nuclear power plant industry differ between utilities as well as from the native language. To tackle this, models further trained on domain-specific data could be beneficial to better analyze the text data and solve natural language processing tasks. As the data used in this study was unlabeled, expert input from the nuclear domain is required for a proper analysis of the results. Working for a more data-driven industry would be valuable for the implementation of natural language processing. / I denna studie har den nuvarande användningen av Natural language processing (NLP) i svenska och internationella kärnkraftverk undersökts genom semistrukturerade intervjuer. Vidare har NLP studerats för att ta reda på hur textdata kan analyseras och användas för att underlätta drift och underhåll i den svenska kärnkraftsindustrin. Transformersmodellen BERT användes för att analysera textdata från driften vid ett svenskt kärnkraftverk. Denna studie har inte lyckats hitta några aktuella implementeringar av NLP för drift och underhåll i svenska kärnkraftverk. NLP finns som inbäddade sökfunktioner internt eller chatbottar på kundsidan, men dessa omfattas inte av detta projekt. Vissa internationella aktörer har framgångsrikt implementerat NLP för klassificering av textdata som t.ex. avhjälpande underhåll (Corrective action programs). Vidare observerades att språket och jargongen inom kärnkraftsindustrin skiljer sig mellan olika kraftverk och från det vanliga språket. Genom att träna modellerna på domänspecifik data skulle modellerna kunna prestera bättre. Eftersom data som användes i denna studie var omärkt (unlabeled), krävs expertinput från kärnkraftsområdet för en korrekt analys av resultaten. Att arbeta för en mer datadriven industri skulle vara värdefullt för implementeringen av NLP / Feasibility Study on Artificial Intelligence Technologies in Nuclear Applications
|
139 |
Semantically Aligned Sentence-Level Embeddings for Agent Autonomy and Natural Language UnderstandingFulda, Nancy Ellen 01 August 2019 (has links)
Many applications of neural linguistic models rely on their use as pre-trained features for downstream tasks such as dialog modeling, machine translation, and question answering. This work presents an alternate paradigm: Rather than treating linguistic embeddings as input features, we treat them as common sense knowledge repositories that can be queried using simple mathematical operations within the embedding space, without the need for additional training. Because current state-of-the-art embedding models were not optimized for this purpose, this work presents a novel embedding model designed and trained specifically for the purpose of "reasoning in the linguistic domain".Our model jointly represents single words, multi-word phrases, and complex sentences in a unified embedding space. To facilitate common-sense reasoning beyond straightforward semantic associations, the embeddings produced by our model exhibit carefully curated properties including analogical coherence and polarity displacement. In other words, rather than training the model on a smorgaspord of tasks and hoping that the resulting embeddings will serve our purposes, we have instead crafted training tasks and placed constraints on the system that are explicitly designed to induce the properties we seek. The resulting embeddings perform competitively on the SemEval 2013 benchmark and outperform state-of- the-art models on two key semantic discernment tasks introduced in Chapter 8.The ultimate goal of this research is to empower agents to reason about low level behaviors in order to fulfill abstract natural language instructions in an autonomous fashion. An agent equipped with an embedding space of sucient caliber could potentially reason about new situations based on their similarity to past experience, facilitating knowledge transfer and one-shot learning. As our embedding model continues to improve, we hope to see these and other abilities become a reality.
|
140 |
Fine-tuning a BERT-based NER Model for Positive Energy DistrictsOrtega, Karen, Sun, Fei January 2023 (has links)
This research presents an innovative approach to extracting information from Positive Energy Districts (PEDs), urban areas generating surplus energy. PEDs are integral to the European Commission's SET Plan, tackling housing challenges arising from population growth. The study refines BERT to categorize PED-related entities, producing a cutting-edge NER model and an integrated pipeline of diverse NER tools and data sources. The model achieves an accuracy of 0.81 and an F1 Score of 0.55 with notably high confidence scores through pipeline evaluations, confirming its practical applicability. While the F1 score falls short of expectations, this pioneering exploration in PED information extraction sets the stage for future refinements and studies, promising enhanced methodologies and impactful outcomes in this dynamic field. This research advances NER processes for Positive Energy Districts, supporting their development and implementation.
|
Page generated in 0.0497 seconds