Global ETD Search

921	Modeling Actions and State Changes for a Machine Reading Comprehension Dataset January 2019 (has links) abstract: Artificial general intelligence consists of many components, one of which is Natural Language Understanding (NLU). One of the applications of NLU is Reading Comprehension where it is expected that a system understand all aspects of a text. Further, understanding natural procedure-describing text that deals with existence of entities and effects of actions on these entities while doing reasoning and inference at the same time is a particularly difficult task. A recent natural language dataset by the Allen Institute of Artificial Intelligence, ProPara, attempted to address the challenges to determine entity existence and entity tracking in natural text. As part of this work, an attempt is made to address the ProPara challenge. The Knowledge Representation and Reasoning (KRR) community has developed effective techniques for modeling and reasoning about actions and similar techniques are used in this work. A system consisting of Inductive Logic Programming (ILP) and Answer Set Programming (ASP) is used to address the challenge and achieves close to state-of-the-art results and provides an explainable model. An existing semantic role label parser is modified and used to parse the dataset. On analysis of the learnt model, it was found that some of the rules were not generic enough. To overcome the issue, the Proposition Bank dataset is then used to add knowledge in an attempt to generalize the ILP learnt rules to possibly improve the results. / Dissertation/Thesis / Masters Thesis Computer Science 2019 Computer science inductive logic programming machine comprehension machine reading comprehension modeling actions modeling states natural language processing
922	Similarity Search in Document Collections / Similarity Search in Document Collections Jordanov, Dimitar Dimitrov January 2009 (has links) Hlavním cílem této práce je odhadnout výkonnost volně šířeni balík Sémantický Vektory a třída MoreLikeThis z balíku Apache Lucene. Tato práce nabízí porovnání těchto dvou přístupů a zavádí metody, které mohou vést ke zlepšení kvality vyhledávání.
923	Modelování emocí v komunikačním agentu / Modelling Emotions in Communication Agents Sivák, Martin Unknown Date (has links) This work deals with current chatterbot systems. It describes problems and possibilities of improvement with emphasis on natural language processing and emotion modeling during conversation. There is an implementation, based on the described knowledge, introduced in the second part of the thesis, also with experimental success rate evaluation.
924	Power Outage Management using Social Sensing Khan, Sifat Shahriar 02 July 2019 (has links) No description available. Artificial Intelligence Computer Science Electrical Engineering
925	Neural Network Models for Tasks in Open-Domain and Closed-Domain Question Answering Chen, Charles L. 01 June 2020 (has links) No description available. Computer Science Information Science Natural Language Processing Question Answering Neural Network Context-Dependent Semantic Parsing Question Ranking Medical Data cQA
926	Text simplification in Swedish using transformer-based neural networks / Textförenkling på Svenska med transformer-baserade neurala nätverk Söderberg, Samuel January 2023 (has links) Textförenkling innebär modifiering av text så att den blir lättare att läsa genom ersättning av komplexa ord, ändringar av satsstruktur och/eller borttagning av onödig information. Forskning existerar kring textförenkling på svenska, men användandet av neurala nätverk inom området är begränsat. Neurala nätverk kräver storaskaliga och högkvalitativa dataset, men sådana dataset är sällsynta för textförenkling på svenska. Denna studie undersöker framtagning av dataset för textförenkling på svenska genom parafrasutvinning från webbsidor och genom översättning av existerande dataset till svenska, och hur neurala nätverk tränade på sådana dataset presterar. Tre dataset med sekvenspar av komplexa och motsvarande simpla sekvenser skapades, den första genom parafrasutvinning från web data, det andra genom översättning av ett dataset från engelska till svenska, och ett tredje genom att kombinera de framtagna dataseten till ett. Dessa dataset användes sedan för att finjustera ett neuralt vätverk av BART modell, förtränad på stora mängder svensk data. Utvärdering av de tränade modellerna utfördes sedan genom en manuell undersökning och kategorisering av output, och en automatiserad bedömning med mätverktygen SARI och LIX. Två olika dataset för testning skapades och användes i utvärderingen, ett översatt från engelska och ett manuellt framtaget från svenska texter. Den automatiska utvärderingen med SARI gav resultat nära, men inte lika bra, som liknande forskning inom textförenkling på engelska. Utvärderingen med LIX gav resultat på liknande nivå eller bättre än nuvarande forskning inom textförenkling på svenska. Den manuella utvärderingen visade att modellen tränad på datat från parafrasutvinningen oftast producerade korta sekvenser med många ändringar jämfört med originalet, medan modellen tränad på det översatta datasetet oftast producerade oförändrade sekvenser och/eller sekvenser med få ändringar. Dock visade det sig att modellen tränad på de utvunna paragraferna producerade många fler oanvändbara sekvenser än vad modellen tränad på det översatta datasetet gjorde. Modellen tränad på det kombinerade datasetet presterade mellan de två andra modellerna i dessa två avseenden, då den producerade färre oanvändbara sekvenser än modellen tränad på de utvunna paragraferna och färre oförändrade sekvenser jämfört med modellen tränad på det översatta datat. Många sekvenser förenklades bra med de tre modellerna, men den manuella utvärderingen visade att en signifikant andel av de genererade sekvenserna förblev oförändrade eller oanvändbara, vilket belyser behovet av ytterligare forskning, utforskning av metoder, och förfinande av de använda verktygen. / Text simplification involves modifying text to make it easier to read by replacing complex words, altering sentence structure, and/or removing unnecessary information. It can be used to make text more accessible to a larger crowd. While research in text simplification exists for Swedish, the use of neural networks in the field is limited. Neural networks require large-scale high-quality datasets, but such datasets are scarce for text simplification in Swedish. This study investigates the acquisition of datasets through paraphrase mining from web snapshots and translation of existing datasets for text simplification in English to Swedish and aims to assess the performance of neural network models trained on such acquired datasets. Three datasets with complex-to-simple sequence pairs were created, one through mining paraphrases from web data, another by translating a dataset from English to Swedish, and a third by combining the acquired mined and translated datasets into one. These datasets were then used to fine-tune a BART neural network model pre-trained on large amounts of Swedish data. An evaluation was conducted through manual examination and categorization of output, and automated assessment using the SARI and LIX metrics. Two different test sets were evaluated, one translated from English and one manually constructed from Swedish texts. The automatic evaluation produced SARI scores close to, but not as well as, similar research in text simplification in English. When considering LIX scores, the models perform on par or better than existing research into automatic text simplification in Swedish. The manual evaluation revealed that the model trained on the mined paraphrases generally produced short sequences that had many alterations compared to the original, while the translated dataset often produced unchanged sequences and sequences with few alterations. However, the model trained on the mined dataset produced many more sequences that were unusable, either with corrupted Swedish or by altering the meaning of the sequences, compared to the model trained on the translated dataset. The model trained on the combined dataset reached a middle ground in these two regards, producing fewer unusable sequences than the model trained on the mined dataset and fewer unchanged sequences compared to the model trained on the translated dataset. Many sequences were successfully simplified using the three models, but the manual evaluation revealed that a significant portion of the generated sequences remains unchanged or unusable, highlighting the need for further research, exploration of methods, and tool refinement. Machine learning Natural language processing Text simplification Datasets Maskininlärning Neurolingvistisk programmering Textförenkling Dataset Computer and Information Sciences Data- och informationsvetenskap
927	Password habits of Sweden Gustafsson, Daniel January 2023 (has links) The password is the first line of defence in most modern web services, it is therefore critical to choose a strong password. Many previous studies have found patterns to improve in global users password creation but none have researched the patterns of Swedish users in particular. In this project, passwords of Swedish users were gathered from underground forums and analyzed to find if Swedish users create passwords differently from global users and if there are any weak patterns in their passwords. We found that Swedish users often use words or names found in a Swedish NLP corpus in their passwords as well as using lowercase letters more frequently than global users. We also found that several of the most popular Swedish websites use weak password policies which might contribute to Swedish users choosing weak passwords. / Lösenordet är den första försvarslinjen i de flesta moderna nät tjänsterna, det är därför kritiskt att välja ett starkt lösenord. Många tidigare studier har upptäckt mönster som kan förbättras i globala användares lösenord men ingen har tidigare forskat på mönster hos just svenska användare. I det här projektet har vi samlat lösenord av svenska användare från olika undergroundforum och analyserat dem för att ta reda på om svenska användare skapar sina lösenord annorlunda från globala användare och ifall det finns några svaga mönster i lösenorden. Vi fann att svenska användare ofta använder ord eller namn från en svensk NLP korpus i sina lösenord och även att svenska användare använder små bokstäver i högre grad än globala användare. Vi fann även att flera av de mest populära svenska hemsidorna har svaga lösenordspolicys vilken kan bidra till att svenska användare väljer svaga lösenord. Passwords Security Sweden Natural language processing NLP Policy Pattern mask Lösenord Säkerhet Sverige Mönster Computer and Information Sciences Data- och informationsvetenskap
928	Invoice Line Item Extraction using Machine Learning SaaS Models Kadir, Avin January 2022 (has links) Manual invoice processing is a time-consuming and error prone task which has proven to be done more efficiently by introducing automation software that minimizes the need for human input. Amazon Textract is a software as a service provided by Amazon Web Services for that purpose. It has been developed to extract document data from both general and financial documents, such as receipts and invoices, by using machine learning models. The service is available in multiple widely spoken languages, but not in Swedish as of the time of writing this thesis. This thesis explores the potential and accuracy of Amazon Textract in extracting data from Swedish invoices by using the English setting. Specifically, the accuracy of extracting line items as well as Swedish letters are examined. In addition, the potential of correcting incorrectly extracted data is explored. This is achieved by testing certain defined categories on each invoice by comparing the Amazon Textract extractions with the correct labeled data. These categories include emptiness, meaning no data was extracted, equality, missing and added line items, as well as missing and added characters that have been added or removed from otherwise correct line item strings. The invoices themselves are divided into two categories, namely structured and semi-structured invoices. The tests are mainly conducted on the service’s dedicated API method for data extraction of financial documents, but a comparison with the table extraction API method is also made to gain more insight in Amazon Textract’s capability. The results suggest that Amazon Textract is quite inaccurate when extracting line item data from Swedish invoices. Therefore, manual post processing of the data is generally needed to ensure its correctness. However, it showed better results in extracting data from structured invoices, where it scored 70% in equality and 100% in 2 out of 6 invoice layouts. The Swedish character accuracy was 66%. natural language processing Amazon Textract invoice data extraction accounts payable
929	Evaluation and Implementation of Code Search using Transformers to Enhance Developer Productivity / Evaluering och Implementering av Kodsökning genom Transformers för att Förbättra Utvecklares Produktivitet Fredrikson, Sara, Månsson, Clara January 2023 (has links) With the rapid advancements in the field of Natural Language Processing and Artificial Intelligence, several aspects of its use cases and impact on productivity are largely unexplored. Many of the recent machine learning models are based on an architecture called Transformers that allows for faster computation and for more context to be preserved. At the same time, tech companies face the dilemmas of how to navigate their code bases, spanning over millions of lines of code. The aim of this thesis is to investigate whether the implementation and fine-tuning of a Transformers-based model can be utilised to improve the code search process in a tech company, leading to improvements in developer productivity. Specifically, the thesis will evaluate the effectiveness of such implementation from a productivity perspective in terms of velocity, quality, and satisfaction. The research uses a mixed method design consisting of two distinct methodologies as well as analyses of quantitative and qualitative data. To assess the level of accuracy that can be obtained by optimising a Transformers-based model on internal data, an evaluative experiment with various internal datasets was conducted. The second methodology applied was a usability test, investigating potential impacts on velocity, quality, and satisfaction by testing a contextual code-search prototype with developers. Data from the tests was analysed through a heat map-, trade-off- and template analysis. Results indicate that a Transformers-based modes can be optimised for code search on internal data and has the potential to improve code search from the aspects of velocity, quality, and satisfaction. / Den snabba utvecklingen inom områdena för Språlteknologi och Artificiell Intelligens har visat på stora framgångar men också lämnat utrymme för ytterligare forskning på dess användningsområden och inverkan på produktivitet. Många av de senaste maskininlärningsmodellerna använder sig av en arkitektur kallad Transformers. Denna arkitektur möjliggör snabbare bearbetning av data och är bättre på att ta hänsyn till kontext. Samtidigt står tech-bolagen inför stora utmaningar i att navigera sina kodbaser, vilka består av flera miljoner rader kod. Målet med denna uppsats är att undersöka huruvida implementering och fine-tuning av en Transformers-baserad modell kan användas för att förbättra kodsökningsprocessen i ett tech-bolag och därmed leda till förbättring av utvecklares produktivitet. Mer specifikt utvärderar uppsatsen en sådan implementation från ett produktivitetsperspektiv med hänsyn till dimensioner såsom hastighet, kvalitet och tillfredställelse. Uppsatsen använder sig av en mixad metodologi bestående av två distinkta metoder samt analys av både kvalitativ och kvantitativ data. För att utvärdera nivån av noggrannhet som kan uppnås genom implementation och optimering av en Transformers-baserad modell på intern data, genomfördes experiment på olika interna dataset. Den andra metoden består av ett usability test för att undersöka potentiella effekter på hastighet, kvalitet och tillfredställelse genom att testa en kontextuell kodsökningsprototyp med utvecklare. Data från testen analyserades genom en heat map, trade-off och template analys. Resultaten indikerar att en Transformers-baserad modell kan optimeras för kodsökningpå intern data och har möjlighet att förbättra kodsökning från perspektiven hastighet, kvalitet och tillfredställelse. Transformers Code Search Developer Productivity Natural Language Processing Code Discoverability Transformers Kodsökning Utvecklares Produktivitet Språkteknologi Kodupptäckbarhet Engineering and Technology Teknik och teknologier
930	ChatGPT as a Software Development Tool : The Future of Development Hörnemalm, Adam January 2023 (has links) The purpose of this master’s thesis was to research and evaluate how ChatGPT can be used as a tool in software developers’ daily work activities. In order to do this, the thesis was conducted in two phases, the initial exploration phase and the data collection phase. In the initial exploration phase, five senior-level developers were interviewed about their day-to-day work, opinions of generative AI, and the profession of software developers as a whole. From these interviews, a theoretical foundation for software development was formed, categorizing the daily work tasks of a software developer into either coding, communication, or planning. This theoretical foundation was then used as the basis for the tasks and interviews used during the data collection phase. In the data collection phase, seven developers, ranging from students to industry veterans, were asked to complete a set of representative tasks with the help of ChatGPT and afterward participate in an interview. The tasks were based upon the theoretical foundation of software development and aimed to serve as representative tasks that software developers have to do in their day-to-day work. Based on the tasks and interviews it was found that the use of ChatGPT did in fact help make software developers more effective when it came to coding and planning-based tasks, but not without risk since it was shown that junior developers trusted and relied more on the answers given by ChatGPT. Although ChatGPT showed a positive effect, the tooling still needs improvement, since the developers had trouble with the text formatting when completing communication-based tasks, as well as them expressing a desire for the tooling to be more integrated. However, this desire was not unexpected, since all of the developers involved showed interest in working with generative AI tooling for work-related tasks in the future. ChatGPT OpenAI Artificial Intelligence Generative AI Natural Language Processing Software Development Agile Development Developer Experience Interaction Technologies Interaktionsteknik

Search results