Global ETD Search

901	Text ranking based on semantic meaning of sentences / Textrankning baserad på semantisk betydelse hos meningar Stigeborn, Olivia January 2021 (has links) Finding a suitable candidate to client match is an important part of consultant companies work. It takes a lot of time and effort for the recruiters at the company to read possibly hundreds of resumes to find a suitable candidate. Natural language processing is capable of performing a ranking task where the goal is to rank the resumes with the most suitable candidates ranked the highest. This ensures that the recruiters are only required to look at the top ranked resumes and can quickly get candidates out in the field. Former research has used methods that count specific keywords in resumes and can make decisions on whether a candidate has an experience or not. The main goal of this thesis is to use the semantic meaning of the text in the resumes to get a deeper understanding of a candidate’s level of experience. It also evaluates if the model is possible to run on-device and if the database can contain a mix of English and Swedish resumes. An algorithm was created that uses the word embedding model DistilRoBERTa that is capable of capturing the semantic meaning of text. The algorithm was evaluated by generating job descriptions from the resumes by creating a summary of each resume. The run time, memory usage and the ranking the wanted candidate achieved was documented and used to analyze the results. When the candidate who was used to generate the job description is ranked in the top 10 the classification was considered to be correct. The accuracy was calculated using this method and an accuracy of 68.3% was achieved. The results show that the algorithm is capable of ranking resumes. The algorithm is able to rank both Swedish and English resumes with an accuracy of 67.7% for Swedish resumes and 74.7% for English. The run time was fast enough at an average of 578 ms but the memory usage was too large to make it possible to use the algorithm on-device. In conclusion the semantic meaning of resumes can be used to rank resumes and possible future work would be to combine this method with a method that counts keywords to research if the accuracy would increase. / Att hitta en lämplig kandidat till kundmatchning är en viktig del av ett konsultföretags arbete. Det tar mycket tid och ansträngning för rekryterare på företaget att läsa eventuellt hundratals CV:n för att hitta en lämplig kandidat. Det finns språkteknologiska metoder för att rangordna CV:n med de mest lämpliga kandidaterna rankade högst. Detta säkerställer att rekryterare endast behöver titta på de topprankade CV:erna och snabbt kan få kandidater ut i fältet. Tidigare forskning har använt metoder som räknar specifika nyckelord i ett CV och är kapabla att avgöra om en kandidat har specifika erfarenheter. Huvudmålet med denna avhandling är att använda den semantiska innebörden av texten iCV:n för att få en djupare förståelse för en kandidats erfarenhetsnivå. Den utvärderar också om modellen kan köras på mobila enheter och om algoritmen kan rangordna CV:n oberoende av om CV:erna är på svenska eller engelska. En algoritm skapades som använder ordinbäddningsmodellen DistilRoBERTa som är kapabel att fånga textens semantiska betydelse. Algoritmen utvärderades genom att generera jobbeskrivningar från CV:n genom att skapa en sammanfattning av varje CV. Körtiden, minnesanvändningen och rankningen som den önskade kandidaten fick dokumenterades och användes för att analysera resultatet. När den kandidat som användes för att generera jobbeskrivningen rankades i topp 10 ansågs klassificeringen vara korrekt. Noggrannheten beräknades med denna metod och en noggrannhet på 68,3 % uppnåddes. Resultaten visar att algoritmen kan rangordna CV:n. Algoritmen kan rangordna både svenska och engelska CV:n med en noggrannhet på 67,7 % för svenska och 74,7 % för engelska. Körtiden var i genomsnitt 578 ms vilket skulle möjliggöra att algoritmen kan köras på mobila enheter men minnesanvändningen var för stor. Sammanfattningsvis kan den semantiska betydelsen av CV:n användas för att rangordna CV:n och ett eventuellt framtida arbete är att kombinera denna metod med en metod som räknar nyckelord för att undersöka hur noggrannheten skulle påverkas. Natural language processing Word Embedding Resume Ranking Semantic meaning Språkteknologi Ordinbäddning CV rankning Semantisk betydelse Computer Sciences Datavetenskap (datalogi)
902	Unsupervised topic modeling for customer support chat : Comparing LDA and K-means Andersson, Fredrik, Idemark, Alexander January 2021 (has links) Fortnox takes in many errands via their support chat. Some of the questions can be hard to interpret, making it difficult to know where to delegate the question further. It would be beneficial if the process was automated to answer the questions instead of need to put in time to analyze the questions to be able to delegate them. So, the main task is to find an unsupervised model that can take questions and put them into topics. A literature review over NLP and clustering was needed to find the most suitable models and techniques for the problem. Then implementing the models and techniques and evaluating them using support chat questions received by Fortnox. The unsupervised models tested in this thesis were LDA and K-means. The resulting models after training are analyzed, and some of the clusters are given a label. The authors of the thesis give clusters a label after analyzing them by looking at the most relevant words for the cluster. Three different sets of labels are analyzed and tested. The models are evaluated using five different score metrics: Silhouette, AdjustedRand Index, Recall, Precision, and F1 score. K-means scores the best when looking at the score metrics and have an F1 score of 0.417. But can not handle very small documents. LDA does not perform very well and got i F1 score of 0.137 and is not able to categorize documents together. LDA K-means Topic modeling Natural Language Processing clustering customer support unsupervised machine learning Computer Sciences Datavetenskap (datalogi)
903	Conversational Engine for Transportation Systems Sidås, Albin, Sandberg, Simon January 2021 (has links) Today's communication between operators and professional drivers takes place through direct conversations between the parties. This thesis project explores the possibility to support the operators in classifying the topic of incoming communications and which entities are affected through the use of named entity recognition and topic classifications. By developing a synthetic training dataset, a NER model and a topic classification model was developed and evaluated to achieve F1-scores of 71.4 and 61.8 respectively. These results were explained by a low variance in the synthetic dataset in comparison to a transcribed dataset from the real world which included anomalies not represented in the synthetic dataset. The aforementioned models were integrated into the dialogue framework Emora to seamlessly handle the back and forth communication and generating responses. Natural Language Processing Topic Classification Named Entity Classification NLP NER NERC
904	Transforming Legal Entity Recognition Andersson-Säll, Tim January 2021 (has links) Transformer-based architectures have in recent years advanced state-of-the-art performance in Natural Language Processing. Researchers have successfully adapted such models to downstream tasks within NLP in a domain-specific setting. This thesis examines the application of these models to the legal domain by doing Named Entity Recognition (NER) in a setting of scarce training data. Three different pre-trained BERT models are fine-tuned on a set of 101 court case documents, whereof one model is pre-trained on legal corpora and the other two on general corpora. Experiments are run to evaluate the models’ predictive performance given smaller or larger quantities of data to fine-tune on. Results show that BERT models work reasonably well for NER with legal data. Unlike many other domain-specific BERT models, the BERT model trained on legal corpora does not outperform the base models. Modest amounts of annotated data seem sufficient for reasonably good performance. Natural Language Processing BERT Transformer Legal AI Transfer Learning Neural Networks Named Entity Recognition Probability Theory and Statistics Sannolikhetsteori och statistik
905	Sémantický parsing nezávislý na uspořádání vrcholů / Permutation-Invariant Semantic Parsing Samuel, David January 2021 (has links) Deep learning has been successfully applied to semantic graph parsing in recent years. However, to our best knowledge, all graph-based parsers depend on a strong assumption about the ordering of graph nodes. This work explores a permutation-invariant approach to sentence-to-graph semantic parsing. We present a versatile, cross-framework, and language-independent architecture for universal modeling of semantic structures. To empirically validate our method, we participated in the CoNLL 2020 shared task, Cross- Framework Meaning Representation Parsing (MRP 2020), which evaluated the competing systems on five different frameworks (AMR, DRG, EDS, PTG, and UCCA) across four languages. Our parsing system, called PERIN, was one of the winners of this shared task. Thus, we believe that permutation invariance is a promising new direction in the field of semantic parsing. 1
906	Toward a Real-Time Recommendation for Online Social Networks Albalawi, Rania 07 June 2021 (has links) The Internet increases the demand for the development of commercial applications and services that can provide better shopping experiences for customers globally. It is full of information and knowledge sources that might confuse customers. This requires customers to spend additional time and effort when they are trying to find relevant information about specific topics or objects. Recommendation systems are considered to be an important method that solves this issue. Incorporating recommendation systems in online social networks led to a specific kind of recommendation system called social recommendation systems which have become popular with the global explosion in social media and online networks and they apply many prediction algorithms such as data mining techniques to address the problem of information overload and to analyze a vast amount of data. We believe that offering a real-time social recommendation system that can understand the real context of a user’s conversation dynamically is essential to defining and recommending interesting objects at the ideal time. In this thesis, we propose an architecture for a real-time social recommendation system that aims to improve word usage and understanding in social media platforms, advance the performance and accuracy of recommendations, and propose a possible solution to the user cold-start problem. Moreover, we aim to find out if the user’s social context can be used as an input source to offer personalized and improved recommendations that will help users to find valuable items immediately, without interrupting their conversation flow. The suggested architecture works as a third-party social recommendation system that could be incorporated with other existing social networking sites (e.g. Facebook and Twitter). The novelty of our approach is the dynamic understanding of the user-generated content, achieved by detecting topics from the user’s extracted dialogue and then matching them with an appropriate task as a recommendation. Topic extraction is done through a modified Latent Dirichlet Allocation topic modeling method. We also develop a social chat app as a proof of concept to validate our proposed architecture. The results of our proposed architecture offer promising gains in enhancing the real-time social recommendations. Social Recommendation System Social Media Architecture Real-Time Advertisements Online Social Networks Natural Language Processing Topic Modeling
907	Natural language processing for researchh philosophies and paradigms dissertation (DFIT91) Mawila, Ntombhimuni 28 February 2021 (has links) Research philosophies and paradigms (RPPs) reveal researchers’ assumptions and provide a systematic way in which research can be carried out effectively and appropriately. Different studies highlight cognitive and comprehension challenges of RPPs concepts at the postgraduate level. This study develops a natural language processing (NLP) supervised classification application that guides students in identifying RPPs applicable to their study. By using algorithms rooted in a quantitative research approach, this study builds a corpus represented using the Bag of Words model to train the naïve Bayes, Logistic Regression, and Support Vector Machine algorithms. Computer experiments conducted to evaluate the performance of the algorithms reveal that the Naïve Bayes algorithm presents the highest accuracy and precision levels. In practice, user testing results show the varying impact of knowledge, performance, and effort expectancy. The findings contribute to the minimization of issues postgraduates encounter in identifying research philosophies and the underlying paradigms for their studies. / Science and Technology Education / MTech. (Information Technology) Research Philosophy Paradigm Corpus Algorithm Classification model Classifier Bag of words Naive Bayes Researcher 006.35
908	A Comparative study of Knowledge Graph Embedding Models for use in Fake News Detection Frimodig, Matilda, Lanhed Sivertsson, Tom January 2021 (has links) During the past few years online misinformation, generally referred to as fake news, has been identified as an increasingly dangerous threat. As the spread of misinformation online has increased, fake news detection has become an active line of research. One approach is to use knowledge graphs for the purpose of automated fake news detection. While large scale knowledge graphs are openly available these are rarely up to date, often missing the relevant information needed for the task of fake news detection. Creating new knowledge graphs from online sources is one way to obtain the missing information. However extracting information from unstructured text is far from straightforward. Using Natural Language Processing techniques we developed a pre-processing pipeline for extracting information from text for the purpose of creating knowledge graphs. In order to classify news as fake or not fake with the use of knowledge graphs, these need to be converted into a machine understandable format, called knowledge graph embeddings. These embeddings also allow new information to be inferred or classified based on the already existing information in the knowledge graph. Only one knowledge graph embedding model has previously been used for the purpose of fake news detection while several new models have recently been developed. We compare the performance of three different embedding models, all relying on different fundamental architectures, in the specific context of fake news detection. The models used were the geometric model TransE, the tensor decomposition model ComplEx and the deep learning model ConvKB. The results of this study shows that out of the three models, ConvKB is the best performing. However other aspects than performance need to be considered and as such these results do not necessarily mean that a deep learning approach is the most suitable for real world fake news detection. Machine Learning Fake News Detection Knowledge Graph Natural Language Processing Knowledge Graph Embedding Computer Sciences Datavetenskap (datalogi)
909	Concept Based Knowledge Discovery From Biomedical Literature Radovanovic, Aleksandar January 2009 (has links) Philosophiae Doctor - PhD / Advancement in biomedical research and continuous growth of scientific literature available in electronic form, calls for innovative methods and tools for information management, knowledge discovery, and data integration. Many biomedical fields such as genomics, proteomics, metabolomics, genetics, and emerging disciplines like systems biology and conceptual biology require synergy between experimental, computational, data mining and text mining technologies. A large amount of biomedical information available in various repositories, such as the US National Library of Medicine Bibliographic Database, emerge as a potential source of textual data for knowledge discovery. Text mining and its application of natural language processing and machine learning technologies to problems of knowledge discovery, is one of the most challenging fields in bioinformatics. This thesis describes and introduces novel methods for knowledge discovery and presents a software system that is able to extract information from biomedical literature, review interesting connections between various biomedical concepts and in so doing, generates new hypotheses. The experimental results obtained by using methods described in this thesis, are compared to currently published results obtained by other methods and a number of case studies are described. This thesis shows how the technology presented can be integrated with the researchers' own knowledge, experimentation and observations for optimal progression of scientific research. Bioinformaties Text mining PubMed Entity recognition Information extraction Relation Extraction Levenshtein distance Supervised classification Natural Language Processing Machine learning
910	Rättssäker Textanalys Svensson, Henrik, Lindqvist, Kalle January 2019 (has links) Digital språkbehandling (natural language processing) är ett forskningsområde inom vilketdet ständigt görs nya framsteg. En betydande del av den textanalys som sker inom dettafält har som mål att uppnå en fullgod tillämpning kring dialogen mellan människa ochdator. I denna studie vill vi dock fokusera på den inverkan digital språkbehandling kan hapå den mänskliga inlärningsprocessen. Vårt praktiska testområde har också en framtidainverkan på en av de mest grundläggande förutsättningarna för ett rättssäkert samhälle,nämligen den polisiära rapportskrivningen.Genom att skapa en teoretisk idébas som förenar viktiga aspekter av digital språk-behandling och polisrapportskrivning samt därefter implementera dem i en pedagogiskwebbplattform ämnad för polisstudenter är vi av uppfattningen att vår forskning tillförnågot nytt inom det datavetenskapliga respektive det samhällsvetenskapliga fälten.Syftet med arbetet är att verka som de första stegen mot en webbapplikation somunderstödjer svensk polisdokumentation. / Natural language processing is a research area in which new advances are constantly beingmade. A significant portion of text analyses that takes place in this field have the aim ofachieving a satisfactory application in the dialogue between human and computer. In thisstudy, we instead want to focus on what impact natural language processing can have onthe human learning process.Simultaneously, the context for our research has a future impact on one of the mostbasic principles for a legally secure society, namely the writing of the police report.By creating a theoretical foundation of ideas that combines aspects of natural languageprocessing as well as official police report writing and then implementing them in aneducational web platform intended for police students, we are of the opinion that ourresearch adds something new in the computer science and sociological fields.The purpose of this work is to act as the first steps towards a web application thatsupports the Swedish police documentation. digital text analysis digital språkbehandling natural language processing nlp datorlingvistik datalingvistik computational linguistics rättssäkerhet Engineering and Technology Teknik och teknologier

Search results