Global ETD Search

911	Sémantický parsing nezávislý na uspořádání vrcholů / Permutation-Invariant Semantic Parsing Samuel, David January 2021 (has links) Deep learning has been successfully applied to semantic graph parsing in recent years. However, to our best knowledge, all graph-based parsers depend on a strong assumption about the ordering of graph nodes. This work explores a permutation-invariant approach to sentence-to-graph semantic parsing. We present a versatile, cross-framework, and language-independent architecture for universal modeling of semantic structures. To empirically validate our method, we participated in the CoNLL 2020 shared task, Cross- Framework Meaning Representation Parsing (MRP 2020), which evaluated the competing systems on five different frameworks (AMR, DRG, EDS, PTG, and UCCA) across four languages. Our parsing system, called PERIN, was one of the winners of this shared task. Thus, we believe that permutation invariance is a promising new direction in the field of semantic parsing. 1
912	Toward a Real-Time Recommendation for Online Social Networks Albalawi, Rania 07 June 2021 (has links) The Internet increases the demand for the development of commercial applications and services that can provide better shopping experiences for customers globally. It is full of information and knowledge sources that might confuse customers. This requires customers to spend additional time and effort when they are trying to find relevant information about specific topics or objects. Recommendation systems are considered to be an important method that solves this issue. Incorporating recommendation systems in online social networks led to a specific kind of recommendation system called social recommendation systems which have become popular with the global explosion in social media and online networks and they apply many prediction algorithms such as data mining techniques to address the problem of information overload and to analyze a vast amount of data. We believe that offering a real-time social recommendation system that can understand the real context of a user’s conversation dynamically is essential to defining and recommending interesting objects at the ideal time. In this thesis, we propose an architecture for a real-time social recommendation system that aims to improve word usage and understanding in social media platforms, advance the performance and accuracy of recommendations, and propose a possible solution to the user cold-start problem. Moreover, we aim to find out if the user’s social context can be used as an input source to offer personalized and improved recommendations that will help users to find valuable items immediately, without interrupting their conversation flow. The suggested architecture works as a third-party social recommendation system that could be incorporated with other existing social networking sites (e.g. Facebook and Twitter). The novelty of our approach is the dynamic understanding of the user-generated content, achieved by detecting topics from the user’s extracted dialogue and then matching them with an appropriate task as a recommendation. Topic extraction is done through a modified Latent Dirichlet Allocation topic modeling method. We also develop a social chat app as a proof of concept to validate our proposed architecture. The results of our proposed architecture offer promising gains in enhancing the real-time social recommendations. Social Recommendation System Social Media Architecture Real-Time Advertisements Online Social Networks Natural Language Processing Topic Modeling
913	Natural language processing for researchh philosophies and paradigms dissertation (DFIT91) Mawila, Ntombhimuni 28 February 2021 (has links) Research philosophies and paradigms (RPPs) reveal researchers’ assumptions and provide a systematic way in which research can be carried out effectively and appropriately. Different studies highlight cognitive and comprehension challenges of RPPs concepts at the postgraduate level. This study develops a natural language processing (NLP) supervised classification application that guides students in identifying RPPs applicable to their study. By using algorithms rooted in a quantitative research approach, this study builds a corpus represented using the Bag of Words model to train the naïve Bayes, Logistic Regression, and Support Vector Machine algorithms. Computer experiments conducted to evaluate the performance of the algorithms reveal that the Naïve Bayes algorithm presents the highest accuracy and precision levels. In practice, user testing results show the varying impact of knowledge, performance, and effort expectancy. The findings contribute to the minimization of issues postgraduates encounter in identifying research philosophies and the underlying paradigms for their studies. / Science and Technology Education / MTech. (Information Technology) Research Philosophy Paradigm Corpus Algorithm Classification model Classifier Bag of words Naive Bayes Researcher 006.35
914	A Comparative study of Knowledge Graph Embedding Models for use in Fake News Detection Frimodig, Matilda, Lanhed Sivertsson, Tom January 2021 (has links) During the past few years online misinformation, generally referred to as fake news, has been identified as an increasingly dangerous threat. As the spread of misinformation online has increased, fake news detection has become an active line of research. One approach is to use knowledge graphs for the purpose of automated fake news detection. While large scale knowledge graphs are openly available these are rarely up to date, often missing the relevant information needed for the task of fake news detection. Creating new knowledge graphs from online sources is one way to obtain the missing information. However extracting information from unstructured text is far from straightforward. Using Natural Language Processing techniques we developed a pre-processing pipeline for extracting information from text for the purpose of creating knowledge graphs. In order to classify news as fake or not fake with the use of knowledge graphs, these need to be converted into a machine understandable format, called knowledge graph embeddings. These embeddings also allow new information to be inferred or classified based on the already existing information in the knowledge graph. Only one knowledge graph embedding model has previously been used for the purpose of fake news detection while several new models have recently been developed. We compare the performance of three different embedding models, all relying on different fundamental architectures, in the specific context of fake news detection. The models used were the geometric model TransE, the tensor decomposition model ComplEx and the deep learning model ConvKB. The results of this study shows that out of the three models, ConvKB is the best performing. However other aspects than performance need to be considered and as such these results do not necessarily mean that a deep learning approach is the most suitable for real world fake news detection. Machine Learning Fake News Detection Knowledge Graph Natural Language Processing Knowledge Graph Embedding Computer Sciences Datavetenskap (datalogi)
915	Concept Based Knowledge Discovery From Biomedical Literature Radovanovic, Aleksandar January 2009 (has links) Philosophiae Doctor - PhD / Advancement in biomedical research and continuous growth of scientific literature available in electronic form, calls for innovative methods and tools for information management, knowledge discovery, and data integration. Many biomedical fields such as genomics, proteomics, metabolomics, genetics, and emerging disciplines like systems biology and conceptual biology require synergy between experimental, computational, data mining and text mining technologies. A large amount of biomedical information available in various repositories, such as the US National Library of Medicine Bibliographic Database, emerge as a potential source of textual data for knowledge discovery. Text mining and its application of natural language processing and machine learning technologies to problems of knowledge discovery, is one of the most challenging fields in bioinformatics. This thesis describes and introduces novel methods for knowledge discovery and presents a software system that is able to extract information from biomedical literature, review interesting connections between various biomedical concepts and in so doing, generates new hypotheses. The experimental results obtained by using methods described in this thesis, are compared to currently published results obtained by other methods and a number of case studies are described. This thesis shows how the technology presented can be integrated with the researchers' own knowledge, experimentation and observations for optimal progression of scientific research. Bioinformaties Text mining PubMed Entity recognition Information extraction Relation Extraction Levenshtein distance Supervised classification Natural Language Processing Machine learning
916	Rättssäker Textanalys Svensson, Henrik, Lindqvist, Kalle January 2019 (has links) Digital språkbehandling (natural language processing) är ett forskningsområde inom vilketdet ständigt görs nya framsteg. En betydande del av den textanalys som sker inom dettafält har som mål att uppnå en fullgod tillämpning kring dialogen mellan människa ochdator. I denna studie vill vi dock fokusera på den inverkan digital språkbehandling kan hapå den mänskliga inlärningsprocessen. Vårt praktiska testområde har också en framtidainverkan på en av de mest grundläggande förutsättningarna för ett rättssäkert samhälle,nämligen den polisiära rapportskrivningen.Genom att skapa en teoretisk idébas som förenar viktiga aspekter av digital språk-behandling och polisrapportskrivning samt därefter implementera dem i en pedagogiskwebbplattform ämnad för polisstudenter är vi av uppfattningen att vår forskning tillförnågot nytt inom det datavetenskapliga respektive det samhällsvetenskapliga fälten.Syftet med arbetet är att verka som de första stegen mot en webbapplikation somunderstödjer svensk polisdokumentation. / Natural language processing is a research area in which new advances are constantly beingmade. A significant portion of text analyses that takes place in this field have the aim ofachieving a satisfactory application in the dialogue between human and computer. In thisstudy, we instead want to focus on what impact natural language processing can have onthe human learning process.Simultaneously, the context for our research has a future impact on one of the mostbasic principles for a legally secure society, namely the writing of the police report.By creating a theoretical foundation of ideas that combines aspects of natural languageprocessing as well as official police report writing and then implementing them in aneducational web platform intended for police students, we are of the opinion that ourresearch adds something new in the computer science and sociological fields.The purpose of this work is to act as the first steps towards a web application thatsupports the Swedish police documentation. digital text analysis digital språkbehandling natural language processing nlp datorlingvistik datalingvistik computational linguistics rättssäkerhet Engineering and Technology Teknik och teknologier
917	Chatbot : The future of customer feedback Dinh, Kevin Hoang January 2020 (has links) This is a study about how to convert a survey to a chatbot and distribute it to various communication channels to collect feedback to improve themselves. What would be the most convenient way to gather feedback? Our daily lives are becoming more and more dependent on digital devices every day. The rise in digital devices leads to a wider range of communication channels. Is it not a good opportunity to use these channels for several purposes. This study focuses on chatbots, survey systems, communication channels, and their ability to gather feedback from respondents and use it to increase the quality of goods, services, and perhaps life. By using chatbot language knowledge, people can engage with the bot in a conversation and answer survey questions in a different way. By using Restful API, the chatbot can extract quantitative information to be analyzed for development. Although the chatbot is not well-made and still requires a lot of adjustments, the work has proven to have many opportunities in surveys, gathering feedback, and analyzing it. This could be an improvement for research regarding chatbots in the future or a new way to make surveys better. / Detta är en studie om hur man konvertera en undersökning till en chattbot och sprida den till olika kommunikationskanaler för att samla återkoppling for att förbättra sig själv. Vad skulle vara det bekvämaste sättet att samla återkoppling? Våra dagliga liv blir mer och mer beroende av digitala enheter var dag. Ökningen av digitala enheter leder till ett större utbud av kommunikationskanaler. Är det inte då en bra möjlighet att utnyttja dessa kanaler för flera ändamål. Det här arbetet focuserar på chattbotar, undersökningssystem och deras förmåga att samla återkoppling från respondenter och använda den för att öka kvaliteten av varor, tjänster och kanske livet. Genom att använda chattbottens språkkunskap kan människor engagera sig med botten i en konversation och svara på undersökningsfrågor på ett annorlunda sätt. Genom att använda sig av något kallat Restful API kan man ta ut kvantitativ information för att analysera den för förbättringssyfte gällande produkter och tjänster. Trots att chattbotten inte är välgjord och fortfarande kräver mycket justeringar så har arbetet visat sig ha många möjligheter inom undersökningar, samla återkoppling och att analysera det. Detta kan vara en förbättring för forskning om chattbottar i framtiden eller ett nytt sätt att förbättra undersökningar. Chatbot feedback Natural Language Processing Computer and Information Sciences Data- och informationsvetenskap
918	Modélisation du langage à l'aide de pénalités structurées / Modeling language with structured penalties Nelakanti, Anil Kumar 11 February 2014 (has links) La modélisation de la langue naturelle est l¿un des défis fondamentaux de l¿intelligence artificielle et de la conception de systèmes interactifs, avec applications dans les systèmes de dialogue, la génération de texte et la traduction automatique. Nous proposons un modèle log-linéaire discriminatif donnant la distribution des mots qui suivent un contexte donné. En raison de la parcimonie des données, nous proposons un terme de pénalité qui code correctement la structure de l¿espace fonctionnel pour éviter le sur-apprentissage et d¿améliorer la généralisation, tout en capturant de manière appropriée les dépendances à long terme. Le résultat est un modèle efficace qui capte suffisamment les dépendances longues sans occasionner une forte augmentation des ressources en espace ou en temps. Dans un modèle log-linéaire, les phases d¿apprentissage et de tests deviennent de plus en plus chères avec un nombre croissant de classes. Le nombre de classes dans un modèle de langue est la taille du vocabulaire, qui est généralement très importante. Une astuce courante consiste à appliquer le modèle en deux étapes: la première étape identifie le cluster le plus probable et la seconde prend le mot le plus probable du cluster choisi. Cette idée peut être généralisée à une hiérarchie de plus grande profondeur avec plusieurs niveaux de regroupement. Cependant, la performance du système de classification hiérarchique qui en résulte dépend du domaine d¿application et de la construction d¿une bonne hiérarchie. Nous étudions différentes stratégies pour construire la hiérarchie des catégories de leurs observations. / Modeling natural language is among fundamental challenges of artificial intelligence and the design of interactive machines, with applications spanning across various domains, such as dialogue systems, text generation and machine translation. We propose a discriminatively trained log-linear model to learn the distribution of words following a given context. Due to data sparsity, it is necessary to appropriately regularize the model using a penalty term. We design a penalty term that properly encodes the structure of the feature space to avoid overfitting and improve generalization while appropriately capturing long range dependencies. Some nice properties of specific structured penalties can be used to reduce the number of parameters required to encode the model. The outcome is an efficient model that suitably captures long dependencies in language without a significant increase in time or space requirements. In a log-linear model, both training and testing become increasingly expensive with growing number of classes. The number of classes in a language model is the size of the vocabulary which is typically very large. A common trick is to cluster classes and apply the model in two-steps; the first step picks the most probable cluster and the second picks the most probable word from the chosen cluster. This idea can be generalized to a hierarchy of larger depth with multiple levels of clustering. However, the performance of the resulting hierarchical classifier depends on the suitability of the clustering to the problem. We study different strategies to build the hierarchy of categories from their observations. Traitement du langage naturel Apprentissage automatique Modélisation probabiliste Statistique Optimisation convexe Classification hiérarchique Convex optimization Natural language processing 004
919	Analysis and Decision-Making with Social Media January 2019 (has links) abstract: The rapid advancements of technology have greatly extended the ubiquitous nature of smartphones acting as a gateway to numerous social media applications. This brings an immense convenience to the users of these applications wishing to stay connected to other individuals through sharing their statuses, posting their opinions, experiences, suggestions, etc on online social networks (OSNs). Exploring and analyzing this data has a great potential to enable deep and fine-grained insights into the behavior, emotions, and language of individuals in a society. This proposed dissertation focuses on utilizing these online social footprints to research two main threads – 1) Analysis: to study the behavior of individuals online (content analysis) and 2) Synthesis: to build models that influence the behavior of individuals offline (incomplete action models for decision-making). A large percentage of posts shared online are in an unrestricted natural language format that is meant for human consumption. One of the demanding problems in this context is to leverage and develop approaches to automatically extract important insights from this incessant massive data pool. Efforts in this direction emphasize mining or extracting the wealth of latent information in the data from multiple OSNs independently. The first thread of this dissertation focuses on analytics to investigate the differentiated content-sharing behavior of individuals. The second thread of this dissertation attempts to build decision-making systems using social media data. The results of the proposed dissertation emphasize the importance of considering multiple data types while interpreting the content shared on OSNs. They highlight the unique ways in which the data and the extracted patterns from text-based platforms or visual-based platforms complement and contrast in terms of their content. The proposed research demonstrated that, in many ways, the results obtained by focusing on either only text or only visual elements of content shared online could lead to biased insights. On the other hand, it also shows the power of a sequential set of patterns that have some sort of precedence relationships and collaboration between humans and automated planners. / Dissertation/Thesis / Doctoral Dissertation Computer Science 2019 Computer science Information science Web studies Artificial Intelligence Automated Planning Machine Learning Natural Language Processing Online Social Media Social Computing
920	Understanding the Importance of Entities and Roles in Natural Language Inference : A Model and Datasets January 2019 (has links) abstract: In this thesis, I present two new datasets and a modification to the existing models in the form of a novel attention mechanism for Natural Language Inference (NLI). The new datasets have been carefully synthesized from various existing corpora released for different tasks. The task of NLI is to determine the possibility of a sentence referred to as “Hypothesis” being true given that another sentence referred to as “Premise” is true. In other words, the task is to identify whether the “Premise” entails, contradicts or remains neutral with regards to the “Hypothesis”. NLI is a precursor to solving many Natural Language Processing (NLP) tasks such as Question Answering and Semantic Search. For example, in Question Answering systems, the question is paraphrased to form a declarative statement which is treated as the hypothesis. The options are treated as the premise. The option with the maximum entailment score is considered as the answer. Considering the applications of NLI, the importance of having a strong NLI system can't be stressed enough. Many large-scale datasets and models have been released in order to advance the field of NLI. While all of these models do get good accuracy on the test sets of the datasets they were trained on, they fail to capture the basic understanding of “Entities” and “Roles”. They often make the mistake of inferring that “John went to the market.” from “Peter went to the market.” failing to capture the notion of “Entities”. In other cases, these models don't understand the difference in the “Roles” played by the same entities in “Premise” and “Hypothesis” sentences and end up wrongly inferring that “Peter drove John to the stadium.” from “John drove Peter to the stadium.” The lack of understanding of “Roles” can be attributed to the lack of such examples in the various existing datasets. The reason for the existing model’s failure in capturing the notion of “Entities” is not just due to the lack of such examples in the existing NLI datasets. It can also be attributed to the strict use of vector similarity in the “word-to-word” attention mechanism being used in the existing architectures. To overcome these issues, I present two new datasets to help make the NLI systems capture the notion of “Entities” and “Roles”. The “NER Changed” (NC) dataset and the “Role-Switched” (RS) dataset contains examples of Premise-Hypothesis pairs that require the understanding of “Entities” and “Roles” respectively in order to be able to make correct inferences. This work shows how the existing architectures perform poorly on the “NER Changed” (NC) dataset even after being trained on the new datasets. In order to help the existing architectures, understand the notion of “Entities”, this work proposes a modification to the “word-to-word” attention mechanism. Instead of relying on vector similarity alone, the modified architectures learn to incorporate the “Symbolic Similarity” as well by using the Named-Entity features of the Premise and Hypothesis sentences. The new modified architectures not only perform significantly better than the unmodified architectures on the “NER Changed” (NC) dataset but also performs as well on the existing datasets. / Dissertation/Thesis / Masters Thesis Computer Science 2019 Artificial intelligence Computer science Information technology Artificial Intelligence Deep Learning Entailment Natural Language Inference Natural Language Processing Natural Language Understanding

Search results