Global ETD Search

1	Exploring declarative rule-based probabilistic frameworks for link prediction in Knowledge Graphs Gao, Xiaoxu January 2017 (has links) En kunskapsgraf lagrar information från webben i form av relationer mellan olika entiteter. En kunskapsgrafs kvalité bestäms av hur komplett den är och dess noggrannhet. Dessvärre har många nuvarande kunskapsgrafer brister i form av saknad fakta och inkorrekt information. Nuvarande lösningar av länkförutsägelser mellan entiteter har problem med skalbarhet och hög arbetskostnad. Denna uppsats föreslår ett deklarativt regelbaserat probabilistiskt ramverk för att utföra länkförutsägelse. Systemet involverar en regelutvinnande modell till ett “hinge-loss Markov random fields” för att föreslå länkar. Vidare utvecklades tre strategier för regeloptimering för att förbättra reglernas kvalité. Jämfört med tidigare lösningar så bidrar detta arbete till att drastiskt reducera arbetskostnader och en mer spårbar modell. Varje metod har utvärderas med precision och F-värde på NELL och Freebase15k. Det visar sig att strategin för regeloptimering presterade bäst. MAP-uppskattningen för den bästa modellen på NELL är 0.754, vilket är bättre än en nuvarande spjutspetsteknologi graphical model(0.306). F-värdet för den bästa modellen på Freebase15k är 0.709. / The knowledge graph stores factual information from the web in form of relationships between entities. The quality of a knowledge graph is determined by its completeness and accuracy. However, most current knowledge graphs often miss facts or have incorrect information. Current link prediction solutions have problems of scalability and high labor costs. This thesis proposed a declarative rule-based probabilistic framework to perform link prediction. The system incorporates a rule-mining model into a hingeloss Markov random fields to infer links. Moreover, three rule optimization strategies were developed to improve the quality of rules. Compared with previous solutions, this work dramatically reduces manual costs and provides a more tractable model. Each proposed method has been evaluated with Average Precision or F-score on NELL and Freebase15k. It turns out that the rule optimization strategy performs the best. The MAP of the best model on NELL is 0.754, better than a state-of-the-art graphical model (0.306). The F-score of the best model on Freebase15k is 0.709. Knowledge Graph Link Prediction Probabilistic Soft Logic Hinge-loss Markov Random Fields Kunskapsgraf Länkförutsägelser Probabilistic Soft Logic Hinge-loss Markov Random Fields Computer and Information Sciences Data- och informationsvetenskap
2	Integrating Linked Data search results using statistical relational learning approaches Al Shekaili, Dhahi January 2017 (has links) Linked Data (LD) follows the web in providing low barriers to publication, and in deploying web-scale keyword search as a central way of identifying relevant data. As in the web, searchesinitially identify results in broadly the form in which they were published, and the published form may be provided to the user as the result of a search. This will be satisfactory in some cases, but the diversity of publishers means that the results of the search may be obtained from many different sources, and described in many different ways. As such, there seems to bean opportunity to add value to search results by providing userswith an integrated representation that brings together features from different sources. This involves an on-the-fly and automated data integration process being applied to search results, which raises the question as to what technologies might bemost suitable for supporting the integration of LD searchresults. In this thesis we take the view that the problem of integrating LD search results is best approached by assimilating different forms ofevidence that support the integration process. In particular, thisdissertation shows how Statistical Relational Learning (SRL) formalisms (viz., Markov Logic Networks (MLN) and Probabilistic Soft Logic (PSL)) can beexploited to assimilate different sources of evidence in a principledway and to beneficial effect for users. Specifically, in this dissertation weconsider syntactic evidence derived from LD search results and from matching algorithms, semantic evidence derived from LD vocabularies, and user evidence,in the form of feedback. This dissertation makes the following key contributions: (i) a characterisation of key features of LD search results that are relevant to their integration, and a description of some initial experiences in the use of MLN for interpreting search results; (ii)a PSL rule-base that models the uniform assimilation of diverse kinds of evidence;(iii) an empirical evaluation of how the contributed MLN and PSL approaches perform in terms of their ability to infer a structure for integrating LD search results;and (iv) concrete examples of how populating such inferred structures for presentation to the end user is beneficial, as well as guiding the collection of feedbackwhose assimilation further improves search results presentation. 004
3	Automated Vocabulary Building for Characterizing and Forecasting Elections using Social Media Analytics Mahendiran, Aravindan 12 February 2014 (has links) Twitter has become a popular data source in the recent decade and garnered a significant amount of attention as a surrogate data source for many important forecasting problems. Strong correlations have been observed between Twitter indicators and real-world trends spanning elections, stock markets, book sales, and flu outbreaks. A key ingredient to all methods that use Twitter for forecasting is to agree on a domain-specific vocabulary to track the pertinent tweets, which is typically provided by subject matter experts (SMEs). The language used in Twitter drastically differs from other forms of online discourse, such as news articles and blogs. It constantly evolves over time as users adopt popular hashtags to express their opinions. Thus, the vocabulary used by forecasting algorithms needs to be dynamic in nature and should capture emerging trends over time. This thesis proposes a novel unsupervised learning algorithm that builds a dynamic vocabulary using Probabilistic Soft Logic (PSL), a framework for probabilistic reasoning over relational domains. Using eight presidential elections from Latin America, we show how our query expansion methodology improves the performance of traditional election forecasting algorithms. Through this approach we demonstrate how we can achieve close to a two-fold increase in the number of tweets retrieved for predictions and a 36.90% reduction in prediction error. / Master of Science Election Forecasting Twitter Query Expansion Social Group Modeling Probabilistic Soft Logic

1

Page generated in 0.0619 seconds