• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 33
  • 10
  • 7
  • 5
  • 3
  • 3
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 74
  • 17
  • 14
  • 13
  • 13
  • 13
  • 11
  • 10
  • 9
  • 8
  • 8
  • 8
  • 7
  • 7
  • 7
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
41

Dominant vectors of nonnegative matrices : application to information extraction in large graphs

Ninove, Laure 21 February 2008 (has links)
Objects such as documents, people, words or utilities, that are related in some way, for instance by citations, friendship, appearance in definitions or physical connections, may be conveniently represented using graphs or networks. An increasing number of such relational databases, as for instance the World Wide Web, digital libraries, social networking web sites or phone calls logs, are available. Relevant information may be hidden in these networks. A user may for instance need to get authority web pages on a particular topic or a list of similar documents from a digital library, or to determine communities of friends from a social networking site or a phone calls log. Unfortunately, extracting this information may not be easy. This thesis is devoted to the study of problems related to information extraction in large graphs with the help of dominant vectors of nonnegative matrices. The graph structure is indeed very useful to retrieve information from a relational database. The correspondence between nonnegative matrices and graphs makes Perron--Frobenius methods a powerful tool for the analysis of networks. In a first part, we analyze the fixed points of a normalized affine iteration used by a database matching algorithm. Then, we consider questions related to PageRank, a ranking method of the web pages based on a random surfer model and used by the well known web search engine Google. In a second part, we study optimal linkage strategies for a web master who wants to maximize the average PageRank score of a web site. Finally, the third part is devoted to the study of a nonlinear variant of PageRank. The simple model that we propose takes into account the mutual influence between web ranking and web surfing.
42

Authority identification in online communities and social networks

Budalakoti, Suratna 26 July 2013 (has links)
As Internet communities such as question-answer (Q&A) forums and online social networks (OSNs) grow in prominence as knowledge sources, traditional editorial filters are unable to scale to their size and pace. This absence hinders the exchange of knowledge online, by creating an understandable lack of trust in information. This mistrust can be partially overcome by a forum by consistently providing reliable information, thus establishing itself as a reliable source. This work investigates how algorithmic approaches can contribute to building such a community of voluntary experts willing to contribute authoritative information. This work identifies two approaches: a) reducing the cost of participation for experts via matching user queries to experts (question recommendation), and b) identifying authoritative contributors for incentivization (authority estimation). The question recommendation problem is addressed by extending existing approaches via a new generative model that augments textual data with expert preference information among different questions. Another contribution to this domain is the introduction of a set of formalized metrics to include the expert's experience besides the questioner's. This is essential for expert retention in a voluntary community, and has not been addressed by previous work. The authority estimation problem is addressed by observing that the global graph structure of user interactions, results from two factors: a user's performance in local one-to-one interactions, and their activity levels. By positing an intrinsic authority 'strength' for each user node in the graph that governs the outcome of individual interactions via the Bradley-Terry model for pairwise comparison, this research establishes a relationship between intrinsic user authority, and global measures of influence. This approach overcomes many drawbacks of current measures of node importance in OSNs by naturally correcting for user activity levels, and providing an explanation for the frequent disconnect between real world reputation and online influence. Also, while existing research has been restricted to node ranking on a single OSN graph, this work demonstrates that co-ranking across multiple endorsement graphs drawn from the same OSN is a highly effective approach for aggregating complementary graph information. A new scalable co-ranking framework is introduced for this task. The resulting algorithms are evaluated on data from various online communities, and empirically shown to outperform existing approaches by a large margin. / text
43

Utvärdering av Random Indexing och PageRank som verktyg för automatisk textsammanfattning

Gustavsson, Pär January 2009 (has links)
Mängden information på internet är enorm och bara forsätter att öka på både gott och ont. Framförallt kan det vara svårt för grupper såsom synskadade och personer med språksvårigheter att navigera sig och ta vara på all denna information. Därmed finns ett behov av väl fungerande sammanfattningsverktyg för dessa, men även för andra människor som snabbt behöver presenteras det viktigaste ur en uppsättning texter. Den här studien undersöker hur väl sammanfattningssystemet CogSum, som är baserat på Random Indexing, presterar med och utan rankningsalgoritmen PageRank aktiverat på nyhetstexter och texter från Försäkringskassan. Utöver detta används sammanfattningssystemet SweSum som en baslinje i undersökningen. Rapporten innefattar en teoretisk bakgrund som avhandlar automatisk textsammanfattning i stort vilket inkluderar olika utvärderingsmetoder, tekniker och sammanfattningssystem. Utvärderingen utfördes med hjälp av det automatiska utvärderingsverktyget KTHxc på nyhetstexterna och ett annat sådant, AutoSummENG, på Försäkringskassans texter. Studiens resultat påvisar att CogSum utan PageRank presterar bättre än CogSum med PageRank på 10 nyhetstexter medan det omvända gäller för 5 texter från Försäkringskassan. SweSum i sin tur erhöll det bästa resultatet för nyhetstexterna respektive det sämsta för texterna från Försäkringskassan.
44

Hur går sökmotoroptimering till? -teori och praktik

Backlund, Adam, Grevillius, Oliwer January 2013 (has links)
Idag finns omkring 625 miljoner webbsidor på internet. Rapporten betonar vikten av att synas bland dessa samt hur man gör för att bli sedd, detta samlingsnamn kallas sökmotoroptimering. I den här studien behandlar vi ämnet sökmotoroptimering med inriktning mot Google, vad det är och hur det används i praktiken. Vårt syfte är att fördjupa oss i detta ämne och bilda oss en uppfattning om vad det handlar om. I resultatet har vi utifrån Googles riktlinjer listat flera begrepp som hjälper dig att optimera din webbsida. Vi har även i resultatet intervjuat flera verksamheter eftersom vi vill veta hur sökmotoroptimering används hos dem. I analysen har vi sedan jämfört Googles riktlinjer med teori och resultat för att ta reda på vad som utmärker sig mest. Vi har kommit fram till att en lättillgänglig sida med bra innehåll är det som utmärker sökmotoroptimering och det som gör att webbsidan blir synligare i träfflistan. Vi kom även fram till att det är tre specifika delar som Google tar upp som utmärker optimering hos verksamheter. Dessa delar är: ”optimera för besökaren”,” förutse sökbeteende” och ”webbplatsanseende”.
45

ADAPTIVE AUTONOMY WITH UNRELIABLE COMMUNICATION

Moberg, Ragnar January 2017 (has links)
For underwater robotics there exists severe constraints regarding wireless bandwidth in the kilobits range. This makes a centralised approach to high-level mission management possibly less than ideal due to inherent delays and possible temporary incompleteness in data during decision making. This thesis aims to propose, implement (in ROS) and test a distributed approach. An auction based method for task assignment was being used, as well as a Pagerank based approach that models a trust based hierarchy between autonomous agents inferred from information exchange, in order to enforce decision conformity. Simulations where carried out using UWsim and a custom made bandwidth limiter for ROS. It was concluded that the Pagerank based algorithm managed to uphold conformity and solve conflicts during network slowdown but did not always lead to the correct decisions being enforced.
46

Indexation aléatoire et similarité inter-phrases appliquées au résumé automatique / Random indexing and inter-sentences similarity applied to automatic summarization

Vu, Hai Hieu 29 January 2016 (has links)
Face à la masse grandissante des données textuelles présentes sur le Web, le résumé automatique d'une collection de documents traitant d'un sujet particulier est devenu un champ de recherche important du Traitement Automatique des Langues. Les expérimentations décrites dans cette thèse s'inscrivent dans cette perspective. L'évaluation de la similarité sémantique entre phrases est l'élément central des travaux réalisés. Notre approche repose sur la similarité distributionnelle et une vectorisation des termes qui utilise l'encyclopédie Wikipédia comme corpus de référence. Sur la base de cette représentation, nous avons proposé, évalué et comparé plusieurs mesures de similarité textuelle ; les données de tests utilisées sont celles du défi SemEval 2014 pour la langue anglaise et des ressources que nous avons construites pour la langue française. Les bonnes performances des mesures proposées nous ont amenés à les utiliser dans une tâche de résumé multi-documents, qui met en oeuvre un algorithme de type PageRank. Le système a été évalué sur les données de DUC 2007 pour l'anglais et le corpus RPM2 pour le français. Les résultats obtenus par cette approche simple, robuste et basée sur une ressource aisément disponible dans de nombreuses langues, se sont avérés très encourageants / With the growing mass of textual data on the Web, automatic summarization of topic-oriented collections of documents has become an important research field of Natural Language Processing. The experiments described in this thesis were framed within this context. Evaluating the semantic similarity between sentences is central to our work and we based our approach on distributional similarity and vector representation of terms, with Wikipedia as a reference corpus. We proposed several similarity measures which were evaluated and compared on different data sets: the SemEval 2014 challenge corpus for the English language and own built datasets for French. The good performance showed by our measures led us to use them in a multi-document summary task, which implements a pagerank-type algorithm. The system was evaluated on the DUC 2007 datasets for English and RPM2 corpus for French. This simple approach, based on a resource readily available in many languages, proved efficient, robust and the encouraging outcomes open up real prospects of improvement.
47

Aplikace SEO technik pro marketing / Application of SEO for marketing

Knapovský, Martin January 2016 (has links)
The aim of this thesis is to describe in detail the current ways of optimizing websites for search engines (SEO) and to apply them in a broader marketing strategy of italian restaurant La Casa Degli Amici. The thesis presents evidence about the importance of SEO for marketing and can be used as a template for optimizing you own website. This thesis is divided into two main parts - theoretical and practical. The theoretical part explains the concepts used in the design and implementation of the practical part. The theoretical part considers 3 main subjects - search engine, users and SEO which represents website creation and optimization. This thesis describes the history of search engines and the way search engines analyze and evaluate a website. It also describes user search strategies and the way users read the search results. In the SEO section we show its advantages, brief history, SEO strategies and description of techniques and tools that can be used for SEO. In the practical part we put together a marketing strategy which consists of website creation, implementation of selected SEO techniques and social network marketing. Implementation of the marketing strategy is evaluated on data obtained during the period between 1. 8. 2016 and 31. 11. 2016. Owner of the restaurant has provided data on sales which were used for overall evaluation of implemented marketing strategy.
48

Link Label Prediction in Signed Citation Network

Akujuobi, Uchenna Thankgod 12 April 2016 (has links)
Link label prediction is the problem of predicting the missing labels or signs of all the unlabeled edges in a network. For signed networks, these labels can either be positive or negative. In recent years, different algorithms have been proposed such as using regression, trust propagation and matrix factorization. These approaches have tried to solve the problem of link label prediction by using ideas from social theories, where most of them predict a single missing label given that labels of other edges are known. However, in most real-world social graphs, the number of labeled edges is usually less than that of unlabeled edges. Therefore, predicting a single edge label at a time would require multiple runs and is more computationally demanding. In this thesis, we look at link label prediction problem on a signed citation network with missing edge labels. Our citation network consists of papers from three major machine learning and data mining conferences together with their references, and edges showing the relationship between them. An edge in our network is labeled either positive (dataset relevant) if the reference is based on the dataset used in the paper or negative otherwise. We present three approaches to predict the missing labels. The first approach converts the label prediction problem into a standard classification problem. We then, generate a set of features for each edge and then adopt Support Vector Machines in solving the classification problem. For the second approach, we formalize the graph such that the edges are represented as nodes with links showing similarities between them. We then adopt a label propagation method to propagate the labels on known nodes to those with unknown labels. In the third approach, we adopt a PageRank approach where we rank the nodes according to the number of incoming positive and negative edges, after which we set a threshold. Based on the ranks, we can infer an edge would be positive if it goes a node above the threshold. Experimental results on our citation network corroborate the efficacy of these approaches. With each edge having a label, we also performed additional network analysis where we extracted a subnetwork of the dataset relevant edges and nodes in our citation network, and then detected different communities from this extracted sub-network. To understand the different detected communities, we performed a case study on several dataset communities. The study shows a relationship between the major topic areas in a dataset community and the data sources in the community.
49

Damping Factor Analysis for PageRank

Scheie, Fredrik January 2022 (has links)
The purpose of this thesis is to present research related to the damping factor in relation to the PageRank algorithm where a method of symbolic calculations is used to calculate eigenvalues, eigenvectors corresponding to the Google matrix in relation to both directed and undirected graphs. These graphs given comprise all the directed graphs up to four vertices and in addition the undirected graphs of five vertices are given in this thesis. A central research question has been to determine how $d$ behaves in relation to effecting the result of the dominant eigenvector for corresponding graphs such as to determine how the PageRank is directly influenced. A few selected graphs along with their calculations were extracted and analyzed in terms of the parameter $d$. For the calculations in this thesis probability matrices were constructed for all graphs and calculations were made using Matlab where eigenvalues, eigenvectors corresponding to the Google matrix were returned along with the input probability matrix and the Google matrix. In addition, the thesis contains a theoretical portion related to the theory behind PageRank along with relevant proofs, theorems and definitions which are used throughout the thesis. Some brief mention of the historical background and applications of the PageRank are also given. \bigskip A discussion of the results is provided involving the interaction of the damping factor with the dominant PageRank eigenvector. Lastly, a conclusion is given and future prospects relating to the topic of research is discussed. The work in this thesis is inspired by a previous work done by Silvestrov et al. in $2008$ where we have here placed further emphasis on the damping factor.
50

The Non-Backtracking Spectrum of a Graph and Non-Bactracking PageRank

Glover, Cory 15 July 2021 (has links)
This thesis studies two problems centered around non-backtracking walks on graphs. First, we analyze the spectrum of the non-backtracking matrix of a graph. We show how to obtain the eigenvectors of the non-backtracking matrix using a smaller matrix and in doing so, create a block diagonal decomposition which more clearly expresses the non-backtracking matrix eigenvalues. Additionally, we develop upper and lower bounds on the matrix spectrum and use the spectrum to investigate properties of the graph. Second, we investigate the difference between PageRank and non-backtracking PageRank. We show some instances where there is no difference and develop an algorithm to compare PageRank and non-backtracking PageRank under certain conditions using $\mu$-PageRank.

Page generated in 0.0352 seconds