Spelling suggestions: "subject:"graph distance"" "subject:"raph distance""
1 |
Graph-theoretic techniques for web content mining [electronic resource] / by Adam Schenker.Schenker, Adam. January 2003 (has links)
Includes vita. / Title from PDF of title page. / Document formatted into pages; contains 145 pages. / Thesis (Ph.D.)--University of South Florida, 2003. / Includes bibliographical references. / Text (Electronic thesis) in PDF format. / ABSTRACT: In this dissertation we introduce several novel techniques for performing data mining on web documents which utilize graph representations of document content. Graphs are more robust than typical vector representations as they can model structural information that is usually lost when converting the original web document content to a vector representation. For example, we can capture information such as the location, order and proximity of term occurrence, which is discarded under the standard document vector representation models. Many machine learning methods rely on distance computations, centroid calculations, and other numerical techniques. Thus many of these methods have not been applied to data represented by graphs since no suitable graph-theoretical concepts were previously available. We introduce the novel Graph Hierarchy Construction Algorithm (GHCA), which performs topic-oriented hierarchical clustering of web search results modeled using graphs. / ABSTRACT: The system we created around this new algorithm and its prior version is compared with similar web search clustering systems to gauge its usefulness. An important advantage of this approach over conventional web search systems is that the results are better organized and more easily browsed by users. Next we present extensions to classical machine learning algorithms, such as the k-means clustering algorithm and the k-Nearest Neighbors classification algorithm, which allows the use of graphs as fundamental data items instead of vectors. We perform experiments comparing the performance of the new graph-based methods to the traditional vector-based methods for three web document collections. Our experimental results show an improvement for the graph approaches over the vector approaches for both clustering and classification of web documents. / ABSTRACT: An important advantage of the graph representations we propose is that they allow the computation of graph similarity in polynomial time; usually the determination of graph similarity with the techniques we use is an NP-Complete problem. In fact, there are some cases where the execution time of the graph-oriented approach was faster than the vector approaches. / System requirements: World Wide Web browser and PDF reader. / Mode of access: World Wide Web.
|
2 |
Graph-Theoretic Techniques for Web Content MiningSchenker, Adam 16 September 2003 (has links)
In this dissertation we introduce several novel techniques for performing data mining on web documents which utilize graph representations of document content. Graphs are more robust than typical vector representations as they can model structural information that is usually lost when converting the original web document content to a vector representation. For example, we can capture information such as the location, order and proximity of term occurrence, which is discarded under the standard document vector representation models. Many machine learning methods rely on distance computations, centroid calculations, and other numerical techniques. Thus many of these methods have not been applied to data represented by graphs since no suitable graph-theoretical concepts were previously available.
We introduce the novel Graph Hierarchy Construction Algorithm (GHCA), which performs topic-oriented hierarchical clustering of web search results modeled using graphs. The system we created around this new algorithm and its prior version is compared with similar web search clustering systems to gauge its usefulness. An important advantage of this approach over conventional web search systems is that the results are better organized and more easily browsed by users.
Next we present extensions to classical machine learning algorithms, such as the k-means clustering algorithm and the k-Nearest Neighbors classification algorithm, which allows the use of graphs as fundamental data items instead of vectors. We perform experiments comparing the performance of the new graph-based methods to the traditional vector-based methods for three web document collections. Our experimental results show an improvement for the graph approaches over the vector approaches for both clustering and classification of web documents. An important advantage of the graph representations we propose is that they allow the computation of graph similarity in polynomial time; usually the determination of graph similarity with the techniques we use is an NP-Complete problem. In fact, there are some cases where the execution time of the graph-oriented approach was faster than the vector approaches.
|
3 |
Distance Consistent Labellings and the Local List NumberHenricsson, Anders January 2023 (has links)
We study the local list number of graphs introduced by Lennerstad and Eriksson. A labelling of a graph on n vertices is a bijection from vertex set to the set {1,…, n}. Given such a labelling c a vertex u is distance consistent if for all vertices v and w |c(u)-c(v)|=|c(u)-c(w)|+1 implies d(u,w)≤ d(u,v). A graph G is k-distance consistent if there is a labelling with k distance-consistent vertices. The local list number of a graph G is the largest k such that G is k-distance consistent. We determine the local list number of cycles, complete bipartite graphs and some trees as well as prove bounds for some families of trees. We show that the local list number of even cycles is two, and of odd cycles is three. We also show that, if k, l≥ 3 , the complete bipartite graph Kk,l has local list number one, the star graph Sn=K1,n has local list number 3, and K2,k has local list number 2. Finally, we show that for each n≥3 and each k such that 3≤k≤n there is a tree with local list number k. / Vi studerar det lokala listtalet introducerat av Lennerstad och Eriksson. En märkning av en graf på n hörn är en bijektion från hörnmängden till mängden {1, . . . , n}. Givet en sådan märkning c är ett hörn u avståndskonsistent om för alla hörn v och w |c(u) − c(v)| = |c(u) − c(w)| = 1 implicerar d(u, w) ≤d(u, v). En graf G är k-avståndskonsistent om det nns en märkning med k avståndskonsistenta hörn. Det lokala listtalet av en graf G är det största k sådan att G är k -avståndskonsistent. Vi bestämmer den lokala listtalet av cykler, kompletta bipartita grafer och vissa träd och visar som gränser för några familjer av träd. Vi visar att det lokla listtalet av jämna cykler är två, och av udda cykler är tre. Vi visar också att, om k, l ≥ 3, den kompletta bipartita grafen Kk,l har lokalt listtal ett, stjärngrafen Sn = K1,n har lokalt listtal 3, och K2,k har lokalt listtal 2. Slutligen, visar vi att för varje n≥3 och varje k sådant att 3 ≤ k≤n finns ett träd med lokalt listtal k.
|
4 |
Une approche déclarative pour la génération de modèles / A Declarative Approach for Model GenerationFerdjoukh, Adel 20 October 2016 (has links)
Disposer de données dans le but de valider ou tester une approche ou un concept est d'une importance primordiale dans beaucoup de domaines différents. Malheureusement, ces données ne sont pas toujours disponibles, sont coûteuses à obtenir, ou bien ne répondent pas à certaines exigences de qualité ce qui les rend inutiles dans certains cas de figure.Un générateur automatique de données est un bon moyen pour obtenir facilement et rapidement des données valides, de différentes tailles, pertinentes et diversifiées. Dans cette thèse, nous proposons une nouvelle approche complète, dirigée par les modèles et basée sur la programmation par contraintes pour la génération de données. / Owning data is useful in many different fields. Data can be used to test and to validate approaches, algorithms and concepts. Unfortunately, data is rarely available, is cost to obtain, or is not adapted to most of cases due to a lack of quality.An automated data generator is a good way to generate quickly and easily data that are valid, in different sizes, likelihood and diverse.In this thesis, we propose a novel and complete model driven approach, based on constraint programming for automated data generation.
|
5 |
Bruhatovy-Titsovy budovy / Bruhat-Tits buildingsLachman, Dominik January 2017 (has links)
Bruhat-Tits buildings are a fundamental concept in the study of linear algebraic groups over general fields. The general goal of this thesis is to introduce buildings in the basic case of SLd(Qp) and to explicitly describe some of their geometrical and combinatorial properties - building are abstract simplicial complexes. After the general construction (Chapter 1) we focus in detail to the case of SL2(Qp). We work with simplices using certain matrix representatives. We explicitly describe the building and give a formula for graph distance. In Chapter 3 we consider the general case SLd(Qp), d ≥ 2. There we introduce a new concept of distance formulas. In Chapter 4 we prove some theorems which are satisfied by buildings in general. Chapter 5 studies the problem of determining so-called gallery distance of two simplices. In the last Chapter 6 we generalize the distance formulas to the case of three vertices. 1
|
Page generated in 0.0813 seconds