Global ETD Search

71	A graph representation of event intervals for efficient clustering and classification / En grafrepresentation av händelsesintervall föreffektiv klustering och klassificering Lee, Zed Heeje January 2020 (has links) Sequences of event intervals occur in several application domains, while their inherent complexity hinders scalable solutions to tasks such as clustering and classification. In this thesis, we propose a novel spectral embedding representation of event interval sequences that relies on bipartite graphs. More concretely, each event interval sequence is represented by a bipartite graph by following three main steps: (1) creating a hash table that can quickly convert a collection of event interval sequences into a bipartite graph representation, (2) creating and regularizing a bi-adjacency matrix corresponding to the bipartite graph, (3) defining a spectral embedding mapping on the bi-adjacency matrix. In addition, we show that substantial improvements can be achieved with regard to classification performance through pruning parameters that capture the nature of the relations formed by the event intervals. We demonstrate through extensive experimental evaluation on five real-world datasets that our approach can obtain runtime speedups of up to two orders of magnitude compared to other state-of-the-art methods and similar or better clustering and classification performance. / Sekvenser av händelsesintervall förekommer i flera applikationsdomäner, medan deras inneboende komplexitet hindrar skalbara lösningar på uppgifter som kluster och klassificering. I den här avhandlingen föreslår vi en ny spektral inbäddningsrepresentation av händelsens intervallsekvenser som förlitar sig på bipartitgrafer. Mer konkret representeras varje händelsesintervalsekvens av en bipartitgraf genom att följa tre huvudsteg: (1) skapa en hashtabell som snabbt kan konvertera en samling händelsintervalsekvenser till en bipartig grafrepresentation, (2) skapa och reglera en bi-adjacency-matris som motsvarar bipartitgrafen, (3) definiera en spektral inbäddning på bi-adjacensmatrisen. Dessutom visar vi att väsentliga förbättringar kan uppnås med avseende på klassificeringsprestanda genom beskärningsparametrar som fångar arten av relationerna som bildas av händelsesintervallen. Vi demonstrerar genom omfattande experimentell utvärdering på fem verkliga datasätt att vår strategi kan erhålla runtime-hastigheter på upp till två storlekar jämfört med andra modernaste metoder och liknande eller bättre kluster- och klassificerings- prestanda. event intervals bipartite graph spectral embedding clustering classification händelsesintervall bipartitgraf spektral inbäddning klustering klassificering Computer and Information Sciences Data- och informationsvetenskap
72	Sedimentological Characterization of Matrix-rich and Associated Matrix-poor Sandstones in Deep-marine Slope and Basin-floor Deposits Ningthoujam, Jagabir 03 October 2022 (has links) Deep-marine sandstones containing significant (> 10%) detrital mud (silt and clay) matrix have become increasingly recognized, but mostly in drill core or poorly exposed outcrops where details of their vertical and lateral variability are poorly captured. Exceptional vertical and along-strike exposures of matrix-rich and associated matrix-poor deposits in deep-marine strata of the passive margin Neoproterozoic Windermere Supergroup and foreland basin Ordovician Cloridorme Formation, provide an unparalleled opportunity to document such characteristics. In both study areas, strata form a 100s m long depositional continuum that at its upflow end consists of thick-bedded matrix-poor sandstone (<20% matrix) that transforms progressively downflow to medium- to thick-bedded muddy sandstone (20 – 50% matrix) to medium-bedded bipartite facies with a basal sandy (30 – 60% matrix) part overlain sharply by a muddier part (40 – 80% matrix), and then to thin-bedded sandy mudstone (50 – 90% matrix). This depositional continuum is then overlain everywhere by a thin- to very thin-bedded traction-structured sandstone and/or silty mudstone cap. This consistent lithofacies change is interpreted to reflect particle settling in a rapidly but systematically evolving, negligibly-sheared sand-mud suspension developed along the margins (Windermere) and downflow terminus (Cloridorme) of a high-energy, mud-enriched avulsion jet. Stratigraphically upward, beds of similar lithofacies type succeed one another vertically and transform to the next facies in the depositional continuum at about the same along-strike position, forming stratal units 2–9 beds thick whose grain-size distribution gradually decreases upward. This spatial and temporal regularity is interpreted to be caused by multiple surges of a single, progressively waning turbidity current, with sufficient lag between successive surges for the deposition of a traction-structured sandstone overlain by mudstone cap. Furthermore, the systematic backstepping or side-stepping recognized at the stratal unit scale in both the Windermere and Cloridorme is interpreted to be driven by a combination of knickpoint migration and local topographic steering of the flows, which continued until the supply of mud from local seafloor erosion became exhausted, the main channel avulsed elsewhere, or a new stratal element developed. deep-marine matrix-rich sandstones avulsion splay particle settling turbidity current pulsing knickpoint migration hybrid event bed bipartite facies
73	Rotulação de símbolos matemáticos manuscritos via casamento de expressões / Labeling of Handwritten Mathematical Symbols via Expression Matching Honda, Willian Yukio 23 January 2013 (has links) O problema de reconhecimento de expressões matemáticas manuscritas envolve três subproblemas importantes: segmentação de símbolos, reconhecimento de símbolos e análise estrutural de expressões. Para avaliar métodos e técnicas de reconhecimento, eles precisam ser testados sobre conjuntos de amostras representativos do domínio de aplicação. Uma das preocupações que tem sido apontada ultimamente é a quase inexistência de base de dados pública de expressões matemáticas, o que dificulta o desenvolvimento e comparação de diferentes abordagens. Em geral, os resultados de reconhecimento apresentados na literatura restringem-se a conjuntos de dados pequenos, não disponíveis publicamente, e muitas vezes formados por dados que visam avaliar apenas alguns aspectos específicos do reconhecimento. No caso de expressões online, para treinar e testar reconhecedores de símbolos, as amostras são em geral obtidas solicitando-se que as pessoas escrevam uma série de símbolos individualmente e repetidas vezes. Tal tarefa é monótona e cansativa. Uma abordagem alternativa para obter amostras de símbolos seria solicitar aos usuários a transcrição de expressões modelo previamente definidas. Dessa forma, a escrita dos símbolos seria realizada de forma natural, menos monótona, e várias amostras de símbolos poderiam ser obtidas de uma única expressão. Para evitar o trabalho de anotar manualmente cada símbolo das expressões transcritas, este trabalho propõe um método para casamento de expressões matemáticas manuscritas, no qual símbolos de uma expressão transcrita por um usuário são associados aos correspondentes símbolos (previamente identificados) da expressão modelo. O método proposto é baseado em uma formulação que reduz o problema a um problema de associação simples, no qual os custos são definidos em termos de características dos símbolos e estrutura da expressão. Resultados experimentais utilizando o método proposto mostram taxas médias de associação correta superiores a 99%. / The problem of recognizing handwritten mathematical expressions includes three important subproblems: symbol segmentation, symbol recognition, and structural analysis of expressions. In order to evaluate recognition methods and techniques, they should be tested on representative sample sets of the application domain. One of the concerns that are being repeatedly pointed recently is the almost non-existence of public representative datasets of mathematical expressions, which makes difficult the development and comparison of distinct approaches. In general, recognition results reported in the literature are restricted to small datasets, not publicly available, and often consisting of data aiming only evaluation of some specific aspects of the recognition. In the case of online expressions, to train and test symbol recognizers, samples are in general obtained asking users to write a series of symbols individually and repeatedly. Such task is boring and tiring. An alternative approach for obtaining samples of symbols would be to ask users to transcribe previously defined model expressions. By doing so, writing would be more natural and less boring, and several symbol samples could be obtained from one transcription. To avoid the task of manually labeling the symbols of the transcribed expressions, in this work a method for handwritten expression matching, in which symbols of a transcribed expression are assigned to the corresponding ones in the model expression, is proposed. The proposed method is based on a formulation that reduces the matching problem to a linear assignment problem, where costs are defined based on symbol features and expression structure. Experimental results using the proposed method show that mean correct assignment rate superior to 99% is achieved. bipartite graph matching casamento de expressões matemáticas emparelhamento de grafos bipartidos expressões matemáticas manuscritas handwritten mathematical expressions mathematical expressions matching mathematical symbols annotation mathematical symbols labeling rotulação de símbolos matemáticos
74	Um algoritmo para o Problema do Isomorfismo de Grafos Rodrigues, Edilson José January 2014 (has links) Orientador: Prof. Dr. Daniel Morgato Martin / Dissertação (mestrado) - Universidade Federal do ABC, Programa de Pós-Graduação em Ciências da Computação, 2014. / Neste trabalho estudamos o Problema do Isomorfismo de Grafos e a sua complexidade para resolvê-lo. Nossa principal contribuição é a proposta de um algoritmo para o caso geral do Problema, baseado no particionamento do conjunto de vértices e em emparelhamentos perfeitos de grafos bipartidos. Estudamos também o algoritmo de Brendan McKay, que é o mais rápido algoritmo para o Problema do Isomorfismo de Grafos conhecido. Ao final, implementamos o algoritmo proposto nesta dissertação e o algoritmo de McKay. Após a comparação dos dois algoritmos, verificamos que os resultados obtidos pelo algoritmo proposto não foram satisfatórios, porém apresentamos possíveis melhorias de como deixá-lo mais eficiente. / In this work we study the Graph Isomorphism Problem and their complexity to solve it. Our main contribution is to propose an algorithm for the general case of the Problem, based on partitioning the set vertex and perfect matchings of bipartite graphs. We also studied the Brendan McKay¿s algorithm, who is the fastest algorithm for the Graph Isomorphism Problem known. At the end, we implemented the algorithm proposed in this dissertation and McKay¿s algorithm. After comparison of the two algorithms, we found that the results obtained by the proposed algorithm were not satisfactory, but improvements are possible as to make it more efficient. PROBLEMA DO ISOMORFISMO DE GRAFOS EMPARELHAMENTOS EM GRAFOS PARTICIONAMENTO DO CONJUNTO DE VERTICES GRAPH ISOMORPHISM PROBLEM MATCHINGS IN BIPARTITE GRAPHS PARTITIONING OF VERTEX SET
75	Propagação em grafos bipartidos para extração de tópicos em fluxo de documentos textuais / Propagation in bipartite graphs for topic extraction in stream of textual data Faleiros, Thiago de Paulo 08 June 2016 (has links) Tratar grandes quantidades de dados é uma exigência dos modernos algoritmos de mineração de texto. Para algumas aplicações, documentos são constantemente publicados, o que demanda alto custo de armazenamento em longo prazo. Então, é necessário criar métodos de fácil adaptação para uma abordagem que considere documentos em fluxo, e que analise os dados em apenas um passo sem requerer alto custo de armazenamento. Outra exigência é a de que essa abordagem possa explorar heurísticas a fim de melhorar a qualidade dos resultados. Diversos modelos para a extração automática das informações latentes de uma coleção de documentos foram propostas na literatura, dentre eles destacando-se os modelos probabilísticos de tópicos. Modelos probabilísticos de tópicos apresentaram bons resultados práticos, sendo estendidos para diversos modelos com diversos tipos de informações inclusas. Entretanto, descrever corretamente esses modelos, derivá-los e em seguida obter o apropriado algoritmo de inferência são tarefas difíceis, exigindo um tratamento matemático rigoroso para as descrições das operações efetuadas no processo de descoberta das dimensões latentes. Assim, para a elaboração de um método simples e eficiente para resolver o problema da descoberta das dimensões latentes, é necessário uma apropriada representação dos dados. A hipótese desta tese é a de que, usando a representação de documentos em grafos bipartidos, é possível endereçar problemas de aprendizado de máquinas, para a descoberta de padrões latentes em relações entre objetos, por exemplo nas relações entre documentos e palavras, de forma simples e intuitiva. Para validar essa hipótese, foi desenvolvido um arcabouço baseado no algoritmo de propagação de rótulos utilizando a representação em grafos bipartidos. O arcabouço, denominado PBG (Propagation in Bipartite Graph), foi aplicado inicialmente para o contexto não supervisionado, considerando uma coleção estática de documentos. Em seguida, foi proposta uma versão semissupervisionada, que considera uma pequena quantidade de documentos rotulados para a tarefa de classificação transdutiva. E por fim, foi aplicado no contexto dinâmico, onde se considerou fluxo de documentos textuais. Análises comparativas foram realizadas, sendo que os resultados indicaram que o PBG é uma alternativa viável e competitiva para tarefas nos contextos não supervisionado e semissupervisionado. / Handling large amounts of data is a requirement for modern text mining algorithms. For some applications, documents are published constantly, which demand a high cost for long-term storage. So it is necessary easily adaptable methods for an approach that considers documents flow, and be capable of analyzing the data in one step without requiring the high cost of storage. Another requirement is that this approach can exploit heuristics in order to improve the quality of results. Several models for automatic extraction of latent information in a collection of documents have been proposed in the literature, among them probabilistic topic models are prominent. Probabilistic topic models achieve good practical results, and have been extended to several models with different types of information included. However, properly describe these models, derive them, and then get appropriate inference algorithms are difficult tasks, requiring a rigorous mathematical treatment for descriptions of operations performed in the latent dimensions discovery process. Thus, for the development of a simple and efficient method to tackle the problem of latent dimensions discovery, a proper representation of the data is required. The hypothesis of this thesis is that by using bipartite graph for representation of textual data one can address the task of latent patterns discovery, present in the relationships between documents and words, in a simple and intuitive way. For validation of this hypothesis, we have developed a framework based on label propagation algorithm using the bipartite graph representation. The framework, called PBG (Propagation in Bipartite Graph) was initially applied to the unsupervised context for a static collection of documents. Then a semi-supervised version was proposed which need only a small amount of labeled documents to the transductive classification task. Finally, it was applied in the dynamic context in which flow of textual data was considered. Comparative analyzes were performed, and the results indicated that the PBG is a viable and competitive alternative for tasks in the unsupervised and semi-supervised contexts. Aprendizado em grafos bipartidos Dimensionality reduction Extração de tópicos Fluxo de dados textuais Learning in bipartite graphs Redução de dimensionalidade Text data stream Topic extraction
76	圖形的訊息傳遞問題 / Message transmission problems of graphs 余銘芬, Yu, Ming Fen Unknown Date (has links) 給定一個圖形G，以及集合M，M為一描述圖形G中各點擁有訊息之情形的集合。圖形G相對於M的的傳遞數是指，於最短時間內，讓圖形中全部點皆獲得所有種類之訊息，並將符號記為t(G;M) 。傳遞過程中每個時間單位將受到下列限制: (1)圖形上的每個點只能與自己相鄰的點交換訊息。 (2)兩個相鄰的點在每個單位時間裡至多只能交換一個訊息。我們希望可以找到在最短的時間裡完成傳遞的方法，也就是讓圖形G中的每一個點都獲得所有種類之訊息，我們稱此類型問題為訊息傳遞問題。在本論文中，給定一個圖形G，且圖形G中每個點的訊息只有一個，G中任兩點的訊息都不會相同，符號t(G)代表完成傳遞所需最少的時間單位。我們給定圖形的傳遞數的上界與下界，並且定出一套公式計算樹圖、完全二部圖及雙環網路圖的傳遞數。 / Given a graph G together with a set M , the transmission number of G corresponding to M , denoted by t(G;M), is the minimum number of time needed to complete the transmission , that is, to let all the vertices in G know all the messages in M , subject to the constraints that at each time unit, each vertex can interchange messages with all its neighbors, but the number of messages that two vertices can interchange at each time unit is at most one. We want to find the minimum number of time units required to complete the transmission, that is, to let all the vertices in G know all the messages. We call such a problem the message transmission problem. Given a graph G, the transmission number of G, denoted t(G), is the minimum number of time units required to complete the transmission, under the condition that \|m(v)\|=1 for all v in V(G). In this thesis, we give upper and lower bounds for the transmission number of G, and give formulas to compute the transmission numbers of trees, complete bipartite graphs and double loop networks. 傳遞數傳遞集樹圖完全二部圖雙環網路 transmission number transmitting set tree complete bipartite graph double loop network.
77	Boxicity, Cubicity And Vertex Cover Shah, Chintan D 08 1900 (has links) The boxicity of a graph G, denoted as box(G), is the minimum dimension d for which each vertex of G can be mapped to a d-dimensional axis-parallel box in Rd such that two boxes intersect if and only if the corresponding vertices of G are adjacent. An axis-parallel box is a generalized rectangle with sides parallel to the coordinate axes. If additionally, we restrict all sides of the rectangle to be of unit length, the new parameter so obtained is called the cubicity of the graph G, denoted by cub(G). F.S. Roberts had shown that for a graph G with n vertices, box(G) ≤ and cub(G) ≤ . A minimum vertex cover of a graph G is a minimum cardinality subset S of the vertex set of G such that each edge of G has at least one endpoint in S. We show that box(G) ≤ +1 and cub(G)≤ t+ ⌈log2(n −t)⌉−1 where t is the cardinality of a minimum vertex cover. Both these bounds are tight. For a bipartite graph G, we show that box(G) ≤ and this bound is tight. We observe that there exist graphs of very high boxicity but with very low chromatic num-ber. For example, there exist bipartite (2 colorable) graphs with boxicity equal to . Interestingly, if boxicity is very close to , then the chromatic number also has to be very high. In particular, we show that if box(G) = −s, s ≥ 02, then x(G) ≥ where X(G) is the chromatic number of G. We also discuss some known techniques for ﬁndingan upper boundon the boxicityof a graph -representing the graph as the intersection of graphs with boxicity 1 (boxicity 1 graphs are known as interval graphs) and covering the complement of the graph by co-interval graphs (a co-interval graph is the complement of an interval graph). Boxicity Cubicity Vortex Cover Graphs - Boxicity Graphs - Cubicity Bipartite Graphs Interval Graphs Intersection Graphs Chromatic Number Box(G) Cub(G) Computer Science
78	A Combinatorial Algorithm for Minimizing the Maximum Laplacian Eigenvalue of Weighted Bipartite Graphs Helmberg, Christoph, Rocha, Israel, Schwerdtfeger, Uwe 13 November 2015 (has links) (PDF) We give a strongly polynomial time combinatorial algorithm to minimise the largest eigenvalue of the weighted Laplacian of a bipartite graph. This is accomplished by solving the dual graph embedding problem which arises from a semidefinite programming formulation. In particular, the problem for trees can be solved in time cubic in the number of vertices. Bipartiter Graph Baum gewichtete Laplace Matrix Einbettung von Graphen bipartite graph tree weighted Laplacian matrix graph embedding ddc:510 bipartiter Graph Baum
79	Codes, graphs and designs from maximal subgroups of alternating groups Mumba, Nephtale Bvalamanja January 2018 (has links) Philosophiae Doctor - PhD (Mathematics) / The main theme of this thesis is the construction of linear codes from adjacency matrices or sub-matrices of adjacency matrices of regular graphs. We first examine the binary codes from the row span of biadjacency matrices and their transposes for some classes of bipartite graphs. In this case we consider a sub-matrix of an adjacency matrix of a graph as the generator of the code. We then shift our attention to uniform subset graphs by exploring the automorphism groups of graph covers and some classes of uniform subset graphs. In the sequel, we explore equal codes from adjacency matrices of non-isomorphic uniform subset graphs and finally consider codes generated by an adjacency matrix formed by adding adjacency matrices of two classes of uniform subset graphs. Alternating groups Automorphism groups Antipodal graphs Bipartite graphs Designs Graphs Graph covers Incidence design Linear code Maximal subgroups Neighbourhood design Permutation decoding
80	Propagação em grafos bipartidos para extração de tópicos em fluxo de documentos textuais / Propagation in bipartite graphs for topic extraction in stream of textual data Thiago de Paulo Faleiros 08 June 2016 (has links) Tratar grandes quantidades de dados é uma exigência dos modernos algoritmos de mineração de texto. Para algumas aplicações, documentos são constantemente publicados, o que demanda alto custo de armazenamento em longo prazo. Então, é necessário criar métodos de fácil adaptação para uma abordagem que considere documentos em fluxo, e que analise os dados em apenas um passo sem requerer alto custo de armazenamento. Outra exigência é a de que essa abordagem possa explorar heurísticas a fim de melhorar a qualidade dos resultados. Diversos modelos para a extração automática das informações latentes de uma coleção de documentos foram propostas na literatura, dentre eles destacando-se os modelos probabilísticos de tópicos. Modelos probabilísticos de tópicos apresentaram bons resultados práticos, sendo estendidos para diversos modelos com diversos tipos de informações inclusas. Entretanto, descrever corretamente esses modelos, derivá-los e em seguida obter o apropriado algoritmo de inferência são tarefas difíceis, exigindo um tratamento matemático rigoroso para as descrições das operações efetuadas no processo de descoberta das dimensões latentes. Assim, para a elaboração de um método simples e eficiente para resolver o problema da descoberta das dimensões latentes, é necessário uma apropriada representação dos dados. A hipótese desta tese é a de que, usando a representação de documentos em grafos bipartidos, é possível endereçar problemas de aprendizado de máquinas, para a descoberta de padrões latentes em relações entre objetos, por exemplo nas relações entre documentos e palavras, de forma simples e intuitiva. Para validar essa hipótese, foi desenvolvido um arcabouço baseado no algoritmo de propagação de rótulos utilizando a representação em grafos bipartidos. O arcabouço, denominado PBG (Propagation in Bipartite Graph), foi aplicado inicialmente para o contexto não supervisionado, considerando uma coleção estática de documentos. Em seguida, foi proposta uma versão semissupervisionada, que considera uma pequena quantidade de documentos rotulados para a tarefa de classificação transdutiva. E por fim, foi aplicado no contexto dinâmico, onde se considerou fluxo de documentos textuais. Análises comparativas foram realizadas, sendo que os resultados indicaram que o PBG é uma alternativa viável e competitiva para tarefas nos contextos não supervisionado e semissupervisionado. / Handling large amounts of data is a requirement for modern text mining algorithms. For some applications, documents are published constantly, which demand a high cost for long-term storage. So it is necessary easily adaptable methods for an approach that considers documents flow, and be capable of analyzing the data in one step without requiring the high cost of storage. Another requirement is that this approach can exploit heuristics in order to improve the quality of results. Several models for automatic extraction of latent information in a collection of documents have been proposed in the literature, among them probabilistic topic models are prominent. Probabilistic topic models achieve good practical results, and have been extended to several models with different types of information included. However, properly describe these models, derive them, and then get appropriate inference algorithms are difficult tasks, requiring a rigorous mathematical treatment for descriptions of operations performed in the latent dimensions discovery process. Thus, for the development of a simple and efficient method to tackle the problem of latent dimensions discovery, a proper representation of the data is required. The hypothesis of this thesis is that by using bipartite graph for representation of textual data one can address the task of latent patterns discovery, present in the relationships between documents and words, in a simple and intuitive way. For validation of this hypothesis, we have developed a framework based on label propagation algorithm using the bipartite graph representation. The framework, called PBG (Propagation in Bipartite Graph) was initially applied to the unsupervised context for a static collection of documents. Then a semi-supervised version was proposed which need only a small amount of labeled documents to the transductive classification task. Finally, it was applied in the dynamic context in which flow of textual data was considered. Comparative analyzes were performed, and the results indicated that the PBG is a viable and competitive alternative for tasks in the unsupervised and semi-supervised contexts. Aprendizado em grafos bipartidos Extração de tópicos Fluxo de dados textuais Redução de dimensionalidade Dimensionality reduction Learning in bipartite graphs Text data stream Topic extraction

Search results