• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 603
  • 285
  • 85
  • 61
  • 40
  • 18
  • 17
  • 16
  • 16
  • 16
  • 15
  • 12
  • 6
  • 5
  • 5
  • Tagged with
  • 1347
  • 236
  • 168
  • 163
  • 140
  • 124
  • 110
  • 109
  • 103
  • 93
  • 90
  • 90
  • 89
  • 82
  • 81
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
341

Escalonamento de painéis reforçados sujeitos a cargas de impacto. / Scaling of reinforced panels subjected to impact loads.

Mazzariol, Leonardo Monteiro 05 October 2012 (has links)
Esta dissertação avalia a aplicação de leis de similaridade distorcidas no contexto de impacto estrutural. A análise se apoia em um estudo teórico, numérico e experimental do impacto de um indentador contra um painel duplo. O modelo analítico descreve de forma simplificada o comportamento de partes desta estrutura e as simulações numéricas reproduzem os ensaios experimentais que utilizam um protótipo (tamanho real) e modelo (escala reduzida). A diferença nas propriedades mecânicas do material de construção do modelo e protótipo é considerada no procedimento de escalonamento, bem como os efeitos de escala por causa da taxa de deformação. Ainda, diante das limitações do aparato experimental, é desenvolvida uma formulação para as leis de similaridade que permite variações da massa impacto e da velocidade inicial do elemento impactante no ensaio. Dessa forma, apresenta-se um procedimento que permite inferir o comportamento de estruturas em tamanho real sob carregamento de impacto através do uso de estruturas em escala, mesmo com as limitações de aparato ou diferenças das propriedades mecânicas do material. / This work evaluates the distorted similarity laws applied in structural impact. The analysis is based in theoretical, numerical and experimental studies of the impact of an indenter to a reinforced panel. The theoretical approach describes, in a simplified manner, the behaviour of the structure components while the numerical analysis reproduces the experiments performed in two scales: prototype (large scale) and model (small scale). Although the panels are made of different materials, this mismatch in mechanical behaviour is taken into account in the scaling procedure, as well as the scale effects due to strain rate. A formulation that allows flexibility in experiment variables such as initial velocity and impact mass is developed due to experimental apparatus limitations. In general lines, is developed a procedure allowing to infer the behaviour of a large scale structure under impact load using scaled structures, while using different materials for prototype and model and respecting the experimental apparatus limits.
342

Time Series Data Analytics

Ahsan, Ramoza 29 April 2019 (has links)
Given the ubiquity of time series data, and the exponential growth of databases, there has recently been an explosion of interest in time series data mining. Finding similar trends and patterns among time series data is critical for many applications ranging from financial planning, weather forecasting, stock analysis to policy making. With time series being high-dimensional objects, detection of similar trends especially at the granularity of subsequences or among time series of different lengths and temporal misalignments incurs prohibitively high computation costs. Finding trends using non-metric correlation measures further compounds the complexity, as traditional pruning techniques cannot be directly applied. My dissertation addresses these challenges while meeting the need to achieve near real-time responsiveness. First, for retrieving exact similarity results using Lp-norm distances, we design a two-layered time series index for subsequence matching. Time series relationships are compactly organized in a directed acyclic graph embedded with similarity vectors capturing subsequence similarities. Powerful pruning strategies leveraging the graph structure greatly reduce the number of time series as well as subsequence comparisons, resulting in a several order of magnitude speed-up. Second, to support a rich diversity of correlation analytics operations, we compress time series into Euclidean-based clusters augmented by a compact overlay graph encoding correlation relationships. Such a framework supports a rich variety of operations including retrieving positive or negative correlations, self correlations and finding groups of correlated sequences. Third, to support flexible similarity specification using computationally expensive warped distance like Dynamic Time Warping we design data reduction strategies leveraging the inexpensive Euclidean distance with subsequent time warped matching on the reduced data. This facilitates the comparison of sequences of different lengths and with flexible alignment still within a few seconds of response time. Comprehensive experimental studies using real-world and synthetic datasets demonstrate the efficiency, effectiveness and quality of the results achieved by our proposed techniques as compared to the state-of-the-art methods.
343

Analyse automatique des crises d'épilepsie du lobe temporal à partir des EEG de surface / Automatical analysis of temporal lobe epileptic seizures from scalp EEG

Caparos, Matthieu 05 October 2006 (has links)
AL’objectif de la thèse est le développement d’une méthode de caractérisation des crises d’épilepsie du lobe temporal à partir des EEG de surface et plus particulièrement de la zone épileptogène (ZE) à l’origine des crises. Des travaux récents ont démontré une évolution des synchronisations entre structures cérébrales permettant une caractérisation de la dynamique des crises du lobe temporal. La comparaison de différentes méthodes de mesure de relation a permis la mise en évidence des avantages du coefficient de corrélation non-linéaire dans l’étude de l’épilepsie par les EEG de surface. L’exploitation de l’évolution de ce coefficient est à la base de trois applications de traitement automatique du signal EEG : -détermination de la latéralisation de la ZE au départ d’une crise, -recherche d’une signature épileptique, -classification des crises du lobe temporal en deux groupes / The objective of this work was the development of a temporal lobe epilepsy seizures characterization methodology realized through scalp EEG analysis. Recent researches showed an evolution of the synchronizations between cerebral structures, allowing a characterization of dynamic of the seizures. The comparison, between different methods of relation measurement, proved the advantages of the non-linear correlation coefficient in the study of epileptic seizures from scalp EEGs. The characterization of the evolution of this coefficient was used as the base of the development of three signal processing applications : -determination of the side of the Epileptic Zone at the onset of a seizure -research of an epileptic pattern at the seizure onset -classification of the temporal lobe seizures into two groups.
344

Consultas por similaridade no modelo relacional / Similarity queries in the relational model

Pierro, Gabriel Vicente de 18 May 2015 (has links)
Os Sistemas de Gerenciamento de Bases de Dados Relacionais (SGBDR) foram concebidos para o armazenamento e recuperação de grandes volumes de dados. Tradicionalmente, estes sistemas suportam números, pequenas cadeias de caracteres e datas (que podem ser comparados por identidade ou por relações de ordem { RO), porém vem se tornando necessário organizar, armazenar e recuperar dados mais complexos, como por exemplo dados multimídia (imagens, áudio e vídeo), séries temporais etc. Quando se trata de dados complexos há uma mudança de paradigma, pois as comparações entre elementos são feitas por similaridade em vez das RO utilizadas tradicionalmente, tendo como mais frequentemente utilizados os operadores de comparação por abrangência (Rq) e por k-vizinhos mais próximos (k-NN). Embora muitos estudos estejam sendo feitos nessa área, quando lidando com consultas por similaridade grande parte do esforço é direcionado para criar as estruturas de indexação e dar suporte às operações necessárias para executar apenas o aspecto da consulta que trata da similaridade, sem focar em realizar uma integração homogênea das consultas que envolvam ambos os tipos de operadores simultaneamente nos ambientes dos SGDBRs. Um dos principais problemas nessa integração é lidar com as peculiaridades do operador de busca por k-NN. Todos os operadores de comparação por identidade e por RO são comutativos e associativos entre si. No entanto o operador de busca por k-NN não atende a nenhuma dessas propriedades. Com isso, a expressão de consultas em SQL, que usualmente pode ser feita sem que a expressão da ordem entre os predicados seja importante, precisa passar a considerar a ordem. Além disso, consultas que utilizam comparações por k-NN podem gerar múltiplos empates, e a falta de uma metodologia para resolvê-los pode levar a um processo de desempate arbitrário ou insensível ao contexto da consulta, onde usuários não tem poder para intervir de maneira significativa. Em alguns casos, isso pode levar a uma mesma consulta a retornar resultados distintos em casos onde a estrutura interna dos dados estiver sujeita a modificações, como por exemplo em casos de transações concorrentes em um SGBDR. Este trabalho aborda os problemas gerados pela inserção de operadores de busca por similaridade nos SGBDR, mais especificamente o k-NN, e propõe novas maneiras de representação de consultas com múltiplos predicados, por similaridade ou RO, assim como novos operadores derivados do k-NN que são mais adequados para um ambiente relacional que permita consultas híbridas, e permitem também controle sobre o tratamento de empates. / The Relational Database Management Systems (RDBMS) were originally conceived to store and retrieve large volumes of data. Traditionally, these systems support only numbers, small strings of characters and dates (which could be compared by identity and a Order Relationship { OR). However it has been increasingly necessary to organize, store and retrieve more complex data, such as multimedia (images, audio and video), time series etc. Dealing with those data types requires a paradigm shift, as the comparisons between each element are made by similarity, and not by the traditionally used identity or OR, with the most common similarity operators used being the range (Rq) and k-Nearest Neighbors (k-NN). Despite many studies in the field, when dealing with similarity queries a large part of the effort has been directed towards the data structures and the necessary operations to execute only the similarity side of the query, not paying attention to a more homogenous integration of queries that involve both operator types simultaneously in RDBMS environments. One of the main problems for such integration is the peculiarities of the k-NN operator. Both identity and OR operators possess the commutative and associative properties amongst themselves, but the k-NN operator does not. As such, expressing SQL queries, that usually can disregard the order in which predicates appear, now needs to be aware of the ordering. Furthermore, queries that use k-NN might generate multiple ties, and the lack of a methodology to solve them might lead to an arbitrary or context-detached untying process, where users have little or no control to intervene. In some applications, the lack of a controlled untying process may even lead to each query yielding distinct results if the underlying structures ought be subject to change, as it is be the case of the concurrent transactions in a relational database management system (RDBMS). This work focuses on the problems that arise from the integration of similarity based operators into RDBMS, more specifically the k-NN, and proposes new ways to represent queries with multiple predicates, including similarity, identity or OR, as well as new operators derived from k-NN that are better suited for a RDBMS environment containing hybrid queries, and also enable control over the untying process.
345

Modelo para sumarização computacional de textos científicos. / Scientific text computational summarization model.

Tarafa Guzmán, Alejandro 07 March 2017 (has links)
Neste trabalho, propõe-se um modelo para a sumarização computacional extrativa de textos de artigos técnico-cientificos em inglês. A metodologia utilizada baseia-se em um módulo de avaliação de similaridade semântica textual entre sentenças, desenvolvido especialmente para integrar o modelo de sumarização. A aplicação deste módulo de similaridade à extração de sentenças é feita por intermédio do conceito de uma janela deslizante de comprimento variável, que facilita a detecção de equivalência semântica entre frases do artigo e aquelas de um léxico de frases típicas, atribuíveis a uma estrutura básica dos artigos. Os sumários obtidos em aplicações do modelo apresentam qualidade razoável e utilizável, para os efeitos de antecipar a informação contida nos artigos. / In this work a model is proposed for the computational extractive summarization of scientific papers in English. Its methodology is based on a semantic textual similarity module, for the evaluation of equivalence between sentences, specially developed to integrate the summarization model. A variable width window facilitates the application of this module to detect semantic similarity between phrases in the article and those in a basic structure, assignable to the articles. Practical summaries obtained with the model show usable quality to anticipate the information found in the papers.
346

Determinantes do efeito da similaridade visual na memória de trabalho / Factors influencing the visual similarity effect in working memory

Zar, Tamires 02 June 2017 (has links)
A similaridade fonológica tem sido estudada desde a década de 1970, tendo contribuído de maneira essencial para o entendimento acerca do funcionamento da memória de trabalho. Vários trabalhos têm se dedicado ao estudo da similaridade visual, da possibilidade de correspondência entre esta e a similaridade fonológica, sem, entretanto, chegar a um consenso sobre a natureza de seu efeito sobre o desempenho em tarefas de reconhecimento. No presente trabalho, tivemos como objetivo caracterizar os efeitos da similaridade através da análise de algumas variáveis que possivelmente estariam relacionadas aos efeitos da similaridade na memória de trabalho. Realizou-se inicialmente uma avaliação dos estímulos a serem utilizados a fim de validar a classificação destes em diferentes níveis de similaridade. Em um segundo momento, foi realizada uma tarefa de reconhecimento na qual foram manipulados o nível de similaridade entre estímulos, o intervalo de retenção e a forma de apresentação dos estímulos. Os resultados demonstram que a similaridade visual entre os estímulos na codificação, aliada à dissimilaridade na recuperação, favorece o desempenho na realização da tarefa proposta. Além disso, o intervalo de retenção maior sugere um prejuízo no desempenho, especialmente em condições de alta similaridade na recuperação. Tais resultados corroboram a literatura e contribuem para o entendimento sobre o efeito da similaridade visual na memória de trabalho. / The phonological similarity has been studied since the 1970s, contributing in an essential matter for the understanding of working memory. Researchers studied visual similarity and its correspondence with the phonological similarity, without, however, arriving to a consensus about the nature of its effect on performance in recognition tasks. In this work, our objective was to caracterize the efects of similarity analyzing variables that could possibly be related to the effects of similarity in working memory. First, an evaluation was conducted in order to validate the classification we made for stimuli. Second, we manipulated the similarity between stimuli, the duration of retention interval and the stimulis presentation mode using an item recogniton task. Results show that visual similarity between stimuli at encoding, together with dissimilarity at retrieval, favors a better performance at this task. Besides, longer retention interval suggests worst performance, especially when there were high similarity levels at retrieval. These results agree with other works and contribute for the understanding of visual similarity effect in working memory.
347

Infraestrutura computacional para avaliação da similaridade funcional composta entre microRNAs baseada em ontologias / Computational platform for evaluation of the composed functional similarity between microRNAs based on ontologies

Sasazaki, Mariana Yuri 19 August 2014 (has links)
MicroRNAs (miRNAs) são pequenos RNAs não codificadores de proteínas que atuam principalmente como silenciadores pós-transcricionais, inibindo a tradução de RNAs mensageiros. Evidências crescentes revelam que tais moléculas desempenham papéis críticos em muitos processos biológicos importantes. Uma vez que não existem anotações de termos de miRNAs na Gene Ontology (GO), tampouco um banco de dados de referência com anotações funcionais dos mesmos, o cálculo da medida de similaridade entre miRNAs de forma direta não possui um padrão estabelecido. Por outro lado, a existência de bancos de dados de genes-alvo de miRNAs, como o TarBase, e bases de dados contendo informações sobre associações de miRNAs e doenças humanas, como o HMDD, nos permite inferir a similaridade funcional dos miRNAs indiretamente, por meio da análise de seus genes-alvo na GO ou entre suas doenças relacionadas na ontologia MeSH. Além disso, de acordo com a estrutura da ontologia de miRNAs OMIT, um miRNA também pode ser anotado com outras informações, tais como a sua natureza de atuação como oncogênico ou supressor de tumor, o organismo em que se encontra, o tipo de experimento em que foi encontrado, suas associações com doenças, genes-alvo, proteínas e eventos patológicos. Dessa forma, a similaridade entre miRNAs pode ser inferida com base na combinação de um conjunto de informações contidas nas respectivas anotações, de forma que possamos obter um aproveitamento de várias informações existentes, definindo assim um cálculo de similaridade funcional composta. Assim, neste trabalho, propomos a criação e aplicação de um método chamado CFSim, aplicado sobre a OMIT e que utiliza a ontologia de doenças, MeSH, e a ontologia de genes, GO, para calcular a similaridade entre dois miRNAs, juntamente com informações contidas em suas anotações. A validação de nosso método foi realizada por meio da comparação com a similaridade funcional inferida considerando diferentes famílias de miRNAs e os resultados obtidos mostraram que nosso método é eficiente, no sentido de que a similaridade entre miRNAs pertencentes à mesma família é maior que a similaridade entre miRNAs de famílias distintas. Ainda, em comparação com os métodos de similaridade funcional já existentes na literatura, o CFSim obteve melhores resultados. Adicionalmente, para tornarmos viável a utilização do método proposto, foi projetado e implementado um ambiente contendo a infraestrutura necessária para que pesquisadores possam incluir dados obtidos de novas descobertas e consultar as informações sobre um determinado miRNA, assim como calcular a similaridade entre dois miRNAs, baseada no método proposto. / MicroRNAs (miRNAs) are small non-coding RNA that mainly negatively regulate gene expression by inhibiting translation of target RNAs. Increasing evidences show that such molecules play critical roles in many important biological processes. Since there are no terms of miRNAs annotations in Gene Ontology (GO), nor a database with microRNAs functional annotations, directly calculating the functional similarity between miRNAs does not have an estabilished pattern aproach. However, the existence of miRNAs target genes database, such as TarBase, and a miRNAs-disease associations database, such as HMDD, allow us to indirectly infer functional similarity of miRNAs through the analysis of their target genes in GO or between their related diseases in MeSH. Moreover, according to the structure of the ontology of miRNAs OMIT, a miRNA can also be annotated with other information, such as if it acts as an oncogene or a tumor suppressor, the organism that it belongs, the experiment in which it was found, its associations with diseases, target genes, proteins and pathological events. Thus, miRNAs similarity can be inferred based on the combination of a broad set of information contained in their annotations, indeed, we can use all available information defining the calculation of a composed functional similarity. In this study, we propose the creation and application of CFSim method applied to the OMIT using the diseases ontology, MeSH, and gene ontology, GO, to compute miRNAs similarity based on different information in their annotations. We validated our method by comparing with functional similarity inferred by miRNA families and the results showed that our method is efficient in sense that the functional similarity between miRNAs in the same family was greater compared to other miRNAs from distinct families. Furthermore, in comparison with existing methods of functional similarity in the literature until the present day, the CFSim showed better results. Finally, to make feasible the use of the proposed method, an environment was designed and implemented, containing the necessary infrastructure so that researchers can include data from new discoveries and see information about a particular miRNA, as well as calculate the similarity between two miRNAs, based in the proposed method.
348

Modelo de custo para consultas por similaridade em espaços métricos / Cost model for similarity queries in metric spaces

Baioco, Gisele Busichia 24 January 2007 (has links)
Esta tese apresenta um modelo de custo para estimar o número de acessos a disco (custo de I/O) e o número de cálculos de distância (custo de CPU) para consultas por similaridade executadas sobre métodos de acesso métricos dinâmicos. O objetivo da criação do modelo é a otimização de consultas por similaridade em Sistemas de Gerenciamento de Bases de Dados relacionais e objeto-relacionais. Foram considerados dois tipos de consultas por similaridade: consulta por abrangência e consulta aos k-vizinhos mais próximos. Como base para a criação do modelo de custo foi utilizado o método de acesso métrico dinâmico Slim-Tree. O modelo estima a dimensão intrínseca do conjunto de dados pela sua dimensão de correlação fractal. A validação do modelo é confirmada por experimentos com conjuntos de dados sintéticos e reais, de variados tamanhos e dimensões, que mostram que as estimativas obtidas em geral estão dentro da faixa de variação medida em consultas reais / This thesis presents a cost model to estimate the number of disk accesses (I/O costs) and the number of distance calculations (CPU costs) to process similarity queries over data indexed by dynamic metric access methods. The goal of the model is to optimize similarity queries on relational and object-relational Database Management Systems. Two types of similarity queries were taken into consideration: range queries and k-nearest neighbor queries. The dynamic metric access method Slim-Tree was used as the basis for the creation of the cost model. The model takes advantage of the intrinsic dimension of the data set, estimated by its correlation fractal dimension. Experiments were performed on real and synthetic data sets, with different sizes and dimensions, in order to validate the proposed model. They confirmed that the estimations are accurate, being always within the range achieved executing real queries
349

Semantic similarities at the core of generic indexing and clustering approaches / Les similarités sémantiques au cœur d’approches génériques d’indexation et de catégorisation

Fiorini, Nicolas 04 November 2015 (has links)
Pour exploiter efficacement une masse toujours croissante de documents électroniques, une branche de l'Intelligence Artificielle s'est focalisée sur la création et l'utilisation de systèmes à base de connaissance. Ces approches ont prouvé leur efficacité, notamment en recherche d'information. Cependant elles imposent une indexation sémantique des ressources exploitées, i.e. que soit associé à chaque ressource un ensemble de termes qui caractérise son contenu. Pour s'affranchir de toute ambiguïté liée au langage naturel, ces termes peuvent être remplacés par des concepts issus d'une ontologie de domaine, on parle alors d'indexation conceptuelle.Le plus souvent cette indexation est réalisée en procédant à l'extraction des concepts du contenu même des documents. On note, dans ce cas, une forte dépendance des techniques associées à ce traitement au type de document et à l'utilisation d'algorithmes dédiés. Pourtant une des forces des approches conceptuelles réside dans leur généricité. En effet, par l'exploitation d'indexation sémantique, ces approches permettent de traiter de la même manière un ensemble d'images, de gènes, de textes ou de personnes, pour peu que ceux-ci aient été correctement indexés. Cette thèse explore ce paradigme de généricité en proposant des systèmes génériques et en les comparant aux approches existantes qui font référence. L'idée est de se reposer sur les annotations sémantiques et d'utiliser des mesures de similarité sémantique afin de créer des approches performantes. De telles approches génériques peuvent par la suite être enrichies par des modules plus spécifiques afin d'améliorer le résultat final. Deux axes de recherche sont suivis dans cette thèse. Le premier et le plus riche est celui de l'indexation sémantique. L'approche proposée exploite la définition et l'utilisation de documents proches en contenu pour annoter un document cible. Grâce à l'utilisation de similarités sémantiques entre les annotations des documents proches et à l'utilisation d'une heuristique, notre approche, USI (User-oriented Semantic Indexer), permet d'annoter des documents plus rapidement que les méthodes existantes en fournissant une qualité comparable. Ce processus a ensuite été étendu à une autre tâche, la classification. Le tri est une opération indispensable à laquelle l'Homme s'est attaché depuis l'Antiquité, qui est aujourd'hui de plus en plus automatisée. Nous proposons une approche de classification hiérarchique qui se base sur les annotations sémantiques des documents à classifier. Là encore, la méthode est indépendante des types de documents puisque l'approche repose uniquement sur leur annotations. Un autre avantage de cette approche est le fait que lorsque des documents sont rassemblés, le groupe qu'il forme est automatiquement annoté (suivant notre algorithme d'indexation). Par conséquent, le résultat fourni est une hiérarchie de classes contenant des documents, chaque classe étant annotée. Cela évite l'annotation manuelle fastidieuse des classes par l'exploration des documents qu'elle contient comme c'est souvent le cas.L'ensemble de nos travaux a montré que l'utilisation des ontologies permettait d'abstraire plusieurs processus et ainsi de réaliser des approches génériques. Cette généricité n'empêche en aucun cas d'être couplée à des approches plus spécifiques, mais constitue en soi une simplicité de mise en place dès lors que l'on dispose de documents annotés sémantiquement. / In order to improve the exploitation of even growing number of electronic documents, Artificial Intelligence has dedicated a lot of effort to the creation and use of systems grounded on knowledge bases. In particular in the information retrieval field, such semantic approaches have proved their efficiency.Therefore, indexing documents is a necessary task. It consists of associating them with sets of terms that describe their content. These terms can be keywords but also concepts from an ontology, in which case the annotation is said to be semantic and benefit from the inherent properties of ontologies which are the absence of ambiguities.Most approaches designed to annotate documents have to parse them and extract concepts from this parsing. This underlines the dependance of such approaches to the type of documents, since parsing requires dedicated algorithms.On the other hand, approaches that solely rely on semantic annotations can ignore the document type, enabling the creation of generic processes. This thesis capitalizes on genericity to build novel systems and compare them to state-of-the-art approaches. To this end, we rely on semantic annotations coupled with semantic similarity measures. Of course, such generic approaches can then be enriched with type-specific ones, which would further increase the quality of the results.First of all, this work explores the relevance of this paradigm for indexing documents. The idea is to rely on already annotated close documents to annotate a target document. We define a heuristic algorithm for this purpose that uses the semantic annotations of these close documents and semantic similarities to provide a generic indexing method. This results in USI (User-oriented Semantic Indexer) that we show to perform as well as best current systems while being faster.Second of all, this idea is extended to another task, clustering. Clustering is a very common and ancient process that is very useful for finding documents or understanding a set of documents. We propose a hierarchical clustering algorithm that reuses the same components of classical methods to provide a novel one applicable to any kind of documents. Another benefit of this approach is that when documents are grouped together, the group can be annotated by using our indexing algorithm. Therefore, the result is not only a hierarchy of clusters containing documents as clusters are actually described by concepts as well. This helps a lot to better understand the results of the clustering.This thesis shows that apart from enhancing classical approaches, building conceptual approaches allows us to abstract them and provide a generic framework. Yet, while bringing easy-to-set-up methods – as long as documents are semantically annotated –, genericity does not prevent us from mixing these methods with type-specific ones, in other words creating hybrid methods.
350

Sequence queries on temporal graphs

Zhu, Haohan 21 June 2016 (has links)
Graphs that evolve over time are called temporal graphs. They can be used to describe and represent real-world networks, including transportation networks, social networks, and communication networks, with higher fidelity and accuracy. However, research is still limited on how to manage large scale temporal graphs and execute queries over these graphs efficiently and effectively. This thesis investigates the problems of temporal graph data management related to node and edge sequence queries. In temporal graphs, nodes and edges can evolve over time. Therefore, sequence queries on nodes and edges can be key components in managing temporal graphs. In this thesis, the node sequence query decomposes into two parts: graph node similarity and subsequence matching. For node similarity, this thesis proposes a modified tree edit distance that is metric and polynomially computable and has a natural, intuitive interpretation. Note that the proposed node similarity works even for inter-graph nodes and therefore can be used for graph de-anonymization, network transfer learning, and cross-network mining, among other tasks. The subsequence matching query proposed in this thesis is a framework that can be adopted to index generic sequence and time-series data, including trajectory data and even DNA sequences for subsequence retrieval. For edge sequence queries, this thesis proposes an efficient storage and optimized indexing technique that allows for efficient retrieval of temporal subgraphs that satisfy certain temporal predicates. For this problem, this thesis develops a lightweight data management engine prototype that can support time-sensitive temporal graph analytics efficiently even on a single PC.

Page generated in 0.0566 seconds