• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1
  • Tagged with
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Automata methods and techniques for graph-structured data

Shoaran, Maryam 23 April 2011 (has links)
Graph-structured data (GSD) is a popular model to represent complex information in a wide variety of applications such as social networks, biological data management, digital libraries, and traffic networks. The flexibility of this model allows the information to evolve and easily integrate with heterogeneous data from many sources. In this dissertation we study three important problems on GSD. A consistent theme of our work is the use of automata methods and techniques to process and reason about GSD. First, we address the problem of answering queries on GSD in a distributed environment. We focus on regular path queries (RPQs) – given by regular expressions matching paths in graph-data. RPQs are the building blocks of almost any mechanism for querying GSD. We present a fault-tolerant, message-efficient, and truly distributed algorithm for answering RPQs. Our algorithm works for the larger class of weighted RPQs on weighted GSDs. Second, we consider the problem of answering RPQs on incomplete GSD, where different data sources are represented by materialized database views. We explore the connection between “certain answers” (CAs) and answers obtained from “view-based rewritings” (VBRs) for RPQs. CAs are answers that can be obtained on each database consistent with the views. Computing all of CAs for RPQs is NP-hard, and one has to resort to an exponential algorithm in the size of the data–view materializations. On the other hand, VBRs are query reformulations in terms of the view definitions. They can be used to obtain query answers in polynomial time in the size of the data. These answers are CAs, but unfortunately for RPQs, not all of the CAs can be obtained in this way. In this work, we show the surprising result that for RPQs under local semantics, using VBRs to answer RPQs gives all the CAs. The importance of this result is that under such semantics, the CAs can be obtained in polynomial time in the size of the data. Third, we focus on XML–an important special case of GSD. The scenario we consider is streaming XML between exchanging parties. The problem we study is flexible validation of streaming XML under the realistic assumption that the schemas of the exchanging parties evolve, and thus diverge from one another. We represent schemas by using Visibly Pushdown Automata (VPAs), which recognize Visibly Pushdown Languages (VPLs). We model evolution for XML by defining formal language operators on VPLs. We show that VPLs are closed under the defined language operators and this enables us to expand the schemas (for XML) in order to account for flexible or constrained evolution. / Graduate

Page generated in 0.0665 seconds