Global ETD Search

41	Supporting Multi-Criteria Decision Support Queries over Disparate Data Sources Raghavan, Venkatesh 17 April 2012 (has links) In the era of "big data revolution," marked by an exponential growth of information, extracting value from data enables analysts and businesses to address challenging problems such as drug discovery, fraud detection, and earthquake predictions. Multi-Criteria Decision Support (MCDS) queries are at the core of big-data analytics resulting in several classes of MCDS queries such as OLAP, Top-K, Pareto-optimal, and nearest neighbor queries. The intuitive nature of specifying multi-dimensional preferences has made Pareto-optimal queries, also known as skyline queries, popular. Existing skyline algorithms however do not address several crucial issues such as performing skyline evaluation over disparate sources, progressively generating skyline results, or robustly handling workload with multiple skyline over join queries. In this dissertation we thoroughly investigate topics in the area of skyline-aware query evaluation. In this dissertation, we first propose a novel execution framework called SKIN that treats skyline over joins as first class citizens during query processing. This is in contrast to existing techniques that treat skylines as an "add-on," loosely integrated with query processing by being placed on top of the query plan. SKIN is effective in exploiting the skyline characteristics of the tuples within individual data sources as well as across disparate sources. This enables SKIN to significantly reduce two primary costs, namely the cost of generating the join results and the cost of skyline comparisons to compute the final results. Second, we address the crucial business need to report results early; as soon as they are being generated so that users can formulate competitive decisions in near real-time. On top of SKIN, we built a progressive query evaluation framework ProgXe to transform the execution of queries involving skyline over joins to become non-blocking, i.e., to be progressively generating results early and often. By exploiting SKIN's principle of processing query at multiple levels of abstraction, ProgXe is able to: (1) extract the output dependencies in the output spaces by analyzing both the input and output space, and (2) exploit this knowledge of abstract-level relationships to guarantee correctness of early output. Third, real-world applications handle query workloads with diverse Quality of Service (QoS) requirements also referred to as contracts. Time sensitive queries, such as fraud detection, require results to progressively output with minimal delay, while ad-hoc and reporting queries can tolerate delay. In this dissertation, by building on the principles of ProgXe we propose the Contract-Aware Query Execution (CAQE) framework to support the open problem of contract driven multi-query processing. CAQE employs an adaptive execution strategy to continuously monitor the run-time satisfaction of queries and aggressively take corrective steps whenever the contracts are not being met. Lastly, to elucidate the portability of the core principle of this dissertation, the reasoning and query processing at different levels of data abstraction, we apply them to solve an orthogonal research question to auto-generate recommendation queries that facilitate users in exploring a complex database system. User queries are often too strict or too broad requiring a frustrating trial-and-error refinement process to meet the desired result cardinality while preserving original query semantics. Based on the principles of SKIN, we propose CAPRI to automatically generate refined queries that: (1) attain the desired cardinality and (2) minimize changes to the original query intentions. In our comprehensive experimental study of each part of this dissertation, we demonstrate the superiority of the proposed strategies over state-of-the-art techniques in both efficiency, as well as resource consumption. Progressive Query Evaluation Query Processing Pareto-Optimal Queries Cardinality Assurance Query Refinement Multi-Criteria Decision Support Skyline Queries
42	VAMANA : A High Performance, Scalable and Cost Driven XPath Engine Raghavan, Venkatesh 05 May 2004 (has links) Many applications are migrating or beginning to make use native XML data. We anticipate that queries will emerge that emphasize the structural semantics of XML query languages like XPath and XQuery. This brings a need for an efficient query engine and database management system tailored for XML data similar to traditional relational engines. While mapping large XML documents into relational database systems while possible, poses difficulty in mapping XML queries to the less powerful relational query language SQL and creates a data model mismatch between relational tables and semi-structured XML data. Hence native solutions to efficiently store and query XML data are being developed recently. However, most of these systems thus far fail to demonstrate scalability with large document sizes, to provide robust support for the XPath query language nor to adequately address costing with respect to query optimization. In this thesis, we propose a novel cost-driven XPath engine to support the scalable evaluation of ad-hoc XPath expressions called VAMANA. VAMANA makes use of an efficient XML repository for storing and indexing large XML documents called the Multi-Axis Storage Structure (MASS) developed at WPI. VAMANA extensively uses indexes for query evaluation by considering index-only plans. To the best of our knowledge, it is the only XML query engine that supports an index plan approach for large XML documents. Our index-oriented query plans allow queries to be evaluated while reading only a fraction of the data, as all tuples for a particular context node are clustered together. The pipelined query framework minimizes the cost of handing intermediate data during query processing. Unlike other native solutions, VAMANA provides support for all 13 XPath axes. Our schema independent cost model provides dynamically calculated statistics that are then used for intelligent cost-based transformations, further improving performance. Our optimization strategy for increasing execution time performance is affirmed through our experimental studies on XMark benchmark data. VAMANA query execution is significantly faster than leading available XML query engines. query optimization cost estimation XPath engine query processing index-based execution XML (Document markup language) Query languages (Computer science)
43	Self Maintenance of Materialized XQuery Views via Query Containment and Re-Writing Nilekar, Shirish K. 24 April 2006 (has links) In recent years XML, the eXtensible Markup Language has become the de-facto standard for publishing and exchanging information on the web and in enterprise data integration systems. Materialized views are often used in information integration systems to present a unified schema for efficient querying of distributed and possibly heterogenous data sources. On similar lines, ACE-XQ, an XQuery based semantic caching system shows the significant performance gains achieved by caching query results (as materialized views) and using these materialized views along with query containment techniques for answering future queries over distributed XML data sources. To keep data in these materialized views of ACE-XQ up-to-date, the view must be maintained i.e. whenever the base data changes, the corresponding cached data in the materialized view must also be updated. This thesis builds on the query containment ideas of ACE-XQ and proposes an efficient approach for self-maintenance of materialized views. Our experimental results illustrate the significant performance improvement achieved by this strategy over view re-computation for a variety of situations. XML Query Re-Writing View Maintenance Query Containment XML (Document markup language) Cache memory Database searching Query languages (Computer science)
44	Query Interface And Query Language For Domain Specific Web Service Discovery System Ozdil, Hilal 01 September 2011 (has links) (PDF) As the number of the published web services increase, discovery of the web services with the desired functionality and quality is becoming a challenging process. Selecting the appropriate web services among the ones that oer the same functionality is also a challenging task. The web service repositories like UDDI (Universal Description Discovery and Integration) support only the syntactic searchs. Quality of service parameters for the published web services can not be queried over these repositories. We have proposed a query language that aims to overcome these problems. It enables its users to query the web services both syntactically and semantically. We also allow the users to specify the quality of service criteria which the desired web services should satisfy. We have developed a graphical query interface to assist the users in query sentence formulation process. The proposed work is developed as a submodule of the Domain Specific Web Service Discovery with Semantics (DSWSD-S) System. Aforementioned query language and the query interface are explained in detail in this thesis.
45	Optimization and Execution of Complex Scientific Queries Fomkin, Ruslan January 2009 (has links) Large volumes of data produced and shared within scientific communities are analyzed by many researchers to investigate different scientific theories. Currently the analyses are implemented in traditional programming languages such as C++. This is inefficient for research productivity, since it is difficult to write, understand, and modify such programs. Furthermore, programs should scale over large data volumes and analysis complexity, which further complicates code development. This Thesis investigates the use of database technologies to implement scientific applications, in which data are complex objects describing measurements of independent events and the analyses are selections of events by applying conjunctions of complex numerical filters on each object separately. An example of such an application is analyses for the presence of Higgs bosons in collision events produced by the ATLAS experiment. For efficient implementation of such an ATLAS application, a new data stream management system SQISLE is developed. In SQISLE queries are specified over complex objects which are efficiently streamed from sources through the query engine. This streaming approach is compared with the conventional approach to load events into a database before querying. Since the queries implementing scientific analyses are large and complex, novel techniques are developed for efficient query processing. To obtain efficient plans for such queries SQISLE implements runtime query optimization strategies, which during query execution collect runtime statistics for a query, reoptimize the query using the collected statistics, and dynamically switch optimization strategies. The cost-based optimization utilizes a novel cost model for aggregate functions over nested subqueries. To alleviate estimation errors in large queries the fragments are decomposed into conjunctions of subqueries over which runtime statistics are measured. Performance is further improved by query transformation, view materialization, and partial evaluation. ATLAS queries in SQISLE using these query processing techniques perform close to or better than hard-coded C++ implementations of the same analyses. Scientific data are often stored in Grids, which manage both storage and computational resources. This Thesis includes a framework POQSEC that utilizes Grid resources to scale scientific queries over large data volumes by parallelizing the queries and shipping the data management system itself, e.g. SQISLE, to Grid computational nodes for the parallel query execution. scientific databases query processing data streams cost-based query optimization query rewritings databases and Grids Computer science Datavetenskap
46	Event-Driven Dynamic Query Model for Sleep Study Outcomes Research Jain, Sulabh 30 January 2012 (has links) No description available. Computer Science Query Interface Event Driven Sleep Study PSG Query System Dynamic Query Sleep Medicine Patient Cohort Identification
47	Querying graphs with data Vrgoc, Domagoj January 2014 (has links) Graph data is becoming more and more pervasive. Indeed, services such as Social Networks or the Semantic Web can no longer rely on the traditional relational model, as its structure is somewhat too rigid for the applications they have in mind. For this reason we have seen a continuous shift towards more non-standard models. First it was the semi-structured data in the 1990s and XML in 2000s, but even such models seem to be too restrictive for new applications that require navigational properties naturally modelled by graphs. Social networks fit into the graph model by their very design: users are nodes and their connections are specified by graph edges. The W3C committee, on the other hand, describes RDF, the model underlying the Semantic Web, by using graphs. The situation is quite similar with crime detection networks and tracking workflow provenance, namely they all have graphs inbuilt into their definition. With pervasiveness of graph data the important question of querying and maintaining it has emerged as one of the main priorities, both in theoretical and applied sense. Currently there seem to be two approaches to handling such data. On the one hand, to extract the actual data, practitioners use traditional relational languages that completely disregard various navigational patterns connecting the data. What makes this data interesting in modern applications, however, is precisely its ability to compactly represent intricate topological properties that envelop the data. To overcome this issue several languages that allow querying graph topology have been proposed and extensively studied. The problem with these languages is that they concentrate on navigation only, thus disregarding the data that is actually stored in the database. What we propose in this thesis is the ability to do both. Namely, we will study how query languages can be designed to allow specifying not only how the data is connected, but also how data changes along paths and patterns connecting it. To this end we will develop several query languages and show how adding different data manipulation capabilities and different navigational features affects the complexity of main reasoning tasks. The story here is somewhat similar to the early success of the relational data model, where theoretical considerations led to a better understanding of what makes certain tasks more challenging than others. Here we aim for languages that are both efficient and capable of expressing a wide variety of queries of interest to several groups of practitioners. To do so we will analyse how different requirements affect the language at hand and at the end provide a good base of primitives whose inclusion into a language should be considered, based on the applications one has in mind. Namely, we consider how adding a specific operation, mechanism, or capability to the language affects practical tasks that such an addition plans to tackle. In the end we arrive at several languages, all of them with their pros and cons, giving us a good overview of how specific capabilities of the language affect the design goals, thus providing a sound basis for practitioners to choose from, based on their requirements. 006.7
48	PDDS : a parallel deductive database system Cao, Hua January 1995 (has links) No description available. 005
49	Approaches to using word collocation in information retrieval Vechtomova, Olga January 2001 (has links) No description available. 020 IR; Query terms; Lexical cohesion
50	Quality of service aware optimization of sensor network queries Galpin, Ixent January 2010 (has links) Sensor networks comprise resource-constrained wireless nodes with the capability of gathering information about their surroundings and have recently risen to prominence with the promise of being an effective computing platform for diverse applications, ranging from event detection to environmental monitoring. The database community proposed the use of sensor network query processors (SNQPs) as means to meet data collection requirements using a declarative query language. Declarative queries posed against a sensor network constitute an effective means to repurpose sensor networks and reduce the high software development costs associated with them. The range of sensor network applications is very broad. Such applications have diverse, and often conflicting, QoS expectations in terms of the delivery time of results, the acquisition interval at which data is collected, the total energy consumption of the deployment, or the network lifetime. The conflicting nature of these desiderata is aggravated by the resource-constrained nature of sensor networks as a computing fabric, making it particularly challenging to reconcile the trade-offs that arise. Previously, SNQPs have been focussed on evaluating queries as energy-efficiently as possible. There has been comparatively less work on attempting to meet a broad range of optimization goals and constraints that captured these QoS expectations. In this respect, previous work in SNQP has not aimed at being general purpose across the breadth of applications to which sensor networks have been applied. This PhD dissertation presents an approach for enabling QoS-awareness in SNQPs so that query evaluation plans are generated that exhibit good performance for a broader range of sensor network applications in terms of their QoS expectations. The research contributions reported here include (a) a functional decomposition of the decision-making steps required to compile a declarative query into a query evaluation plan in a sensor network setting; (b) algorithms to implement these decision-making steps; and (c) an empirical evaluation to show the benefits of QoS-awareness compared to a representative fixed-goal SNQP. 621.382

Search results