• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1
  • 1
  • Tagged with
  • 4
  • 4
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

The HELLS-Join: A Heterogeneous Stream join for ExtremeLy Large windows

Karnagel, Tomas, Habich, Dirk, Schlegel, Benjamin, Lehner, Wolfgang 19 September 2022 (has links)
Upcoming processors are combining different computing units in a tightly-coupled approach using a unified shared memory hierarchy. This tightly-coupled combination leads to novel properties with regard to cooperation and interaction. This paper demonstrates the advantages of those processors for a stream-join operator as an important data-intensive example. In detail, we propose our HELLS-Join approach employing all heterogeneous devices by outsourcing parts of the algorithm on the appropriate device. Our HELLS-Join performs better than CPU stream joins, allowing wider time windows, higher stream frequencies, and more streams to be joined as before.
2

ASBJOIN: uma estratÃgia adaptativa para consultas envolvendo operadores de junÃÃo em Linked data / ASBJOIN: an adaptive strategy for queries involving join operators on Linked date

Macedo Sousa Maia 31 October 2013 (has links)
CoordenaÃÃo de AperfeiÃoamento de Pessoal de NÃvel Superior / Motivado pelo sucesso de Linked Data e impulsionado pelo crescimento do nÃmero de fontes de dados em formato RDF disponÃveis na Web, novos desafios para processamento de consultas estÃo emergindo, especialmente em configuraÃÃes distribuÃdas. No ambiente de Linked Data, à possÃvel executar consultas federadas, as quais envolvem junÃÃes de dados fornecidos por mÃltiplas fontes. O termo consulta federada à usado quando queremos prover soluÃÃes baseadas em informaÃÃes obtidas de diferentes fontes. Nesse sentido, a concepÃÃo de novos algoritmos e estratÃgias adaptativas para a execuÃÃo de junÃÃes de forma eficiente constitui um desafio importante. Nesse trabalho, apresentamos uma soluÃÃo para a execuÃÃo adaptativa de operaÃÃes de junÃÃes de dados em consultas federadas. A execuÃÃo da operaÃÃo de junÃÃo adaptativa entre informaÃÃes contidas em fontes de dados distribuÃdas baseia-se em estatÃsticas, que sÃo coletadas em tempo de execuÃÃo. Uma informaÃÃo estatÃstica sobre uma determinada fontes seria, por exemplo, o tempo decorrido (Elapsed Time) para obter algum resultado. Para obter as informaÃÃes estatÃsticas atualizadas, usamos uma estratÃgia que coleta essas informaÃÃes durante a execuÃÃo da consulta e,logo apÃs, sÃo armazenadas em uma base de dados local, na qual denominamos como catÃlogo de informaÃÃes estatÃsticas. / Motivated by the success of Linked Data and driven by the growing number of data sources into RDF files available on the web, new challenges for query processing are emerging, especially in distributed settings. These environments allow distributed execution of federated queries, which involve joining data provided by multiple sources, which are often unstable. In this sense, the design of new algorithms and adaptive strategies for efficiently implementing joins is a major challenge. In this paper, we present a solution to the adaptive joins execution in federated queries. The adaptative context of distributed data sources is based on statistics that are collected at runtime. For this, we use a module that updates the information in the catalog as the query is executed. The module works in parallel with the query processor.
3

Multiple Continuous Query Processing with Relative Window Predicates "Juggler"

Silva, Asima 27 May 2004 (has links)
"Efficient querying over streaming data is a critical technology which requires the ability to handle numerous and possibly similar queries in real time dynamic environments such as the stock market and medical devices. Existing DBMS technology is not well suited for this domain since it was developed for static historical data. Queries over streams often contain relative window predicates such as in the query: ``Heart rate decreased to fifty-two beats per second within four seconds after the patient's temperature started rising." Relative window predicates are a specific type of join between streams that is based on the tuple's timestamp. In our operator, called Juggler, predicates are classified into three types: attribute, join, and window. Attribute predicates are stream values compared to a constant. Join predicates are stream values compared to another stream's values. Window predicates are join predicates where the streams' timestamp values are compared. Juggler's composite operator incorporates the processing of similar though not identical, query functionalities as one complex computation process. This execution strategy handles multi-way joins for multiple selection and join predicates. It adaptively orders the execution of predicates by their selectivity to efficiently process multiple continuous queries based on stream characteristics. In Juggler, all similar predicates are grouped into lists. These indices are represented by a collection of bits. Every tuple contains the bit structure representation of the predicate lists which encodes tuple predicate evaluation history. Every query also contains a similar bit structure to encode the predicate's relationship to the registered queries. The tuple's and query's bit structures are compared to assess if the tuple has satisfied a query. Juggler is designed and implemented in Java. Experiments were conducted to verify correctness and to assess the performance of Juggler's three features. Its adaptivity of reordering the evaluation of predicate types performed as well as the most selective predicate ordering. Its ability to exploit similar predicates in multiple queries showed reduction in number of comparisons. Its effectiveness when multiple queries are combined in a single Juggler operator indicated potential performance improvements after optimization of Juggler's data structures."
4

State Spill Policies for State Intensive Continuous Query Plan Evaluation

Jbantova, Mariana G 02 May 2007 (has links)
The needs of new modern day applications such as network monitoring systems, telecommunications data management, web applications, remote medical monitoring applications and others for near real time results over continuous data streams have spurred the development of new data management systems called Data Stream Management Systems (DSMS). Unlike traditional database systems which answer one-time user queries only after the finite data has been captured on disk, DSMSs provide on-the-fly answers to user queries as data is arriving at various rates in the form of continuous, potentially infinite streams of tuples. To meet the timeliness requirements of applications, DSMSs aim to keep all data in main memory. Thus queries with multiple stateful operators pose a major strain on memory. Existing adaptation techniques designed to address this issue are ineffective when faced with continuous bursts of high data rates. When system load exceeds system capacity, a DSMS has three options: 1) discard some new data; 2) crash; or 3) spill data to disk. Only option three allows it to produce delayed, yet accurate and complete query results. However, this option involves disk access overhead and change in the natural order of tuples flowing through the query plan tree. As not all stream operators can process correctly out of order tuples, data spilling may have a negative impact on the quality of the final results. Moreover, since operators in a query plan are interconnected, changes in the order of tuple flows inevitably impact the stages of execution of affected downstream operators such as for example data purging . Data purging is necessary for processing continuous queries composed of stateful operators. The state of such operators is divided into finite non-overlapping sets of tuples called windows. Thus, after all the tuples for a window have been processed and all results output, these tuples can be discarded to free memory for new data. To address these issues, we have redesigned the state structure of continuous operators into smaller, finite, non-overlapping sets of tuples such as partitioned window groups, which incur less disk-access overhead. Second, we provide for the capability of continuous operators to correctly process out of order tuples using punctuation pointers. Third, we design methods for downstream operators to synchronize their processing stages with those of upstream operators to achieve optimized query plan throughput. Putting these techniques together, we have designed a consolidated spilling adaptation strategy which considers all aspects of operators' inter-connections in a query plan for making optimal adaptation decisions. The effectiveness of our integrated approach was empirically tested in a comparative evaluation study against several alternate spilling adaptation strategies. We conducted our experiments on CAPE, a DSMS developed at WPI, using different types of query plans composed of multiple partitioned window join operators. Our experiments prove that despite the higher overhead of a more synchronized adaptation approach, our consolidated strategy provides better query plan performance and higher plan throughput during periods of continuous bursts of high data rates.

Page generated in 0.0671 seconds