Global ETD Search

21	Towards Semantically Enabled Complex Event Processing Keskisärkkä, Robin January 2017 (has links) The Semantic Web provides a framework for semantically annotating data on the web, and the Resource Description Framework (RDF) supports the integration of structured data represented in heterogeneous formats. Traditionally, the Semantic Web has focused primarily on more or less static data, but information on the web today is becoming increasingly dynamic. RDF Stream Processing (RSP) systems address this issue by adding support for streaming data and continuous query processing. To some extent, RSP systems can be used to perform complex event processing (CEP), where meaningful high-level events are generated based on low-level events from multiple sources; however, there are several challenges with respect to using RSP in this context. Event models designed to represent static event information lack several features required for CEP, and are typically not well suited for stream reasoning. The dynamic nature of streaming data also greatly complicates the development and validation of RSP queries. Therefore, reusing queries that have been prepared ahead of time is important to be able to support real-time decision-making. Additionally, there are limitations in existing RSP implementations in terms of both scalability and expressiveness, where some features required in CEP are not supported by any of the current systems. The goal of this thesis work has been to address some of these challenges and the main contributions of the thesis are: (1) an event model ontology targeted at supporting CEP; (2) a model for representing parameterized RSP queries as reusable templates; and (3) an architecture that allows RSP systems to be integrated for use in CEP. The proposed event model tackles issues specifically related to event modeling in CEP that have not been sufficiently covered by other event models, includes support for event encapsulation and event payloads, and can easily be extended to fit specific use-cases. The model for representing RSP query templates was designed as an extension to SPIN, a vocabulary that supports modeling of SPARQL queries as RDF. The extended model supports the current version of the RSP Query Language (RSP-QL) developed by the RDF Stream Processing Community Group, along with some of the most popular RSP query languages. Finally, the proposed architecture views RSP queries as individual event processing agents in a more general CEP framework. Additional event processing components can be integrated to provide support for operations that are not supported in RSP, or to provide more efficient processing for specific tasks. We demonstrate the architecture in implementations for scenarios related to traffic-incident monitoring, criminal-activity monitoring, and electronic healthcare monitoring. Semantic Web RDF stream processing Complex Event Processing Event modeling Query abstraction Computer Sciences Datavetenskap (datalogi)
22	Stream processing optimizations for mobile sensing applications Lai, Farley 01 August 2017 (has links) Mobile sensing applications (MSAs) are an emerging class of applications that process continuous sensor data streams to make time-sensitive inferences. Representative application domains range from environmental monitoring, context-aware services to recognition of physical activities and social interactions. Example applications involve city air quality assessment, indoor localization, pedometer and speaker identification. The common application workflow is to read data streams from the sensors (e.g, accelerometers, microphone, GPS), extract statistical features, and then present the inferred high-level events to the user. MSAs in the healthcare domain especially draw a significant amount of attention in recent years because sensor-based data collection and assessment offer finer-granularity, timeliness, and higher accuracy in greater quantity than traditional, labor-intensive, data gathering mechanisms in use today, e.g., surveys methods. The higher fidelity and accuracy of the collected data expose new research opportunities, improve the reliability and accuracy of medical decisions, and empower users to manage personal health more effectively. Nonetheless, a critical challenge to practical deployment of MSAs in real-world is to effectively manage limited resources of mobile platforms to meet stringent quality of service (QoS) requirements in terms of processing throughput and delay while ensuring long term robustness. To address the challenge, we model MSAs in dataflows as a graph of processing elements that are connected by communication channels. The processing elements may execute in parallel as long as they have sufficient data to process. A key feature of the dataflow model is that it explicitly capture parallelism and data dependencies between processing elements. Based on the graph composition, we first proposed CSense, a stream-processing toolkit for robust and high-rate MSAs. In this work, CSense provide a simple language for developers to describe their sensing flow without the need to deal with system intricacy, such as memory allocation, concurrency control and power management. The results show up to 19X performance difference may be achieved automatically compared with a baseline using the default runtime concurrency and memory management. Following this direction, we saw the opportunities that MSAs can be significantly improved from the perspective of memory performance and energy efficiency in view of the iterative execution. Therefore, we next focus on optimizing the runtime memory management through compile time analysis. The contribution is a stream compiler that captures the whole program memory behavior to generate an efficient memory layout for runtime access. Experiments show that our memory optimizations reduce memory footprint by as much as 96% while matching or improving the performance of the StreamIt compiler with cache optimizations enabled. On the other hand, while there is a significant body of work that has focused on optimizing the throughput or latency of processing sensor streams, little to no attention has been given to energy efficiency. We proposed an accurate offline energy prediction model for MSAs that leverages the pipeline structure and iterative execution nature to search for the most energy saving batching configuration w.r.t. a deadline constraint. The developers are expected to visualize the energy delay trade-off in the parameter space without runtime profiling. The evaluation shows the worst-case prediction errors are about 7% and 15% for energy and latency respectively despite variable application workloads. energy efficiency memory management mobile sensing applications power management stream processing workload shaping Computer Sciences
23	CBPsp: complex business processes for stream processing Kamaleswaran, Rishikesan 01 April 2011 (has links) This thesis presents the framework of a complex business process driven event stream processing system to produce meaningful output with direct implications to the business objectives of an organization. This framework is demonstrated using a case study instantiating the management of a newborn infant with hypoglycaemia. Business processes defined within guidelines, are defined at build-time while critical knowledge found in the definition of business processes are used to support their enactment for stream analysis. Four major research contributions are delivered. The first contribution enables the definition and enactment of complex business processes in real-time. The second contribution supports the extraction of business process using knowledge found within the initial expression of the business process. The third contribution allows for the explicit use of temporal abstraction and stream analysis knowledge to support enactment in real-time. Finally, the last contribution is the real-time integration of heterogeneous streams based on Service-Oriented Architecture principles. / UOIT Complex business processes Event stream processing Data streams Real-time enactment Clinical decision support system
24	Quality-of-Service-Aware Data Stream Processing Schmidt, Sven 21 March 2007 (has links) (PDF) Data stream processing in the industrial as well as in the academic field has gained more and more importance during the last years. Consider the monitoring of industrial processes as an example. There, sensors are mounted to gather lots of data within a short time range. Storing and post-processing these data may occasionally be useless or even impossible. On the one hand, only a small part of the monitored data is relevant. To efficiently use the storage capacity, only a preselection of the data should be considered. On the other hand, it may occur that the volume of incoming data is generally too high to be stored in time or–in other words–the technical efforts for storing the data in time would be out of scale. Processing data streams in the context of this thesis means to apply database operations to the stream in an on-the-fly manner (without explicitly storing the data). The challenges for this task lie in the limited amount of resources while data streams are potentially infinite. Furthermore, data stream processing must be fast and the results have to be disseminated as soon as possible. This thesis focuses on the latter issue. The goal is to provide a so-called Quality-of-Service (QoS) for the data stream processing task. Therefore, adequate QoS metrics like maximum output delay or minimum result data rate are defined. Thereafter, a cost model for obtaining the required processing resources from the specified QoS is presented. On that basis, the stream processing operations are scheduled. Depending on the required QoS and on the available resources, the weight can be shifted among the individual resources and QoS metrics, respectively. Calculating and scheduling resources requires a lot of expert knowledge regarding the characteristics of the stream operations and regarding the incoming data streams. Often, this knowledge is based on experience and thus, a revision of the resource calculation and reservation becomes necessary from time to time. This leads to occasional interruptions of the continuous data stream processing, of the delivery of the result, and thus, of the negotiated Quality-of-Service. The proposed robustness concept supports the user and facilitates a decrease in the number of interruptions by providing more resources. data stream processing quality-of-service robustness Datenstromverarbeitung Qualität Robustheit ddc:004 rvk:ST 274 Datenstrom Datenverarbeitung Dienstgüte
25	Escalonamento adaptativo para sistemas de processamento contínuo de eventos. / Adaptive scheduling for continuous event processing systems. SOUSA, Rodrigo Duarte. 13 April 2018 (has links) Submitted by Johnny Rodrigues (johnnyrodrigues@ufcg.edu.br) on 2018-04-13T17:23:58Z No. of bitstreams: 1 RODRIGO DUARTE SOUSA - DISSERTAÇÃO - PPGCC 2014..pdf: 3708263 bytes, checksum: d9e59ec276a62382b6317ec8ce6bf880 (MD5) / Made available in DSpace on 2018-04-13T17:23:58Z (GMT). No. of bitstreams: 1 RODRIGO DUARTE SOUSA - DISSERTAÇÃO - PPGCC 2014..pdf: 3708263 bytes, checksum: d9e59ec276a62382b6317ec8ce6bf880 (MD5) Previous issue date: 2014-08-04 / Sistemasde processamento contínuo de eventos vêm sendo utilizados em aplicações que necessitam de um processamento quase em tempo real. Essa necessidade, junto da quantidade elevada de dados processados nessas aplicações, provocam que tais sistemas possuam fortes requisitos de desempenho e tolerância a falhas. Sendo assim, escalonadores geralmente fazem uso de informações de utilização dos recursos das máquinas do sistema (como utilização de CPU, memória RAM, rede e disco) natentativadereagirapossíveissobrecargasque possam aumentar a utilização dos recursos, provocando uma piora no desempenho da aplicação. Entretanto, devido aos diferentes perﬁs de aplicações e componentes, a complexidade de se decidir, de forma ﬂexível e genérica, o que deve ser monitorado e a diferença entre o que torna um recurso mais importante que outro em um dado momento, podem provocar escolhas não adequadas por parte do escalonador. O trabalho apresentado nesta dissertação propõe um algoritmo de escalonamento que, através de uma abordagem reativa, se adapta a diferentes perﬁs de aplicações e de carga, tomando decisões baseadas no monitoramento da variação do desempenho de seus operadores. Periodicamente,o escalonador realiza uma avaliação de quais operadores apresentaram uma piora em seu desempenho e, posteriormente, tenta migrar tais operadores para nós menos sobrecarregados. Foram executados experimentos onde um protótipo do algoritmo foi avaliado e os resultados demonstraram uma melhora no desempenho do sistema, apartirdadiminuiçãodalatênciadeprocessamentoedamanutenção da quantidade de eventos processados. Em execuções com variações bruscas da carga de trabalho, a latência média de processamento dos operadores foi reduzida em mais de 84%, enquanto queaquantidadedeeventos processados diminuiuapenas 1,18%. / The usage of event stream processing systems is growing lately, mainly at applications that have a near real-time processing as a requirement. That need, combined with the high amount of data processed by these applications, increases the dependency on performance and fault tolerance of such systems. Therefore, to handle these requirements, schedulers usually make use of the resources utilization (like CPU, RAM, disk and network bandwidth) in an attempt to react to potential over loads that may further increase their utilization, causing the application’s performance to deteriorate. However, due to different application proﬁles and components, the complexity of deciding, in a ﬂexible and generic way, what resources should be monitored and the difference between what makes a resource utilization more important than another in a given time, can provoke the scheduler to perform wrong actions. In this work, we propose a scheduling algorithm that, via a reactive approach, adapts to different applications proﬁles and load, taking decisions based at the latency variation from its operators. Periodically, the system scheduler performs an evaluation of which operators are giving evidence of beingin an over loaded state, then, the scheduler tries to migrate those operators to a machine with less utilization. The experiments showed an improvement in the system performance, in scenarios with a bursty workload, the operators’ average processing latency was reduced by more than 84%, while the number of processed events decreased by only1.18%. Ciência da Computação Sistemas Distribuídos Escalonamento Ciência da computação Event stream processing systems
26	A benchmark suite for distributed stream processing systems / Um benchmark suite para sistemas distribuídos de stream processing Bordin, Maycon Viana January 2017 (has links) Um dado por si só não possui valor algum, a menos que ele seja interpretado, contextualizado e agregado com outros dados, para então possuir valor, tornando-o uma informação. Em algumas classes de aplicações o valor não está apenas na informação, mas também na velocidade com que essa informação é obtida. As negociações de alta frequência (NAF) são um bom exemplo onde a lucratividade é diretamente proporcional a latência (LOVELESS; STOIKOV; WAEBER, 2013). Com a evolução do hardware e de ferramentas de processamento de dados diversas aplicações que antes levavam horas para produzir resultados, hoje precisam produzir resultados em questão de minutos ou segundos (BARLOW, 2013). Este tipo de aplicação tem como característica, além da necessidade de processamento em tempo-real ou quase real, a ingestão contínua de grandes e ilimitadas quantidades de dados na forma de tuplas ou eventos. A crescente demanda por aplicações com esses requisitos levou a criação de sistemas que disponibilizam um modelo de programação que abstrai detalhes como escalonamento, tolerância a falhas, processamento e otimização de consultas. Estes sistemas são conhecidos como Stream Processing Systems (SPS), Data Stream Management Systems (DSMS) (CHAKRAVARTHY, 2009) ou Stream Processing Engines (SPE) (ABADI et al., 2005). Ultimamente estes sistemas adotaram uma arquitetura distribuída como forma de lidar com as quantidades cada vez maiores de dados (ZAHARIA et al., 2012). Entre estes sistemas estão S4, Storm, Spark Streaming, Flink Streaming e mais recentemente Samza e Apache Beam. Estes sistemas modelam o processamento de dados através de um grafo de fluxo com vértices representando os operadores e as arestas representando os data streams. Mas as similaridades não vão muito além disso, pois cada sistema possui suas particularidades com relação aos mecanismos de tolerância e recuperação a falhas, escalonamento e paralelismo de operadores, e padrões de comunicação. Neste senário seria útil possuir uma ferramenta para a comparação destes sistemas em diferentes workloads, para auxiliar na seleção da plataforma mais adequada para um trabalho específico. Este trabalho propõe um benchmark composto por aplicações de diferentes áreas, bem como um framework para o desenvolvimento e avaliação de SPSs distribuídos. / Recently a new application domain characterized by the continuous and low-latency processing of large volumes of data has been gaining attention. The growing number of applications of such genre has led to the creation of Stream Processing Systems (SPSs), systems that abstract the details of real-time applications from the developer. More recently, the ever increasing volumes of data to be processed gave rise to distributed SPSs. Currently there are in the market several distributed SPSs, however the existing benchmarks designed for the evaluation this kind of system covers only a few applications and workloads, while these systems have a much wider set of applications. In this work a benchmark for stream processing systems is proposed. Based on a survey of several papers with real-time and stream applications, the most used applications and areas were outlined, as well as the most used metrics in the performance evaluation of such applications. With these information the metrics of the benchmark were selected as well as a list of possible application to be part of the benchmark. Those passed through a workload characterization in order to select a diverse set of applications. To ease the evaluation of SPSs a framework was created with an API to generalize the application development and collect metrics, with the possibility of extending it to support other platforms in the future. To prove the usefulness of the benchmark, a subset of the applications were executed on Storm and Spark using the Azure Platform and the results have demonstrated the usefulness of the benchmark suite in comparing these systems. Processamento distribuido Processamento : Alto desempenho Distributed systems Benchmark suite Stream processing Real-time processing Big data
27	A benchmark suite for distributed stream processing systems / Um benchmark suite para sistemas distribuídos de stream processing Bordin, Maycon Viana January 2017 (has links) Um dado por si só não possui valor algum, a menos que ele seja interpretado, contextualizado e agregado com outros dados, para então possuir valor, tornando-o uma informação. Em algumas classes de aplicações o valor não está apenas na informação, mas também na velocidade com que essa informação é obtida. As negociações de alta frequência (NAF) são um bom exemplo onde a lucratividade é diretamente proporcional a latência (LOVELESS; STOIKOV; WAEBER, 2013). Com a evolução do hardware e de ferramentas de processamento de dados diversas aplicações que antes levavam horas para produzir resultados, hoje precisam produzir resultados em questão de minutos ou segundos (BARLOW, 2013). Este tipo de aplicação tem como característica, além da necessidade de processamento em tempo-real ou quase real, a ingestão contínua de grandes e ilimitadas quantidades de dados na forma de tuplas ou eventos. A crescente demanda por aplicações com esses requisitos levou a criação de sistemas que disponibilizam um modelo de programação que abstrai detalhes como escalonamento, tolerância a falhas, processamento e otimização de consultas. Estes sistemas são conhecidos como Stream Processing Systems (SPS), Data Stream Management Systems (DSMS) (CHAKRAVARTHY, 2009) ou Stream Processing Engines (SPE) (ABADI et al., 2005). Ultimamente estes sistemas adotaram uma arquitetura distribuída como forma de lidar com as quantidades cada vez maiores de dados (ZAHARIA et al., 2012). Entre estes sistemas estão S4, Storm, Spark Streaming, Flink Streaming e mais recentemente Samza e Apache Beam. Estes sistemas modelam o processamento de dados através de um grafo de fluxo com vértices representando os operadores e as arestas representando os data streams. Mas as similaridades não vão muito além disso, pois cada sistema possui suas particularidades com relação aos mecanismos de tolerância e recuperação a falhas, escalonamento e paralelismo de operadores, e padrões de comunicação. Neste senário seria útil possuir uma ferramenta para a comparação destes sistemas em diferentes workloads, para auxiliar na seleção da plataforma mais adequada para um trabalho específico. Este trabalho propõe um benchmark composto por aplicações de diferentes áreas, bem como um framework para o desenvolvimento e avaliação de SPSs distribuídos. / Recently a new application domain characterized by the continuous and low-latency processing of large volumes of data has been gaining attention. The growing number of applications of such genre has led to the creation of Stream Processing Systems (SPSs), systems that abstract the details of real-time applications from the developer. More recently, the ever increasing volumes of data to be processed gave rise to distributed SPSs. Currently there are in the market several distributed SPSs, however the existing benchmarks designed for the evaluation this kind of system covers only a few applications and workloads, while these systems have a much wider set of applications. In this work a benchmark for stream processing systems is proposed. Based on a survey of several papers with real-time and stream applications, the most used applications and areas were outlined, as well as the most used metrics in the performance evaluation of such applications. With these information the metrics of the benchmark were selected as well as a list of possible application to be part of the benchmark. Those passed through a workload characterization in order to select a diverse set of applications. To ease the evaluation of SPSs a framework was created with an API to generalize the application development and collect metrics, with the possibility of extending it to support other platforms in the future. To prove the usefulness of the benchmark, a subset of the applications were executed on Storm and Spark using the Azure Platform and the results have demonstrated the usefulness of the benchmark suite in comparing these systems. Processamento distribuido Processamento : Alto desempenho Distributed systems Benchmark suite Stream processing Real-time processing Big data
28	TupleSearch : A scalable framework based on sketches to process and store streaming temporal data for real time analytics Karlsson, Henrik January 2017 (has links) In many fields, there is a need for quick analysis of data. As the number of devices connected to the Internet grows, so does the amounts of data generated. The traditional way of analyzing large amounts of data has been by using batch processing, where the already collected data is pro-cessed. This process is time consuming, resulting in another trend emerg-ing: stream processing. Stream processing is when data is processed and stored as it arrives. Because of the velocity, volume and variations in data. Stream processing is best carried out in the main memory, and means processing and storing data as it arrives, which makes it a big challenge. This thesis focuses on developing a framework for the processing and storing of streaming temporal data enabling the data to be analyzed in real time. For this purpose, a server application was created consisting of approximate in-memory data synopsizes, called sketches, to process and store the input data. Furthermore, a client web application was created to query and analyze the data. The results show that the framework can sup-port simple aggregate queries with constant query time regardless to the volume of data. Also, it can process data 6.8 times faster than a traditional database system. All this implies that the system is scalable, at the same time it with a query error vs. memory trade-off. For a distribution of ~3000000 unique items it was concluded that the framework can provide very accurate answers, with an error rate less than 1.1%, for the trendiest data using about 100 times less space than the actual size of the data set. Streaming Data Stream Processing Count-Min Sketch Time Adaptive Sketches Computer Engineering Datorteknik
29	Handling Tradeoffs between Performance and Query-Result Quality in Data Stream Processing Ji, Yuanzhen 27 March 2018 (has links) (PDF) Data streams in the form of potentially unbounded sequences of tuples arise naturally in a large variety of domains including finance markets, sensor networks, social media, and network traffic management. The increasing number of applications that require processing data streams with high throughput and low latency have promoted the development of data stream processing systems (DSPS). A DSPS processes data streams with continuous queries, which are issued once and return query results to users continuously as new tuples arrive. For stream-based applications, both the query-execution performance (in terms of, e.g., throughput and end-to-end latency) and the quality of produced query results (in terms of, e.g., accuracy and completeness) are important. However, a DSPS often needs to make tradeoffs between these two requirements, either because of the data imperfection within the streams, or because of the limited computation capacity of the DSPS itself. Performance versus result-quality tradeoffs caused by data imperfection are inevitable, because the quality of the incoming data is beyond the control of a DSPS, whereas tradeoffs caused by system limitations can be alleviated—even erased—by enhancing the DSPS itself. This dissertation seeks to advance the state of the art on handling the performance versus result-quality tradeoffs in data stream processing caused by the above two aspects of reasons. For tradeoffs caused by data imperfection, this dissertation focuses on the typical data-imperfection problem of stream disorder and proposes the concept of quality-driven disorder handling (QDDH). QDDH enables a DSPS to make flexible and user-configurable tradeoffs between the end-to-end latency and the query-result quality when dealing with stream disorder. Moreover, compared to existing disorder handling approaches, QDDH can significantly reduce the end-to-end latency, and at the same time provide users with desired query-result quality. In this dissertation, a generic buffer-based QDDH framework and three instantiations of the generic framework for distinct query types are presented. For tradeoffs caused by system limitations, this dissertation proposes a system-enhancement approach that combines the row-oriented and the column-oriented data layout and processing techniques in data stream processing to improve the throughput. To fully exploit the potential of such hybrid execution of continuous queries, a static, cost-based query optimizer is introduced. The optimizer works at the operator level and takes the unique property of execution plans of continuous queries—feasibility—into account. Datenstromverarbeitung Data Stream Processing ddc:004 rvk:ST 234 rvk:ST 277 rvk:ST 265
30	A situation refinement model for complex event processing Alakari, Alaa A. 07 January 2021 (has links) Complex Event Processing (CEP) systems aim at processing large flows of events to discover situations of interest (SOI). Primarily, CEP uses predefined pattern templates to detect occurrences of complex events in an event stream. Extracting complex event is achieved by employing techniques such as filtering and aggregation to detect complex patterns of many simple events. In general, CEP systems rely on domain experts to de fine complex pattern rules to recognize SOI. However, the task of fine tuning complex pattern rules in the event streaming environment face two main challenges: the issue of increased pattern complexity and the event streaming constraints where such rules must be acquired and processed in near real-time. Therefore, to fine-tune the CEP pattern to identify SOI, the following requirements must be met: First, a minimum number of rules must be used to re fine the CEP pattern to avoid increased pattern complexity, and second, domain knowledge must be incorporated in the refinement process to improve awareness about emerging situations. Furthermore, the event data must be processed upon arrival to cope with the continuous arrival of events in the stream and to respond in near real-time. In this dissertation, we present a Situation Refi nement Model (SRM) that considers these requirements. In particular, by developing a Single-Scan Frequent Item Mining algorithm to acquire the minimal number of CEP rules with the ability to adjust the level of re refinement to t the applied scenario. In addition, a cost-gain evaluation measure to determine the best tradeoff to identify a particular SOI is presented. / Graduate Complex Event Processing Situational Awareness Event Stream Processing Real time data mining Situation Refinement Knowledge Discovery

Search results