1 |
Real-time data stream clustering over sliding windowsBadiozamany, Sobhan January 2016 (has links)
In many applications, e.g. urban traffic monitoring, stock trading, and industrial sensor data monitoring, clustering algorithms are applied on data streams in real-time to find current patterns. Here, sliding windows are commonly used as they capture concept drift. Real-time clustering over sliding windows is early detection of continuously evolving clusters as soon as they occur in the stream, which requires efficient maintenance of cluster memberships that change as windows slide. Data stream management systems (DSMSs) provide high-level query languages for searching and analyzing streaming data. In this thesis we extend a DSMS with a real-time data stream clustering framework called Generic 2-phase Continuous Summarization framework (G2CS). G2CS modularizes data stream clustering by taking as input clustering algorithms which are expressed in terms of a number of functions and indexing structures. G2CS supports real-time clustering by efficient window sliding mechanism and algorithm transparent indexing. A particular challenge for real-time detection of a high number of rapidly evolving clusters is efficiency of window slides for clustering algorithms where deletion of expired data is not supported, e.g. BIRCH. To that end, G2CS includes a novel window maintenance mechanism called Sliding Binary Merge (SBM). To further improve real-time sliding performance, G2CS uses generation-based multi-dimensional indexing where indexing structures suitable for the clustering algorithms can be plugged-in.
|
2 |
The implementation of H.264 algorithm with parallel extended MMX instruction setShen, Cheng-Ying 20 August 2008 (has links)
The H.264 Protocol is an important method for the multimedia transmission and calculation, but it is difficult to work smoothly on the embedded systems because of the low clock in the working environment of the embedded system .Although many new multimedia instruction sets have been developed, the immediate multimedia calculation is still difficult to implement on the embedded system.
So this paper uses the ¡§Multimedia Operation Register¡¨, a SIMD architecture, to implement H.264 algorithm on the embedded system to improve the performance of handling multimedia calculation. Multimedia Operation Register, which performs the parallel execution of the multi-data-streaming, uses the bit slice concept to design operation pair combining bit storage cell and bit computation. According to the characteristic , which is the address having constant distance between more than two data being used saved in the Memory, this paper using the striping addressing mode , which can cooperate with the parallel execution of multi-data-streaming , to load the data having strode addresses from the Memory in one instructions. On the other hand, this paper designs a new instruction set based on the Intel MMX instruction set and the operation feature of multimedia calculation.
When a designer uses single-data-steaming to implement the H.264 Protocol by the multimedia instruction sets, he will use more interactions to do the same thing in every block. Now this paper can use fewer interactions to do the same thing because the Multimedia Operation Register can use the parallel execution of the multi-data-stream to calculate the data in many different blocks to implement H.264 Protocol at the same time. On the other hand, this paper can reallocate the number of the registers to the arithmetic unit which will be used smartly by changing the working mode. This paper also saves much execution time of some actions such as the transpose of the matrix, the data resorting and the SAD (Sum of Absolute Differences) calculation by using new instructions. In order to reduce the times of memory access, this paper uses the method which rotates the data between two registers to let the data been used as possible as it can. So the coding efficiency can be improved explosively by using all the methods which have been introduced.
The conclusion in this paper shows that the parallel execution of the multi-data-streaming will be a very important method to handle multimedia calculation. And this paper advances an innovative architecture to implement the parallel execution of the multi-data- streaming. According to the simulation in 5th chapter, the speedup of handling H.264 Protocol by Multimedia Operation Register is more than four times with MMX instruction set. In the SAD calculation, it even can have ten times advanced then MMX instruction set. At last the efficacy is even better than the latest multimedia instruction set -¡§SSE4¡¨.
|
3 |
Event stream analyticsPoppe, Olga 05 January 2018 (has links)
Advances in hardware, software and communication networks have enabled applications to generate data at unprecedented volume and velocity. An important type of this data are event streams generated from financial transactions, health sensors, web logs, social media, mobile devices, and vehicles. The world is thus poised for a sea-change in time-critical applications from financial fraud detection to health care analytics empowered by inferring insights from event streams in real time. Event processing systems continuously evaluate massive workloads of Kleene queries to detect and aggregate event trends of interest. Examples of these trends include check kites in financial fraud detection, irregular heartbeat in health care analytics, and vehicle trajectories in traffic control. These trends can be of any length. Worst yet, their number may grow exponentially in the number of events. State-of-the-art systems do not offer practical solutions for trend analytics and thus suffer from long delays and high memory costs. In this dissertation, we propose the following event trend detection and aggregation techniques. First, we solve the trade-off between CPU processing time and memory usage while computing event trends over high-rate event streams. Namely, our event trend detection approach guarantees minimal CPU processing time given limited memory. Second, we compute online event trend aggregation at multiple granularity levels from fine (per matched event), to medium (per event type), to coarse (per pattern). Thus, we minimize the number of aggregates – reducing both time and space complexity compared to the state-of-the-art approaches. Third, we share intermediate aggregates among multiple event sequence queries while avoiding the expensive construction of matched event sequences. In several comprehensive experimental studies, we demonstrate the superiority of the proposed strategies over the state-of-the-art techniques with respect to latency, throughput, and memory costs.
|
4 |
A Client-Centric Data Streaming Technique for Smartphones: An Energy EvaluationAbogharaf, Abdulhakim 04 1900 (has links)
With advances in microelectronic and wireless communication technologies, smartphones have computer-like capabilities in terms of computing power and communication bandwidth. They allow users to use advanced applications that used to be run on computers only. Web browsing, email fetching, gaming, social networking, and multimedia streaming are examples of wide-spread smartphone applications. Unsurprisingly, network-related applications are dominant in the realm of smartphones. Users love to be connected while they are mobile. Streaming applications, as a part of network-related applications, are getting increasingly popular. Mobile TV, video on demand, and video sharing are some popular streaming services in the mobile world. Thus, the expected operational time of smartphones is rising rapidly.
On the other hand, the enormous growth of smartphone applications and services adds
up to a significant increase in complexity in the context of computation and communication needs, and thus there is a growing demand for energy in smartphones. Unlike the exponential growth in computing and communication technologies, the growth in battery technologies is not keeping up with the rapidly growing energy demand of these devices. Therefore, the smartphone's utility has been severely constrained by its limited battery lifetime. It is very important to conserve the smartphone's battery power. Even though hardware components are the actual energy consumers, software applications utilize the hardware components through the operating system. Thus, by making smartphone applications energy-efficient, the battery lifetime can be extended. With this view, this work focuses on two main problems: i) developing an energy testing methodology for smartphone applications, and ii) evaluating the energy cost and designing an energy-friendly downloader for smartphone streaming applications.
The detailed contributions of this thesis are as follows: (i) it gives a generalized framework for energy performance testing and shows a detailed flowchart that application developers can easily follow to test their applications; (ii) it evaluates the energy cost of some popular streaming applications showing how the download strategy that an application developer adopts may adversely affect the energy savings; (iii) it develops a model of an energy-friendly downloader for streaming applications and studies the effects of the downloader's parameters regarding energy consumption; and finally, (iv) it gives a mathematical model for the proposed downloader and validates it by means of experiments.
|
5 |
Implementation of face detection algorithm with parallel extended-MMX instruction setTzeng, Hua-Yi 20 August 2008 (has links)
Face detection has many applications in technical area. We think about accuracy and regular arrangement of data of face detection. So, we select Recognition algorithms using neural network for implementation. The implementation method can be divided into three parts. One is Modified Census Transform. The other one is computing hypotheses. Other is square frame for mark face. Modified Census Transform is a regularly computing method and regular arrangement of data. Modified Census Transform is compatible using SIMD execution, but other parts is irregular arrangement of data and not easy to parallel execution. This paper uses SIMD processor architecture which develops in our laboratory to implementation of Modified Census Transform and multi-data streaming property. The picture is divided four parts to execute at the same time and changes different mode to execute according to different algorithm then fetch data is smooth and moving data can reduce frequency. Adding a new instruction that uses 16bits data format uses four MMX registers for 4¡Ñ4 transpose of the matrix. The other is loading data and extending signed bit or unsigned bit at the same time. They can accelerate parallel execution in multi-data streaming. We also support multi-data streaming that is not series. It uses striping mode to fetch multi-data which between the same distance then we can achieve to compute multi-data streaming. Besides, we use hypotheses to distinguish different person that we only want find one. We compare two hypotheses. If the difference in hypotheses between two different picture that there is small than 0.3%, they are the same person which in different picture. Finial, we verify the function is correct in UMVP-2500 platform. We compare efficiency with MMX and Xscale and analysis multi-data streaming SIMD architecture which has some benefits. We compare efficiency with MMX. We speed up 373%. We compare efficiency with Xscale. We speed up 345%. This result will show that multi-data streaming SIMD architecture compares speed up with others SIMD architecture. Multi-data streaming SIMD architecture adds a new instruction which is 4¡Ñ4 transpose of the matrix. Because the 4¡Ñ4 transpose of the matrix can change row and column, we have new abstraction. The common computation likes a line, but the new abstraction becomes a phase. MMX and Xscale are not this abstraction.
|
6 |
A Client-Centric Data Streaming Technique for Smartphones: An Energy EvaluationAbogharaf, Abdulhakim 04 1900 (has links)
With advances in microelectronic and wireless communication technologies, smartphones have computer-like capabilities in terms of computing power and communication bandwidth. They allow users to use advanced applications that used to be run on computers only. Web browsing, email fetching, gaming, social networking, and multimedia streaming are examples of wide-spread smartphone applications. Unsurprisingly, network-related applications are dominant in the realm of smartphones. Users love to be connected while they are mobile. Streaming applications, as a part of network-related applications, are getting increasingly popular. Mobile TV, video on demand, and video sharing are some popular streaming services in the mobile world. Thus, the expected operational time of smartphones is rising rapidly.
On the other hand, the enormous growth of smartphone applications and services adds
up to a significant increase in complexity in the context of computation and communication needs, and thus there is a growing demand for energy in smartphones. Unlike the exponential growth in computing and communication technologies, the growth in battery technologies is not keeping up with the rapidly growing energy demand of these devices. Therefore, the smartphone's utility has been severely constrained by its limited battery lifetime. It is very important to conserve the smartphone's battery power. Even though hardware components are the actual energy consumers, software applications utilize the hardware components through the operating system. Thus, by making smartphone applications energy-efficient, the battery lifetime can be extended. With this view, this work focuses on two main problems: i) developing an energy testing methodology for smartphone applications, and ii) evaluating the energy cost and designing an energy-friendly downloader for smartphone streaming applications.
The detailed contributions of this thesis are as follows: (i) it gives a generalized framework for energy performance testing and shows a detailed flowchart that application developers can easily follow to test their applications; (ii) it evaluates the energy cost of some popular streaming applications showing how the download strategy that an application developer adopts may adversely affect the energy savings; (iii) it develops a model of an energy-friendly downloader for streaming applications and studies the effects of the downloader's parameters regarding energy consumption; and finally, (iv) it gives a mathematical model for the proposed downloader and validates it by means of experiments.
|
7 |
Building a high throughput microscope simulator using the Apache Kafka streaming frameworkLugnegård, Lovisa January 2018 (has links)
Today microscopy imaging is a widely used and powerful method for investigating biological processes. The microscopes can produce large amounts of data in a short time. It is therefore impossible to analyse all the data thoroughly because of time and cost constraints. HASTE (Hierarchical Analysis of Temporal and Spatial Image Data) is a collaborative research project between Uppsala University, AstraZeneca and Vironova which addresses this specific problem. The idea is to analyse the image data in real time to make fast decisions on whether to analyse further, store or throw away the data. To facilitate the development process of this system a microscope simulator has been designed and implemented with large focus on parameters relating to data throughput. Apart from building the simulator the framework Apache Kafka has been evaluated for streaming large images. The results from this project are both a working simulator which shows a performance similar to that of the microscope and an evaluation of Apache Kafka showing that it is possible to stream image data with the framework.
|
8 |
Design and Analysis of a Real-time Data Monitoring Prototype for the LWA Radio TelescopeVigraham, Sushrutha 11 March 2011 (has links)
Increasing computing power has been helping researchers understand many complex scientific problems. Scientific computing helps to model and visualize complex processes such as molecular modelling, medical imaging, astrophysics and space exploration by processing large set of data streams collected through sensors or cameras. This produces a massive amount of data which consume a large amount of processing and storage resources. Monitoring the data streams and filtering unwanted information will enable efficient use of the available resources. This thesis proposes a data-centric system that can monitor high-speed data streams in real-time. The proposed system provides a flexible environment where users can plug-in application-specific data monitoring algorithms. The Long Wavelength Array telescope (LWA) is an astronomical apparatus that works with high speed data streams, and the proposed data-centric platform is developed to evaluate FPGAs to implement data monitoring algorithms in LWA. The throughput of the data-centric system has been modeled and it is observed that the developed data-centric system can deliver a maximum throughput of 164 MB/s. / Master of Science
|
9 |
Vyhodnocování relačních dotazů v proudově orientovaném prostředí / Vyhodnocování relačních dotazů v proudově orientovaném prostředíKikta, Marcel January 2014 (has links)
This thesis deals with the design and implementation of an optimizer and a transformer of relational queries. Firstly, the thesis describes the theory of the relational query compilers. Secondly, we present the data structures and algorithms used in the implemented tool. Finally, the important implementation details of the developed tool are discussed. Part of the thesis is the selection of used relational algebra operators and design of an appropriate input. Input of the implemented software is a query written in a XML file in the form of relational algebra. Query is optimized and transformed into physical plan which will be executed in the parallelization framework Bobox. Developed compiler outputs physical plan written in the Bobolang language, which serves as an input for the Bobox. Powered by TCPDF (www.tcpdf.org)
|
10 |
Network Data Streaming: Algorithms for Network Measurement and MonitoringKumar, Abhishek 18 November 2005 (has links)
With the emergence of computer networks as one of the primary modes of
communication, and with their adoption for an increasingly wide range
of applications, there is a growing need to understand and
characterize the traffic they carry. The rise of large scale
network attacks adds urgency to this need. However, the large size,
high speed and increasing complexity of these networks imply that
tracking and characterizing the traffic they carry is an increasingly
difficult problem. Dealing with higher level aggregates, such as flows
instead of packets, does not solve the problem because these
aggregates tend to be quite numerous and exhibit dynamics of their
own.
In this thesis, we investigate a novel approach to deal with the
immense amounts of data associated with problems in network
measurement and monitoring. Building upon the paradigm of Data
Streaming, which processes a large stream of data using a small
working memory to answer a class of queries, we develop an
architecture for Network Data Streaming that can accommodate
additional constraints imposed in the context of network monitoring.
Using this architecture, we design algorithms for monitoring
properties of network traffic that have traditionally been considered
too difficult to monitor at high speed network links and routers. Our
first algorithm provides the ability to accurately estimate the size
of individual flows. A second algorithm to estimate the distribution of
flow sizes enables network operators to monitor anomalies in the
traffic. Incorporating the use of packet sampling, we can extend the
latter algorithm to estimate the flow size distribution of arbitrary
subpopulations.
Finally, we apply the tools of Network Data Streaming to the operation
of packet sampling itself. Using the ability to efficiently estimate
flow-statistics such as approximate per-flow size, we design a family
of mechanisms where the sampling decision is guided by this knowledge.
The individual solutions developed in this thesis share a common
architectural theme, supporting the monitoring of highly dynamic
populations. Integrating this with the traditional sampling based
framework for network monitoring will enable a broad range of
applications for accurate and comprehensive monitoring of network
traffic.
|
Page generated in 0.1789 seconds