Global ETD Search

281	Advanced Memory Data Structures for Scalable Event Trace Analysis Knüpfer, Andreas 16 December 2008 (has links) The thesis presents a contribution to the analysis and visualization of computational performance based on event traces with a particular focus on parallel programs and High Performance Computing (HPC). Event traces contain detailed information about speciﬁed incidents (events) during run-time of programs and allow minute investigation of dynamic program behavior, various performance metrics, and possible causes of performance ﬂaws. Due to long running and highly parallel programs and very ﬁne detail resolutions, event traces can accumulate huge amounts of data which become a challenge for interactive as well as automatic analysis and visualization tools. The thesis proposes a method of exploiting redundancy in the event traces in order to reduce the memory requirements and the computational complexity of event trace analysis. The sources of redundancy are repeated segments of the original program, either through iterative or recursive algorithms or through SPMD-style parallel programs, which produce equal or similar repeated event sequences. The data reduction technique is based on the novel Complete Call Graph (CCG) data structure which allows domain speciﬁc data compression for event traces in a combination of lossless and lossy methods. All deviations due to lossy data compression can be controlled by constant bounds. The compression of the CCG data structure is incorporated in the construction process, such that at no point substantial uncompressed parts have to be stored. Experiments with real-world example traces reveal the potential for very high data compression. The results range from factors of 3 to 15 for small scale compression with minimum deviation of the data to factors &gt; 100 for large scale compression with moderate deviation. Based on the CCG data structure, new algorithms for the most common evaluation and analysis methods for event traces are presented, which require no explicit decompression. By avoiding repeated evaluation of formerly redundant event sequences, the computational effort of the new algorithms can be reduced in the same extent as memory consumption. The thesis includes a comprehensive discussion of the state-of-the-art and related work, a detailed presentation of the design of the CCG data structure, an elaborate description of algorithms for construction, compression, and analysis of CCGs, and an extensive experimental validation of all components. / Diese Dissertation stellt einen neuartigen Ansatz für die Analyse und Visualisierung der Berechnungs-Performance vor, der auf dem Ereignis-Tracing basiert und insbesondere auf parallele Programme und das Hochleistungsrechnen (High Performance Computing, HPC) zugeschnitten ist. Ereignis-Traces (Ereignis-Spuren) enthalten detaillierte Informationen über spezifizierte Ereignisse während der Laufzeit eines Programms und erlauben eine sehr genaue Untersuchung des dynamischen Verhaltens, verschiedener Performance-Metriken und potentieller Performance-Probleme. Aufgrund lang laufender und hoch paralleler Anwendungen und dem hohen Detailgrad kann das Ereignis-Tracing sehr große Datenmengen produzieren. Diese stellen ihrerseits eine Herausforderung für interaktive und automatische Analyse- und Visualisierungswerkzeuge dar. Die vorliegende Arbeit präsentiert eine Methode, die Redundanzen in den Ereignis-Traces ausnutzt, um sowohl die Speicheranforderungen als auch die Laufzeitkomplexität der Trace-Analyse zu reduzieren. Die Ursachen für Redundanzen sind wiederholt ausgeführte Programmabschnitte, entweder durch iterative oder rekursive Algorithmen oder durch SPMD-Parallelisierung, die gleiche oder ähnliche Ereignis-Sequenzen erzeugen. Die Datenreduktion basiert auf der neuartigen Datenstruktur der &quot;Vollständigen Aufruf-Graphen&quot; (Complete Call Graph, CCG) und erlaubt eine Kombination von verlustfreier und verlustbehafteter Datenkompression. Dabei können konstante Grenzen für alle Abweichungen durch verlustbehaftete Kompression vorgegeben werden. Die Datenkompression ist in den Aufbau der Datenstruktur integriert, so dass keine umfangreichen unkomprimierten Teile vor der Kompression im Hauptspeicher gehalten werden müssen. Das enorme Kompressionsvermögen des neuen Ansatzes wird anhand einer Reihe von Beispielen aus realen Anwendungsszenarien nachgewiesen. Die dabei erzielten Resultate reichen von Kompressionsfaktoren von 3 bis 5 mit nur minimalen Abweichungen aufgrund der verlustbehafteten Kompression bis zu Faktoren &gt; 100 für hochgradige Kompression. Basierend auf der CCG_Datenstruktur werden außerdem neue Auswertungs- und Analyseverfahren für Ereignis-Traces vorgestellt, die ohne explizite Dekompression auskommen. Damit kann die Laufzeitkomplexität der Analyse im selben Maß gesenkt werden wie der Hauptspeicherbedarf, indem komprimierte Ereignis-Sequenzen nicht mehrmals analysiert werden. Die vorliegende Dissertation enthält eine ausführliche Vorstellung des Stands der Technik und verwandter Arbeiten in diesem Bereich, eine detaillierte Herleitung der neu eingeführten Daten-strukturen, der Konstruktions-, Kompressions- und Analysealgorithmen sowie eine umfangreiche experimentelle Auswertung und Validierung aller Bestandteile. info:eu-repo/classification/ddc/004 ddc:004
282	Statistické jazykové modely založené na neuronových sítích / STATISTICAL LANGUAGE MODELS BASED ON NEURAL NETWORKS Mikolov, Tomáš January 2012 (has links) Statistické jazykové modely jsou důležitou součástí mnoha úspěšných aplikací, mezi něž patří například automatické rozpoznávání řeči a strojový překlad (příkladem je známá aplikace Google Translate). Tradiční techniky pro odhad těchto modelů jsou založeny na tzv. N-gramech. Navzdory známým nedostatkům těchto technik a obrovskému úsilí výzkumných skupin napříč mnoha oblastmi (rozpoznávání řeči, automatický překlad, neuroscience, umělá inteligence, zpracování přirozeného jazyka, komprese dat, psychologie atd.), N-gramy v podstatě zůstaly nejúspěšnější technikou. Cílem této práce je prezentace několika architektur jazykových modelůzaložených na neuronových sítích. Ačkoliv jsou tyto modely výpočetně náročnější než N-gramové modely, s technikami vyvinutými v této práci je možné jejich efektivní použití v reálných aplikacích. Dosažené snížení počtu chyb při rozpoznávání řeči oproti nejlepším N-gramovým modelům dosahuje 20%. Model založený na rekurentní neurovové síti dosahuje nejlepších publikovaných výsledků na velmi známé datové sadě (Penn Treebank).
283	High-Throughput BitPacking Compression Lisa, Nusrat Jahan, Nguyen, Tuan Duy Anh, Habich, Dirk, Kumar, Akash, Lehner, Wolfgang 03 July 2023 (has links) To efficiently support analytical applications from a data management perspective, in-memory column store database systems are state-of-the art. In this kind of database system, lossless lightweight integer compression schemes are crucial to keep the memory storage as low as possible and to speedup query processing. In this specific compression domain, BitPacking is one of the most frequently applied compression scheme. However, (de) compression should not come with any additional cost during run time, but should be provided transparently without compromising the overall system performance. To achieve that, we focus on acceleration of BitPacking using Field Programmable Gate Arrays (FPGAs). Therefore, we outline several FPGA designs for BitPacking in this paper. As we are going to show in our evaluation, our specific designs provide the BitPacking compression scheme with high-throughput. info:eu-repo/classification/ddc/004 ddc:004
284	Differential pulse code modulation data compression Lum, Randall M. G. 01 January 1989 (has links) (PDF) With the requirement to store and transmit information efficiently, an ever increasing number of uses of data compression techniques have been generated in diverse fields such as television, surveillance, remote sensing, medical processing, office automation, and robotics. Rapid increases in processing capabilities and the speed of complex integrated circuits make data compression techniques a prime candidate for application in the areas mentioned above. This report addresses, from a theoretical viewpoint, three major data compression techniques, Pixel Coding, Predictive Coding, and Transform Coding. It begins with a project description and continues with data compression techniques, focusing on Differential Pulse Code Modulation. Pulse-code modulation Data compression Telecommunication Image processing Digital techniques Optical data processing Electrical and Computer Engineering Electrical and Electronics Engineering
285	The Compression of IoT operational data time series in vehicle embedded systems Xing, Renzhi January 2018 (has links) This thesis examines compression algorithms for time series operational data which are collected from the Controller Area Network (CAN) bus in an automotive Internet of Things (IoT) setting. The purpose of a compression algorithm is to decrease the size of a set of time series data (such as vehicle speed, wheel speed, etc.) so that the data to be transmitted from the vehicle is small size, thus decreasing the cost of transmission while providing potentially better offboard data analysis. The project helped improve the quality of data collected by the data analysts and reduced the cost of data transmission. Since the time series data compression mostly concerns data storage and transmission, the difficulties in this project were where to locate the combination of data compression and transmission, within the limited performance of the onboard embedded systems. These embedded systems have limited resources (concerning hardware and software resources). Hence the efficiency of the compression algorithm becomes very important. Additionally, there is a tradeoff between the compression ratio and real-time performance. Moreover, the error rate introduced by the compression algorithm must be smaller than an expected value. The compression algorithm contains two phases: (1) an online lossy compression algorithm - piecewise approximation to shrink the total number of data samples while maintaining a guaranteed precision and (2) a lossless compression algorithm – Delta-XOR encoding to compress the output of the lossy algorithm. The algorithm was tested with four typical time series data samples from real CAN logs with different functions and properties. The similarities and differences between these logs are discussed. These differences helped to determine the algorithms that should be used. After the experiments which helped to compare different algorithms and check their performances, a simulation is implemented based on the experiment results. The results of this simulation show that the combined compression algorithm can meet the need of certain compression ratio by controlling the error bound. Finally, the possibility of improving the compression algorithm in the future is discussed. / Denna avhandling undersöker komprimeringsalgoritmer för driftdata från tidsserier som samlas in från ett fordons CAN-buss i ett sammanhang rörande Internet of Things (IoT) speciellt tillämpat för bilindustrin. Syftet med en kompressionsalgoritm är att minska storleken på en uppsättning tidsseriedata (som tex fordonshastighet, hjulhastighet etc.) så att data som ska överföras från fordonet har liten storlek och därmed sänker kostnaden för överföring samtidigt som det möjliggör bättre dataanalys utanför fordonet. Projektet bidrog till att förbättra kvaliteten på data som samlats in av dataanalytiker och minskade kostnaderna för dataöverföring. Eftersom tidsseriekomprimeringen huvudsakligen handlar om datalagring och överföring var svårigheterna i det här projektet att lokalisera kombinationen av datakomprimering och överföring inom den begränsade prestandan hos de inbyggda systemen. Dessa inbyggda system har begränsade resurser (både avseende hårdvaru- och programvaruresurser). Därför blir effektiviteten hos kompressionsalgoritmen mycket viktig. Dessutom är det en kompromiss mellan kompressionsförhållandet och realtidsprestanda. Dessutom måste felfrekvensen som införs av kompressionsalgoritmen vara mindre än ett givet gränsvärde. Komprimeringsalgoritmen i denna avhandling benämns kombinerad kompression, och innehåller två faser: (1) en online-algoritm med dataförluster, för att krympa det totala antalet data-samples samtidigt som det garanterade felet kan hållas under en begränsad nivå och (2) en dataförlustfri kompressionsalgoritm som komprimerar utsignalen från den första algoritmen. Algoritmen testades med fyra typiska tidsseriedataxempel från reella CAN-loggar med olika funktioner och egenskaper. Likheterna och skillnaderna mellan dessa olika typer diskuteras. Dessa skillnader hjälpte till att bestämma vilken algoritm som ska väljas i båda faser. Efter experimenten som jämför prestandan för olika algoritmer, implementeras en simulering baserad på experimentresultaten. Resultaten av denna simulering visar att den kombinerade kompressionsalgoritmen kan möta behovet av ett visst kompressionsförhållande genom att styra mot den bundna felgränsen. Slutligen diskuteras möjligheten att förbättra kompressionsalgoritmen i framtiden. Data compression Data transmission Time series data IoT CAN Vehicle connectivity Datakomprimering Dataöverföring Tidsseriedata IoT CAN Uppkopplade fordon Computer and Information Sciences Data- och informationsvetenskap
286	Experimental Study on Machine Learning with Approximation to Data Streams Jiang, Jiani January 2019 (has links) Realtime transferring of data streams enables many data analytics and machine learning applications in the areas of e.g. massive IoT and industrial automation. Big data volume of those streams is a significant burden or overhead not only to the transportation network, but also to the corresponding application servers. Therefore, researchers and scientists focus on reducing the amount of data needed to be transferred via data compressions and approximations. Data compression techniques like lossy compression can significantly reduce data volume with the price of data information loss. Meanwhile, how to do data compression is highly dependent on the corresponding applications. However, when apply the decompressed data in some data analysis application like machine learning, the results may be affected due to the information loss. In this paper, the author did a study on the impact of data compression to the machine learning applications. In particular, from the experimental perspective, it shows the tradeoff among the approximation error bound, compression ratio and the prediction accuracy of multiple machine learning methods. The author believes that, with proper choice, data compression can dramatically reduce the amount of data transferred with limited impact on the machine learning applications. / Realtidsöverföring av dataströmmar möjliggör många dataanalyser och maskininlärningsapplikationer inom områdena t.ex. massiv IoT och industriell automatisering. Stor datavolym för dessa strömmar är en betydande börda eller omkostnad inte bara för transportnätet utan också för motsvarande applikationsservrar. Därför fokuserar forskare och forskare om att minska mängden data som behövs för att överföras via datakomprimeringar och approximationer. Datakomprimeringstekniker som förlustkomprimering kan minska datavolymen betydligt med priset för datainformation. Samtidigt är datakomprimering mycket beroende av motsvarande applikationer. Men när du använder dekomprimerade data i en viss dataanalysapplikation som maskininlärning, kan resultaten påverkas på grund av informationsförlusten. I denna artikel gjorde författaren en studie om effekterna av datakomprimering på maskininlärningsapplikationerna. I synnerhet, från det experimentella perspektivet, visar det avvägningen mellan tillnärmningsfelbundet, kompressionsförhållande och förutsägbarhetsnoggrannheten för flera maskininlärningsmetoder. Författaren anser att datakomprimering med rätt val dramatiskt kan minska mängden data som överförs med begränsad inverkan på maskininlärningsapplikationerna. Elektroteknik och elektronik
287	Real-time Realistic Rendering And High Dynamic Range Image Display And Compression Xu, Ruifeng 01 January 2005 (has links) This dissertation focuses on the many issues that arise from the visual rendering problem. Of primary consideration is light transport simulation, which is known to be computationally expensive. Monte Carlo methods represent a simple and general class of algorithms often used for light transport computation. Unfortunately, the images resulting from Monte Carlo approaches generally suffer from visually unacceptable noise artifacts. The result of any light transport simulation is, by its very nature, an image of high dynamic range (HDR). This leads to the issues of the display of such images on conventional low dynamic range devices and the development of data compression algorithms to store and recover the corresponding large amounts of detail found in HDR images. This dissertation presents our contributions relevant to these issues. Our contributions to high dynamic range image processing include tone mapping and data compression algorithms. This research proposes and shows the efficacy of a novel level set based tone mapping method that preserves visual details in the display of high dynamic range images on low dynamic range display devices. The level set method is used to extract the high frequency information from HDR images. The details are then added to the range compressed low frequency information to reconstruct a visually accurate low dynamic range version of the image. Additional challenges associated with high dynamic range images include the requirements to reduce excessively large amounts of storage and transmission time. To alleviate these problems, this research presents two methods for efficient high dynamic range image data compression. One is based on the classical JPEG compression. It first converts the raw image into RGBE representation, and then sends the color base and common exponent to classical discrete cosine transform based compression and lossless compression, respectively. The other is based on the wavelet transformation. It first transforms the raw image data into the logarithmic domain, then quantizes the logarithmic data into the integer domain, and finally applies the wavelet based JPEG2000 encoder for entropy compression and bit stream truncation to meet the desired bit rate requirement. We believe that these and similar such contributions will make a wide application of high dynamic range images possible. The contributions to light transport simulation include Monte Carlo noise reduction, dynamic object rendering and complex scene rendering. Monte Carlo noise is an inescapable artifact in synthetic images rendered using stochastic algorithm. This dissertation proposes two noise reduction algorithms to obtain high quality synthetic images. The first one models the distribution of noise in the wavelet domain using a Laplacian function, and then suppresses the noise using a Bayesian method. The other extends the bilateral filtering method to reduce all types of Monte Carlo noise in a unified way. All our methods reduce Monte Carlo noise effectively. Rendering of dynamic objects adds more dimension to the expensive light transport simulation issue. This dissertation presents a pre-computation based method. It pre-computes the surface radiance for each basis lighting and animation key frame, and then renders the objects by synthesizing the pre-computed data in real-time. Realistic rendering of complex scenes is computationally expensive. This research proposes a novel 3D space subdivision method, which leads to a new rendering framework. The light is first distributed to each local region to form local light fields, which are then used to illuminate the local scenes. The method allows us to render complex scenes at interactive frame rates. Rendering has important applications in mixed reality. Consistent lighting and shadows between real scenes and virtual scenes are important features of visual integration. The dissertation proposes to render the virtual objects by irradiance rendering using live captured environmental lighting. This research also introduces a virtual shadow generation method that computes shadows cast by virtual objects to the real background. We finally conclude the dissertation by discussing a number of future directions for rendering research, and presenting our proposed approaches. high dynamic range image data compression tone mapping Monte Carlo noise dynamic scene complex scene real-time rendering realistic rendering global illumination Computer Sciences Engineering
288	Conflict Detection-Based Run-Length Encoding: AVX-512 CD Instruction Set in Action Lehner, Wolfgang, Ungethum, Annett, Pietrzyk, Johannes, Damme, Patrick, Habich, Dirk 18 January 2023 (has links) Data as well as hardware characteristics are two key aspects for efficient data management. This holds in particular for the field of in-memory data processing. Aside from increasing main memory capacities, efficient in-memory processing benefits from novel processing concepts based on lightweight compressed data. Thus, an active research field deals with the adaptation of new hardware features such as vectorization using SIMD instructions to speedup lightweight data compression algorithms. Following this trend, we propose a novel approach for run-length encoding, a well-known and often applied lightweight compression technique. Our novel approach is based on newly introduced conflict detection (CD) instructions in Intel's AVX-512 instruction set extension. As we are going to show, our CD-based approach has unique properties and outperforms the state-of-the-art RLE approach for data sets with small run lengths. info:eu-repo/classification/ddc/005 ddc:005
289	A Benchmark Framework for Data Compression Techniques Damme, Patrick, Habich, Dirk, Lehner, Wolfgang 03 February 2023 (has links) Lightweight data compression is frequently applied in main memory database systems to improve query performance. The data processed by such systems is highly diverse. Moreover, there is a high number of existing lightweight compression techniques. Therefore, choosing the optimal technique for a given dataset is non-trivial. Existing approaches are based on simple rules, which do not suffice for such a complex decision. In contrast, our vision is a cost-based approach. However, this requires a detailed cost model, which can only be obtained from a systematic benchmarking of many compression algorithms on many different datasets. A naïve benchmark evaluates every algorithm under consideration separately. This yields many redundant steps and is thus inefficient. We propose an efficient and extensible benchmark framework for compression techniques. Given an ensemble of algorithms, it minimizes the overall run time of the evaluation. We experimentally show that our approach outperforms the naïve approach. info:eu-repo/classification/ddc/004 ddc:004
290	Computer Graphics and Visualization based Analysis and Record System for Hand Surgery and Therapy Practice Gokavarapu, Venkatamanikanta Subrahmanyakartheek 27 May 2016 (has links) No description available. Computer Engineering Computer Science Hand therapy Leap Motion Hand Wound Area Hand Surface Area Visualization Data compression Electronic Health Record Graphics based hand model

Search results