Global ETD Search

71	Approximate Data Analytics Systems Le Quoc, Do 22 March 2018 (has links) (PDF) Today, most modern online services make use of big data analytics systems to extract useful information from the raw digital data. The data normally arrives as a continuous data stream at a high speed and in huge volumes. The cost of handling this massive data can be significant. Providing interactive latency in processing the data is often impractical due to the fact that the data is growing exponentially and even faster than Moore’s law predictions. To overcome this problem, approximate computing has recently emerged as a promising solution. Approximate computing is based on the observation that many modern applications are amenable to an approximate, rather than the exact output. Unlike traditional computing, approximate computing tolerates lower accuracy to achieve lower latency by computing over a partial subset instead of the entire input data. Unfortunately, the advancements in approximate computing are primarily geared towards batch analytics and cannot provide low-latency guarantees in the context of stream processing, where new data continuously arrives as an unbounded stream. In this thesis, we design and implement approximate computing techniques for processing and interacting with high-speed and large-scale stream data to achieve low latency and efficient utilization of resources. To achieve these goals, we have designed and built the following approximate data analytics systems: • StreamApprox—a data stream analytics system for approximate computing. This system supports approximate computing for low-latency stream analytics in a transparent way and has an ability to adapt to rapid fluctuations of input data streams. In this system, we designed an online adaptive stratified reservoir sampling algorithm to produce approximate output with bounded error. • IncApprox—a data analytics system for incremental approximate computing. This system adopts approximate and incremental computing in stream processing to achieve high-throughput and low-latency with efficient resource utilization. In this system, we designed an online stratified sampling algorithm that uses self-adjusting computation to produce an incrementally updated approximate output with bounded error. • PrivApprox—a data stream analytics system for privacy-preserving and approximate computing. This system supports high utility and low-latency data analytics and preserves user’s privacy at the same time. The system is based on the combination of privacy-preserving data analytics and approximate computing. • ApproxJoin—an approximate distributed joins system. This system improves the performance of joins — critical but expensive operations in big data systems. In this system, we employed a sketching technique (Bloom filter) to avoid shuffling non-joinable data items through the network as well as proposed a novel sampling mechanism that executes during the join to obtain an unbiased representative sample of the join output. Our evaluation based on micro-benchmarks and real world case studies shows that these systems can achieve significant performance speedup compared to state-of-the-art systems by tolerating negligible accuracy loss of the analytics output. In addition, our systems allow users to systematically make a trade-off between accuracy and throughput/latency and require no/minor modifications to the existing applications. Approximate computing big data Probenahme Datenstromverarbeitung Approximate computing big data sampling stream processing ddc:004 rvk:ST 234 rvk:ST 230
72	Approximate Data Analytics Systems Le Quoc, Do 22 January 2018 (has links) Today, most modern online services make use of big data analytics systems to extract useful information from the raw digital data. The data normally arrives as a continuous data stream at a high speed and in huge volumes. The cost of handling this massive data can be significant. Providing interactive latency in processing the data is often impractical due to the fact that the data is growing exponentially and even faster than Moore’s law predictions. To overcome this problem, approximate computing has recently emerged as a promising solution. Approximate computing is based on the observation that many modern applications are amenable to an approximate, rather than the exact output. Unlike traditional computing, approximate computing tolerates lower accuracy to achieve lower latency by computing over a partial subset instead of the entire input data. Unfortunately, the advancements in approximate computing are primarily geared towards batch analytics and cannot provide low-latency guarantees in the context of stream processing, where new data continuously arrives as an unbounded stream. In this thesis, we design and implement approximate computing techniques for processing and interacting with high-speed and large-scale stream data to achieve low latency and efficient utilization of resources. To achieve these goals, we have designed and built the following approximate data analytics systems: • StreamApprox—a data stream analytics system for approximate computing. This system supports approximate computing for low-latency stream analytics in a transparent way and has an ability to adapt to rapid fluctuations of input data streams. In this system, we designed an online adaptive stratified reservoir sampling algorithm to produce approximate output with bounded error. • IncApprox—a data analytics system for incremental approximate computing. This system adopts approximate and incremental computing in stream processing to achieve high-throughput and low-latency with efficient resource utilization. In this system, we designed an online stratified sampling algorithm that uses self-adjusting computation to produce an incrementally updated approximate output with bounded error. • PrivApprox—a data stream analytics system for privacy-preserving and approximate computing. This system supports high utility and low-latency data analytics and preserves user’s privacy at the same time. The system is based on the combination of privacy-preserving data analytics and approximate computing. • ApproxJoin—an approximate distributed joins system. This system improves the performance of joins — critical but expensive operations in big data systems. In this system, we employed a sketching technique (Bloom filter) to avoid shuffling non-joinable data items through the network as well as proposed a novel sampling mechanism that executes during the join to obtain an unbiased representative sample of the join output. Our evaluation based on micro-benchmarks and real world case studies shows that these systems can achieve significant performance speedup compared to state-of-the-art systems by tolerating negligible accuracy loss of the analytics output. In addition, our systems allow users to systematically make a trade-off between accuracy and throughput/latency and require no/minor modifications to the existing applications. info:eu-repo/classification/ddc/004 ddc:004
73	Use of Approximate Triple Modular Redundancy for Fault Tolerance in Digital Circuits Albandes, Iuri 26 November 2018 (has links) La triple redundancia modular (TMR) es una técnica bien conocida de mitigación de fallos que proporciona una alta protección frente a fallos únicos pero con un gran coste en términos de área y consumo de potencia. Por esta razón, la redundancia parcial se suele aplicar para aligerar estos sobrecostes. En este contexto, la TMR aproximada (ATMR), que consisten en la implementación de la redundancia triple con versiones aproximadas del circuito a proteger, ha surgido en los últimos años como una alternativa a la replicación parcial, con la ventaja de obtener mejores soluciones de compromiso entre la cobertura a fallos y los sobrecostes. En la literatura ya han sido propuestas varias técnicas para la generación de circuitos aproximados, cada una con sus pros y sus contras. Este trabajo realiza un estudio de la técnica ATMR, evaluando el coste-beneficio entre el incremento de recursos (área) y la cobertura frente a fallos. La primera contribución es una nueva aproximación ATMR donde todos los módulos redundantes son versiones aproximadas del diseño original, permitiendo la generación de circuitos ATMR con un sobrecoste de área muy reducido, esta técnica se denomina Full-ATMR (ATMR completo o FATMR). El trabajo también presenta una segunda aproximación para implementar la ATMR de forma automática combinando una biblioteca de puertas aproximadas (ApxLib) y un algoritmo genético multi-objetivo (MOOGA). El algoritmo realiza una búsqueda ciega sobre el inmenso espacio de soluciones, optimizando conjuntamente la cobertura frente a fallos y el sobrecoste de área. Los experimentos comparando nuestra aproximación con las técnicas del estado del arte muestran una mejora de los trade-offs para diferentes circuitos de prueba (benchmark). Fault Tolerance Single Event Effects Approximate circuits Approximate-TMR Multi-Objective Optimization Genetic Algorithm
74	Accelerating Monte Carlo methods for Bayesian inference in dynamical models Dahlin, Johan January 2016 (has links) Making decisions and predictions from noisy observations are two important and challenging problems in many areas of society. Some examples of applications are recommendation systems for online shopping and streaming services, connecting genes with certain diseases and modelling climate change. In this thesis, we make use of Bayesian statistics to construct probabilistic models given prior information and historical data, which can be used for decision support and predictions. The main obstacle with this approach is that it often results in mathematical problems lacking analytical solutions. To cope with this, we make use of statistical simulation algorithms known as Monte Carlo methods to approximate the intractable solution. These methods enjoy well-understood statistical properties but are often computational prohibitive to employ. The main contribution of this thesis is the exploration of different strategies for accelerating inference methods based on sequential Monte Carlo (SMC) and Markov chain Monte Carlo (MCMC). That is, strategies for reducing the computational effort while keeping or improving the accuracy. A major part of the thesis is devoted to proposing such strategies for the MCMC method known as the particle Metropolis-Hastings (PMH) algorithm. We investigate two strategies: (i) introducing estimates of the gradient and Hessian of the target to better tailor the algorithm to the problem and (ii) introducing a positive correlation between the point-wise estimates of the target. Furthermore, we propose an algorithm based on the combination of SMC and Gaussian process optimisation, which can provide reasonable estimates of the posterior but with a significant decrease in computational effort compared with PMH. Moreover, we explore the use of sparseness priors for approximate inference in over-parametrised mixed effects models and autoregressive processes. This can potentially be a practical strategy for inference in the big data era. Finally, we propose a general method for increasing the accuracy of the parameter estimates in non-linear state space models by applying a designed input signal. / Borde Riksbanken höja eller sänka reporäntan vid sitt nästa möte för att nå inflationsmålet? Vilka gener är förknippade med en viss sjukdom? Hur kan Netflix och Spotify veta vilka filmer och vilken musik som jag vill lyssna på härnäst? Dessa tre problem är exempel på frågor där statistiska modeller kan vara användbara för att ge hjälp och underlag för beslut. Statistiska modeller kombinerar teoretisk kunskap om exempelvis det svenska ekonomiska systemet med historisk data för att ge prognoser av framtida skeenden. Dessa prognoser kan sedan användas för att utvärdera exempelvis vad som skulle hända med inflationen i Sverige om arbetslösheten sjunker eller hur värdet på mitt pensionssparande förändras när Stockholmsbörsen rasar. Tillämpningar som dessa och många andra gör statistiska modeller viktiga för många delar av samhället. Ett sätt att ta fram statistiska modeller bygger på att kontinuerligt uppdatera en modell allteftersom mer information samlas in. Detta angreppssätt kallas för Bayesiansk statistik och är särskilt användbart när man sedan tidigare har bra insikter i modellen eller tillgång till endast lite historisk data för att bygga modellen. En nackdel med Bayesiansk statistik är att de beräkningar som krävs för att uppdatera modellen med den nya informationen ofta är mycket komplicerade. I sådana situationer kan man istället simulera utfallet från miljontals varianter av modellen och sedan jämföra dessa mot de historiska observationerna som finns till hands. Man kan sedan medelvärdesbilda över de varianter som gav bäst resultat för att på så sätt ta fram en slutlig modell. Det kan därför ibland ta dagar eller veckor för att ta fram en modell. Problemet blir särskilt stort när man använder mer avancerade modeller som skulle kunna ge bättre prognoser men som tar för lång tid för att bygga. I denna avhandling använder vi ett antal olika strategier för att underlätta eller förbättra dessa simuleringar. Vi föreslår exempelvis att ta hänsyn till fler insikter om systemet och därmed minska antalet varianter av modellen som behöver undersökas. Vi kan således redan utesluta vissa modeller eftersom vi har en bra uppfattning om ungefär hur en bra modell ska se ut. Vi kan också förändra simuleringen så att den enklare rör sig mellan olika typer av modeller. På detta sätt utforskas rymden av alla möjliga modeller på ett mer effektivt sätt. Vi föreslår ett antal olika kombinationer och förändringar av befintliga metoder för att snabba upp anpassningen av modellen till observationerna. Vi visar att beräkningstiden i vissa fall kan minska ifrån några dagar till någon timme. Förhoppningsvis kommer detta i framtiden leda till att man i praktiken kan använda mer avancerade modeller som i sin tur resulterar i bättre prognoser och beslut. Computational statistics Monte Carlo Markov chains Particle filters Machine learning Bayesian optimisation Approximate Bayesian Computations Gaussian processes Particle Metropolis-Hastings Approximate inference Pseudo-marginal methods
75	Variabilité démographique et adaptation de la gestion aux changements climatiques en forêt de montagne : calibration par Calcul Bayésien Approché et projection avec le modèle Samsara2 / Demographic variability and adaptation of mountain forest management to climate change : calibration by Approximate Bayesian Computation and projection with the Samsara2 model Lagarrigues, Guillaume 16 December 2016 (has links) Les hêtraies-sapinières-pessières de montagne paraissent particulièrement menacées par le réchauffement climatique. Pour appréhender la dynamique future de ces forêts et adapter la sylviculture à ces nouvelles conditions, il est important de mieux connaître les facteurs environnementaux impactant la démographie de ces espèces. Nous avons abordé cette problématique en combinant des données historiques de gestion, le modèle de dynamique forestière Samsara2 et une méthode de calibration basée sur le Calcul Bayésien Approché. Nous avons ainsi pu étudier conjointement les différents processus démographiques dans ces forêts. Nos analyses montrent que la démographie forestière peut varier fortement entre les parcelles et que le climat n'est pas toujours déterminant pour expliquer ces variations. Ainsi, malgré les changements climatiques attendus, la gestion irrégulière pratiquée actuellement devrait permettre de maintenir les services rendus par les peuplements mélangés situés en conditions mésiques. / The spruce-fir-beech mountain forests could be particularly threatened by the global warming. To better understand the future dynamics of these forests and adapt the silviculture to these new conditions, a better knowledge of the environmental factors affecting the species demograhics is needed. We studied this issue by combining a historical management data set, the forest dynamics model Samsara2 and a calibration method based on Approximate Bayesian Computation. We were able thus to study jointly the different demographic process in these forests. Our analysis show that the forest demographics can strongly vary between stands and that climate is not always determining to explain these variations. The unven-aged management currently applied seem adapted for the mixed stands located in mesic conditions, but the pure spruce forests and the low elevation stands could be highly impacted. Démographie forestière Picea abies Abies alba Fagus sylvatica Approximate Bayesian Computation Réchauffement climatique Forest demographics Picea abies Abies alba Fagus sylvatica Approximate Bayesian Computation Global warming 550
76	Controle hierárquico para a equação do calor via estratégia Stackelberg-Nash Albuquerque., Islanita Cecília Alcantara de 29 September 2011 (has links) Made available in DSpace on 2015-05-15T11:46:05Z (GMT). No. of bitstreams: 1 arquivototal.pdf: 674722 bytes, checksum: eb17d5816a0fce98d1def5be593711f1 (MD5) Previous issue date: 2011-09-29 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / We have as main issue in this work the Hierarchical Control, which consists in a leader-followers system. We studied in special the heat equation approximate controllability under Stackelberg-Nashs strategy, which is directed in controlling every system from local controls choices with the minimum possible costs. / Temos como principal tema neste trabalho o Controle Hierárquico, que consiste em um sistema de líder e seguidores. Estudamos em especial a controlabilidade aproximada da equação do Calor sob a estratégia de Stackelberg-Nash, estratégia esta direcionada em controlar todo sistema a partir de escolhas de controles locais com o mínimo de custos possíveis. Controle Hierárquico Líder Seguidores Estratégia de Stackelberg-Nash Sistema de Otimalidade Controlabilidade Aproximada Approximate Controllability Leader Followers Optimality System Approximate Controllability
77	Practical Implementations Of The Active Set Method For Support Vector Machine Training With Semi-definite Kernels Sentelle, Christopher 01 January 2014 (has links) The Support Vector Machine (SVM) is a popular binary classification model due to its superior generalization performance, relative ease-of-use, and applicability of kernel methods. SVM training entails solving an associated quadratic programming (QP) that presents significant challenges in terms of speed and memory constraints for very large datasets; therefore, research on numerical optimization techniques tailored to SVM training is vast. Slow training times are especially of concern when one considers that re-training is often necessary at several values of the models regularization parameter, C, as well as associated kernel parameters. The active set method is suitable for solving SVM problem and is in general ideal when the Hessian is dense and the solution is sparse–the case for the `1-loss SVM formulation. There has recently been renewed interest in the active set method as a technique for exploring the entire SVM regularization path, which has been shown to solve the SVM solution at all points along the regularization path (all values of C) in not much more time than it takes, on average, to perform training at a single value of C with traditional methods. Unfortunately, the majority of active set implementations used for SVM training require positive definite kernels, and those implementations that do allow semi-definite kernels tend to be complex and can exhibit instability and, worse, lack of convergence. This severely limits applicability since it precludes the use of the linear kernel, can be an issue when duplicate data points exist, and doesn’t allow use of low-rank kernel approximations to improve tractability for large datasets. The difficulty, in the case of a semi-definite kernel, arises when a particular active set results in a singular KKT matrix (or the equality-constrained problem formed using the active set is semidefinite). Typically this is handled by explicitly detecting the rank of the KKT matrix. Unfortunately, this adds significant complexity to the implementation; and, if care is not taken, numerical iii instability, or worse, failure to converge can result. This research shows that the singular KKT system can be avoided altogether with simple modifications to the active set method. The result is a practical, easy to implement active set method that does not need to explicitly detect the rank of the KKT matrix nor modify factorization or solution methods based upon the rank. Methods are given for both conventional SVM training as well as for computing the regularization path that are simple and numerically stable. First, an efficient revised simplex method is efficiently implemented for SVM training (SVM-RSQP) with semi-definite kernels and shown to out-perform competing active set implementations for SVM training in terms of training time as well as shown to perform on-par with state-of-the-art SVM training algorithms such as SMO and SVMLight. Next, a new regularization path-following algorithm for semi-definite kernels (Simple SVMPath) is shown to be orders of magnitude faster, more accurate, and significantly less complex than competing methods and does not require the use of external solvers. Theoretical analysis reveals new insights into the nature of the path-following algorithms. Finally, a method is given for computing the approximate regularization path and approximate kernel path using the warm-start capability of the proposed revised simplex method (SVM-RSQP) and shown to provide significant, orders of magnitude, speed-ups relative to the traditional grid search where re-training is performed at each parameter value. Surprisingly, it also shown that even when the solution for the entire path is not desired, computing the approximate path can be seen as a speed-up mechanism for obtaining the solution at a single value. New insights are given concerning the limiting behaviors of the regularization and kernel path as well as the use of low-rank kernel approximations. Support vector machine active set method regularization path following method approximate regularization path approximate kernel path revised simplex Electrical and Computer Engineering Electrical and Electronics Engineering
78	Message Passing Approaches to Compressive Inference Under Structured Signal Priors Ziniel, Justin A. January 2014 (has links) No description available. Electrical Engineering Computer Science compressive sensing compressed sensing message passing approximate message passing AMP GAMP generalized approximate message passing belief propagation dynamic compressed sensing
79	From Magnitudes to Math: Developmental Precursors of Quantitative Reasoning Starr, Ariel January 2015 (has links) <p>The uniquely human mathematical mind sets us apart from all other animals. Although humans typically think about number symbolically, we also possess nonverbal representations of quantity that are present at birth and shared with many other animal species. These primitive numerical representations are thought to arise from an evolutionarily ancient system termed the Approximate Number System (ANS). The present dissertation aims to determine how these preverbal representations of quantity may serve as the foundation for more complex quantitative reasoning abilities. To this end, the five studies contained herein investigate the relations between representations of number, representations of other magnitude dimensions, and symbolic math proficiency in infants, children, and adults. The first empirical study, described in Chapter 2, investigated whether infants engage the ANS to represent the full range of natural numbers. The study presented in Chapter 3 compared infants' acuity for detecting changes in contour length to their acuity for detecting changes in number to assess whether representations of continuous quantities are primary to representations of number in infancy. The study presented in Chapter 4 compared individual differences in acuity for number, line length, and brightness in children and adults to determine how the relations between these magnitudes may change over development. Chapter 5 contains a longitudinal study investigating the relation between preverbal number sense in infancy and symbolic math abilities in preschool-aged children. Finally, the study presented in Chapter 6 investigated the mechanisms underlying the maturation of the number sense and determined which features of the number sense are predictive of symbolic math skill. Taken together, these findings confirm that number is a salient feature of the environment for infants and young children and suggest that approximate number representations are fundamental for the acquisition of symbolic math.</p> / Dissertation Cognitive psychology Developmental psychology analog magnitudes approximate number system cognitive development mathematical cognition numerical cognition
80	Variational Estimators in Statistical Multiscale Analysis Li, Housen 17 February 2016 (has links) No description available. 510 nonparametric regression statistical inverse problems adaptation multiresolution norm convergence rates approximate source conditions Mathematik (PPN61756535X)

Search results