• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 36
  • 32
  • 13
  • Tagged with
  • 81
  • 81
  • 69
  • 59
  • 45
  • 41
  • 40
  • 40
  • 25
  • 24
  • 21
  • 19
  • 14
  • 12
  • 10
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
71

Multidimensional flow mapping for proportional valves

Sitte, André, Koch, Oliver, Liu, Jianbin, Tautenhahn, Ralf, Weber, Jürgen 25 June 2020 (has links)
Inverse, multidimensional input-output flow mapping is very important for use of valves in precision motion control applications. Due to the highly nonlinear characteristic and uncertain model structure of the cartridge valves, it is hard to formulate the modelling of their flow mappings into simple parameter estimation problems. This contribution conducts a comprehensive analysis and validation of three- and four-dimensional input-output-mapping approaches for a proportional pilot operated seat valves. Therefore, a virtual and a physical test-rig setup are utilized for initial measurement, implementation and assessment. After modeling and validating the valve under consideration, as a function of flow, pressure and temperature different mapping methods are investigated. More specifically, state of the art approaches, deep-learning methods and a newly developed approach (extPoly) are examined. Especially ANNs and Polynomials show reasonable approximation results even for more than two inputs. However, the results are strongly dependent on the structure and distribution of the input data points. Besides identification effort, the invertibility was investigated.
72

Data-driven modeling and simulation of spatiotemporal processes with a view toward applications in biology

Maddu Kondaiah, Suryanarayana 11 January 2022 (has links)
Mathematical modeling and simulation has emerged as a fundamental means to understand physical process around us with countless real-world applications in applied science and engineering problems. However, heavy reliance on first principles, symmetry relations, and conservation laws has limited its applicability to a few scientific domains and even few real-world scenarios. Especially in disciplines like biology the underlying living constituents exhibit a myriad of complexities like non-linearities, non-equilibrium physics, self-organization and plasticity that routinely escape mathematical treatment based on governing laws. Meanwhile, recent decades have witnessed rapid advancement in computing hardware, sensing technologies, and algorithmic innovations in machine learning. This progress has helped propel data-driven paradigms to achieve unprecedented practical success in the fields of image processing and computer vision, natural language processing, autonomous transport, and etc. In the current thesis, we explore, apply, and advance statistical and machine learning strategies that help bridge the gap between data and mathematical models, with a view toward modeling and simulation of spatiotemporal processes in biology. As first, we address the problem of learning interpretable mathematical models of biologial process from limited and noisy data. For this, we propose a statistical learning framework called PDE-STRIDE based on the theory of stability selection and ℓ0-based sparse regularization for parsimonious model selection. The PDE-STRIDE framework enables model learning with relaxed dependencies on tuning parameters, sample-size and noise-levels. We demonstrate the practical applicability of our method on real-world data by considering a purely data-driven re-evaluation of the advective triggering hypothesis explaining the embryonic patterning event in the C. elegans zygote. As a next natural step, we extend our PDE-STRIDE framework to leverage prior knowledge from physical principles to learn biologically plausible and physically consistent models rather than models that simply fit the data best. For this, we modify the PDE-STRIDE framework to handle structured sparsity constraints for grouping features which enables us to: 1) enforce conservation laws, 2) extract spatially varying non-observables, 3) encode symmetry relations associated with the underlying biological process. We show several applications from systems biology demonstrating the claim that enforcing priors dramatically enhances the robustness and consistency of the data-driven approaches. In the following part, we apply our statistical learning framework for learning mean-field deterministic equations of active matter systems directly from stochastic self-propelled active particle simulations. We investigate two examples of particle models which differs in the microscopic interaction rules being used. First, we consider a self-propelled particle model endowed with density-dependent motility character. For the chosen hydrodynamic variables, our data-driven framework learns continuum partial differential equations that are in excellent agreement with analytical derived coarse-grain equations from Boltzmann approach. In addition, our structured sparsity framework is able to decode the hidden dependency between particle speed and the local density intrinsic to the self-propelled particle model. As a second example, the learning framework is applied for coarse-graining a popular stochastic particle model employed for studying the collective cell motion in epithelial sheets. The PDE-STRIDE framework is able to infer novel PDE model that quantitatively captures the flow statistics of the particle model in the regime of low density fluctuations. Modern microscopy techniques produce GigaBytes (GB) and TeraBytes (TB) of data while imaging spatiotemporal developmental dynamics of living organisms. However, classical statistical learning based on penalized linear regression models struggle with issues like accurate computation of derivatives in the candidate library and problems with computational scalability for application to “big” and noisy data-sets. For this reason we exploit the rich parameterization of neural networks that can efficiently learn from large data-sets. Specifically, we explore the framework of Physics-Informed Neural Networks (PINN) that allow for seamless integration of physics priors with measurement data. We propose novel strategies for multi-objective optimization that allow for adapting PINN architecture to multi-scale modeling problems arising in biology. We showcase application examples for both forward and inverse modeling of mesoscale active turbulence phenomenon observed in dense bacterial suspensions. Employing our strategies, we demonstrate orders of magnitude gain in accuracy and convergence in comparison with conventional formulation for solving multi-objective optimization in PINNs. In the concluding chapter of the thesis, we skip model interpretability and focus on learning computable models directly from noisy data for the purpose of pure dynamics forecasting. We propose STENCIL-NET, an artificial neural network architecture that learns solution adaptive spatial discretization of an unknown PDE model that can be stably integrated in time with negligible loss in accuracy. To support this claim, we present numerical experiments on long-term forecasting of chaotic PDE solutions on coarse spatio-temporal grids, and also showcase de-noising application that help decompose spatiotemporal dynamics from the noise in an equation-free manner.
73

Neurodynamische Module zur Bewegungssteuerung autonomer mobiler Roboter

Hild, Manfred 07 January 2008 (has links)
In der vorliegenden Arbeit werden rekurrente neuronale Netze im Hinblick auf ihre Eignung zur Bewegungssteuerung autonomer Roboter untersucht. Nacheinander werden Oszillatoren für Vierbeiner, homöostatische Ringmodule für segmentierte Roboter und monostabile Neuromodule für Roboter mit vielen Freiheitsgraden und komplexen Bewegungsabläufen besprochen. Neben dem mathematisch-theoretischen Hintergrund der Neuromodule steht in gleichberechtigter Weise deren praktische Implementierung auf realen Robotersystemen. Hierzu wird die funktionale Einbettung ins Gesamtsystem ebenso betrachtet, wie die konkreten Aspekte der zugrundeliegenden Hardware: Rechengenauigkeit, zeitliche Auflösung, Einfluss verwendeter Materialien und dergleichen mehr. Interessante elektronische Schaltungsprinzipien werden detailliert besprochen. Insgesamt enthält die vorliegende Arbeit alle notwendigen theoretischen und praktischen Informationen, um individuelle Robotersysteme mit einer angemessenen Bewegungssteuerung zu versehen. Ein weiteres Anliegen der Arbeit ist es, aus der Richtung der klassischen Ingenieurswissenschaften kommend, einen neuen Zugang zur Theorie rekurrenter neuronaler Netze zu schaffen. Gezielte Vergleiche der Neuromodule mit analogen elektronischen Schaltungen, physikalischen Modellen und Algorithmen aus der digitalen Signalverarbeitung können das Verständnis von Neurodynamiken erleichtern. / How recurrent neural networks can help to make autonomous robots move, will be investigated within this thesis. First, oscillators which are able to control four-legged robots will be dealt with, then homeostatic ring modules which control segmented robots, and finally monostable neural modules, which are able to drive complex motion sequences on robots with many degrees of freedom will be focused upon. The mathematical theory of neural modules will be addressed as well as their practical implementation on real robot platforms. This includes their embedding into a major framework and concrete aspects, like computational accuracy, timing and dependance on materials. Details on electronics will be given, so that individual robot systems can be built and equipped with an appropriate motion controller. It is another concern of this thesis, to shed a new light on the theory of recurrent neural networks, from the perspective of classical engineering science. Selective comparisons to analog electronic schematics, physical models, and digital signal processing algorithms can ease the understanding of neural dynamics.
74

Artificial Neural Networks in Greenhouse Modelling

Miranda Trujillo, Luis Carlos 24 August 2018 (has links)
Moderne Präzisionsgartenbaulicheproduktion schließt hoch technifizierte Gewächshäuser, deren Einsatz in großem Maße von der Qualität der Sensorik- und Regelungstechnik abhängt, mit ein. Zu den Regelungsstrategien gehören unter anderem Methoden der Künstlichen Intelligenz, wie z.B. Künstliche Neuronale Netze (KNN, aus dem Englischen). Die vorliegende Arbeit befasst sich mit der Eignung KNN-basierter Modelle als Bauelemente von Klimaregelungstrategien in Gewächshäusern. Es werden zwei Modelle vorgestellt: Ein Modell zur kurzzeitigen Voraussage des Gewächshausklimas (Lufttemperatur und relative Feuchtigkeit, in Minuten-Zeiträumen), und Modell zur Einschätzung von phytometrischen Signalen (Blatttemperatur, Transpirationsrate und Photosyntheserate). Eine Datenbank, die drei Kulturjahre umfasste (Kultur: Tomato), wurde zur Modellbildung bzw. -test benutzt. Es wurde festgestellt, dass die ANN-basierte Modelle sehr stark auf die Auswahl der Metaparameter und Netzarchitektur reagieren, und dass sie auch mit derselben Architektur verschiedene Kalkulationsergebnisse liefern können. Nichtsdestotrotz, hat sich diese Art von Modellen als geeignet zur Einschätzung komplexer Pflanzensignalen sowie zur Mikroklimavoraussage erwiesen. Zwei zusätzliche Möglichkeiten zur Erstellung von komplexen Simulationen sind in der Arbeit enthalten, und zwar zur Klimavoraussage in längerer Perioden und zur Voraussage der Photosyntheserate. Die Arbeit kommt zum Ergebnis, dass die Verwendung von KNN-Modellen für neue Gewächshaussteuerungstrategien geeignet ist, da sie robust sind und mit der Systemskomplexität gut zurechtkommen. Allerdings muss beachtet werden, dass Probleme und Schwierigkeiten auftreten können. Diese Arbeit weist auf die Relevanz der Netzarchitektur, die erforderlichen großen Datenmengen zur Modellbildung und Probleme mit verschiedenen Zeitkonstanten im Gewächshaus hin. / One facet of the current developments in precision horticulture is the highly technified production under cover. The intensive production in modern greenhouses heavily relies on instrumentation and control techniques to automate many tasks. Among these techniques are control strategies, which can also include some methods developed within the field of Artificial Intelligence. This document presents research on Artificial Neural Networks (ANN), a technique derived from Artificial Intelligence, and aims to shed light on their applicability in greenhouse vegetable production. In particular, this work focuses on the suitability of ANN-based models for greenhouse environmental control. To this end, two models were built: A short-term climate prediction model (air temperature and relative humidity in time scale of minutes), and a model of the plant response to the climate, the latter regarding phytometric measurements of leaf temperature, transpiration rate and photosynthesis rate. A dataset comprising three years of tomato cultivation was used to build and test the models. It was found that this kind of models is very sensitive to the fine-tuning of the metaparameters and that they can produce different results even with the same architecture. Nevertheless, it was shown that ANN are useful to simulate complex biological signals and to estimate future microclimate trends. Furthermore, two connection schemes are proposed to assemble several models in order to generate more complex simulations, like long-term prediction chains and photosynthesis forecasts. It was concluded that ANN could be used in greenhouse automation systems as part of the control strategy, as they are robust and can cope with the complexity of the system. However, a number of problems and difficulties are pointed out, including the importance of the architecture, the need for large datasets to build the models and problems arising from different time constants in the whole greenhouse system.
75

Evaluation Functions in General Game Playing

Michulke, Daniel 24 July 2012 (has links) (PDF)
While in traditional computer game playing agents were designed solely for the purpose of playing one single game, General Game Playing is concerned with agents capable of playing classes of games. Given the game's rules and a few minutes time, the agent is supposed to play any game of the class and eventually win it. Since the game is unknown beforehand, previously optimized data structures or human-provided features are not applicable. Instead, the agent must derive a strategy on its own. One approach to obtain such a strategy is to analyze the game rules and create a state evaluation function that can be subsequently used to direct the agent to promising states in the match. In this thesis we will discuss existing methods and present a general approach on how to construct such an evaluation function. Each topic is discussed in a modular fashion and evaluated along the lines of quality and efficiency, resulting in a strong agent.
76

On the Efficient Utilization of Dense Nonlocal Adjacency Information In Graph Neural Networks

Bünger, Dominik 14 December 2021 (has links)
In den letzten Jahren hat das Teilgebiet des Maschinellen Lernens, das sich mit Graphdaten beschäftigt, durch die Entwicklung von spezialisierten Graph-Neuronalen Netzen (GNNs) mit mathematischer Begründung in der spektralen Graphtheorie große Sprünge nach vorn gemacht. Zusätzlich zu natürlichen Graphdaten können diese Methoden auch auf Datensätze ohne Graphen angewendet werden, indem man einen Graphen künstlich mithilfe eines definierten Adjazenzbegriffs zwischen den Samplen konstruiert. Nach dem neueste Stand der Technik wird jedes Sample mit einer geringen Anzahl an Nachbarn verknüpft, um gleichzeitig das dünnbesetzte Verhalten natürlicher Graphen nachzuahmen, die Stärken bestehender GNN-Methoden auszunutzen und quadratische Abhängigkeit von der Knotenanzahl zu verhinden, welche diesen Ansatz für große Datensätze unbrauchbar machen würde. Die vorliegende Arbeit beleuchtet die alternative Konstruktion von vollbesetzten Graphen basierend auf Kernel-Funktionen. Dabei quantifizieren die Verknüpfungen eines jeden Samples explizit die Ähnlichkeit zu allen anderen Samplen. Deshalb enthält der Graph eine quadratische Anzahl an Kanten, die die lokalen und nicht-lokalen Nachbarschaftsinformationen beschreiben. Obwohl dieser Ansatz in anderen Kontexten wie der Lösung partieller Differentialgleichungen ausgiebig untersucht wurde, wird er im Maschinellen Lernen heutzutage meist wegen der dichtbesetzten Adjazenzmatrizen als unbrauchbar empfunden. Aus diesem Grund behandelt ein großer Teil dieser Arbeit numerische Techniken für schnelle Auswertungen, insbesondere Eigenwertberechnungen, in wichtigen Spezialfällen, bei denen die Samples durch niedrigdimensionale Vektoren (wie z.B. in dreidimensionalen Punktwolken) oder durch kategoriale Attribute beschrieben werden. Weiterhin wird untersucht, wie diese dichtbesetzten Adjazenzinformationen in Lernsituationen auf Graphen benutzt werden können. Es wird eine eigene transduktive Lernmethode vorgeschlagen und präsentiert, eine Version eines Graph Convolutional Networks (GCN), das auf die spektralen und räumlichen Eigenschaften von dichtbesetzten Graphen abgestimmt ist. Schließlich wird die Anwendung von Kernel-basierten Adjazenzmatrizen in der Beschleunigung der erfolgreichen Architektur “PointNet++” umrissen. Im Verlauf der Arbeit werden die Methoden in ausführlichen numerischen Experimenten evaluiert. Zusätzlich zu der empirischen Genauigkeit der Neuronalen Netze liegt der Fokus auf wettbewerbsfähigen Laufzeiten, um die Berechnungs- und Energiekosten der Methoden zu reduzieren. / Over the past few years, graph learning - the subdomain of machine learning on graph data - has taken big leaps forward through the development of specialized Graph Neural Networks (GNNs) that have mathematical foundations in spectral graph theory. In addition to natural graph data, these methods can be applied to non-graph data sets by constructing a graph artificially using a predefined notion of adjacency between samples. The state of the art is to only connect each sample to a low number of neighbors in order to simultaneously mimic the sparse behavior of natural graphs, play into the strengths of existing GNN methods, and avoid quadratic scaling in the number of nodes that would make the approach infeasible for large problem sizes. In this thesis, we shine light on the alternative construction of kernel-based fully-connected graphs. Here the connections of each sample explicitly quantify the similarities to all other samples. Hence the graph contains a quadratic number of edges which encode local and non-local neighborhood information. Though this approach is well studied in other settings including the solution of partial differential equations, it is typically dismissed in machine learning nowadays because of its dense adjacency matrices. We thus dedicate a large portion of this work to showcasing numerical techniques for fast evaluations, especially eigenvalue computations, in important special cases where samples are described by low-dimensional feature vectors (e.g., three-dimensional point clouds) or by a small set of categorial attributes. We then continue to investigate how this dense adjacency information can be utilized in graph learning settings. In particular, we present our own proposed transductive learning method, a version of a Graph Convolutional Network (GCN) designed towards the spectral and spatial properties of dense graphs. We furthermore outline the application of kernel-based adjacency matrices in the speedup of the successful PointNet++ architecture. Throughout this work, we evaluate our methods in extensive numerical experiments. In addition to the empirical accuracy of our neural network tasks, we focus on competitive runtimes in order to decrease the computational and energy cost of our methods.
77

Lane Change Prediction in the Urban Area

Griesbach, Karoline 18 July 2019 (has links)
The development of Advanced Driver Assistance Systems and autonomous driving is one of the main research fields in the area of vehicle development today. Initially the research in this area focused on analyzing and predicting driving maneuvers on highways. Nowadays, a vast amount of research focuses on urban areas as well. Driving maneuvers in urban areas are more complex and therefore more difficult to predict than driving maneuvers on highways. The goals of predicting and understanding driving maneuvers are to reduce accidents, to improve traffic density, and to develop reliable algorithms for autonomous driving. Driving behavior during different driving maneuvers such as turning at intersections, emergency braking or lane changes are analyzed. This thesis focuses on the driving behavior around lane changes and thus the prediction of lane changes in the urban area is applied with an Echo State Network. First, existing methods with a special focus on input variables and results were evaluated to derive input variables with regard to lane change and no lane change sequences. The data for this first analyses were obtained from a naturalistic driving study. Based on theses results the final set of variables (steering angle, turn signal and gazes to the left and right) was chosen for further computations. The parameters of the Echo State Network were then optimized using the data of the naturalistic driving study and the final set of variables. Finally, left and right lane changes were predicted. Furthermore, the Echo State Network was compared to a feedforward neural network. The Echo State Network could predict left and right lane changes more successful than the feedforward neural network. / Fahrerassistenzsysteme und Algorithmen zum autonomen Fahren stellen ein aktuelles Forschungsfeld im Bereich der Fahrzeugentwicklung dar. Am Anfang wurden vor allem Fahrmanöver auf der Autobahn analysiert und vorhergesagt, mittlerweile hat sich das Forschungsfeld auch auf den urbanen Verkehr ausgeweitet. Fahrmanöver im urbanen Raum sind komplexer als Fahrmanöver auf Autobahnen und daher schwieriger vorherzusagen. Ziele für die Vorhersage von Fahrmanövern sind die Reduzierung von Verkehrsunfällen, die Verbesserung des Verkehrsflusses und die Entwicklung von zuverlässigen Algorithmen für das autonome Fahren. Um diese Ziele zu erreichen, wird das Fahrverhalten bei unterschiedlichen Fahrmanövern analysiert, wie z.B. beim Abbiegevorgang an Kreuzungen, bei der Notbremsung oder beim Spurwechsel. In dieser Arbeit wird der Spurwechsel im urbanen Straßenverkehr mit einem Echo State Network vorhergesagt. Zuerst wurden existierende Methoden zur Spurwechselvorhersage bezogen auf die Eingaben und die Ergebnisse bewertet, um danach die spurwechselbezogenen Variableneigenschaften bezüglich Spurwechsel- und Nicht-Spurwechselsequenzen zu analysieren. Die Daten, die Basis für diese ersten Untersuchungen waren, stammen aus einer Realfahrstudie. Basierend auf diesen Resultaten wurden die finalen Variablen (Lenkwinkel, Blinker und Blickrichtung) für weitere Berechnungen ausgewählt. Mit den Daten aus der Realfahrstudie und den finalen Variablen wurden die Parameter des Echo State Networks optimiert und letztendlich wurden linke und rechte Spurwechsel vorhergesagt. Zusätzlich wurde das Echo State Network mit einem vorwärtsgerichteten neuronalen Netz verglichen. Das Echo State Network konnte linke und rechte Spurwechsel erfolgreicher vorhersagen als das vorwärtsgerichtete neuronale Netz.
78

Methodische Aspekte bei der Entwicklung mechanischer Simulationen zur Messung der Funktionalitäten eines Handballschuhs

Krumm, Dominik 17 March 2020 (has links)
Ziel der vorliegenden Arbeit war es, die methodischen Aspekte bei der Entwicklung mechanischer Simulationen zur Messung der Funktionalitäten von Handballschuhen systematisch zu untersuchen und aus den Ergebnissen allgemeingültige Aussagen zum Abstraktionsgrad abzuleiten. Die Untersuchungen der vier methodischen Aspekte Messgerät, Auswertemodell, Einfluss- und Eingangsgröße haben ergeben, dass insgesamt drei Aspekte einen Einfluss auf den Messwert hatten. Mit Ausnahme der Ergebnisse zum Aspekt Eingangsgröße besaßen die untersuchten methodischen Aspekte jeweils einen Einfluss auf den Messwert. Anhand der Ergebnisse konnte abgeleitet werden, dass der Abstraktionsgrad einen Einfluss auf die Messwerte besitzt. / The aim of the current work was to investigate systematically the methodological aspects used in the development of mechanical simulations, which are capable of measuring the functionalities of handball shoes, and to derive general conclusions about the proper degree of abstraction from the results. The investigations of the four methodological aspects, namely measuring instrument, evaluation model, influence quantity and input quantity, have shown that three aspects had an influence on the measurand. Except for the results on the aspect of input quantity, each of the examined methodological aspects had an influence on the measurand. Based on the results, it could be deduced that the degree of abstraction has an influence on the measurand.
79

Evaluation Functions in General Game Playing

Michulke, Daniel 22 June 2012 (has links)
While in traditional computer game playing agents were designed solely for the purpose of playing one single game, General Game Playing is concerned with agents capable of playing classes of games. Given the game's rules and a few minutes time, the agent is supposed to play any game of the class and eventually win it. Since the game is unknown beforehand, previously optimized data structures or human-provided features are not applicable. Instead, the agent must derive a strategy on its own. One approach to obtain such a strategy is to analyze the game rules and create a state evaluation function that can be subsequently used to direct the agent to promising states in the match. In this thesis we will discuss existing methods and present a general approach on how to construct such an evaluation function. Each topic is discussed in a modular fashion and evaluated along the lines of quality and efficiency, resulting in a strong agent.:Introduction Game Playing Evaluation Functions I - Aggregation Evaluation Functions II - Features General Evaluation Related Work Discussion
80

Design, Analysis, and Applications of Approximate Arithmetic Modules

Ullah, Salim 06 April 2022 (has links)
From the initial computing machines, Colossus of 1943 and ENIAC of 1945, to modern high-performance data centers and Internet of Things (IOTs), four design goals, i.e., high-performance, energy-efficiency, resource utilization, and ease of programmability, have remained a beacon of development for the computing industry. During this period, the computing industry has exploited the advantages of technology scaling and microarchitectural enhancements to achieve these goals. However, with the end of Dennard scaling, these techniques have diminishing energy and performance advantages. Therefore, it is necessary to explore alternative techniques for satisfying the computational and energy requirements of modern applications. Towards this end, one promising technique is analyzing and surrendering the strict notion of correctness in various layers of the computation stack. Most modern applications across the computing spectrum---from data centers to IoTs---interact and analyze real-world data and take decisions accordingly. These applications are broadly classified as Recognition, Mining, and Synthesis (RMS). Instead of producing a single golden answer, these applications produce several feasible answers. These applications possess an inherent error-resilience to the inexactness of processed data and corresponding operations. Utilizing these applications' inherent error-resilience, the paradigm of Approximate Computing relaxes the strict notion of computation correctness to realize high-performance and energy-efficient systems with acceptable quality outputs. The prior works on circuit-level approximations have mainly focused on Application-specific Integrated Circuits (ASICs). However, ASIC-based solutions suffer from long time-to-market and high-cost developing cycles. These limitations of ASICs can be overcome by utilizing the reconfigurable nature of Field Programmable Gate Arrays (FPGAs). However, due to architectural differences between ASICs and FPGAs, the utilization of ASIC-based approximation techniques for FPGA-based systems does not result in proportional performance and energy gains. Therefore, to exploit the principles of approximate computing for FPGA-based hardware accelerators for error-resilient applications, FPGA-optimized approximation techniques are required. Further, most state-of-the-art approximate arithmetic operators do not have a generic approximation methodology to implement new approximate designs for an application's changing accuracy and performance requirements. These works also lack a methodology where a machine learning model can be used to correlate an approximate operator with its impact on the output quality of an application. This thesis focuses on these research challenges by designing and exploring FPGA-optimized logic-based approximate arithmetic operators. As multiplication operation is one of the computationally complex and most frequently used arithmetic operations in various modern applications, such as Artificial Neural Networks (ANNs), we have, therefore, considered it for most of the proposed approximation techniques in this thesis. The primary focus of the work is to provide a framework for generating FPGA-optimized approximate arithmetic operators and efficient techniques to explore approximate operators for implementing hardware accelerators for error-resilient applications. Towards this end, we first present various designs of resource-optimized, high-performance, and energy-efficient accurate multipliers. Although modern FPGAs host high-performance DSP blocks to perform multiplication and other arithmetic operations, our analysis and results show that the orthogonal approach of having resource-efficient and high-performance multipliers is necessary for implementing high-performance accelerators. Due to the differences in the type of data processed by various applications, the thesis presents individual designs for unsigned, signed, and constant multipliers. Compared to the multiplier IPs provided by the FPGA Synthesis tool, our proposed designs provide significant performance gains. We then explore the designed accurate multipliers and provide a library of approximate unsigned/signed multipliers. The proposed approximations target the reduction in the total utilized resources, critical path delay, and energy consumption of the multipliers. We have explored various statistical error metrics to characterize the approximation-induced accuracy degradation of the approximate multipliers. We have also utilized the designed multipliers in various error-resilient applications to evaluate their impact on applications' output quality and performance. Based on our analysis of the designed approximate multipliers, we identify the need for a framework to design application-specific approximate arithmetic operators. An application-specific approximate arithmetic operator intends to implement only the logic that can satisfy the application's overall output accuracy and performance constraints. Towards this end, we present a generic design methodology for implementing FPGA-based application-specific approximate arithmetic operators from their accurate implementations according to the applications' accuracy and performance requirements. In this regard, we utilize various machine learning models to identify feasible approximate arithmetic configurations for various applications. We also utilize different machine learning models and optimization techniques to efficiently explore the large design space of individual operators and their utilization in various applications. In this thesis, we have used the proposed methodology to design approximate adders and multipliers. This thesis also explores other layers of the computation stack (cross-layer) for possible approximations to satisfy an application's accuracy and performance requirements. Towards this end, we first present a low bit-width and highly accurate quantization scheme for pre-trained Deep Neural Networks (DNNs). The proposed quantization scheme does not require re-training (fine-tuning the parameters) after quantization. We also present a resource-efficient FPGA-based multiplier that utilizes our proposed quantization scheme. Finally, we present a framework to allow the intelligent exploration and highly accurate identification of the feasible design points in the large design space enabled by cross-layer approximations. The proposed framework utilizes a novel Polynomial Regression (PR)-based method to model approximate arithmetic operators. The PR-based representation enables machine learning models to better correlate an approximate operator's coefficients with their impact on an application's output quality.:1. Introduction 1.1 Inherent Error Resilience of Applications 1.2 Approximate Computing Paradigm 1.2.1 Software Layer Approximation 1.2.2 Architecture Layer Approximation 1.2.3 Circuit Layer Approximation 1.3 Problem Statement 1.4 Focus of the Thesis 1.5 Key Contributions and Thesis Overview 2. Preliminaries 2.1 Xilinx FPGA Slice Structure 2.2 Multiplication Algorithms 2.2.1 Baugh-Wooley’s Multiplication Algorithm 2.2.2 Booth’s Multiplication Algorithm 2.2.3 Sign Extension for Booth’s Multiplier 2.3 Statistical Error Metrics 2.4 Design Space Exploration and Optimization Techniques 2.4.1 Genetic Algorithm 2.4.2 Bayesian Optimization 2.5 Artificial Neural Networks 3. Accurate Multipliers 3.1 Introduction 3.2 Related Work 3.3 Unsigned Multiplier Architecture 3.4 Motivation for Signed Multipliers 3.5 Baugh-Wooley’s Multiplier 3.6 Booth’s Algorithm-based Signed Multipliers 3.6.1 Booth-Mult Design 3.6.2 Booth-Opt Design 3.6.3 Booth-Par Design 3.7 Constant Multipliers 3.8 Results and Discussion 3.8.1 Experimental Setup and Tool Flow 3.8.2 Performance comparison of the proposed accurate unsigned multiplier 3.8.3 Performance comparison of the proposed accurate signed multiplier with the state-of-the-art accurate multipliers 3.8.4 Performance comparison of the proposed constant multiplier with the state-of-the-art accurate multipliers 3.9 Conclusion 4. Approximate Multipliers 4.1 Introduction 4.2 Related Work 4.3 Unsigned Approximate Multipliers 4.3.1 Approximate 4 × 4 Multiplier (Approx-1) 4.3.2 Approximate 4 × 4 Multiplier (Approx-2) 4.3.3 Approximate 4 × 4 Multiplier (Approx-3) 4.4 Designing Higher Order Approximate Unsigned Multipliers 4.4.1 Accurate Adders for Implementing 8 × 8 Approximate Multipliers from 4 × 4 Approximate Multipliers 4.4.2 Approximate Adders for Implementing Higher-order Approximate Multipliers 4.5 Approximate Signed Multipliers (Booth-Approx) 4.6 Results and Discussion 4.6.1 Experimental Setup and Tool Flow 4.6.2 Evaluation of the Proposed Approximate Unsigned Multipliers 4.6.3 Evaluation of the Proposed Approximate Signed Multiplier 4.7 Conclusion 5. Designing Application-specific Approximate Operators 5.1 Introduction 5.2 Related Work 5.3 Modeling Approximate Arithmetic Operators 5.3.1 Accurate Multiplier Design 5.3.2 Approximation Methodology 5.3.3 Approximate Adders 5.4 DSE for FPGA-based Approximate Operators Synthesis 5.4.1 DSE using Bayesian Optimization 5.4.2 MOEA-based Optimization 5.4.3 Machine Learning Models for DSE 5.5 Results and Discussion 5.5.1 Experimental Setup and Tool Flow 5.5.2 Accuracy-Performance Analysis of Approximate Adders 5.5.3 Accuracy-Performance Analysis of Approximate Multipliers 5.5.4 AppAxO MBO 5.5.5 ML Modeling 5.5.6 DSE using ML Models 5.5.7 Proposed Approximate Operators 5.6 Conclusion 6. Quantization of Pre-trained Deep Neural Networks 6.1 Introduction 6.2 Related Work 6.2.1 Commonly Used Quantization Techniques 6.3 Proposed Quantization Techniques 6.3.1 L2L: Log_2_Lead Quantization 6.3.2 ALigN: Adaptive Log_2_Lead Quantization 6.3.3 Quantitative Analysis of the Proposed Quantization Schemes 6.3.4 Proposed Quantization Technique-based Multiplier 6.4 Results and Discussion 6.4.1 Experimental Setup and Tool Flow 6.4.2 Image Classification 6.4.3 Semantic Segmentation 6.4.4 Hardware Implementation Results 6.5 Conclusion 7. A Framework for Cross-layer Approximations 7.1 Introduction 7.2 Related Work 7.3 Error-analysis of approximate arithmetic units 7.3.1 Application Independent Error-analysis of Approximate Multipliers 7.3.2 Application Specific Error Analysis 7.4 Accelerator Performance Estimation 7.5 DSE Methodology 7.6 Results and Discussion 7.6.1 Experimental Setup and Tool Flow 7.6.2 Behavioral Analysis 7.6.3 Accelerator Performance Estimation 7.6.4 DSE Performance 7.7 Conclusion 8. Conclusions and Future Work

Page generated in 0.0408 seconds