• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 91
  • 12
  • 9
  • 9
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 161
  • 161
  • 105
  • 43
  • 42
  • 32
  • 29
  • 21
  • 21
  • 20
  • 19
  • 19
  • 18
  • 18
  • 17
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
131

Em busca da cultura espacial

Borges, Fabiane Morais 14 June 2013 (has links)
Made available in DSpace on 2016-04-28T20:38:43Z (GMT). No. of bitstreams: 1 Fabiane Morais Borges.pdf: 3503545 bytes, checksum: 7297b721aeaa0cb975f5cbd2df5be8ad (MD5) Previous issue date: 2013-06-14 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / This thesis has arisen from the complex Internet networks intending to build up new Space paradigms based on practices of free software and hardware and open source systems, as they appeared, roughly, since the turn of the century. In order to examine these new paradigms, I consider prior processes connected to the Space Culture. The text goes back to the history of the Space Race; the first rockets, the first satellites, some tenets of the international politics that guided the cold war in the years after the Second World War. I bring up interesting elements of the Space programs both of the US and the USSR, as well as the main technicians and scientists behind the engineering of the rockets. The research dives in the rockets of Nazi Germany who first invested in the production of rockets, goes to communist Russia as well as to liberal post-war America. The thesis brings up ideas concerning Space utopias, science fiction in literature and cinema and engages with the difference between Space exploration taken up by humans and by robots. It examines the first rocket flights and the first artificial satellites placed in outer Space, paying attention to the particulars of each of those first endeavors, to their purpose and to how much they accomplished their mission. The thesis is therefore ready to question the importance of the Space Race to human imagination and to analyse the realm of Space dreams from the late 19th century up to now. The last part of the thesis is concerned with the groups that are building Space travels in an independent way, moved either by ideological or by commercial reasons. The investigation uncovers the ideas of each of those groups concerning Space exploration. It then goes on to think the relation between the makers of such exploration and a possible industrial revolution. Finally, the thesis raises some criticisms to the creative processes of individuals, groups, networks and social movements that are concerned with the outer Space / Essa tese surge a partir das complexas redes de internet voltadas à construção de novos paradigmas espaciais, baseadas em práticas de software e hardware livre e sistemas open source que surgiram, a grosso modo, a partir dos anos 2000. Mas para chegar nesse ponto foi preciso investigar processos anteriores em relação à Cultura Espacial. O texto retoma a história da Corrida Espacial, os primeiros foguetes, os primeiros satélites, a política que estava em voga durante os anos da Guerra Fria pós II Guerra Mundial. Ela tenta levantar os pontos de tensão dos programas espaciais da União Soviética e dos Estados Unidos, assim como dos principais técnicos que estavam por traz de toda a engenharia de foguetes. Vai mergulhar na Alemanha Nazista que foi a primeira a investir irrestritamente na produção de foguetes, passeia pela Russia comunista e o liberalismo americano. A tese traz à tona ideias sobre utopias espaciais, ficção científica na literatura e no cinema, e analisa a diferença entre exploração espacial humana e robótica. Traz tabelas dos primeiros vôos espaciais e os primeiros satélites levados ao Espaço, atentando para as particularidades de cada um deles, para que serviam e que fim levaram. Levanta questionamentos sobre a importância da Corrida Espacial para a imaginação humana e analisa o arco dos sonhos espaciais desde o final do século XIX até os dias atuais. O final da tese é dedicada aos grupos que estão retomando a questão das viagens espaciais de forma independente, sejam grupos ideológicos ou mais empresariais e as ideias de cada um a respeito da exploração espacial. Pensa a relação dos makers com uma possível revolução industrial e levanta algumas críticas aos processos criativos de indivíduos, grupos, redes e movimentos sociais que se dedicam ao espaço
132

Design, Analysis, and Applications of Approximate Arithmetic Modules

Ullah, Salim 06 April 2022 (has links)
From the initial computing machines, Colossus of 1943 and ENIAC of 1945, to modern high-performance data centers and Internet of Things (IOTs), four design goals, i.e., high-performance, energy-efficiency, resource utilization, and ease of programmability, have remained a beacon of development for the computing industry. During this period, the computing industry has exploited the advantages of technology scaling and microarchitectural enhancements to achieve these goals. However, with the end of Dennard scaling, these techniques have diminishing energy and performance advantages. Therefore, it is necessary to explore alternative techniques for satisfying the computational and energy requirements of modern applications. Towards this end, one promising technique is analyzing and surrendering the strict notion of correctness in various layers of the computation stack. Most modern applications across the computing spectrum---from data centers to IoTs---interact and analyze real-world data and take decisions accordingly. These applications are broadly classified as Recognition, Mining, and Synthesis (RMS). Instead of producing a single golden answer, these applications produce several feasible answers. These applications possess an inherent error-resilience to the inexactness of processed data and corresponding operations. Utilizing these applications' inherent error-resilience, the paradigm of Approximate Computing relaxes the strict notion of computation correctness to realize high-performance and energy-efficient systems with acceptable quality outputs. The prior works on circuit-level approximations have mainly focused on Application-specific Integrated Circuits (ASICs). However, ASIC-based solutions suffer from long time-to-market and high-cost developing cycles. These limitations of ASICs can be overcome by utilizing the reconfigurable nature of Field Programmable Gate Arrays (FPGAs). However, due to architectural differences between ASICs and FPGAs, the utilization of ASIC-based approximation techniques for FPGA-based systems does not result in proportional performance and energy gains. Therefore, to exploit the principles of approximate computing for FPGA-based hardware accelerators for error-resilient applications, FPGA-optimized approximation techniques are required. Further, most state-of-the-art approximate arithmetic operators do not have a generic approximation methodology to implement new approximate designs for an application's changing accuracy and performance requirements. These works also lack a methodology where a machine learning model can be used to correlate an approximate operator with its impact on the output quality of an application. This thesis focuses on these research challenges by designing and exploring FPGA-optimized logic-based approximate arithmetic operators. As multiplication operation is one of the computationally complex and most frequently used arithmetic operations in various modern applications, such as Artificial Neural Networks (ANNs), we have, therefore, considered it for most of the proposed approximation techniques in this thesis. The primary focus of the work is to provide a framework for generating FPGA-optimized approximate arithmetic operators and efficient techniques to explore approximate operators for implementing hardware accelerators for error-resilient applications. Towards this end, we first present various designs of resource-optimized, high-performance, and energy-efficient accurate multipliers. Although modern FPGAs host high-performance DSP blocks to perform multiplication and other arithmetic operations, our analysis and results show that the orthogonal approach of having resource-efficient and high-performance multipliers is necessary for implementing high-performance accelerators. Due to the differences in the type of data processed by various applications, the thesis presents individual designs for unsigned, signed, and constant multipliers. Compared to the multiplier IPs provided by the FPGA Synthesis tool, our proposed designs provide significant performance gains. We then explore the designed accurate multipliers and provide a library of approximate unsigned/signed multipliers. The proposed approximations target the reduction in the total utilized resources, critical path delay, and energy consumption of the multipliers. We have explored various statistical error metrics to characterize the approximation-induced accuracy degradation of the approximate multipliers. We have also utilized the designed multipliers in various error-resilient applications to evaluate their impact on applications' output quality and performance. Based on our analysis of the designed approximate multipliers, we identify the need for a framework to design application-specific approximate arithmetic operators. An application-specific approximate arithmetic operator intends to implement only the logic that can satisfy the application's overall output accuracy and performance constraints. Towards this end, we present a generic design methodology for implementing FPGA-based application-specific approximate arithmetic operators from their accurate implementations according to the applications' accuracy and performance requirements. In this regard, we utilize various machine learning models to identify feasible approximate arithmetic configurations for various applications. We also utilize different machine learning models and optimization techniques to efficiently explore the large design space of individual operators and their utilization in various applications. In this thesis, we have used the proposed methodology to design approximate adders and multipliers. This thesis also explores other layers of the computation stack (cross-layer) for possible approximations to satisfy an application's accuracy and performance requirements. Towards this end, we first present a low bit-width and highly accurate quantization scheme for pre-trained Deep Neural Networks (DNNs). The proposed quantization scheme does not require re-training (fine-tuning the parameters) after quantization. We also present a resource-efficient FPGA-based multiplier that utilizes our proposed quantization scheme. Finally, we present a framework to allow the intelligent exploration and highly accurate identification of the feasible design points in the large design space enabled by cross-layer approximations. The proposed framework utilizes a novel Polynomial Regression (PR)-based method to model approximate arithmetic operators. The PR-based representation enables machine learning models to better correlate an approximate operator's coefficients with their impact on an application's output quality.:1. Introduction 1.1 Inherent Error Resilience of Applications 1.2 Approximate Computing Paradigm 1.2.1 Software Layer Approximation 1.2.2 Architecture Layer Approximation 1.2.3 Circuit Layer Approximation 1.3 Problem Statement 1.4 Focus of the Thesis 1.5 Key Contributions and Thesis Overview 2. Preliminaries 2.1 Xilinx FPGA Slice Structure 2.2 Multiplication Algorithms 2.2.1 Baugh-Wooley’s Multiplication Algorithm 2.2.2 Booth’s Multiplication Algorithm 2.2.3 Sign Extension for Booth’s Multiplier 2.3 Statistical Error Metrics 2.4 Design Space Exploration and Optimization Techniques 2.4.1 Genetic Algorithm 2.4.2 Bayesian Optimization 2.5 Artificial Neural Networks 3. Accurate Multipliers 3.1 Introduction 3.2 Related Work 3.3 Unsigned Multiplier Architecture 3.4 Motivation for Signed Multipliers 3.5 Baugh-Wooley’s Multiplier 3.6 Booth’s Algorithm-based Signed Multipliers 3.6.1 Booth-Mult Design 3.6.2 Booth-Opt Design 3.6.3 Booth-Par Design 3.7 Constant Multipliers 3.8 Results and Discussion 3.8.1 Experimental Setup and Tool Flow 3.8.2 Performance comparison of the proposed accurate unsigned multiplier 3.8.3 Performance comparison of the proposed accurate signed multiplier with the state-of-the-art accurate multipliers 3.8.4 Performance comparison of the proposed constant multiplier with the state-of-the-art accurate multipliers 3.9 Conclusion 4. Approximate Multipliers 4.1 Introduction 4.2 Related Work 4.3 Unsigned Approximate Multipliers 4.3.1 Approximate 4 × 4 Multiplier (Approx-1) 4.3.2 Approximate 4 × 4 Multiplier (Approx-2) 4.3.3 Approximate 4 × 4 Multiplier (Approx-3) 4.4 Designing Higher Order Approximate Unsigned Multipliers 4.4.1 Accurate Adders for Implementing 8 × 8 Approximate Multipliers from 4 × 4 Approximate Multipliers 4.4.2 Approximate Adders for Implementing Higher-order Approximate Multipliers 4.5 Approximate Signed Multipliers (Booth-Approx) 4.6 Results and Discussion 4.6.1 Experimental Setup and Tool Flow 4.6.2 Evaluation of the Proposed Approximate Unsigned Multipliers 4.6.3 Evaluation of the Proposed Approximate Signed Multiplier 4.7 Conclusion 5. Designing Application-specific Approximate Operators 5.1 Introduction 5.2 Related Work 5.3 Modeling Approximate Arithmetic Operators 5.3.1 Accurate Multiplier Design 5.3.2 Approximation Methodology 5.3.3 Approximate Adders 5.4 DSE for FPGA-based Approximate Operators Synthesis 5.4.1 DSE using Bayesian Optimization 5.4.2 MOEA-based Optimization 5.4.3 Machine Learning Models for DSE 5.5 Results and Discussion 5.5.1 Experimental Setup and Tool Flow 5.5.2 Accuracy-Performance Analysis of Approximate Adders 5.5.3 Accuracy-Performance Analysis of Approximate Multipliers 5.5.4 AppAxO MBO 5.5.5 ML Modeling 5.5.6 DSE using ML Models 5.5.7 Proposed Approximate Operators 5.6 Conclusion 6. Quantization of Pre-trained Deep Neural Networks 6.1 Introduction 6.2 Related Work 6.2.1 Commonly Used Quantization Techniques 6.3 Proposed Quantization Techniques 6.3.1 L2L: Log_2_Lead Quantization 6.3.2 ALigN: Adaptive Log_2_Lead Quantization 6.3.3 Quantitative Analysis of the Proposed Quantization Schemes 6.3.4 Proposed Quantization Technique-based Multiplier 6.4 Results and Discussion 6.4.1 Experimental Setup and Tool Flow 6.4.2 Image Classification 6.4.3 Semantic Segmentation 6.4.4 Hardware Implementation Results 6.5 Conclusion 7. A Framework for Cross-layer Approximations 7.1 Introduction 7.2 Related Work 7.3 Error-analysis of approximate arithmetic units 7.3.1 Application Independent Error-analysis of Approximate Multipliers 7.3.2 Application Specific Error Analysis 7.4 Accelerator Performance Estimation 7.5 DSE Methodology 7.6 Results and Discussion 7.6.1 Experimental Setup and Tool Flow 7.6.2 Behavioral Analysis 7.6.3 Accelerator Performance Estimation 7.6.4 DSE Performance 7.7 Conclusion 8. Conclusions and Future Work
133

Processor design-space exploration through fast simulation / Exploration de l'espace de conception de processeurs via simulation accélérée

Khan, Taj Muhammad 12 May 2011 (has links)
Nous nous focalisons sur l'échantillonnage comme une technique de simulation pour réduire le temps de simulation. L'échantillonnage est basé sur le fait que l'exécution d'un programme est composée des parties du code qui se répètent, les phases. D'où vient l'observation que l'on peut éviter la simulation entière d'un programme et simuler chaque phase juste une fois et à partir de leurs performances calculer la performance du programme entier. Deux questions importantes se lèvent: quelles parties du programme doit-on simuler? Et comment restaurer l'état du système avant chaque simulation? Pour répondre à la première question, il existe deux solutions: une qui analyse l'exécution du programme en termes de phases et choisit de simuler chaque phase une fois, l'échantillonnage représentatif, et une deuxième qui prône de choisir les échantillons aléatoirement, l'échantillonnage statistique. Pour répondre à la deuxième question de la restauration de l'état du système, des techniques ont été développées récemment qui restaurent l'état (chauffent) du système en fonction des besoins du bout du code simulé (adaptativement). Les techniques des choix des échantillons ignorent complètement les mécanismes de chauffage du système ou proposent des alternatives qui demandent beaucoup de modification du simulateur et les techniques adaptatives du chauffage ne sont pas compatibles avec la plupart des techniques d'échantillonnage. Au sein de cette thèse nous nous focalisons sur le fait de réconcilier les techniques d'échantillonnage avec celles du chauffage adaptatif pour développer un mécanisme qui soit à la fois facile à utiliser, précis dans ses résultats, et soit transparent à l'utilisateur. Nous avons prit l'échantillonnage représentatif et statistique et modifié les techniques adaptatives du chauffage pour les rendre compatibles avec ces premiers dans un seul mécanisme. Nous avons pu montrer que les techniques adaptatives du chauffage peuvent être employées dans l'échantillonnage. Nos résultats sont comparables avec l'état de l'art en terme de précision mais en débarrassant l'utilisateur des problèmes du chauffage et en lui cachant les détails de la simulation, nous rendons le processus plus facile. On a aussi constaté que l'échantillonnage statistique donne des résultats meilleurs que l'échantillonnage représentatif / Simulation is a vital tool used by architects to develop new architectures. However, because of the complexity of modern architectures and the length of recent benchmarks, detailed simulation of programs can take extremely long times. This impedes the exploration of processor design space which the architects need to do to find the optimal configuration of processor parameters. Sampling is one technique which reduces the simulation time without adversely affecting the accuracy of the results. Yet, most sampling techniques either ignore the warm-up issue or require significant development effort on the part of the user.In this thesis we tackle the problem of reconciling state-of-the-art warm-up techniques and the latest sampling mechanisms with the triple objective of keeping the user effort minimum, achieving good accuracy and being agnostic to software and hardware changes. We show that both the representative and statistical sampling techniques can be adapted to use warm-up mechanisms which can accommodate the underlying architecture's warm-up requirements on-the-fly. We present the experimental results which show an accuracy and speed comparable to latest research. Also, we leverage statistical calculations to provide an estimate of the robustness of the final results.
134

Dynamic instruction set extension of microprocessors with embedded FPGAs

Bauer, Heiner 13 April 2017 (has links) (PDF)
Increasingly complex applications and recent shifts in technology scaling have created a large demand for microprocessors which can perform tasks more quickly and more energy efficient. Conventional microarchitectures exploit multiple levels of parallelism to increase instruction throughput and use application specific instruction sets or hardware accelerators to increase energy efficiency. Reconfigurable microprocessors adopt the same principle of providing application specific hardware, however, with the significant advantage of post-fabrication flexibility. Not only does this offer similar gains in performance but also the flexibility to configure each device individually. This thesis explored the benefit of a tight coupled and fine-grained reconfigurable microprocessor. In contrast to previous research, a detailed design space exploration of logical architectures for island-style field programmable gate arrays (FPGAs) has been performed in the context of a commercial 22nm process technology. Other research projects either reused general purpose architectures or spent little effort to design and characterize custom fabrics, which are critical to system performance and the practicality of frequently proposed high-level software techniques. Here, detailed circuit implementations and a custom area model were used to estimate the performance of over 200 different logical FPGA architectures with single-driver routing. Results of this exploration revealed similar tradeoffs and trends described by previous studies. The number of lookup table (LUT) inputs and the structure of the global routing network were shown to have a major impact on the area delay product. However, results suggested a much larger region of efficient architectures than before. Finally, an architecture with 5-LUTs and 8 logic elements per cluster was selected. Modifications to the microprocessor, whichwas based on an industry proven instruction set architecture, and its software toolchain provided access to this embedded reconfigurable fabric via custom instructions. The baseline microprocessor was characterized with estimates from signoff data for a 28nm hardware implementation. A modified academic FPGA tool flow was used to transform Verilog implementations of custom instructions into a post-routing netlist with timing annotations. Simulation-based verification of the system was performed with a cycle-accurate processor model and diverse application benchmarks, ranging from signal processing, over encryption to computation of elementary functions. For these benchmarks, a significant increase in performance with speedups from 3 to 15 relative to the baseline microprocessor was achieved with the extended instruction set. Except for one case, application speedup clearly outweighed the area overhead for the extended system, even though the modeled fabric architecturewas primitive and contained no explicit arithmetic enhancements. Insights into fundamental tradeoffs of island-style FPGA architectures, the developed exploration flow, and a concrete cost model are relevant for the development of more advanced architectures. Hence, this work is a successful proof of concept and has laid the basis for further investigations into architectural extensions and physical implementations. Potential for further optimizationwas identified on multiple levels and numerous directions for future research were described. / Zunehmend komplexere Anwendungen und Besonderheiten moderner Halbleitertechnologien haben zu einer großen Nachfrage an leistungsfähigen und gleichzeitig sehr energieeffizienten Mikroprozessoren geführt. Konventionelle Architekturen versuchen den Befehlsdurchsatz durch Parallelisierung zu steigern und stellen anwendungsspezifische Befehlssätze oder Hardwarebeschleuniger zur Steigerung der Energieeffizienz bereit. Rekonfigurierbare Prozessoren ermöglichen ähnliche Performancesteigerungen und besitzen gleichzeitig den enormen Vorteil, dass die Spezialisierung auf eine bestimmte Anwendung nach der Herstellung erfolgen kann. In dieser Diplomarbeit wurde ein rekonfigurierbarer Mikroprozessor mit einem eng gekoppelten FPGA untersucht. Im Gegensatz zu früheren Forschungsansätzen wurde eine umfangreiche Entwurfsraumexploration der FPGA-Architektur im Zusammenhang mit einem kommerziellen 22nm Herstellungsprozess durchgeführt. Bisher verwendeten die meisten Forschungsprojekte entweder kommerzielle Architekturen, die nicht unbedingt auf diesen Anwendungsfall zugeschnitten sind, oder die vorgeschlagenen FGPA-Komponenten wurden nur unzureichend untersucht und charakterisiert. Jedoch ist gerade dieser Baustein ausschlaggebend für die Leistungsfähigkeit des gesamten Systems. Deshalb wurden im Rahmen dieser Arbeit über 200 verschiedene logische FPGA-Architekturen untersucht. Zur Modellierung wurden konkrete Schaltungstopologien und ein auf den Herstellungsprozess zugeschnittenes Modell zur Abschätzung der Layoutfläche verwendet. Generell wurden die gleichen Trends wie bei vorhergehenden und ähnlich umfangreichen Untersuchungen beobachtet. Auch hier wurden die Ergebnisse maßgeblich von der Größe der LUTs (engl. "Lookup Tables") und der Struktur des Routingnetzwerks bestimmt. Gleichzeitig wurde ein viel breiterer Bereich von Architekturen mit nahezu gleicher Effizienz identifiziert. Zur weiteren Evaluation wurde eine FPGA-Architektur mit 5-LUTs und 8 Logikelementen ausgewählt. Die Performance des ausgewählten Mikroprozessors, der auf einer erprobten Befehlssatzarchitektur aufbaut, wurde mit Ergebnissen eines 28nm Testchips abgeschätzt. Eine modifizierte Sammlung von akademischen Softwarewerkzeugen wurde verwendet, um Spezialbefehle auf die modellierte FPGA-Architektur abzubilden und eine Netzliste für die anschließende Simulation und Verifikation zu erzeugen. Für eine Reihe unterschiedlicher Anwendungs-Benchmarks wurde eine relative Leistungssteigerung zwischen 3 und 15 gegenüber dem ursprünglichen Prozessor ermittelt. Obwohl die vorgeschlagene FPGA-Architektur vergleichsweise primitiv ist und keinerlei arithmetische Erweiterungen besitzt, musste dabei, bis auf eine Ausnahme, kein überproportionaler Anstieg der Chipfläche in Kauf genommen werden. Die gewonnen Erkenntnisse zu den Abhängigkeiten zwischen den Architekturparametern, der entwickelte Ablauf für die Exploration und das konkrete Kostenmodell sind essenziell für weitere Verbesserungen der FPGA-Architektur. Die vorliegende Arbeit hat somit erfolgreich den Vorteil der untersuchten Systemarchitektur gezeigt und den Weg für mögliche Erweiterungen und Hardwareimplementierungen geebnet. Zusätzlich wurden eine Reihe von Optimierungen der Architektur und weitere potenziellen Forschungsansätzen aufgezeigt.
135

Methods for parameterizing and exploring Pareto frontiers using barycentric coordinates

Daskilewicz, Matthew John 08 April 2013 (has links)
The research objective of this dissertation is to create and demonstrate methods for parameterizing the Pareto frontiers of continuous multi-attribute design problems using barycentric coordinates, and in doing so, to enable intuitive exploration of optimal trade spaces. This work is enabled by two observations about Pareto frontiers that have not been previously addressed in the engineering design literature. First, the observation that the mapping between non-dominated designs and Pareto efficient response vectors is a bijection almost everywhere suggests that points on the Pareto frontier can be inverted to find their corresponding design variable vectors. Second, the observation that certain common classes of Pareto frontiers are topologically equivalent to simplices suggests that a barycentric coordinate system will be more useful for parameterizing the frontier than the Cartesian coordinate systems typically used to parameterize the design and objective spaces. By defining such a coordinate system, the design problem may be reformulated from y = f(x) to (y,x) = g(p) where x is a vector of design variables, y is a vector of attributes and p is a vector of barycentric coordinates. Exploration of the design problem using p as the independent variables has the following desirable properties: 1) Every vector p corresponds to a particular Pareto efficient design, and every Pareto efficient design corresponds to a particular vector p. 2) The number of p-coordinates is equal to the number of attributes regardless of the number of design variables. 3) Each attribute y_i has a corresponding coordinate p_i such that increasing the value of p_i corresponds to a motion along the Pareto frontier that improves y_i monotonically. The primary contribution of this work is the development of three methods for forming a barycentric coordinate system on the Pareto frontier, two of which are entirely original. The first method, named "non-domination level coordinates," constructs a coordinate system based on the (k-1)-attribute non-domination levels of a discretely sampled Pareto frontier. The second method is based on a modification to an existing "normal boundary intersection" multi-objective optimizer that adaptively redistributes its search basepoints in order to sample from the entire frontier uniformly. The weights associated with each basepoint can then serve as a coordinate system on the frontier. The third method, named "Pareto simplex self-organizing maps" uses a modified a self-organizing map training algorithm with a barycentric-grid node topology to iteratively conform a coordinate grid to the sampled Pareto frontier.
136

Design, Implementation and Evaluation of a Configurable NoC for AcENoCs FPGA Accelerated Emulation Platform

Lotlikar, Swapnil Subhash 2010 August 1900 (has links)
The heterogenous nature and the demand for extensive parallel processing in modern applications have resulted in widespread use of Multicore System-on-Chip (SoC) architectures. The emerging Network-on-Chip (NoC) architecture provides an energy-efficient and scalable communication solution for Multicore SoCs, serving as a powerful replacement for traditional bus-based solutions. The key to successful realization of such architectures is a flexible, fast and robust emulation platform for fast design space exploration. In this research, we present the design and evaluation of a highly configurable NoC used in AcENoCs (Accelerated Emulation platform for NoCs), a flexible and cycle accurate field programmable gate array (FPGA) emulation platform for validating NoC architectures. Along with the implementation details, we also discuss the various design optimizations and tradeoffs, and assess the performance improvements of AcENoCs over existing simulators and emulators. We design a hardware library consisting of routers and links using verilog hardware description language (HDL). The router is parameterized and has a configurable number of physical ports, virtual channels (VCs) and pipeline depth. A packet switched NoC is constructed by connecting the routers in either 2D-Mesh or 2D-Torus topology. The NoC is integrated in the AcENoCs platform and prototyped on Xilinx Virtex-5 FPGA. The NoC was evaluated under various synthetic and realistic workloads generated by AcENoCs' traffic generators implemented on the Xilinx MicroBlaze embedded processor. In order to validate the NoC design, performance metrics like average latency and throughput were measured and compared against the results obtained using standard network simulators. FPGA implementation of the NoC using Xilinx tools indicated a 76% LUT utilization for a 5x5 2D-Mesh network. A VC allocator was found to be the single largest consumer of hardware resources within a router. The router design synthesized at a frequency of 135MHz, 124MHz and 109MHz for 3-port, 4-port and 5-port configurations, respectively. The operational frequency of the router in the AcENoCs environment was limited only by the software execution latency even though the hardware itself could be clocked at a much higher rate. An AcENoCs emulator showed speedup improvements of 10000-12000X over HDL simulators and 5-15X over software simulators, without sacrificing cycle accuracy.
137

A Markovian state-space framework for integrating flexibility into space system design decisions

Lafleur, Jarret Marshall 16 December 2011 (has links)
The past decades have seen the state of the art in aerospace system design progress from a scope of simple optimization to one including robustness, with the objective of permitting a single system to perform well even in off-nominal future environments. Integrating flexibility, or the capability to easily modify a system after it has been fielded in response to changing environments, into system design represents a further step forward. One challenge in accomplishing this rests in that the decision-maker must consider not only the present system design decision, but also sequential future design and operation decisions. Despite extensive interest in the topic, the state of the art in designing flexibility into aerospace systems, and particularly space systems, tends to be limited to analyses that are qualitative, deterministic, single-objective, and/or limited to consider a single future time period. To address these gaps, this thesis develops a stochastic, multi-objective, and multi-period framework for integrating flexibility into space system design decisions. Central to the framework are five steps. First, system configuration options are identified and costs of switching from one configuration to another are compiled into a cost transition matrix. Second, probabilities that demand on the system will transition from one mission to another are compiled into a mission demand Markov chain. Third, one performance matrix for each design objective is populated to describe how well the identified system configurations perform in each of the identified mission demand environments. The fourth step employs multi-period decision analysis techniques, including Markov decision processes (MDPs) from the field of operations research, to find efficient paths and policies a decision-maker may follow. The final step examines the implications of these paths and policies for the primary goal of informing initial system selection. Overall, this thesis unifies state-centric concepts of flexibility from economics and engineering literature with sequential decision-making techniques from operations research. The end objective of this thesis' framework and its supporting analytic and computational tools is to enable selection of the next-generation space systems today, tailored to decision-maker budget and performance preferences, that will be best able to adapt and perform in a future of changing environments and requirements. Following extensive theoretical development, the framework and its steps are applied to space system planning problems of (1) DARPA-motivated multiple- or distributed-payload satellite selection and (2) NASA human space exploration architecture selection.
138

A Method for Optimised Allocation of System Architectures with Real-time Constraints

Marcus, Ventovaara, Arman, Hasanbegović January 2018 (has links)
Optimised allocation of system architectures is a well researched area as it can greatly reduce the developmental cost of systems and increase performance and reliability in their respective applications.In conjunction with the recent shift from federated to integrated architectures in automotive, and the increasing complexity of computer systems, both in terms of software and hardware, the applications of design space exploration and optimised allocation of system architectures are of great interest.This thesis proposes a method to derive architectures and their allocations for systems with real-time constraints.The method implements integer linear programming to solve for an optimised allocation of system architectures according to a set of linear constraints while taking resource requirements, communication dependencies, and manual design choices into account.Additionally, this thesis describes and evaluates an industrial use case using the method wherein the timing characteristics of a system were evaluated, and, the method applied to simultaneously derive a system architecture, and, an optimised allocation of the system architecture.This thesis presents evidence and validations that suggest the viability of the method and its use case in an industrial setting.The work in this thesis sets precedence for future research and development, as well as future applications of the method in both industry and academia.
139

High level design and control of adaptive multiprocessor system-on-chips / Conception et contrôle de haut niveau pour les systèmes sur puce multiprocesseurs adaptatifs

An, Xin 16 October 2013 (has links)
La conception de systèmes embarqués modernes est de plus en plus complexe, car plus de fonctionnalités sont intégrées dans ces systèmes. En même temps, afin de répondre aux exigences de calcul tout en conservant une consommation d'énergie de faible niveau, MPSoCs sont apparus comme les principales solutions pour tels systèmes embarqués. En outre, les systèmes embarqués sont de plus en plus adaptatifs, comme l’adaptabilité peut apporter un certain nombre d'avantages, tels que la flexibilité du logiciel et l'efficacité énergétique. Cette thèse vise la conception sécuritaire de ces MPSoCs adaptatifs. Tout d'abord, chaque configuration de système doit être analysée en ce qui concerne ses propriétés fonctionnelles et non fonctionnelles. Nous présentons un cadre abstraite de conception et d’analyse qui permet des décisions d’implémentation plus rapide et plus rentable. Ce cadre est conçu comme un support de raisonnement intermédiaire pour les environnements de co-conception de logiciel / matériel au niveau de système. Il peut élaguer l'espace de conception à sa plus grande portée, et identifier les candidats de solutions de conception de manière rapide et efficace. Dans ce cadre, nous utilisons un codage basé sur l’horloge abstrait pour modéliser les comportements du système. Différents scénarios d'applications de mapping et de planification sur MPSoCs sont analysés via les traces d'horloge qui représentent les simulations du système. Les propriétés d'intérêt sont l’exactitude du comportement fonctionnel, la performance temporelle et la consommation d'énergie. Deuxièmement, la gestion de la reconfiguration de MPSoCs adaptatifs doit être abordée. Nous sommes particulièrement intéressés par les MPSoCs implémentés sur des architectures reconfigurables de hardware (ex. FPGA tissus) qui offrent une bonne flexibilité et une efficacité de calcul pour les MPSoCs adaptatifs. Nous proposons un cadre général de conception basésur la technique de la synthèse de contrôleurs discrets (SCD) pour résoudre ce problème. L’avantage principal de cette technique est qu'elle permet une synthèse d'un contrôleur automatique vis-à-vis d’une spécification donnée des objectifs de contrôle. Dans ce cadre, le comportement de reconfiguration du système est modélisé en termes d'automates synchrones en parallèle. Le problème de calcul de la gestion reconfiguration vis-à-vis de multiples objectifs concernant, par exemple, les usages des ressources, la performance et la consommation d’énergie est codé comme un problème de SCD . Le langage de programmation BZR existant et l’outil Sigali sont employés pour effectuer SCD et générer un contrôleur qui satisfait aux exigences du système. Finalement, nous étudions deux façons différentes de combiner les deux cadres de conception proposées pour MPSoCs adaptatifs. Tout d'abord, ils sont combinés pour construire un flot de conception complet pour MPSoCs adaptatifs. Deuxièmement, ils sont combinés pour présenter la façon dont le gestionnaire d'exécution conçu dans le second cadre peut être intégré dans le premier cadre de sorte que les simulations de haut niveau peuvent être effectuées pour évaluer le gestionnaire d'exécution. / The design of modern embedded systems is getting more and more complex, as more func- tionality is integrated into these systems. At the same time, in order to meet the compu- tational requirements while keeping a low level power consumption, MPSoCs have emerged as the main solutions for such embedded systems. Furthermore, embedded systems are be- coming more and more adaptive, as the adaptivity can bring a number of benefits, such as software flexibility and energy efficiency. This thesis targets the safe design of such adaptive MPSoCs. First, each system configuration must be analyzed concerning its functional and non- functional properties. We present an abstract design and analysis framework, which allows for faster and cost-effective implementation decisions. This framework is intended as an intermediate reasoning support for system level software/hardware co-design environments. It can prune the design space at its largest, and identify candidate design solutions in a fast and efficient way. In the framework, we use an abstract clock-based encoding to model system behaviors. Different mapping and scheduling scenarios of applications on MPSoCs are analyzed via clock traces representing system simulations. Among properties of interest are functional behavioral correctness, temporal performance and energy consumption. Second, the reconfiguration management of adaptive MPSoCs must be addressed. We are specially interested in MPSoCs implemented on reconfigurable hardware architectures (i.e., FPGA fabrics), which provide a good flexibility and computational efficiency for adap- tive MPSoCs. We propose a general design framework based on the discrete controller syn- thesis (DCS) technique to address this issue. The main advantage of this technique is that it allows the automatic controller synthesis w.r.t. a given specification of control objectives. In the framework, the system reconfiguration behavior is modeled in terms of synchronous parallel automata. The reconfiguration management computation problem w.r.t. multiple objectives regarding e.g., resource usages, performance and power consumption is encoded as a DCS problem. The existing BZR programming language and Sigali tool are employed to perform DCS and generate a controller that satisfies the system requirements. Finally, we investigate two different ways of combining the two proposed design frame- works for adaptive MPSoCs. Firstly, they are combined to construct a complete design flow for adaptive MPSoCs. Secondly, they are combined to present how the designed run-time manager by the second framework can be integrated into the first framework so that high level simulations can be performed to assess the run-time manager.
140

Visualisation d’information pour une décision informée en exploration d’espace de conception par shopping / Information visualization for an informed decision to design space exploration by shopping

Abi Akle, Audrey 10 July 2015 (has links)
Lors de l’exploration d’espace de conception, les données résultantes de la simulation d’un grand nombre d’alternatives de conception peuvent conduire à la surcharge d’information quand il s’agit de choisir une bonne solution de conception. Cette exploration d’espace de conception s’apparente à une méthode d’optimisation en conception multicritère mais en mode manuel pour lequel des outils appropriés à la visualisation de données multidimensionnelle sont employés. Pour le concepteur, un processus en trois phases – découverte, optimisation, sélection – est suivi selon un paradigme dit de Design by Shopping. Le fait de « parcourir » l’espace de conception permet de gagner en intuition sur les sous-espaces de solutions faisables et infaisables et sur les solutions offrant de bons compromis. Le concepteur apprend au cours de ces manipulations graphiques de données. La sélection d’une solution optimale se fait donc sur la base d’une décision dite informée. L’objectif de cette recherche est la performance des représentations graphiques pour l’exploration d’espace de conception, pour les trois phases du processus en Design by Shopping. Pour cela, cinq représentations graphiques, identifiées comme potentiellement performantes, sont testées à travers deux expérimentations. Dans la première, trente participants ont testé trois graphiques, pour la phase de sélection dans une situation multi-attribut, à travers trois scénarios de conception où une voiture doit être choisie parmi quarante selon des préférences énoncées. Pour cela, un indice de qualité est proposé pour calculer la qualité de la solution du concepteur pour un des trois scénarios définis, la solution optimale selon cet indice étant comparée à celles obtenues après manipulation des graphiques. Dans la deuxième expérimentation, quarante-deux concepteurs novices ont résolu deux problèmes de conception à l’aide de trois graphiques. Dans ce cas, la performance des graphiques est testée pour la prise de décision informée et pour les trois phases du processus dans une situation multi-objectif. Les résultats révèlent qu’un graphique est adapté à chacune des trois phases du Design by Shopping :: le graphique Scatter Plot Matrix pour la phase de découverte et pour la prise de décision informée, le graphique Simple Scatter pour la phase d’optimisation et le graphique Parallel Coordinate Plot pour la phase de sélection aussi bien dans une situation multi-attribut que multi-objectif. / In Design space exploration, the resulting data, from simulation of large amount of new design alternatives, can lead to information overload when one good design solution must be chosen. The design space exploration relates to a multi-criteria optimization method in design but in manual mode, for which appropriate tools to support multi-dimensional data visualization are employed. For the designer, a three-phase process - discovery, optimization, selection - is followed according to a paradigm called Design by Shopping. Exploring the design space helps to gain insight into both feasible and infeasible solutions subspaces, and into solutions presenting good trade-offs. Designers learn during these graphical data manipulations and the selection of an optimal solution is based on a so-called informed decision. The objective of this research is the performance of graphs for design space exploration according to the three phases of the Design by Shopping process. In consequence, five graphs, identified as potentially efficient, are tested through two experiments. In the first, thirty participants tested three graphs, in three design scenarios where one car must be chosen out of a total of forty, for the selection phase in a multi-attribute situation where preferences are enounced. A response quality index is proposed to compute the choice quality for each of the three given scenarios, the optimal solutions being compared to the ones resulting from the graphical manipulations. In the second experiment, forty-two novice designers solved two design problems with three graphs. In this case, the performance of graphs is tested for informed decision-making and for the three phases of the process in a multi-objective situation. The results reveal three efficient graphs for the design space exploration: the Scatter Plot Matrix for the discovery phase and for informed decision-making, the Simple Scatter Plot for the optimization phase and the Parallel Coordinate Plot for the selection phase in a multi-attribute as well as multi-objective situation.

Page generated in 0.1347 seconds