• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 3
  • 1
  • 1
  • Tagged with
  • 6
  • 6
  • 4
  • 4
  • 3
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Performance Improvement of Adaptive Processors

Döbrich, Stefan 03 August 2017 (has links) (PDF)
Improving a computers performance has been of major interest to all users around the world, from computing centers to private persons, ever since computer science has entered the stage and then the spotlight in the 1940’s. Most often times, this is either achieved by exchanging parts of the computer with better performing parts, called an upgrade, or by simply buying a newer and better computer. Another approach, which originates from the scientific community, is the optimization of the source code of an application. Thereby, the application programmer capitalizes his knowledge about the underlying platform and its tool-chain in order to gain tweaked binary code, which results in a better performance. It is clear, that this technique will never be an option for consumer electronics or people outside the area of programming and software development. Traditionally, these users stick with the upgrade/buy new method. During the last years, consumer electronics improved into multi-tool devices, which are capable of almost any functionality, originating from their internet connection and their ability to dynamically download and install new software. Certainly, it may happen that an application is too demanding for a given underlying hardware revision. As these new devices are built in a monolithic way, a hardware upgrade is not an option. Nonetheless, most users do not want to buy a new device every time this happens. Thus, it is necessary to provide a possibility, which allows the processor to adapt to a given application at runtime, and thereby improving its own performance. This thesis presents three major approaches to such a runtime dynamic application acceleration.
2

Hardware Synthesis of Synchronous Data Flow Models

Koecher, Matthew R. 06 April 2004 (has links) (PDF)
Synchronous Dataflow (SDF) graphs are a convenient way to represent many signal processing and dataflow operations. Nodes within SDF graphs represent computation while arcs represent dependencies between nodes. Using a graph representation, SDF graphs formally specify a dataflow algorithm without any assumptions on the final implementation. This allows an SDF model to be synthesized into a variety of implementation techniques including both software and hardware. This thesis presents a technique for generating an abstract hardware representation from SDF models. The techniques presented here operate on SDF models defined structurally within the Ptolemy modeling environment. The behavior of the nodes within Ptolemy SDF models is specified in software and can be simple, such as a single arithmetic operation, or arbitrarily complex. This thesis presents a technique for extracting the behavior of a limited class of SDF nodes defined in software and generating a structural description of the SDF model based on primitive arithmetic and logical operations. This synthesized graph can be used for subsequent hardware synthesis transformations.
3

System-Level Hardwa Synthesis of Dataflow Programs with HEVC as Study Use Case / Synthèse matérielle au niveau système des programmes flots-de-données : étude de cas du décodeur HEVC

Abid, Mariem 28 April 2016 (has links)
Les applications de traitement d'image et vidéo sont caractérisées par le traitement d'une grande quantité de données. La conception de ces applications complexes avec des méthodologies de conception traditionnelles bas niveau provoque 1'augmentation des coûts de développement. Afin de résoudre ces défis, des outils de synthèse haut niveau ont été proposés. Le principe de base est de modéliser le comportement de l'ensemble du système en utilisant des spécifications haut niveau afin de permettre la synthèse automatique vers des spécifications bas niveau pour implémentation efficace en FPGA. Cependant, l'inconvénient principal de ces outils de synthèse haut niveau est le manque de prise en compte de la totalité du système, c.-à-d. la création de la communication entre les différents composants pour atteindre le niveau système n'est pas considérée. Le but de cette thèse est d'élever le niveau d'abstraction dans la conception des systèmes embarqués au niveau système. Nous proposons un flot de conception qui permet une synthèse matérielle efficace des applications de traitement vidéo décrites en utilisant un langage spécifique à un domaine pour la programmation flot-de- données. Le flot de conception combine un compilateur flot- de-données pour générer des descriptions à base de code C et d'un synthétiseur pour générer des descriptions niveau de transfert de registre. Le défi majeur de l'implémentation en FPGA des canaux de communication des programmes flot-de-données basés sur un modèle de calcul est la minimisation des frais généraux de la communication. Pour cela, nous avons introduit une nouvelle approche de synthèse de l'interface qui mappe les grandes quantités des données vidéo, à travers des m'mémoires partagées sur FPGA. Ce qui conduit à une diminution considérable de la latence et une augmentation du débit. Ces résultats ont été démontrés sur la synthèse matérielle du standard vidéo émergent High-Efficiency Video Coding (HEVC). / Image and video processing applications are characterized by the processing of a huge amount of data. The design of such complex applications with traditional design methodologies at lowlevel of abstraction causes increasing development costs. In order to resolve the above mentioned challenges, Electronic System Level (ESL) synthesis or High-Level Synthesis (HLS) tools were proposed. The basic premise is to model the behavior of the entire system using high level specifications, and to enable the automatic synthesis to low-level specifications for efficient implementation in Field-Programmable Gate array (FPGA). However, the main downside of the HLS tools is the lack of the entire system consideration, i.e. the establishment of the communications between these components to achieve the system-level is not yet considered. The purpose of this thesis is to raise the level of abstraction in the design of embedded systems to the system-level. A novel design flow was proposed that enables an efficient hardware implementation of video processing applications described using a Domain Specific Language (DSL) for dataflow programming. The design flow combines a dataflow compiler for generating C-based HLS descriptions from a dataflow description and a C-to-gate synthesizer for generating Register-Transfer Level (RTL) descriptions. The challenge of implementing the communication channels of dataflow programs relying on Model of Computation (MoC) in FPGA is the minimization of the communication overhead. In this issue, we introduced a new interface synthesis approach that maps the large amounts of data that multimedia and image processing applications process, to shared memories on the FPGA. This leads to a tremendous decrease in the latency and an increase in the throughput. These results were demonstrated upon the hardware synthesis of the emerging High-Efficiency Video Coding (HEVC) standard.
4

Performance Improvement of Adaptive Processors: Hardware Synthesis, Instruction Folding and Microcode Assembly

Döbrich, Stefan 28 January 2013 (has links)
Improving a computers performance has been of major interest to all users around the world, from computing centers to private persons, ever since computer science has entered the stage and then the spotlight in the 1940’s. Most often times, this is either achieved by exchanging parts of the computer with better performing parts, called an upgrade, or by simply buying a newer and better computer. Another approach, which originates from the scientific community, is the optimization of the source code of an application. Thereby, the application programmer capitalizes his knowledge about the underlying platform and its tool-chain in order to gain tweaked binary code, which results in a better performance. It is clear, that this technique will never be an option for consumer electronics or people outside the area of programming and software development. Traditionally, these users stick with the upgrade/buy new method. During the last years, consumer electronics improved into multi-tool devices, which are capable of almost any functionality, originating from their internet connection and their ability to dynamically download and install new software. Certainly, it may happen that an application is too demanding for a given underlying hardware revision. As these new devices are built in a monolithic way, a hardware upgrade is not an option. Nonetheless, most users do not want to buy a new device every time this happens. Thus, it is necessary to provide a possibility, which allows the processor to adapt to a given application at runtime, and thereby improving its own performance. This thesis presents three major approaches to such a runtime dynamic application acceleration.:1 Introduction 5 1.1 Motivation 5 1.2 Targets and Aims 7 1.3 Thesis Outline 8 2 AMIDAR - A Runtime Reconfigurable Processor 11 2.1 Overall Processor Architecture 11 2.2 Principle of Operation 14 2.3 Applicability of the AMIDAR Model 15 2.4 Adaptivity in AMIDAR Processors 16 2.5 Relations to Existing Processor Architectures 19 3 Applicability to Different Instruction Set Architectures 23 3.1 Supported Instruction Set Architectures 23 3.2 Selecting an ISA for Hardware Acceleration 25 3.3 A Detailed Look at an AMIDAR Based Java Processor 29 3.4 Example Token Sequence and Execution Trace 31 3.5 Performance Comparison of AMIDAR and IA32 Processors 34 4 Hotspot Evaluation 37 5 Runtime Reconfiguration of Processors 41 5.1 The Idea of Processor Reconfiguration 41 5.2 Targets and Aims for Efficient Processor Extensibility 43 6 Hardware Synthesis 47 6.1 The Evolution of Coarse Grain Reconfigurable Computing 47 6.2 The CGRA Target Architecture 71 6.3 Hardware Synthesis 79 6.4 Evaluation and Results of Hardware Synthesis 97 6.5 Saving Hardware With Heterogeneous CGRAs 103 6.6 The Size of Token Sets for Synthesized Functional Units 107 6.7 The Runtime Consumption of Performance Acceleration 108 7 Instruction Folding 113 7.1 The General Idea Behind Instruction Folding 113 7.2 General Classification of Folding Strategies 114 7.3 Folding Based on Instruction Type Pattern 116 7.4 Java Bytecode Folding Based on Behavioural Pattern 121 7.5 Common Applications of Instruction Folding 125 7.6 Instruction Folding and the AMIDAR Execution Model 126 8 Assembly of Microinstruction Groups 151 8.1 Motivation and General Idea 151 8.2 The Basic Token Set Assembly Algorithm 159 8.3 Algorithmic Extensions 179 8.4 Synthilation for an Unaltered Basic Processor 182 8.5 Synthilation Performance on Multi-ALU Processors 191 8.6 Runtime Characteristics of Synthilation Algorithms 195 9 Comparison 197 9.1 Speedup Comparison 197 9.2 Runtime and Complexity 198 9.3 Token Memory Consumption 200 9.4 Consumed Hardware Resources 201 10 Conclusion 203 10.1 Realization of Targets and Aims 203 10.2 The Ideal Use Case for Each Acceleration Approach 204 10.3 Limitations and Drawbacks 206 10.4 Summary 207 A Benchmark Applications 209 A.1 Cryptographic Ciphers 209 A.2 Hash Functions and Message Digests 210 A.3 Image Processing Filters 212 A.4 Jpeg Encoder 212 B Benchmark Measurement Values 213 B.1 Measurements of Instruction Set Evaluation 213 B.2 Measurement Values of Hardware Synthesis 217 B.3 Measurement Values of Instruction Folding 227 B.4 Measurement Values of Token Set Synthilation 243
5

High-Level-Synthese von Operationseigenschaften / High-Level Synthesis Using Operation Properties

Langer, Jan 12 December 2011 (has links) (PDF)
In der formalen Verifikation digitaler Schaltkreise hat sich die Methodik der vollständigen Verifikation anhand spezieller Operationseigenschaften bewährt. Operationseigenschaften beschreiben das Verhalten einer Schaltung in einem festen Zeitintervall und können sequentiell miteinander verknüpft werden, um so das Gesamtverhalten zu spezifizieren. Zusätzlich beweist eine formale Vollständigkeitsprüfung, dass die Menge der Eigenschaften für jede Folge von Eingangssignalwerten die Ausgänge der zu verifizierenden Schaltung eindeutig und lückenlos determiniert. In dieser Arbeit wird untersucht, wie aus Operationseigenschaften, deren Vollständigkeit erfolgreich bewiesen wurde, automatisiert eine Schaltungsbeschreibung abgeleitet werden kann. Gegenüber der traditionellen Entwurfsmethodik auf Register-Transfer-Ebene (RTL) bietet dieses Verfahren zwei Vorteile. Zum einen vermeidet der Vollständigkeitsbeweis viele Arten von Entwurfsfehlern, zum anderen ähnelt eine Beschreibung mit Hilfe von Operationseigenschaften den in Spezifikationen häufig genutzten Zeitdiagrammen, sodass die Entwurfsebene der Spezifikationsebene angenähert wird und Fehler durch manuelle Verfeinerungsschritte vermieden werden. Das Entwurfswerkzeug vhisyn führt die High-Level-Synthese (HLS) einer vollständigen Menge von Operationseigenschaften zu einer Beschreibung auf RTL durch. Die Ergebnisse zeigen, dass sowohl die verwendeten Synthesealgorithmen, als auch die erzeugten Schaltungen effizient sind und somit die Realisierung größerer Beispiele zulassen. Anhand zweier Fallstudien kann dies praktisch nachgewiesen werden. / The complete verification approach using special operation properties is an accepted methodology for the formal verification of digital circuits. Operation properties describe the behavior of a circuit during a certain time interval. They can be sequentially concatenated in order to specify the overall behavior. Additionally, a formal completeness check proves that the sequence of properties consistently determines the exact value of the output signals for every valid sequence of input signal values. This work examines how a circuit description can be automatically derived from a set of operation properties whose completeness has been proven. In contrast to the traditional design flow at register-transfer level (RTL), this method offers two advantages. First, the prove of completeness helps to avoid many design errors. Second, the design of operation properties resembles the design of timing diagrams often used in textual specifications. Therefore, the design level is closer to the specification level and errors caused by refinement steps are avoided. The design tool vhisyn performs the high-level synthesis from a complete set of operation properties to a description at RTL. The results show that both the synthesis algorithms and the generated circuit descriptions are efficient and allow the design of larger applications. This is demonstrated by means of two case studies.
6

High-Level-Synthese von Operationseigenschaften

Langer, Jan 23 November 2011 (has links)
In der formalen Verifikation digitaler Schaltkreise hat sich die Methodik der vollständigen Verifikation anhand spezieller Operationseigenschaften bewährt. Operationseigenschaften beschreiben das Verhalten einer Schaltung in einem festen Zeitintervall und können sequentiell miteinander verknüpft werden, um so das Gesamtverhalten zu spezifizieren. Zusätzlich beweist eine formale Vollständigkeitsprüfung, dass die Menge der Eigenschaften für jede Folge von Eingangssignalwerten die Ausgänge der zu verifizierenden Schaltung eindeutig und lückenlos determiniert. In dieser Arbeit wird untersucht, wie aus Operationseigenschaften, deren Vollständigkeit erfolgreich bewiesen wurde, automatisiert eine Schaltungsbeschreibung abgeleitet werden kann. Gegenüber der traditionellen Entwurfsmethodik auf Register-Transfer-Ebene (RTL) bietet dieses Verfahren zwei Vorteile. Zum einen vermeidet der Vollständigkeitsbeweis viele Arten von Entwurfsfehlern, zum anderen ähnelt eine Beschreibung mit Hilfe von Operationseigenschaften den in Spezifikationen häufig genutzten Zeitdiagrammen, sodass die Entwurfsebene der Spezifikationsebene angenähert wird und Fehler durch manuelle Verfeinerungsschritte vermieden werden. Das Entwurfswerkzeug vhisyn führt die High-Level-Synthese (HLS) einer vollständigen Menge von Operationseigenschaften zu einer Beschreibung auf RTL durch. Die Ergebnisse zeigen, dass sowohl die verwendeten Synthesealgorithmen, als auch die erzeugten Schaltungen effizient sind und somit die Realisierung größerer Beispiele zulassen. Anhand zweier Fallstudien kann dies praktisch nachgewiesen werden. / The complete verification approach using special operation properties is an accepted methodology for the formal verification of digital circuits. Operation properties describe the behavior of a circuit during a certain time interval. They can be sequentially concatenated in order to specify the overall behavior. Additionally, a formal completeness check proves that the sequence of properties consistently determines the exact value of the output signals for every valid sequence of input signal values. This work examines how a circuit description can be automatically derived from a set of operation properties whose completeness has been proven. In contrast to the traditional design flow at register-transfer level (RTL), this method offers two advantages. First, the prove of completeness helps to avoid many design errors. Second, the design of operation properties resembles the design of timing diagrams often used in textual specifications. Therefore, the design level is closer to the specification level and errors caused by refinement steps are avoided. The design tool vhisyn performs the high-level synthesis from a complete set of operation properties to a description at RTL. The results show that both the synthesis algorithms and the generated circuit descriptions are efficient and allow the design of larger applications. This is demonstrated by means of two case studies.

Page generated in 0.0487 seconds