• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 53
  • 13
  • 6
  • 6
  • 5
  • 5
  • 5
  • 4
  • 3
  • 3
  • 1
  • Tagged with
  • 116
  • 116
  • 56
  • 52
  • 26
  • 25
  • 25
  • 22
  • 20
  • 19
  • 18
  • 17
  • 15
  • 15
  • 14
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
31

Towards Efficient Resource Allocation for Embedded Systems

Hasler, Mattis 06 June 2023 (has links)
Das Hauptthema ist die dynamische Ressourcenverwaltung in eingebetteten Systemen, insbesondere die Verwaltung von Rechenzeit und Netzwerkverkehr auf einem MPSoC. Die Idee besteht darin, eine Pipeline für die Verarbeitung von Mobiler Kommunikation auf dem Chip dynamisch zu schedulen, um die Effizienz der Hardwareressourcen zu verbessern, ohne den Ressourcenverbrauch des dynamischen Schedulings dramatisch zu erhöhen. Sowohl Software- als auch Hardwaremodule werden auf Hotspots im Ressourcenverbrauch untersucht und optimiert, um diese zu entfernen. Da Applikationen im Bereich der Signalverarbeitung normalerweise mit Hilfe von SDF-Diagrammen beschrieben werden können, wird deren dynamisches Scheduling optimiert, um den Ressourcenverbrauch gegenüber dem üblicherweise verwendeten statischen Scheduling zu verbessern. Es wird ein hybrider dynamischer Scheduler vorgestellt, der die Vorteile von Processing-Networks und der Planung von Task-Graphen kombiniert. Es ermöglicht dem Scheduler, ein Gleichgewicht zwischen der Parallelisierung der Berechnung und der Zunahme des dynamischen Scheduling-Aufands optimal abzuwägen. Der resultierende dynamisch erstellte Schedule reduziert den Ressourcenverbrauch um etwa 50%, wobei die Laufzeit im Vergleich zu einem statischen Schedule nur um 20% erhöht wird. Zusätzlich wird ein verteilter dynamischer SDF-Scheduler vorgeschlagen, der das Scheduling in verschiedene Teile zerlegt, die dann zu einer Pipeline verbunden werden, um mehrere parallele Prozessoren einzubeziehen. Jeder Scheduling-Teil wird zu einem Cluster mit Load-Balancing erweitert, um die Anzahl der parallel laufenden Scheduling-Jobs weiter zu erhöhen. Auf diese Weise wird dem vorhandene Engpass bei dem dynamischen Scheduling eines zentralisierten Schedulers entgegengewirkt, sodass 7x mehr Prozessoren mit dem Pipelined-Clustered-Dynamic-Scheduler für eine typische Signalverarbeitungsanwendung verwendet werden können. Das neue dynamische Scheduling-System setzt das Vorhandensein von drei verschiedenen Kommunikationsmodi zwischen den Verarbeitungskernen voraus. Bei der Emulation auf Basis des häufig verwendeten RDMA-Protokolls treten Leistungsprobleme auf. Sehr gut kann RDMA für einmalige Punkt-zu-Punkt-Datenübertragungen verwendet werden, wie sie bei der Ausführung von Task-Graphen verwendet werden. Process-Networks verwenden normalerweise Datenströme mit hohem Volumen und hoher Bandbreite. Es wird eine FIFO-basierte Kommunikationslösung vorgestellt, die einen zyklischen Puffer sowohl im Sender als auch im Empfänger implementiert, um diesen Bedarf zu decken. Die Pufferbehandlung und die Datenübertragung zwischen ihnen erfolgen ausschließlich in Hardware, um den Software-Overhead aus der Anwendung zu entfernen. Die Implementierung verbessert die Zugriffsverwaltung mehrerer Nutzer auf flächen-effiziente Single-Port Speichermodule. Es werden 0,8 der theoretisch möglichen Bandbreite, die normalerweise nur mit flächenmäßig teureren Dual-Port-Speichern erreicht wird. Der dritte Kommunikationsmodus definiert eine einfache Message-Passing-Implementierung, die ohne einen Verbindungszustand auskommt. Dieser Modus wird für eine effiziente prozessübergreifende Kommunikation des verteilten Scheduling-Systems und der engen Ansteuerung der restlichen Prozessoren benötigt. Eine Flusskontrolle in Hardware stellt sicher, dass eine große Anzahl von Sendern Nachrichten an denselben Empfänger senden kann. Dabei wird garantiert, dass alle Nachrichten korrekt empfangen werden, ohne dass eine Verbindung hergestellt werden muss und die Nachrichtenlaufzeit gering bleibt. Die Arbeit konzentriert sich auf die Optimierung des Codesigns von Hardware und Software, um die kompromisslose Ressourceneffizienz der dynamischen SDF-Graphen-Planung zu erhöhen. Besonderes Augenmerk wird auf die Abhängigkeiten zwischen den Ebenen eines verteilten Scheduling-Systems gelegt, das auf der Verfügbarkeit spezifischer hardwarebeschleunigter Kommunikationsmethoden beruht.:1 Introduction 1.1 Motivation 1.2 The Multiprocessor System on Chip Architecture 1.3 Concrete MPSoC Architecture 1.4 Representing LTE/5G baseband processing as Static Data Flow 1.5 Compuation Stack 1.6 Performance Hotspots Addressed 1.7 State of the Art 1.8 Overview of the Work 2 Hybrid SDF Execution 2.1 Addressed Performance Hotspot 2.2 State of the Art 2.3 Static Data Flow Graphs 2.4 Runtime Environment 2.5 Overhead of Deloying Tasks to a MPSoC 2.6 Interpretation of SDF Graphs as Task Graphs 2.7 Interpreting SDF Graphs as Process Networks 2.8 Hybrid Interpretation 2.9 Graph Topology Considerations 2.10 Theoretic Impact of Hybrid Interpretation 2.11 Simulating Hybrid Execution 2.12 Pipeline SDF Graph Example 2.13 Random SDF Graphs 2.14 LTE-like SDF Graph 2.15 Key Lernings 3 Distribution of Management 3.1 Addressed Performance Hotspot 3.2 State of the Art 3.3 Revising Deployment Overhead 3.4 Distribution of Overhead 3.5 Impact of Management Distribution to Resource Utilization 3.6 Reconfigurability 3.7 Key Lernings 4 Sliced FIFO Hardware 4.1 Addressed Performance Hotspot 4.2 State of the Art 4.3 System Environment 4.4 Sliced Windowed FIFO buffer 4.5 Single FIFO Evaluation 4.6 Multiple FIFO Evalutaion 4.7 Hardware Implementation 4.8 Key Lernings 5 Message Passing Hardware 5.1 Addressed Performance Hotspot 5.2 State of the Art 5.3 Message Passing Regarded as Queueing 5.4 A Remote Direct Memory Access Based Implementation 5.5 Hardware Implementation Concept 5.6 Evalutation of Performance 5.7 Key Lernings 6 Summary / The main topic is the dynamic resource allocation in embedded systems, especially the allocation of computing time and network traffic on an multi processor system on chip (MPSoC). The idea is to dynamically schedule a mobile communication signal processing pipeline on the chip to improve hardware resource efficiency while not dramatically improve resource consumption because of dynamic scheduling overhead. Both software and hardware modules are examined for resource consumption hotspots and optimized to remove them. Since signal processing can usually be described with the help of static data flow (SDF) graphs, the dynamic handling of those is optimized to improve resource consumption over the commonly used static scheduling approach. A hybrid dynamic scheduler is presented that combines benefits from both processing networks and task graph scheduling. It allows the scheduler to optimally balance parallelization of computation and addition of dynamic scheduling overhead. The resulting dynamically created schedule reduces resource consumption by about 50%, with a runtime increase of only 20% compared to a static schedule. Additionally, a distributed dynamic SDF scheduler is proposed that splits the scheduling into different parts, which are then connected to a scheduling pipeli ne to incorporate multiple parallel working processors. Each scheduling stage is reworked into a load-balanced cluster to increase the number of parallel scheduling jobs further. This way, the still existing dynamic scheduling bottleneck of a centralized scheduler is widened, allowing handling 7x more processors with the pipelined, clustered dynamic scheduler for a typical signal processing application. The presented dynamic scheduling system assumes the presence of three different communication modes between the processing cores. When emulated on top of the commonly used remote direct memory access (RDMA) protocol, performance issues are encountered. Firstly, RDMA can neatly be used for single-shot point-to-point data transfers, like used in task graph scheduling. Process networks usually make use of high-volume and high-bandwidth data streams. A first in first out (FIFO) communication solution is presented that implements a cyclic buffer on both sender and receiver to serve this need. The buffer handling and data transfer between them are done purely in hardware to remove software overhead from the application. The implementation improves the multi-user access to area-efficient single port on-chip memory modules. It achieves 0.8 of the theoretically possible bandwidth, usually only achieved with area expensive dual-port memories. The third communication mode defines a lightweight message passing (MP) implementation that is truly connectionless. It is needed for efficient inter-process communication of the distributed and clustered scheduling system and the worker processing units’ tight coupling. A hardware flow control assures that an arbitrary number of senders can spontaneously start sending messages to the same receiver. Yet, all messages are guaranteed to be correctly received while eliminating the need for connection establishment and keeping a low message delay. The work focuses on the hardware-software codesign optimization to increase the uncompromised resource efficiency of dynamic SDF graph scheduling. Special attention is paid to the inter-level dependencies in developing a distributed scheduling system, which relies on the availability of specific hardwareaccelerated communication methods.:1 Introduction 1.1 Motivation 1.2 The Multiprocessor System on Chip Architecture 1.3 Concrete MPSoC Architecture 1.4 Representing LTE/5G baseband processing as Static Data Flow 1.5 Compuation Stack 1.6 Performance Hotspots Addressed 1.7 State of the Art 1.8 Overview of the Work 2 Hybrid SDF Execution 2.1 Addressed Performance Hotspot 2.2 State of the Art 2.3 Static Data Flow Graphs 2.4 Runtime Environment 2.5 Overhead of Deloying Tasks to a MPSoC 2.6 Interpretation of SDF Graphs as Task Graphs 2.7 Interpreting SDF Graphs as Process Networks 2.8 Hybrid Interpretation 2.9 Graph Topology Considerations 2.10 Theoretic Impact of Hybrid Interpretation 2.11 Simulating Hybrid Execution 2.12 Pipeline SDF Graph Example 2.13 Random SDF Graphs 2.14 LTE-like SDF Graph 2.15 Key Lernings 3 Distribution of Management 3.1 Addressed Performance Hotspot 3.2 State of the Art 3.3 Revising Deployment Overhead 3.4 Distribution of Overhead 3.5 Impact of Management Distribution to Resource Utilization 3.6 Reconfigurability 3.7 Key Lernings 4 Sliced FIFO Hardware 4.1 Addressed Performance Hotspot 4.2 State of the Art 4.3 System Environment 4.4 Sliced Windowed FIFO buffer 4.5 Single FIFO Evaluation 4.6 Multiple FIFO Evalutaion 4.7 Hardware Implementation 4.8 Key Lernings 5 Message Passing Hardware 5.1 Addressed Performance Hotspot 5.2 State of the Art 5.3 Message Passing Regarded as Queueing 5.4 A Remote Direct Memory Access Based Implementation 5.5 Hardware Implementation Concept 5.6 Evalutation of Performance 5.7 Key Lernings 6 Summary
32

Lower Bound-oriented Parameter Calculation for AN Coding

Lehner, Wolfgang, Hildebrandt, Juliana, Kolditz, Till, Habich, Dirk 18 January 2023 (has links)
The hardware as well as software communities have recently experienced a shift towards mitigating bit flips issues in software, rather than completely mitigating only in hardware. For this software error mitigation, arithmetic error coding schemes like AN coding are increasingly applied because arithmetic operations can be directly executed without decoding and bit flip detection is provided in an end-to-end fashion. In this case, each encoded data word is computed by multiplying the original data word with a constant integer value A. To reliably detect b bit flips in each code word, the value A has to be well-chosen, so that a minimum Hamming distance of b + 1 can be guaranteed. However, the value A depends on the data word length as well as on the desired minimum Hamming distance. Up to now, a very expensive brute force approach for computation of the value for A is applied. To tackle that in a more efficient way, we present a lower bound-oriented approach for this calculation in this paper.
33

An Efficient Architecture for Dynamic Profiling of Multicore Systems

Sargur, Sudarshan Lakshminarasimhan January 2015 (has links)
Application profiling is an important step in the design and optimization of embedded systems. Accurately identifying and analyzing the execution of frequently executed computational kernels is needed to effectively optimize the system implementation, both at design time and runtime. In a traditional design process, it suffices to perform the profiling and optimization steps offline, during design time. The offline profiling guides the design space exploration, hardware software codesign, or power and performance optimizations. When the system implementation can be finalized at design time, this approach works well. However, dynamic optimization techniques, which adapt and reconfigure the system at runtime, require dynamic profiling with minimum runtime overheads. Existing profiling methods are usually software based and incur significant overheads that may be prohibitive or impractical for profiling embedded systems at runtime. In addition, these profiling methods typically focus on profiling the execution of specific tasks executing on a single processor core, but do not consider accurate and holistic profiling across multiple processor cores. Directly utilizing existing profiling approaches and naively combining isolated profiles from multiple processor cores can lead to significant profile inaccuracies of up to 35%. To address these challenges, a hardware-based dynamic application profiler for non-intrusively and accurately profiling software applications in multicore embedded systems is presented. The profiler provides a detailed execution profile for computational kernels and maintains profile accuracy across multiple processor cores. The hardware-based profiler achieves an average error of less than 0.5% for the percentage execution time of profiled applications while being area efficient.
34

ADAPT : architectural and design exploration for application specific instruction-set processor technologies

Shee, Seng Lin, Computer Science & Engineering, Faculty of Engineering, UNSW January 2007 (has links)
This thesis presents design automation methodologies for extensible processor platforms in application specific domains. The work presents first a single processor approach for customization; a methodology that can rapidly create different processor configurations by the removal of unused instructions sets from the architecture. A profile directed approach is used to identify frequently used instructions and to eliminate unused opcodes from the available instruction pool. A coprocessor approach is next explored to create an SoC (System-on-Chip) to speedup the application while reducing energy consumption. Loops in applications are identified and accelerated by tightly coupling a coprocessor to an ASIP (Application Specific Instruction-set Processor). Latency hiding is used to exploit the parallelism provided by this architecture. A case study has been performed on a JPEG encoding algorithm; comparing two different coprocessor approaches: a high-level synthesis approach and our custom coprocessor approach. The thesis concludes by introducing a heterogenous multi-processor system using ASIPs as processing entities in a pipeline configuration. The problem of mapping each algorithmic stage in the system to an ASIP configuration is formulated. We proposed an estimation technique to calculate runtimes of the configured multiprocessor system without running cycle-accurate simulations, which could take a significant amount of time. We present two heuristics to efficiently search the design space of a pipeline-based multi ASIP system and compare the results against an exhaustive approach. In our first approach, we show that, on average, processor size can be reduced by 30%, energy consumption by 24%, while performance is improved by 24%. In the coprocessor approach, compared with the use of a main processor alone, a loop performance improvement of 2.57x is achieved using the custom coprocessor approach, as against 1.58x for the high level synthesis method, and 1.33x for the customized instruction approach. Energy savings are 57%, 28% and 19%, respectively. Our multiprocessor design provides a performance improvement of at least 4.03x for JPEG and 3.31x for MP3, for a single processor design system. The minimum cost obtained using our heuristic was within 0.43% and 0.29% of the optimum values for the JPEG and MP3 benchmarks respectively.
35

Compiler Directed Codesign for FPGA-based Embedded Systems

Hauff, Martin Anthony, marty@extendabilities.com.au January 2008 (has links)
As embedded systems designers increasingly turn to programmable logic technologies in place of off-the-shelf microprocessors, there is a growing interest in the development of optimised custom processing cores that can be designed on a per-application basis. FPGAs blur the traditional distinction between hardware and software and offer the promise of application specific hardware acceleration. But realizing this in a general sense requires a significant departure from traditional embedded systems development flows. Whereas off-the-shelf processors have a fixed architecture, the same cannot be said of purpose-built FPGA-based processors. With this freedom comes the challenge of empirically determining the optimal boundary point between hardware and software. The fluidity of the hardware/software partition also poses an interesting challenge for compiler developers. This thesis presents a tool and methodology that addresses these codesign challenges in a new way. Described as 'compiler-directed codesign', it makes use of a suitably modified compiler to help direct the development of a custom processor core on a per-application basis. By exposing the compiler's internal representation of a compiled target program, visibility into those instructions, and hardware resources, that are most sought after by the compiler can be gained. This information is then used to inform further processor development and to determine the optimal partition between hardware and software. At each design iteration, the machine model is updated to reflect the available hardware resources, the compiler is rebuilt, and the target application is compiled once again. By including the compiler 'in-the-loop' of custom processor design, developers can accurately quantify the impact on performance caused by the addition or removal of specific hardware resources and iteratively converge on an optimal solution. Compiler Directed Codesign has advantages over existing codesign methodologies because it offers both a concrete point from which to begin the partitioning process as well as providing quantifiable and rapid feedback of the merits of different partitioning choices. When applied to an Adaptive PCM Encoder/Decoder case study, the Compiler Directed Codesign technique yielded a custom processor core that was between 36% and 73% smaller, consumed between 11% to 19% less memory, and performed up to 10X faster than comparable general-purpose FPGA-based processor cores. The conclusion of this work is that a suitably modified compiler can serve a valuable role in directing hardware/software partitioning on a per-application basis.
36

Conception des systèmes logiciel/matériel : du partitionnement logiciel/matériel au prototypage sur plateformes reconfigurables

Rousseau, F. 08 July 2005 (has links) (PDF)
Ce document retrace mes activités de recherche depuis ma thèse soutenue en juillet 1997. Certains des travaux présentés sont achevés, d'autres sont en cours ou encore dans un stade exploratoire. De 1993 à 1999, je me suis intéressé aux différents aspects du <br />partitionnement logiciel/matériel dans la conception de systèmes intégrés numériques de télécommunications. Depuis 1999, mes travaux ont porté sur la conception de systèmes multiprocesseurs monopuces, et plus particulièrement sur ce qui a trait aux relations entre <br />logiciel et matériel. Ces systèmes sont généralement dédiés à une application ou à une classe d'applications, ce qui permet d'optimiser l'architecture et les programmes. Mes recherches ses sont donc <br />focalisées sur l'architecture mémoire, les interfaces de <br />communication entre composants et le prototypage. Pour ces trois axes de recherche, des méthodes et des outils d'aide à la conception ont été définis et développés. Des travaux toujours en cours portent sur la généralisation d'une méthode de conception de composants d'interface matériels à partir <br />d'une spécification sous forme de services requis et fournis. Une telle spécification est déjà utilisée pour représenter des protocoles dans les réseaux de communication et pour le développement<br />des couches logicielles de communication. Son extension à la conception des interfaces matérielles homogénéiserait les langages, méthodes et outils de l'environnement de conception. Mes travaux futurs s'orientent vers deux axes : l'intégration <br />logiciel/matériel et l'adéquation entre architecture et système d'exploitation. Dans les deux cas, les relations étroites entre les ressources physiques de l'architecture et les couches logicielles qui y accèdent doivent permettre d'améliorer sensiblement les performances.
37

Enabling Hardware/Software Co-design in High-level Synthesis

Choi, Jongsok 21 November 2012 (has links)
A hardware implementation can bring orders of magnitude improvements in performance and energy consumption over a software implementation. Hardware design, however, can be extremely difficult. High-level synthesis, the process of compiling software to hardware, promises to make hardware design easier. However, compiling an entire software program to hardware can be inefficient. This thesis proposes hardware/software co-design, where computationally intensive functions are accelerated by hardware, while remaining program segments execute in software. The work in this thesis builds a framework where user-designated software functions are automatically compiled to hardware accelerators, which can execute serially or in parallel to work in tandem with a processor. To support multiple parallel accelerators, new multi-ported cache designs are presented. These caches provide low-latency high-bandwidth data to further improve the performance of accelerators. An extensive range of cache architectures are explored, and results show that certain cache architectures significantly outperform others in a processor/accelerator system.
38

Enabling Hardware/Software Co-design in High-level Synthesis

Choi, Jongsok 21 November 2012 (has links)
A hardware implementation can bring orders of magnitude improvements in performance and energy consumption over a software implementation. Hardware design, however, can be extremely difficult. High-level synthesis, the process of compiling software to hardware, promises to make hardware design easier. However, compiling an entire software program to hardware can be inefficient. This thesis proposes hardware/software co-design, where computationally intensive functions are accelerated by hardware, while remaining program segments execute in software. The work in this thesis builds a framework where user-designated software functions are automatically compiled to hardware accelerators, which can execute serially or in parallel to work in tandem with a processor. To support multiple parallel accelerators, new multi-ported cache designs are presented. These caches provide low-latency high-bandwidth data to further improve the performance of accelerators. An extensive range of cache architectures are explored, and results show that certain cache architectures significantly outperform others in a processor/accelerator system.
39

Hardware accelerators for embedded fingerprint-based personal recognition systems

Fons Lluís, Mariano 29 May 2012 (has links)
Abstract The development of automatic biometrics-based personal recognition systems is a reality in the current technological age. Not only those operations demanding stringent security levels but also many daily use consumer applications request the existence of computational platforms in charge of recognizing the identity of one individual based on the analysis of his/her physiological and/or behavioural characteristics. The state of the art points out two main open problems in the implementation of such applications: on the one hand, the needed reliability improvement in terms of recognition accuracy, overall security and real-time performances; and on the other hand, the cost reduction of those physical platforms in charge of the processing. This work aims at finding the proper system architecture able to address those limitations of current personal recognition applications. Embedded system solutions based on hardware-software co-design techniques and programmable (and run-time reconfigurable) logic devices under FPGAs or SOPCs is proven to be an efficient alternative to those existing multiprocessor systems based on HPCs, GPUs or PC platforms in the development of that kind of high-performance applications at low cost / El desenvolupament de sistemes automàtics de reconeixement personal basats en tècniques biomètriques esdevé una realitat en l’era tecnològica actual. No només aquelles operacions que exigeixen un elevat nivell de seguretat sinó també moltes aplicacions quotidianes demanen l’existència de plataformes computacionals encarregades de reconèixer la identitat d’un individu a partir de l’anàlisi de les seves característiques fisiològiques i/o comportamentals. L’estat de l’art de la tècnica identifica dues limitacions importants en la implementació d’aquest tipus d’aplicacions: per una banda, és necessària la millora de la fiabilitat d’aquests sistemes en termes de precisió en el procés de reconeixement personal, seguretat i execució en temps real; i per altra banda, és necessari reduir notablement el cost dels sistemes electrònics encarregats del processat biomètric. Aquest treball té per objectiu la cerca de l’arquitectura adequada a nivell de sistema que permeti fer front a les limitacions de les aplicacions de reconeixement personal actuals. Es demostra que la proposta de sistemes empotrats basats en tècniques de codisseny hardware-software i dispositius lògics programables (i reconfigurables en temps d’execució) sobre FPGAs o SOPCs resulta ser una alternativa eficient en front d’aquells sistemes multiprocessadors existents basats en HPCs, GPUs o plataformes PC per al desenvolupament d’aquests tipus d’aplicacions que requereixen un alt nivell de prestacions a baix cost. / El desarrollo de sistemas automáticos de reconocimiento personal basados en técnicas biométricas se ha convertido en una realidad en la era tecnológica actual. No tan solo aquellas operaciones que requieren un alto nivel de seguridad sino también muchas otras aplicaciones cotidianas exigen la existencia de plataformas computacionales encargadas de verificar la identidad de un individuo a partir del análisis de sus características fisiológicas y/o comportamentales. El estado del arte de la técnica identifica dos limitaciones importantes en la implementación de este tipo de aplicaciones: por un lado, es necesario mejorar la fiabilidad que presentan estos sistemas en términos de precisión en el proceso de reconocimiento personal, seguridad y ejecución en tiempo real; y por otro lado, es necesario reducir notablemente el coste de los sistemas electrónicos encargados de dicho procesado biométrico. Este trabajo tiene por objetivo la búsqueda de aquella arquitectura adecuada a nivel de sistema que permita hacer frente a las limitaciones de los sistemas de reconocimiento personal actuales. Se demuestra que la propuesta basada en sistemas embebidos implementados mediante técnicas de codiseño hardware-software y dispositivos lógicos programables (y reconfigurables en tiempo de ejecución) sobre FPGAs o SOPCs resulta ser una alternativa eficiente frente a aquellos sistemas multiprocesador actuales basados en HPCs, GPUs o plataformas PC en el ámbito del desarrollo de aplicaciones que demandan un alto nivel de prestaciones a bajo coste
40

High-Level Synthesis of Software Function Calls

TOMIYAMA, Hiroyuki, KANBARA, Hiroyuki, ISHIMORI, Yoshiyuki, ISHIURA, Nagisa, NISHIMURA, Masanari 01 December 2008 (has links)
No description available.

Page generated in 0.0589 seconds