Global ETD Search

81	Energy and Design Cost Efficiency for Streaming Applications on Systems-on-Chip Zhu, Jun January 2009 (has links) <p>With the increasing capacity of today's integrated circuits, a number ofheterogeneous system-on-chip (SoC) architectures in embedded systemshave been proposed. In order to achieve energy and design cost efficientstreaming applications on these systems, new design space explorationframeworks and performance analysis approaches are required. Thisthesis considers three state-of-the-art SoCs architectures, i.e., themulti-processor SoCs (MPSoCs) with network-on-chip (NoC) communication,the hybrid CPU/FPGA architectures, and the run-time reconfigurable (RTR)FPGAs. The main topic of the author?s research is to model and capturethe application scheduling, architecture customization, and bufferdimensioning problems, according to the real-time requirement. Sincethese problems are NP-complete, heuristic algorithms and constraintprogramming solver are used to compute a solution.For NoC communication based MPSoCs, an approach to optimize thereal-time streaming applications with customized processorvoltage-frequency levels and memory sizes is presented. A multi-clockedsynchronous model of computation (MoC) framework is proposed inheterogeneous timing analysis and energy estimation. Using heuristicsearching (i.e., greedy and taboo search), the experiments show anenergy reduction (up to 21%) without any loss in application throughputcompared with an ad-hoc approach.On hybrid CPU/FPGA architectures, the buffer minimization scheduling ofreal-time streaming applications is addressed. Based on event models,the problem has been formalized decoratively as constraint basescheduling, and solved by public domain constraint solver Gecode.Compared with traditional PAPS method, the proposed method needssignificantly smaller buffers (2.4% of PAPS in the best case), whilehigh throughput guarantees can still be achieved.Furthermore, a novel compile-time analysis approach based on iterativetiming phases is proposed for run-time reconfigurations in adaptivereal-time streaming applications on RTR FPGAs. Finally, thereconfigurations analysis and design trade-offs analysis capabilities ofthe proposed framework have been exemplified with experiments on bothexample and industrial applications.</p> / Andres Streaming applications Systems-on-chip Synchronous dataflow energy efficiency buffer minimization performance analysis Informatik, data- och systemvetenskap
82	Caracterização analítica de carga de trabalho baseada em cenários de aplicações multimídia. / Analytical characterization of workload based on scenarios of multimedia applications. Patiño Alvarez, Gustavo Adolfo 07 December 2012 (has links) As metodologias clássicas de análise de desempenho de sistemas sobre silício (System on chip, SoC) geralmente são descritas em função do tempo de execução do pior-caso1 das tarefas a serem executadas. No entanto, nas aplicações do mundo real, o tempo de execução destas tarefas pode variar devido à presença dos diferentes eventos de entrada que ativam o sistema, colocando uma exigência diferente de execução sobre os recursos do sistema. Geralmente, um modelo da carga de trabalho é uma parte integrante de um modelo de desempenho utilizado para avaliar o desempenho de um sistema. O quão bom for um modelo de carga de trabalho determina em grande medida a qualidade das soluções do projeto e a precisão das estimativas de desempenho baseadas nele. Nesta tese, é abordado o problema de modelar a carga de trabalho para o projeto de sistemas de tempo-real cuja funcionalidade envolve processamento de fluxos de multimídia, isto é, fluxos de dados representando áudio, imagens ou vídeo. O problema de modelar a carga de trabalho é abordado sob a premissa de que uma caracterização acurada do comportamento temporal do software embarcado permite ao projetista identificar diversas exigências variáveis de execução, apresentadas para os diversos recursos de arquitetura do sistema, tanto na operação individual do conjunto de tarefas de software, assim como na execução global da aplicação, em fase de projeto. A caracterização do comportamento de cada tarefa foi definida a partir de uma análise temporal dos códigos de software associados às diferentes tarefas de uma aplicação, a fim de identificar os múltiplos modos de operação que o código pode apresentar dentro de um processador. Esta caracterização é feita através da realização de uma análise estática das rotas do código executável, de forma que para cada rota de execução encontrada, estimam-se os tempos extremos de execução (WCET e BCET)2, baseando-se na modelagem da microarquitetura de um processador on-chip. Desta forma, cada rota do código executável junto aos seus respectivos tempos de execução, constitui um modo de operação do código analisado. A fim de agrupar os diversos modos de operação que apresentam um grau de semelhança entre si de acordo a uma perspectiva da medida de processamento utilizado do processador modelado, foi utilizado o conceito de cenário, o qual diferencia o comportamento de cada tarefa em relação às entradas que a aplicação sob análise pode receber. Partindo desta caracterização temporal de cada tarefa de software, as exigências da execução global da aplicação são representadas através de um modelo analítico de eventos. O modelo considera as diferentes tarefas como atores temporais de um grafo de fluxo síncrono de dados, de modo que os diferentes cenários de operação da aplicação são definidos em função dos tempos variáveis de execução identificados previamente na caracterização de cada tarefa. Uma descrição matemática deste modelo, baseada na Álgebra de Max-Plus, permite caracterizar analiticamente os diferentes fluxos de eventos entre a entrada e a saída da aplicação, assim como os fluxos de eventos entre as diferentes tarefas, considerando as mudanças nas exigências de processamento associadas aos diversos cenários previamente identificados. Esta caracterização analítica dos diversos fluxos de eventos de entrada e saída é a base para um modelo de curvas de carga de trabalho baseada em cenários de aplicação, e um modelo de curvas de serviços baseada também em cenários, que dão lugar a caracterizar o dinamismo comportamental da aplicação analisada, determinado pela diversidade de eventos de entrada que podem ativar diferentes comportamentos do sistema em fase de execução. / Classical methods for performance analysis of Multiprocessor System-on-chip (MPSoCs) are usually described in terms of Worst-Case Execution Times (WCET) of the executed tasks. Nevertheless, in real-world applications the running time of tasks varies due to different input events that trigger the system, imposing a different workload on the system resources. Usually, a workload model is a part of a performance model used to evaluate the performance of a system. How good is a workload model largely determines the quality of design solutions and the accuracy of performance estimations based on it. This thesis addresses the problem of modeling the workload for the design of real-time systems which functionality involves multimedia streams processing, i.e, data streams representing audio, images or video. The workload modeling problem is addressed from the assumption that an accurate characterization of timing behavior of real-time embedded software enables the designer to identify several variable execution requirements that the individual operation of the software tasks and the overall execution of the application will present to the several system resources of an architecture, in design phase. The software task characterization was defined from a timing analysis of the source code in order to identify the multiple operating modes the code can exhibit within a processor. This characterization is done by performing a static path analysis on the code, so that for each given path the worst-case and bestcase execution times (WCET and BCET) were estimated, based on a microarchitectural modeling of an on-chip processor. Thus, every execution path of the code, with its estimated execution times, defines an operation mode of the analyzed code. In order to cluster the several operation modes that exhibit certain degree of similarity according to the required amount of processing in the modeled processor, the concept of scenario was used, which differentiates every task behavior with respect to the several inputs the application under analysis may receive. From this timing characterization of every application task, the global execution requirements of the application are represented by an analytical event model. It describes the tasks as timed actors of a synchronous dataflow graph, so that the multiple application scenarios are defined in terms of the variable execution times previously identified in the task characterization. A mathematical description of this model based on the Max-Plus Algebra allows one to characterize the different event sequences incoming to, and exiting from, the application as well as the event sequences between the different tasks, having in count changes in the processing requirements associated with the various scenarios previously identified. This analytical characterization between the input event sequences and the output event sequences states the basis for a model of scenario-based workload curves and a model of scenario-based service curves that allow characterizing the behavioral dynamism of the application determined by the several input events that activate several system behaviors, in the execution phase. Análise de desempenho Carga de trabalho Dataflow model Embedded systems Modelo analítico Modelos de computação Models of computation Multimídia (aplicações) Network on chip Sistemas embutidos
83	Caracterização analítica de carga de trabalho baseada em cenários de aplicações multimídia. / Analytical characterization of workload based on scenarios of multimedia applications. Gustavo Adolfo Patiño Alvarez 07 December 2012 (has links) As metodologias clássicas de análise de desempenho de sistemas sobre silício (System on chip, SoC) geralmente são descritas em função do tempo de execução do pior-caso1 das tarefas a serem executadas. No entanto, nas aplicações do mundo real, o tempo de execução destas tarefas pode variar devido à presença dos diferentes eventos de entrada que ativam o sistema, colocando uma exigência diferente de execução sobre os recursos do sistema. Geralmente, um modelo da carga de trabalho é uma parte integrante de um modelo de desempenho utilizado para avaliar o desempenho de um sistema. O quão bom for um modelo de carga de trabalho determina em grande medida a qualidade das soluções do projeto e a precisão das estimativas de desempenho baseadas nele. Nesta tese, é abordado o problema de modelar a carga de trabalho para o projeto de sistemas de tempo-real cuja funcionalidade envolve processamento de fluxos de multimídia, isto é, fluxos de dados representando áudio, imagens ou vídeo. O problema de modelar a carga de trabalho é abordado sob a premissa de que uma caracterização acurada do comportamento temporal do software embarcado permite ao projetista identificar diversas exigências variáveis de execução, apresentadas para os diversos recursos de arquitetura do sistema, tanto na operação individual do conjunto de tarefas de software, assim como na execução global da aplicação, em fase de projeto. A caracterização do comportamento de cada tarefa foi definida a partir de uma análise temporal dos códigos de software associados às diferentes tarefas de uma aplicação, a fim de identificar os múltiplos modos de operação que o código pode apresentar dentro de um processador. Esta caracterização é feita através da realização de uma análise estática das rotas do código executável, de forma que para cada rota de execução encontrada, estimam-se os tempos extremos de execução (WCET e BCET)2, baseando-se na modelagem da microarquitetura de um processador on-chip. Desta forma, cada rota do código executável junto aos seus respectivos tempos de execução, constitui um modo de operação do código analisado. A fim de agrupar os diversos modos de operação que apresentam um grau de semelhança entre si de acordo a uma perspectiva da medida de processamento utilizado do processador modelado, foi utilizado o conceito de cenário, o qual diferencia o comportamento de cada tarefa em relação às entradas que a aplicação sob análise pode receber. Partindo desta caracterização temporal de cada tarefa de software, as exigências da execução global da aplicação são representadas através de um modelo analítico de eventos. O modelo considera as diferentes tarefas como atores temporais de um grafo de fluxo síncrono de dados, de modo que os diferentes cenários de operação da aplicação são definidos em função dos tempos variáveis de execução identificados previamente na caracterização de cada tarefa. Uma descrição matemática deste modelo, baseada na Álgebra de Max-Plus, permite caracterizar analiticamente os diferentes fluxos de eventos entre a entrada e a saída da aplicação, assim como os fluxos de eventos entre as diferentes tarefas, considerando as mudanças nas exigências de processamento associadas aos diversos cenários previamente identificados. Esta caracterização analítica dos diversos fluxos de eventos de entrada e saída é a base para um modelo de curvas de carga de trabalho baseada em cenários de aplicação, e um modelo de curvas de serviços baseada também em cenários, que dão lugar a caracterizar o dinamismo comportamental da aplicação analisada, determinado pela diversidade de eventos de entrada que podem ativar diferentes comportamentos do sistema em fase de execução. / Classical methods for performance analysis of Multiprocessor System-on-chip (MPSoCs) are usually described in terms of Worst-Case Execution Times (WCET) of the executed tasks. Nevertheless, in real-world applications the running time of tasks varies due to different input events that trigger the system, imposing a different workload on the system resources. Usually, a workload model is a part of a performance model used to evaluate the performance of a system. How good is a workload model largely determines the quality of design solutions and the accuracy of performance estimations based on it. This thesis addresses the problem of modeling the workload for the design of real-time systems which functionality involves multimedia streams processing, i.e, data streams representing audio, images or video. The workload modeling problem is addressed from the assumption that an accurate characterization of timing behavior of real-time embedded software enables the designer to identify several variable execution requirements that the individual operation of the software tasks and the overall execution of the application will present to the several system resources of an architecture, in design phase. The software task characterization was defined from a timing analysis of the source code in order to identify the multiple operating modes the code can exhibit within a processor. This characterization is done by performing a static path analysis on the code, so that for each given path the worst-case and bestcase execution times (WCET and BCET) were estimated, based on a microarchitectural modeling of an on-chip processor. Thus, every execution path of the code, with its estimated execution times, defines an operation mode of the analyzed code. In order to cluster the several operation modes that exhibit certain degree of similarity according to the required amount of processing in the modeled processor, the concept of scenario was used, which differentiates every task behavior with respect to the several inputs the application under analysis may receive. From this timing characterization of every application task, the global execution requirements of the application are represented by an analytical event model. It describes the tasks as timed actors of a synchronous dataflow graph, so that the multiple application scenarios are defined in terms of the variable execution times previously identified in the task characterization. A mathematical description of this model based on the Max-Plus Algebra allows one to characterize the different event sequences incoming to, and exiting from, the application as well as the event sequences between the different tasks, having in count changes in the processing requirements associated with the various scenarios previously identified. This analytical characterization between the input event sequences and the output event sequences states the basis for a model of scenario-based workload curves and a model of scenario-based service curves that allow characterizing the behavioral dynamism of the application determined by the several input events that activate several system behaviors, in the execution phase. Análise de desempenho Carga de trabalho Modelo analítico Modelos de computação Multimídia (aplicações) Sistemas embutidos Dataflow model Embedded systems Models of computation Network on chip
84	ChipCflow - uma ferramenta para execução de algoritmos utilizando o modelo a fluxo de dados dinâmico em hardware reconfigurável / ChipCflow - a tool to executing algorithms using dynamic dataflow architecture in FPGA Lopes, Joelmir José 29 June 2012 (has links) Devido à complexidade das aplicações, a demanda crescente por sistemas que usam milhões de transistores e hardware complexo; tem sido desenvolvidas ferramentas que convertem C em Linguagem de Descrição de Hardware, tais como VHDL e Verilog. Neste contexto, esta tese apresenta o projeto ChipCflow, o qual usa arquitetura a fluxo de dados, para implementar lógica de alto desempenho em Field Programmable Gate Array (FPGA). Maquinas a fluxo de dados são computadores programáveis, cujo hardware é otimizado para computação paralela de granularidade fina dirigida por dados. Em outras palavras, a execução de programas é determinado pela disponibilidade dos dados, assim, o paralelismo é intrínseco neste sistema. Por outro lado, com o avanço da tecnologia da microeletrônica, o FPGA tem sido utilizado principalmente devido a sua flexibilidade, facilidade para implementar sistemas complexos e paralelismo intrínseco. Um dos desafios é criar ferramentas para programadores que usam linguagem de alto nível (HLL), como a linguagem C, e produzir hardware diretamente. Essas ferramentas devem usar a máxima experiência dos programadores, o paralelismo das arquiteturas a fluxo de dados dinâmica, a flexibilidade e o paralelismo do FPGA, para produzir um hardware eficiente, otimizado para alto desempenho e baixo consumo de energia. O projeto ChipCflow é uma ferramenta que converte os programas de aplicação escritos em linguagem C para a linguagem VHDL, baseado na arquitetura a fluxo de dados dinâmica. O principal objetivo dessa tese é definir e implementar os operadores do ChipCflow, usando a arquitetura a fluxo de dados dinâmica em FPGA. Esses operadores usam tagged tokens para identificar dados, com base em instâncias de operadores. A implementação dos operadores e das instâncias usam um modelo de implementação assíncrono em FPGA para obter maior velocidade e menor consumo / Due to the complexity of applications, the growing demand for both systems using millions of transistors and consecutive complex hardware, tools that convert C into a Hardware Description Language (HDL), as VHDL and Verilog, have been developed. In this context this thesis presents the ChipCflow project, which uses dataflow architecture to implement high-performance logics in Field Programmable Gate Array (FPGA). Dataflow machines are programmable computers whose hardware is optimized for fine-grain data-flow parallel computation. In other words the execution of programs is determined by data availability, thus parallelism is intrinsic in these systems. On the other hand, with the advance of technology of microelectronics, the FPGA has been used mainly because of its flexibility, facilities to implement complex systems and intrinsic parallelism. One of the challenges is to create tools for programmers who use HLL (High Level Language), such as C language, producing hardware directly. These tools should use the utmost experience of the programmers, the parallelism of dynamic dataflow architecture and the flexibility and parallelism of FPGA to produce efficient hardware optimized for high performance and lower power consumption. The ChipCflow project is a tool that converts application programs written in C language into VHDL, based on the dynamic dataflow architecture. The main goal in this thesis is to define and implement the operators of ChipCflow using dynamic dataflow architecture in FPGA. These operators use tagged tokens to identify data based on instances of operators and their implementation and instances use an asynchronous implementation model in FPGA to achieve faster speed and lower consumption Arquiteturas a fluxo de dados dinâmicas Dynamic dataflow architecture Parallel systems Sistemas paralelos
85	ChipCflow - em hardware dinamicamente reconfigurável / ChipCflow - in dynamically reconfigurable hardware Astolfi, Vitor Fiorotto 04 December 2009 (has links) Nos últimos anos, houve um grande avanço na computação reconfigurável, em particular em hardware que emprega Field-Programmable Gate Arrays. Porém, esse aumento de capacidade e desempenho aumentou a distância entre a capacidade de projeto e a disponibilidade de tecnologia para o desenvolvimento do projeto. As linguagens de programação imperativas de alto nível, como C, são mais apropriadas para o desenvolvimento de aplicativos complexos que as linguagens de descrição de hardware. Por isso, surgiram diversas ferramentas para o desenvolvimento de hardware a partir de código em C. A ferramenta ChipCflow, da qual faz parte este projeto, é uma delas. A execução dos programas por meio dessa ferramenta será completamente baseada em seu fluxo de dados, seguindo o modelo dinâmico encontrado nas arquiteturas de computadores a fluxo de dados, aproveitando ao máximo o paralelismo considerado natural desse modelo e as características do hardware parcialmente reconfigurável. Neste projeto em particular, o objetivo é a prova de conceito (proof of concept) para a criação de instâncias, em forma de operadores, de um algoritmo ChipCflow em hardware parcialmente reconfigurável, tendo como base a plataforma Virtex da Xilinx / In recent years, reconfigurable computing has become increasingly more advanced, especially in hardware that uses Field-Programmable Gate Arrays. However, the increase of performance in FPGAs accumulated the gap between design capacity and technology for the development of the design. Imperative high-level programming languages such as C are more appropriate for the development of complex algorithms than hardware description languages (HDL). For this reason, many ANSI C-like programming tools for the development of hardware came to existence. The ChipCflow project, of which this project is part, is one of these tools. The execution of algorithms through this tool will be completely directed by data flow, according to the dynamic model found on Dataflow Architectures, taking advantage of its natural high levels of parallelism and the characteristics of the partially reconfigurable hardware. In this project, the objective is a proof of concept for the creation of instances, in the form of operators, of a ChipCflow algorithm on a partially reconfigurable hardware, taking as reference the Xilinx Virtex boards Arquitetura a fluxo de dados Computação reconfigurável Dataflow architecture Dynamic partial reconfiguration FPGA FPGA Reconfigurable computing Reconfiguração parcial dinâmica Virtex Virtex
86	Interconnection Optimization for Dataflow Architectures Moser, Nico, Gremzow, Carsten, Menge, Matthias 08 June 2007 (has links) (PDF) In this paper we present a dataflow processor architecture based on [1], which is driven by controlflow generated tokens. We will show the special properties of this architecture with regard to scalability, extensibility, and parallelism. In this context we outline the application scope and compare our approach with related work. Advantages and disadvantages will be discussed and we suggest solutions to solve the disadvantages. Finally an example of the implementation of this architecture will be given and we have a look at further developments. We believe the features of this basic approach predestines the architecture especially for embedded systems and system on chips. Chaining methods Controlflow driven Dataflow architecture adjustment datapaths to FPGA-structures embedded systems ddc:004 ddc:500 Eingebettetes System Systementwurf
87	Traitement d'images bas niveau intégré dans un capteur de vision CMOS / integrated low level image processing in a CMOS imager Amhaz, Hawraa 10 July 2012 (has links) Le traitement d’images classique est basé sur l’évaluation des données délivrées par un système à basede capteur de vision sous forme d’images. L’information lumineuse captée est extraiteséquentiellement de chaque élément photosensible (pixel) de la matrice avec un certain cadencementet à fréquence fixe. Ces données, une fois mémorisées, forment une matrice de données qui estréactualisée de manière exhaustive à l’arrivée de chaque nouvelle image. De fait, Pour des capteurs àforte résolution, le volume de données à gérer est extrêmement important. De plus, le système neprend pas en compte le fait que l’information stockée ai changé ou non par rapport à l’imageprécédente. Cette probabilité est, en effet, assez importante. Ceci nous mène donc, selon « l’activité »de la scène filmée à un haut niveau de redondances temporelles. De même, la méthode de lectureusuelle ne prend pas en compte le fait que le pixel en phase de lecture a la même valeur ou non que lepixel voisin lu juste avant. Cela rajoute aux redondances temporelles un taux de redondances spatialesplus ou moins élevé selon le spectre de fréquences spatiales de la scène filmée. Dans cette thèse, nousavons développé plusieurs solutions qui visent contrôler le flot de données en sortie de l’imageur enessayant de réduire les redondances spatiales et temporelles des pixels. Les contraintes de simplicité etd’« intelligence » des techniques de lecture développées font la différence entre ce que nousprésentons et ce qui a été publié dans la littérature. En effet, les travaux présentés dans l’état de l’artproposent des solutions à cette problématique, qui en général, exigent de gros sacrifices en terme desurface du pixel, vu qu’elles implémentent des fonctions électroniques complexes in situ.Les principes de fonctionnement, les émulations sous MATLAB, la conception et les simulationsélectriques ainsi que les résultats expérimentaux des techniques proposées sont présentés en détailsdans ce manuscrit. / The classical image processing is based on the evaluation of data delivered by a vision sensor systemas images. The captured light information is extracted sequentially from each photosensitive element(pixel) of the matrix with a fixed frequency called frame rate. These data, once stored, form a matrixof data that is entirely updated at the acquisition of each new image. Therefore, for high resolutionimagers, the data flow is huge. Moreover, the conventional systems do not take into account the factthat the stored data have changed or not compared to the previously acquired image. Indeed, there is ahigh probability that this information is not changed. Therefore, this leads, depending on the "activity"of the filmed scene, to a high level of temporal redundancies. Similarly, the usual scanning methodsdo not take into account that the read pixel has or not the same value of his neighbor pixel read oncebefore. This adds to the temporal redundancies, spatial redundancies rate that depends on the spatialfrequency spectrum of the scene. In this thesis, we have developed several solutions that aim to controlthe output data flow from the imager trying to reduce both spatial and temporal pixels redundancies. Aconstraint of simplicity and "Smartness" of the developed readout techniques makes the differencebetween what we present and what has been published in the literature. Indeed, the works presented inthe literature suggest several solutions to this problem, but in general, these solutions require largesacrifices in terms of pixel area, since they implement complex electronic functions in situ.The operating principles, the emulation in MATLAB, the electrical design and simulations and theexperimental results of the proposed techniques are explained in detail in this manuscript Pixel Capteur de vision CMOS « intelligent » Techniques de lecture Contrôle du flot de données Pixel “smart” CMOS image sensor Readout schemes Dataflow control
88	Application d'un langage de programmation de type flot de données à la synthèse haut-niveau de système de vision en temps-réel sur matériel reconfigurable / Application of a dataflow programming language to the high level synthesis of real time vision systems on reconfigurable hardware Ahmed, Sameer 24 January 2013 (has links) Les circuits reconfigurables de type FPGA (Field Programmable Gate Arrays) peuvent désormais surpasser les processeurs généralistes pour certaines applications offrant un fort degré de parallélisme intrinsèque. Ces circuits sont traditionnellement programmés en utilisant des langages de type HDL (Hardware Description Languages), comme Verilog et VHDL. L'usage de ces langages permet d'exploiter au mieux les performances offertes par ces circuits mais requiert des programmeurs une très bonne connaissance des techniques de conception numérique. Ce pré-requis limite fortement l'utilisation des FPGA par la communauté des concepteurs de logiciel en général. Afin de pallier cette limitation, un certain nombre d'outils de plus haut niveau ont été développés, tant dans le monde industriel qu'académique. Parmi les approches proposées, celles fondées sur une transformation plus ou moins automatique de langages de type C ou équivalent, largement utilisés dans le domaine logiciel, ont été les plus explorées. Malheureusement, ces approches ne permettent pas, en général, d'obtenir des performances comparables à celles issues d'une formulation directe avec un langage de type HDL, en raison, essentiellement, de l'incapacité de ces langages à exprimer le parallélisme intrinsèque des applications. Une solution possible à ce problème passe par un changement du modèle de programmation même. Dans le contexte qui est le notre, le modèle flot de données apparaît comme un bon candidat. Cette thèse explore donc l'adoption d'un modèle de programmation flot de données pour la programmation de circuits de type FPGA. Plus précisément, nous évaluons l'adéquation de CAPH, un langage orienté domaine (Domain Specific Language) à la description et à l'implantation sur FPGA d'application opérant à la volée des capteurs (stream processing applications). L'expressivité du langage et l'efficacité du code généré sont évaluées expérimentalement en utilisant un large spectre d'applications, allant du traitement d'images bas niveau (filtrage, convolution) à des applications de complexité réaliste telles que la détection de mouvement, l'étiquetage en composantes connexes ou l'encodage JPEG. / Field Programmable Gate Arrays (FPGAs) are reconfigurable devices which can outperform General Purpose Processors (GPPs) for applications exhibiting parallelism. Traditionally, FPGAs are programmed using Hardware Description Languages (HDLs) such as Verilog and VHDL. Using these languages generally offers the best performances but the programmer must be familiar with digital design. This creates a barrier for the software community to use FPGAs and limits their adoption as a computing solution. To make FPGAs accessible to both software and hardware programmers, a number of tools have been proposed both by academia and industry providing high-level programming environment. A widely used approach is to convert C-like languages to HDLs, making it easier for software programmers to use FPGAs. But these approaches generally do not provide performances on the par with those obtained with HDL languages. The primary reason is the inability of C-like approaches to express parallelism. Our claim is that in order to have a high level programming language for FPGAs as well as not to compromise on performance, a shift in programming paradigm is required. We think that the Dataflowow / actor programming model is a good candidate for this. This thesis explores the adoption of Dataflow / actor programming model for programming FPGAs. More precisely, we assess the suitability of CAPH, a domain-specific language based on this programming model for the description and implementation of stream-processing applications on FPGAs. The expressivity of the language and the efficiency of the generated code are assessed experimentally using a set of test bench applications ranging from very simple applications (basic image filtering) to more complex realistic applications such as motion detection, Connected Component Labeling (CCL) and JPEG encoder. Modèle flot de données FPGA Traitement d'images Vision par ordinateur Dataflow programming Stream-processing applications FPGA Computer vision
89	ChipCflow - em hardware dinamicamente reconfigurável / ChipCflow - in dynamically reconfigurable hardware Vitor Fiorotto Astolfi 04 December 2009 (has links) Nos últimos anos, houve um grande avanço na computação reconfigurável, em particular em hardware que emprega Field-Programmable Gate Arrays. Porém, esse aumento de capacidade e desempenho aumentou a distância entre a capacidade de projeto e a disponibilidade de tecnologia para o desenvolvimento do projeto. As linguagens de programação imperativas de alto nível, como C, são mais apropriadas para o desenvolvimento de aplicativos complexos que as linguagens de descrição de hardware. Por isso, surgiram diversas ferramentas para o desenvolvimento de hardware a partir de código em C. A ferramenta ChipCflow, da qual faz parte este projeto, é uma delas. A execução dos programas por meio dessa ferramenta será completamente baseada em seu fluxo de dados, seguindo o modelo dinâmico encontrado nas arquiteturas de computadores a fluxo de dados, aproveitando ao máximo o paralelismo considerado natural desse modelo e as características do hardware parcialmente reconfigurável. Neste projeto em particular, o objetivo é a prova de conceito (proof of concept) para a criação de instâncias, em forma de operadores, de um algoritmo ChipCflow em hardware parcialmente reconfigurável, tendo como base a plataforma Virtex da Xilinx / In recent years, reconfigurable computing has become increasingly more advanced, especially in hardware that uses Field-Programmable Gate Arrays. However, the increase of performance in FPGAs accumulated the gap between design capacity and technology for the development of the design. Imperative high-level programming languages such as C are more appropriate for the development of complex algorithms than hardware description languages (HDL). For this reason, many ANSI C-like programming tools for the development of hardware came to existence. The ChipCflow project, of which this project is part, is one of these tools. The execution of algorithms through this tool will be completely directed by data flow, according to the dynamic model found on Dataflow Architectures, taking advantage of its natural high levels of parallelism and the characteristics of the partially reconfigurable hardware. In this project, the objective is a proof of concept for the creation of instances, in the form of operators, of a ChipCflow algorithm on a partially reconfigurable hardware, taking as reference the Xilinx Virtex boards Arquitetura a fluxo de dados Computação reconfigurável FPGA Reconfiguração parcial dinâmica Virtex Dataflow architecture Dynamic partial reconfiguration FPGA Reconfigurable computing Virtex
90	ChipCflow - uma ferramenta para execução de algoritmos utilizando o modelo a fluxo de dados dinâmico em hardware reconfigurável / ChipCflow - a tool to executing algorithms using dynamic dataflow architecture in FPGA Joelmir José Lopes 29 June 2012 (has links) Devido à complexidade das aplicações, a demanda crescente por sistemas que usam milhões de transistores e hardware complexo; tem sido desenvolvidas ferramentas que convertem C em Linguagem de Descrição de Hardware, tais como VHDL e Verilog. Neste contexto, esta tese apresenta o projeto ChipCflow, o qual usa arquitetura a fluxo de dados, para implementar lógica de alto desempenho em Field Programmable Gate Array (FPGA). Maquinas a fluxo de dados são computadores programáveis, cujo hardware é otimizado para computação paralela de granularidade fina dirigida por dados. Em outras palavras, a execução de programas é determinado pela disponibilidade dos dados, assim, o paralelismo é intrínseco neste sistema. Por outro lado, com o avanço da tecnologia da microeletrônica, o FPGA tem sido utilizado principalmente devido a sua flexibilidade, facilidade para implementar sistemas complexos e paralelismo intrínseco. Um dos desafios é criar ferramentas para programadores que usam linguagem de alto nível (HLL), como a linguagem C, e produzir hardware diretamente. Essas ferramentas devem usar a máxima experiência dos programadores, o paralelismo das arquiteturas a fluxo de dados dinâmica, a flexibilidade e o paralelismo do FPGA, para produzir um hardware eficiente, otimizado para alto desempenho e baixo consumo de energia. O projeto ChipCflow é uma ferramenta que converte os programas de aplicação escritos em linguagem C para a linguagem VHDL, baseado na arquitetura a fluxo de dados dinâmica. O principal objetivo dessa tese é definir e implementar os operadores do ChipCflow, usando a arquitetura a fluxo de dados dinâmica em FPGA. Esses operadores usam tagged tokens para identificar dados, com base em instâncias de operadores. A implementação dos operadores e das instâncias usam um modelo de implementação assíncrono em FPGA para obter maior velocidade e menor consumo / Due to the complexity of applications, the growing demand for both systems using millions of transistors and consecutive complex hardware, tools that convert C into a Hardware Description Language (HDL), as VHDL and Verilog, have been developed. In this context this thesis presents the ChipCflow project, which uses dataflow architecture to implement high-performance logics in Field Programmable Gate Array (FPGA). Dataflow machines are programmable computers whose hardware is optimized for fine-grain data-flow parallel computation. In other words the execution of programs is determined by data availability, thus parallelism is intrinsic in these systems. On the other hand, with the advance of technology of microelectronics, the FPGA has been used mainly because of its flexibility, facilities to implement complex systems and intrinsic parallelism. One of the challenges is to create tools for programmers who use HLL (High Level Language), such as C language, producing hardware directly. These tools should use the utmost experience of the programmers, the parallelism of dynamic dataflow architecture and the flexibility and parallelism of FPGA to produce efficient hardware optimized for high performance and lower power consumption. The ChipCflow project is a tool that converts application programs written in C language into VHDL, based on the dynamic dataflow architecture. The main goal in this thesis is to define and implement the operators of ChipCflow using dynamic dataflow architecture in FPGA. These operators use tagged tokens to identify data based on instances of operators and their implementation and instances use an asynchronous implementation model in FPGA to achieve faster speed and lower consumption Arquiteturas a fluxo de dados dinâmicas Sistemas paralelos Dynamic dataflow architecture Parallel systems

Search results