• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 15
  • 8
  • 4
  • 3
  • 3
  • Tagged with
  • 33
  • 33
  • 10
  • 9
  • 9
  • 7
  • 7
  • 6
  • 6
  • 6
  • 5
  • 5
  • 5
  • 5
  • 5
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
31

Low Density Parity Check Encoder and Decoder on SiLago Coarse Grain Reconfigurable Architecture

Kong, Weijiang January 2019 (has links)
Low density parity check (LDPC) code is an error correction code that has been widely adopted as an optional error correcting operation in most of today’s communication protocols. Current design of ASIC or FPGA based LDPC accelerators can reach Gbit/s data rate. However, the hardware cost of ASIC based methods and related interface is considerably high to be integrated into coarse grain reconfigurable architectures (CGRA). Moreover, for platforms aiming at high level synthesis or system level synthesis, they don’t provide flexibility under low-performance low-cost design scenarios. In this degree project, we establish connectivity between SiLago CGRA and a typical QC-LDPC code defined in IEEE 802.11n standard. We design lightweight LDPC encoder and decoder blocks using FSM+Datapath design pattern. The encoder provides sufficient throughput and consumes very little area and power. The decoder provides sufficient performance for low speed modulations while consuming significantly lower hardware resources. Both encoder and decoder are capable of cooperating with SiLago based DRRA through standard Network on Chip (NOC) based shared memory, DiMArch. And extra hardware for interface is no longer necessary. We verified our design through RTL simulation and synthesis. Encoder went through logic and physical synthesis while decoder went through only logic synthesis. The result acquired proves that our design is closely coupled with the SiLago CGRA while provides a solution with lowperformance and low-cost. / LDPC-kod med låg densitet är en felkorrigeringskod som har vidtagits i stor utsträckning som en valfri felsökande operation i de flesta av dagens kommunikationsprotokoll. Nuvarande design av ASICeller FPGAbaserade LDPC-acceleratorer kan nå Gbit / s datahastighet. Hårdvarukostnaden för ASIC-baserade metoder och relaterade gränssnitt är emellertid avsevärt hög för att integreras i grova kornkonfigurerbara arkitekturer (CGRA). Dessutom ger plattformar som syftar till syntese på hög nivå eller syntes på systemnivå inte flexibilitet under lågprestanda med låg kostnadsscenarier. I detta examensarbete upprättar vi anslutning mellan SiLago CGRA och en typisk QC-LDPC-kod definierad i IEEE 802.11n-standarden. Vi designar lätta LDPC-kodare och avkodarblock med FSM + Datapathdesignmönster. Kodaren ger tillräcklig genomströmning och förbrukar mycket lite areal och effekt. Avkodaren ger tillräckligt med prestanda för moduleringar med låg hastighet medan den förbrukar betydligt lägre hårdvaruressurser. Både kodare och avkodare kan samarbeta med SiLago-baserade DRRA genom standard Network on Chip (NOC) baserat delat minne, DiMArch. Och extra hårdvara för gränssnittet är inte längre nödvändigt. Vi verifierade vår design genom RTL-simulering och syntes. Kodaren genomgick logik och fysisk syntes medan avkodare genomgick endast logisk syntes. Det förvärvade resultatet bevisar att vår design är nära kopplad till SiLago CGRA och ger en lösning med låg prestanda och låg kostnad.
32

Sistema embarcado reconfigurável de forma estática por programação genética utilizando hardware evolucionário híbrido

Almeida, Manoel Aranda de 04 March 2016 (has links)
Submitted by Izabel Franco (izabel-franco@ufscar.br) on 2016-10-03T18:47:50Z No. of bitstreams: 1 DissMAA.pdf: 3325891 bytes, checksum: 1b4744d48d74943990bed42753cc4b4c (MD5) / Approved for entry into archive by Marina Freitas (marinapf@ufscar.br) on 2016-10-20T18:27:58Z (GMT) No. of bitstreams: 1 DissMAA.pdf: 3325891 bytes, checksum: 1b4744d48d74943990bed42753cc4b4c (MD5) / Approved for entry into archive by Marina Freitas (marinapf@ufscar.br) on 2016-10-20T18:28:04Z (GMT) No. of bitstreams: 1 DissMAA.pdf: 3325891 bytes, checksum: 1b4744d48d74943990bed42753cc4b4c (MD5) / Made available in DSpace on 2016-10-20T18:28:13Z (GMT). No. of bitstreams: 1 DissMAA.pdf: 3325891 bytes, checksum: 1b4744d48d74943990bed42753cc4b4c (MD5) Previous issue date: 2016-03-04 / Não recebi financiamento / The use of technology based on Field Programmable Gate Arrays (FPGAs), a reconfigurable technology, has become a frequent object of study. This technique is feasible and a promising application in the development of embedded systems, however, the difficulty in finding a flexible and efficient way to perform such an application is their bigger problem. In this work, a virtual and reconfigurable architecture (AVR) in FPGA for hardware applications is presented using a Genetic Programming Software on the development of an optimal reconfiguration for this AVR, in order to build a hardware capable of performing a given task in an embedded system. This proposal is a simple, flexible and efficient way to achieve appropriate applications in embedded systems, when compared to other reconfigurable hardware techniques. The representation of phenotype of the proposed evolutionary system is based on a bi-dimensional network function elements (EF). The GPLAB tool for MATLAB is used in Genetic Programming, and the solution found by this procedure is converted into a memory mapping to represent the best solution, where it is used to reconfigure the hardware. In the tests, GPLAB found results for logic circuits in a few generations, and for image filters containing efficient solutions, where there was little hardware occupation, especially memory, in the cases this has been presented, with a reduced chromosome size, shows a proposal efficiency. / O uso da tecnologia baseada em Field Programmable Gate Arrays (FPGAs), de forma reconfigurável, para a solução de diversos problemas atuais, tem se tornado um frequente objeto de estudo. Essa técnica é de aplicação viável e promissora na elaboração de sistemas embarcados, porém, a dificuldade em encontrar uma forma flexível e eficiente de realizar tal aplicação é o seu maior problema. Neste trabalho, é apresentada uma arquitetura virtual e reconfigurável (AVR) em FPGA para aplicações em hardware, utilizando um software de Programação Genética na elaboração de uma reconfiguração ótima para esta AVR, de forma a construir um hardware capaz de efetuar uma determinada tarefa em um sistema embarcado. Esta proposta é uma forma simples, flexível e eficiente de realizar aplicações adequadas em sistemas embarcados, quando comparada a outras técnicas de hardware reconfigurável. A representação do fenótipo no sistema evolutivo proposto se baseia em uma rede de elementos de função (EF) bidimensional. A ferramenta GPLAB, para MATLAB, é usada na Programação Genética, e a solução encontrada por esta é convertida em um mapeamento de memória com o cromossomo da melhor solução, onde este é usado para reconfigurar o hardware. Nos testes realizados, a GPLAB encontrou resultados para circuitos lógicos em poucas gerações, e para filtros de imagem encontrou soluções eficientes, onde ocorreu pouca ocupação de hardware, principalmente da memória nos casos apresentados, apresentando um cromossomo de tamanho reduzido, o que demonstra uma boa eficiência da proposta.
33

Compiling For Coarse-Grained Reconfigurable Architectures Based On Dataflow Execution Paradigm

Alle, Mythri 12 1900 (has links) (PDF)
Coarse-Grained Reconfigurable Architectures(CGRAs) can be employed for accelerating computational workloads that demand both flexibility and performance. CGRAs comprise a set of computation elements interconnected using a network and this interconnection of computation elements is referred to as a reconfigurable fabric. The size of application that can be accommodated on the reconfigurable fabric is limited by the size of instruction buffers associated with each Compute element. When an application cannot be accommodated entirely, application is partitioned such that each of these partitions can be executed on the reconfigurable fabric. These partitions are scheduled by an orchestrator. The orchestrator employs dynamic dataflow execution paradigm. Dynamic dataflow execution paradigm has inherent support for synchronization and helps in exploitation of parallelism that exists across application partitions. In this thesis, we present a compiler that targets such CGRAs. The compiler presented in this thesis is capable of accepting applications specified in C89 standard. To enable architectural design space exploration, the compiler is designed such that it can be customized for several instances of CGRAs employing dataflow execution paradigm at the orchestrator. This can be achieved by specifying the appropriate configuration parameters to the compiler. The focus of this thesis is to provide efficient support for various kinds of parallelism while ensuring correctness. The compiler is designed to support fine-grained task level parallelism that exists across iterations of loops and function calls. Additionally, compiler can also support pipeline parallelism, where a loop is split into multiple stages that execute in a pipelined manner. The prototype compiler, which targets multiple instances of a CGRA, is demonstrated in this thesis. We used this compiler to target multiple variants of CGRAs employing dataflow execution paradigm. We varied the reconfigur-able fabric, orchestration mechanism employed, size of instruction buffers. We also choose applications from two different domains viz. cryptography and linear algebra. The execution time of the CGRA (the best among all instances) is compared against an Intel Quad core processor. Cryptography applications show a performance improvement ranging from more than one order of magnitude to close to two orders of magnitude. These applications have large amounts of ILP and our compiler could successfully expose the ILP available in these applications. Further, the domain customization also played an important role in achieving good performance. We employed two custom functional units for accelerating Cryptography applications and compiler could efficiently use them. In linear algebra kernels we observe multiple iterations of the loop executing in parallel, effectively exploiting loop-level parallelism at runtime. Inspite of this we notice close to an order of magnitude performance degradation. The reason for this degradation can be attributed to the use of non-pipelined floating point units, and the delays involved in accessing memory. Pipeline parallelism was demonstrated using this compiler for FFT and QR factorization. Thus, the compiler is capable of efficiently supporting different kinds of parallelism and can support complete C89 standard. Further, the compiler can also support different instances of CGRAs employing dataflow execution paradigm.

Page generated in 0.0819 seconds