21 |
Performance Assessment of Model-Driven FPGA-based Software-Defined Radio DevelopmentAllen, Matthew S 20 August 2014 (has links)
"This thesis presents technologies that integrate field programmable gate arrays (FPGAs), model-driven design tools, and software-defined radios (SDRs). Specifically, an assessment of current state-of-the-art practices applying model-driven development techniques targeting SDR systems is conducted. FPGAs have become increasingly versatile computing devices due to their size and resource enhancements, advanced core generation, partial reconfigurability, and system-on-a-chip (SoC) implementations. Although FPGAs possess relatively better performance per watt when compared to central processing units (CPUs) or graphics processing units (GPUs), FPGAs have been avoided due to long development cycles and higher implementation costs due to significant learning curves and low levels of abstraction associated with the hardware description languages (HDLs). This thesis conducts a performance assessment of SDR designs using both a model-driven design approach developed with Mathworks HDL Coder and a hand-optimized design approach created from the model-driven VHDL. Each design was implemented on the FPGA fabric of a Zynq-7000 SoC, using a Zedboard evaluation platform for hardware verification. Furthermore, a set of guidelines and best practices for applying model-driven design techniques toward the development of SDR systems using HDL Coder is presented."
|
22 |
Implementation of optical flow algorithms in FPGA platforms with embedded CPUSantos, João Pedro Ramos de Oliveira January 2009 (has links)
Tese de mestrado integrado. Engenharia Electrotécnica e de Computadores. Faculdade de Engenharia. Universidade do Porto. 2009
|
23 |
Implementação de algoritmos em FPGA para estimação de sinal em sistemas ópticos coerentesPinto, Nuno José de Moura January 2009 (has links)
Tese de mestrado integrado. Engenharia Electrotécnica e de Computadores (Major de Telecomunicações). Faculdade de Engenharia. Universidade do Porto. 2009
|
24 |
Driving actuators in a Suzaku board featuring FPGA+PowerPCCarvalhosa, André Manuel Ferraz January 2009 (has links)
Tese de mestrado integrado. Engenharia Electrotécnica e de Computadores (Ramo Auomação). Faculdade de Engenharia. Universidade do Porto. 2009
|
25 |
Konstruksjon av maskinvare for kjøring av sblokkbaserte eksperimenterDjupdal, Asbjørn January 2003 (has links)
<p>En CompactPCI datamaskin med et NallaTech BenERA FPGA-kort har blitt kjøpt inn til bruk innen forskning på evolusjonær maskinvare. Denne datamaskinen er i stand til å rekonfigurere og kjøre en vilkårlig krets på en FPGA, og kan kommunisere med FPGAen kjapt over en PCI-buss. Øvre hastighet for kommunikasjon med moduler på FPGA-en begrenses av CompactPCI-bussen til 132MB pr. sekund.</p><p>Denne hovedoppgaven går ut på konstruksjon av et system basert rundt BenERA FPGA-kortet som skal benyttes til kjøring av sblokkbaserte eksperimenter, med fokus på development. Prosjektet baserer seg på tidligere forskning innen sblokker og development ved NTNU.</p><p>Det har blitt utviklet et system der sblokkmatriser kan lastes ned til en FPGA og kjøres der. Kretsen har også spesialkonstruert maskinvare som kan kjøre development på sblokkmatrisen. Sblokkmatrisen kan på et hvilket som helst tidspunkt undersøkes av programvare på CompactPCI-maskinen ved tilbakelesning av data over PCI-bussen. All styring skjer ved hjelp av programvare og kan automatiseres.</p><p>Kretsen er implementert som en samlebåndsbasert koprosessor. Koprosessoren er i stand til å prosessere to sblokker pr. sykel både ved kjøring av developmentsteg og ved konfigurering av sblokkmatrise. Ved kjøring av sblokkmatrise vil alle sblokker oppdateres pr. sykel. Ved tilbakelesning av data fra sblokkmatrise behandles 8 sblokker pr. sykel.</p><p>Ved syntese ble det oppnådd en klokkefrekvens på 80MHz.</p>
|
26 |
Konstruksjon av maskinvare for kjøring av sblokkbaserte eksperimenterDjupdal, Asbjørn January 2003 (has links)
En CompactPCI datamaskin med et NallaTech BenERA FPGA-kort har blitt kjøpt inn til bruk innen forskning på evolusjonær maskinvare. Denne datamaskinen er i stand til å rekonfigurere og kjøre en vilkårlig krets på en FPGA, og kan kommunisere med FPGAen kjapt over en PCI-buss. Øvre hastighet for kommunikasjon med moduler på FPGA-en begrenses av CompactPCI-bussen til 132MB pr. sekund. Denne hovedoppgaven går ut på konstruksjon av et system basert rundt BenERA FPGA-kortet som skal benyttes til kjøring av sblokkbaserte eksperimenter, med fokus på development. Prosjektet baserer seg på tidligere forskning innen sblokker og development ved NTNU. Det har blitt utviklet et system der sblokkmatriser kan lastes ned til en FPGA og kjøres der. Kretsen har også spesialkonstruert maskinvare som kan kjøre development på sblokkmatrisen. Sblokkmatrisen kan på et hvilket som helst tidspunkt undersøkes av programvare på CompactPCI-maskinen ved tilbakelesning av data over PCI-bussen. All styring skjer ved hjelp av programvare og kan automatiseres. Kretsen er implementert som en samlebåndsbasert koprosessor. Koprosessoren er i stand til å prosessere to sblokker pr. sykel både ved kjøring av developmentsteg og ved konfigurering av sblokkmatrise. Ved kjøring av sblokkmatrise vil alle sblokker oppdateres pr. sykel. Ved tilbakelesning av data fra sblokkmatrise behandles 8 sblokker pr. sykel. Ved syntese ble det oppnådd en klokkefrekvens på 80MHz.
|
27 |
Modeling and reduction of dynamic power in field-programmable gate arraysLamoureux, Julien 05 1900 (has links)
Field-Programmable Gate Arrays (FPGAs) are one of the most popular platforms for implementing digital circuits. Their main advantages include the ability to be (re)programmed in the field, a shorter time-to-market, and lower non-recurring engineering costs. This programmability, however, is afforded through a significant amount of additional circuitry, which makes FPGAs significantly slower and less power-efficient compared to Application Specific Integrated Circuits (ASICs).
This thesis investigates three aspects of low-power FPGA design: switching activity estimation, switching activity minimization, and low-power FPGA clock network design. In our investigation of switching activity estimation, we compare new and existing techniques to determine which are most appropriate in the context of FPGAs. Specifically, we compare how each technique affects the accuracy of FPGA power models and the ability of power-aware CAD tools to minimize power. We then present a new publicly available activity estimation tool called ACE-2.0 that incorporates the most appropriate techniques. Using activities estimated byACE-2.0, power estimates and power savings were both within 1% of results obtained using simulated activities. Moreover, the new tool was 69 and 7.2 times faster than circuit simulation for combinational and sequential circuits, respectively.
In our investigation of switching activity minimization, we propose a technique for reducing power in FPGAs by minimizing unnecessary transitions called glitches. The technique involves adding programmable delay elements at inputs of the logic elements of the FPGA to align the arrival times, thereby preventing new glitches from being generated. On average, the proposed technique eliminates 87% of the glitching, which reduces overall FPGA power by17%. The added circuitry increases the overall FPGA area by 6% and critical-path delay by less than 1%.
Finally, in our investigation of low-power FPGA clock networks, we examine the tradeoff between the power consumption of FPGA clock networks and the cost of the constraints they impose on FPGA CAD tools. Specifically, we present a parameterized framework for describing FPGA clock networks, we describe new clock-aware placement techniques, and we perform an empirical study to examine how the clock network parameters affect the overall power consumption of FPGAs. The results show that the techniques used to produce a legal placement can have a significant influence on power and delay. On average, circuits placed using the most effective techniques dissipate 9.9% less energy and were 2.4% faster than circuits placed using the least effective techniques. Moreover, the results show that the architecture of the clock network is also important. On average, FPGAs with an efficient clock network were up to12.5% more energy efficient and 7.2% faster than other FPGAs.
|
28 |
Parallelizing Simulated Annealing Placement for GPGPUChoong, Alexander 17 December 2010 (has links)
Field Programmable Gate Array (FPGA) devices are increasing in capacity at an exponential rate, and thus there is an increasingly strong demand to accelerate simulated annealing placement. Graphics Processing Units (GPUs) offer a unique opportunity to accelerate this simulated annealing placement on a manycore architecture using only commodity hardware. GPUs are optimized for applications which can tolerate single-thread latency and so GPUs can provide high throughput across many threads. However simulated annealing is not embarrassingly parallel and so single thread latency should be minimized to improve run time. Thus it is questionable whether GPUs can achieve any speedup over a sequential implementation. In this thesis, a novel subset-based simulated annealing placement framework is proposed, which specifically targets the GPU architecture. A highly optimized framework is implemented which, on average, achieves an order of magnitude speedup with less than 1% degradation for wirelength and no loss in quality for timing on realistic architectures.
|
29 |
Parallelizing Simulated Annealing Placement for GPGPUChoong, Alexander 17 December 2010 (has links)
Field Programmable Gate Array (FPGA) devices are increasing in capacity at an exponential rate, and thus there is an increasingly strong demand to accelerate simulated annealing placement. Graphics Processing Units (GPUs) offer a unique opportunity to accelerate this simulated annealing placement on a manycore architecture using only commodity hardware. GPUs are optimized for applications which can tolerate single-thread latency and so GPUs can provide high throughput across many threads. However simulated annealing is not embarrassingly parallel and so single thread latency should be minimized to improve run time. Thus it is questionable whether GPUs can achieve any speedup over a sequential implementation. In this thesis, a novel subset-based simulated annealing placement framework is proposed, which specifically targets the GPU architecture. A highly optimized framework is implemented which, on average, achieves an order of magnitude speedup with less than 1% degradation for wirelength and no loss in quality for timing on realistic architectures.
|
30 |
Field-programmable gate-array (FPGA) implementation of low-density parity-check (LDPC) decoder in digital video broadcasting - second generation satellite (DVB-S2)Loi, Kung Chi Cinnati 22 September 2010
In recent years, LDPC codes are gaining a lot of attention among researchers. Its near-
Shannon performance combined with its highly parallel architecture and lesser complexity
compared to Turbo-codes has made LDPC codes one of the most popular forward error
correction (FEC) codes in most of the recently ratied wireless communication standards.
This thesis focuses on one of these standards, namely the DVB-S2 standard that was ratied
in 2005.<p>
In this thesis, the design and architecture of a FPGA implementation of an LDPC decoder
for the DVB-S2 standard are presented. The decoder architecture is an improvement over
others that are published in the current literature. Novel algorithms are devised to use
a memory mapping scheme that allows for 360 functional units (FUs) used in decoding
to be implemented using the Sum-Product Algorithm (SPA). The functional units (FU)
are optimized for reduced hardware resource utilization on a FPGA with a large number
of congurable logic blocks (CLBs) and memory blocks. A novel design of a parity-check
module (PCM) is presented that veries the parity-check equations of the LDPC codes.
Furthermore, a special characteristic of ve of the codes dened in the DVB-S2 standard
and their in
uence on the decoder design is discussed.
Three versions of the LDPC decoder are implemented, namely the 360-FU decoder, the
180-FU decoder and the hybrid 360/180-FU decoder. The decoders are synthesized for two
FPGAs. A Xilinx Virtex-II Pro family FPGA is used for comparison purposes and a Xilinx
Virtex-6 family FPGA is used to demonstrate the portability of the design. The synthesis
results show that the hardware resource utilization and minimum throughput of the decoders
presented are competitive with a DVB-S2 LDPC decoder found in the current literature that
also uses FPGA technology.
|
Page generated in 0.0177 seconds