Spelling suggestions: "subject:"programmable game"" "subject:"programmable gave""
181 |
Techniques de multiplexage pour un système d'émulation et de prototypage rapide à base de FPGA / Multiplexing techniques for FPGA-based emulation and prototyping platformTurki, Mariem 17 September 2014 (has links)
De nos jours, la complexité de la conception des circuits intégrés et du logiciel croit régulièrement, faisant croître le besoin de la vérification dans chaque étape du cycle de conception. Le prototypage matériel sur une plateforme multi-FPGA présente le meilleur compromis entre le temps de conception d'un circuit et le temps d'exécution d'une application par ce circuit. Pour l'implémenter sur cette plateforme, une opération de partitionnement est effectuée avant de créer des partitions capables de s'intégrer dans chaque FPGAPar conséquent, des signaux coupés à l'interface des partitions doivent passer d'un FPGA à un autre. Cependant, le nombre de traces physiques inter-FPGA est limité ce qui crée des problèmes de routabilité du circuit prototypé. Cette thèse touche surtout la partie post-partitionnement et s'intéresse au problème deroutage inter-FPGA. Ainsi, les principaux travaux de cette thèse sont les suivants :Dans un premier temps, nous nous intéressons au développement d'un générateur debenchmarks qui permet, à l'aide d'une description architecturale simple du benchmark, de générer un circuit modélisé avec le langage de description matérielle VHDL. Le générateur utilise un ensemble de composants ce qui donne aux benchmarks un aspect réel semblable à celui des circuits industriels. Ces circuits de tests nous serviront pour évalue rles performances des techniques développées dans cette thèse. Dans un deuxième temps, nous proposons de développer un outil spécifique qui intervient après le partitionnement pour prendre en compte la contrainte liée à la limitation du nombre d'interconnexion entre les FPGAs. Cet outil est basé sur une approcheitérative visant à réduire le taux de multiplexage (nombre de signaux qui partagent un seul _l physique). Le routage en lui même est assuré par l'algorithme de routage Pathfinder adapté. Cet algorithme servira comme point de départ pour les techniques de routage développées durant cette thèse. Des adaptations adéquates seront faites pour cibler un ré-seau de routage inter-FPGA. Dans une deuxième partie, nous essayons de déterminer la meilleure forme du signal à router (bi-points ou multi-points) ainsi que le graphe de routage utilisé. Pour cela, nous proposons des scénarios de test a_n de sélectionner les critères qui donnent la fréquence de fonctionnement la plus performante. Par la suite, nous présentons une description détaillée des IPs de multiplexage utilisés.Ces IPs sont insérés dans les parties émettrices et réceptrices d'un canal de communication. Ces IPs incluent des composants spécifiques appelés SERDES pour assurer la sérialisation/déserialisation des données à transmettre. L'insertion de ces composants peut créer des problèmes de routabilité intra-FPGA. Ainsi, dans une deuxième partie, nous proposons un algorithme de placement basé sur l'estimation de la congestion afin d'améliorer la routabilité du circuit. / This thesis mainly deals with the post-partitioning task and addresses the problem of inter-FPGA routing. Thus, the main contributions of this thesis are: Firstly, we focus on the development of a benchmark generator which, using a simple architectural description of the benchmark, generates a circuit modelled with the hardware description language VHDL. The generator uses a set of industrial components providing benchmarks with real behaviour similar to that of industrial circuits. These benchmarks are used to evaluate the performance of the techniques developed in this thesis. In a second step , we propose a speci_c tool which acts after the partitioning to handle the constraints related to the limited number of interconnection between FPGAs. This tool is based on an iterative approach and aims to reduce the multiplexing ratio (the number of signals that share the same physical wire). The routing task itself is operated by the Pathfinder routing algorithm which is widely used by academic and industrial researchers . This algorithm is used as a starting point for routing techniques developed in this thesis . In a second part , we try to identify the best shape of the routed signals and the appropriate routing graph. For this reason, we propose scenarios to select criteria that give the best system frequency. Finally, we present a detailed description of the architecture of the multiplexing IPs. These IPs are inserted in the transmitting and receiving FPGAs of a communication channel. These IPs include speci_c components called SERDES for serialization/deserialization of the data. The insertion of these IPs can create problems of intra-FPGA routability. Thus, in a second part, we propose a placement algorithm based on congestion estimation to improve the routability of the circuit.
|
182 |
FPGA accelerated packet capture with eBPF : Performance considerations of using SoC FPGA accelerators for packet capturing. / FPGA-accelererad paketfångst med eBPF : Prestandaöverväganden vid användning av SoC FPGA acceleratorer för paketering.Duchniewicz, Jakub January 2022 (has links)
With the rise of the Internet of Things and the proliferation of embedded devices equipped with an accelerator arose a need for efficient resource utilization. Hardware acceleration is a complex topic that requires specialized domain knowledge about the platform and different trade-offs that have to be made, especially in the area of power consumption. Efficient work offloading strives to reduce or at least maintain the total power consumption of the system. Offloading packet capturing is usually done in more powerful devices, hence scarce research is present concerning network packet acceleration in embedded devices. The thesis focuses on accelerating networking packets utilizing a Field Programmable Gate Array in an embedded Linux System. The solution is based on a custom Linux distribution assembled using the Buildroot tool, specially configured and patched Linux kernel, uboot bootloader, and the programmable logic for packet acceleration. The system is evaluated on a De0-Nano System on Chip development board through modifications to burst lengths, packet sizes, and programmable logic clock frequency. Metrics include packet capturing time, time per packet, and consumed power. Finally, the results are contrasted with baseline embedded Linux packet processing by inspection of a packet’s path through the kernel. Collected results provide a deeper understanding of the packet acceleration problem in embedded devices and the resultant system gives a solid starting point for possible extensions such as packet filtering. Key findings include an improvement in packet processing speed as the clock frequency and burst length are increased while maintaining power consumption. Additionally, the solution performs better when the packet sizes are above 64 bytes as the overhead of additional logic necessary for their processing is compensated. The project is also found to be significantly faster than regular in kernel processing with the caveat of providing just packet capturing whereas Linux contains a full network stack. / I och med uppkomsten av sakernas internet och spridningen av inbyggda enheter som är utrustade med en accelerator har det uppstått ett behov av effektivt resursutnyttjande. Hårdvaruacceleration är ett komplext ämne som kräver specialiserad domänkunskap om plattformen och olika avvägningar som måste göras, särskilt när det gäller energiförbrukning. Effektiv arbetsavlastning strävar efter att minska eller åtminstone bibehålla systemets totala energiförbrukning. Avlastning av paketering sker vanligtvis i kraftfullare enheter, och därför finns det knappt någon forskning om nätverksacceleration av paket i inbyggda enheter. Avhandlingen är inriktad på att påskynda nätverkspaket med hjälp av en Field Programmable Gate Array i ett inbäddat Linuxsystem. Lösningen bygger på en anpassad Linuxdistribution som sammanställts med hjälp av verktyget Buildroot, en särskilt konfigurerad och patchad Linuxkärna, uboot bootloader och den programmerbara logiken för paketacceleration. Systemet utvärderas på ett De0-Nano System on Chip-utvecklingskort genom ändringar av burstlängder, paketstorlekar och den programmerbara logikens klockfrekvens. Metrikerna omfattar tid för paketering, tid per paket och förbrukad effekt. Slutligen jämförs resultaten med grundläggande inbäddad Linux-paketbehandling genom inspektion av paketens väg genom kärnan. De samlade resultaten ger en djupare förståelse för problemet med paketacceleration i inbyggda enheter och det resulterande systemet ger en solid utgångspunkt för möjliga utvidgningar, t.ex. paketfiltrering. Bland de viktigaste resultaten kan nämnas en förbättring av hastigheten i paketbehandlingen när klockfrekvensen och burstlängden ökas samtidigt som strömförbrukningen bibehålls. Dessutom fungerar lösningen bättre när paketstorleken är större än 64 bytes eftersom den extra logik som krävs för att behandla paketen kompenseras. Projektet har också visat sig vara betydligt snabbare än vanlig kärnbearbetning, med den reservationen att det bara tillhandahåller paketupptagning, medan Linux innehåller en fullständig nätverksstack. / Rozwój Internetu Rzeczy i ąrosnca śćpopularno systemów wbudowanych ąposiadajcych wbudowany akcelerator ęsprztowy łsprawiy, że łwzrosa potrzeba na ich efektywne wykorzytanie. Akceleracja ęsprztowa jest ądziedzin nauki, która wymaga specjalistycznej wiedzy na temat platformy na której ma ćoperowa oraz wymaga śznajomoci potencjalnych komplikacji które ęsi z ąni ążąwi. Efektywna akceleracja ma na celu ęredukcj żzuycia energii, a przynajmnniej jej utrzymanie na dotychczasowym poziomie. Tematyka ta jest śćdo uboga pod ąktem ędostpnej literatury, żgdy zazwyczaj akceleratory stosowane do sieciowych ąńrozwiza ąs żuywane w ąrozwizaniach serwerowych gdzie ęąwystpuj innego rodzaju problemy. W pracy wykorzystany jest akcelerator Field Programmable Gate Array który jest ęśączci łpytki deweloperskiej De0-Nano System on Chip, gdzie łdziaa łąwspópracujc z wbudowanym systemem Linux, do którego przygotowania wykorzystano ęnarzdzie Buildroot. Na ńkocowe ąrozwizanie ponadto łskada ęsi łpoatane ąjdro Linuxa, bootloader uboot oraz programowalna logika ąrealizujca przechwytywanie pakietów sieciowych. ąRozwizanie poddane jest testom, w których parametry odpowiedzialne za łśćdugo transakcji typu burst, rozmiaru pakietu oraz ęśczstotliwoci zegara ąs poddawane modyfikacjom. Wyniki ąs przedstawione za ąpomoc czasu przetwarzania pakietu, czasu per pakiet oraz żzuycia mocy. Do oceny śefektywnoci ąrozwizania łżłposuyo żtake porównanie z czasem procesowania pakietu w niezmodyfikowanym systemie Linux Na podstawie eksperymentów dokonanych w pracy ęwysunite ąs ęąnastpujce wnioski: wraz ze wzrostem ęśczstotliwoci zegara oraz łśdugoci transakcji burst, czas procesowania pakietów maleje a żzuycie ąprdu pozostaje na dotychczasowym poziomie. Pakiety o rozmiarze ąprzekraczajcym 64 bajty ąs procesowane wydajniej w dostarczonym ąrozwizaniu poprzez ękompensacj dodatkowego łnakadu czasu narzuconego przez ęlogik ąąązarzdzajc przetwarzaniem. System porównano żtake do łzwykego przetwarzania pakietów ąodbywajcego ęsi w systemie Linux które łokazao ęsi zdecydowanie wolniejsze z żzastrzeeniem, żi ów system dokonuje łpenego przetworzenia pakietów a ąrozwizanie w pracy jedynie ich przechwytywania. Projekt stanowi ępodstaw do ewentualnych ńrozszerze, na łprzykad filtrowania pakietów. Wnioski ęwysunite łżąsu łępogbieniu wiedzy w domenie sieci wbudowanych systemów Linux oraz ęsprztowej akceleracji.
|
183 |
Instrumentation of CdZnTe detectors for measuring prompt gamma-rays emitted during particle therapy / Instrumentierung von CdZnTe Detektoren zur Messung prompter Gammastrahlung während der TeilchentherapieFödisch, Philipp 15 May 2017 (has links) (PDF)
Background: The irradiation of cancer patients with charged particles, mainly protons and carbon ions, has become an established method for the treatment of specific types of tumors. In comparison with the use of X-rays or gamma-rays, particle therapy has the advantage that the dose distribution in the patient can be precisely controlled. Tissue or organs lying near the tumor will be spared. A verification of the treatment plan with the actual dose deposition by means of a measurement can be done through range assessment of the particle beam. For
this purpose, prompt gamma-rays are detected, which are emitted by the affected target volume during irradiation.
Motivation: The detection of prompt gamma-rays is a task related to radiation detection and measurement. Nuclear applications in medicine can be found in particular for in vivo diagnosis. In that respect the spatially resolved measurement of gamma-rays is an essential technique for nuclear imaging, however, technical requirements of radiation measurement during particle therapy are much more challenging than those of classical applications. For this purpose, appropriate instruments beyond the state-of-the-art need to be developed and tested for detecting prompt gamma-rays. Hence the success of a method for range assessment of particle beams is largely determined by the implementation of electronics. In practice, this means that a suitable detector material with adapted readout electronics, signal and information processing, and data interface must be utilized to solve the challenges. Thus, the parameters of the system (e.g. segmentation, time or energy resolution) can be optimized depending on the method (e.g. slit camera, time-of-flight measurement or Compton camera). Regardless of the method, the detector system must have a high count rate capability and a large measuring range (>7 MeV). For a subsequent evaluation of a suitable method for imaging, the mentioned parameters may not be restricted by the electronics. Digital signal processing is predestined for multipurpose tasks, and, in terms of the demands made, the performance of such an implementation has to be determined.
Materials and methods: In this study, the instrumentation of a detector system for prompt gamma-rays emitted during particle therapy is limited to the use of a cadmium zinc telluride (CdZnTe, CZT) semiconductor detector. The detector crystal is divided into an 8x8 pixel array by segmented electrodes. Analog and digital signal processing are exemplarily tested with this type of detector and aims for application of a Compton camera to range assessment. The electronics are implemented with commercial off-the-shelf (COTS) components. If applicable, functional units of the detector system were digitalized and implemented in a field-programmable gate array (FPGA). An efficient implementation of the algorithms in terms of timing and logic utilization is fundamental to the design of digital circuits. The measurement system is characterized with radioactive sources to determine the measurement dynamic range and resolution. Finally, the performance is examined in terms of the requirements of particle therapy with experiments at particle accelerators.
Results: A detector system based on a CZT pixel detector has been developed and tested. Although the use of an application-specific integrated circuit is convenient, this approach was rejected because there was no circuit available which met the requirements. Instead, a multichannel, compact, and low-noise analog amplifier circuit with COTS components has been implemented. Finally, the 65 information channels of a detector are digitized, processed and visualized.
An advanced digital signal processing transforms the traditional approaches of nuclear electronics in algorithms and digital filter structures for an FPGA. With regard to the characteristic signals (e.g. varying rise times, depth-dependent energy measurement) of a CZT pixel detector, it could be shown that digital pulse processing results in a very good energy resolution (~2% FWHM at 511 keV), as well as permits a time measurement in the range of some tens of nanoseconds. Furthermore, the experimental results have shown that the dynamic range of the detector system could be significantly improved compared to the existing prototype of the Compton camera (~10 keV..7 MeV). Even count rates of ~100 kcps in a high-energy beam could be ultimately processed with the CZT pixel detector. But this is merely a limit of the detector due to its volume, and not related to electronics. In addition, the versatility of digital signal processing has been demonstrated with other detector materials (e.g. CeBr3). With foresight on high data throughput in a distributed data acquisition from multiple detectors, a Gigabit Ethernet link has been implemented as data interface.
Conclusions: To fully exploit the capabilities of a CZT pixel detector, a digital signal processing is absolutely necessary. A decisive advantage of the digital approach is the ease of use in a multichannel system. Thus with digitalization, a necessary step has been done to master the complexity of a Compton camera. Furthermore, the benchmark of technology shows that a CZT pixel detector withstands the requirements of measuring prompt gamma-rays during particle therapy. The previously used orthogonal strip detector must be replaced by the pixel detector in favor of increased efficiency and improved energy resolution. With the integration of the developed digital detector system into a Compton camera, it must be ultimately proven whether this method is applicable for range assessment in particle therapy. Even if another method is more convenient in a clinical environment due to practical considerations, the detector system of that method may benefit from the shown instrumentation of a digital signal processing system for nuclear applications. / Hintergrund: Die Bestrahlung von Krebspatienten mit geladenen Teilchen, vor allem Protonen oder Kohlenstoffionen, ist mittlerweile eine etablierte Methode zur Behandlung von speziellen Tumorarten. Im Vergleich mit der Anwendung von Röntgen- oder Gammastrahlen hat die Teilchentherapie den Vorteil, dass die Dosisverteilung im Patienten präziser gesteuert werden kann. Dadurch werden um den Tumor liegendes Gewebe oder Organe geschont. Die messtechnische Verifikation des Bestrahlungsplans mit der tatsächlichen Dosisdeposition kann über eine Reichweitenkontrolle des Teilchenstrahls erfolgen. Für diesen Zweck werden prompte Gammastrahlen detektiert, die während der Bestrahlung vom getroffenen Zielvolumen emittiert werden.
Fragestellung: Die Detektion von prompten Gammastrahlen ist eine Aufgabenstellung der Strahlenmesstechnik. Strahlenanwendungen in der Medizintechnik finden sich insbesondere in der in-vivo Diagnostik. Dabei ist die räumlich aufgelöste Messung von Gammastrahlen bereits zentraler Bestandteil der nuklearmedizinischen Bildgebung, jedoch sind die technischen Anforderungen der Strahlendetektion während der Teilchentherapie im Vergleich mit klassischen Anwendungen weitaus anspruchsvoller. Über den Stand der Technik hinaus müssen für diesen Zweck geeignete Instrumente zur Erfassung der prompten Gammastrahlen entwickelt und erprobt werden. Die elektrotechnische Realisierung bestimmt maßgeblich den Erfolg eines Verfahrens zur Reichweitenkontrolle von Teilchenstrahlen. Konkret bedeutet dies, dass ein geeignetes Detektormaterial mit angepasster Ausleseelektronik, Signal- und Informationsverarbeitung sowie Datenschnittstelle zur Problemlösung eingesetzt werden muss. Damit können die Parameter des Systems (z. B. Segmentierung, Zeit- oder Energieauflösung) in Abhängigkeit der Methode (z.B. Schlitzkamera, Flugzeitmessung oder Compton-Kamera) optimiert werden. Unabhängig vom Verfahren muss das Detektorsystem eine hohe Ratenfestigkeit und einen großen Messbereich (>7 MeV) besitzen. Für die anschließende Evaluierung eines geeigneten Verfahrens zur Bildgebung dürfen die genannten Parameter durch die Elektronik nicht eingeschränkt werden. Eine digitale Signalverarbeitung ist für universelle Aufgaben prädestiniert und die Leistungsfähigkeit einer solchen Implementierung soll hinsichtlich der gestellten Anforderungen bestimmt werden.
Material und Methode: Die Instrumentierung eines Detektorsystems für prompte Gammastrahlen beschränkt sich in dieser Arbeit auf die Anwendung eines Cadmiumzinktellurid (CdZnTe, CZT) Halbleiterdetektors. Der Detektorkristall ist durch segmentierte Elektroden in ein 8x8 Pixelarray geteilt. Die analoge und digitale Signalverarbeitung wird beispielhaft mit diesem Detektortyp erprobt und zielt auf die Anwendung zur Reichweitenkontrolle mit einer Compton-Kamera. Die Elektronik wird mit seriengefertigten integrierten Schaltkreisen umgesetzt. Soweit möglich, werden die Funktionseinheiten des Detektorsystems digitalisiert und in einem field-programmable gate array (FPGA) implementiert. Eine effiziente Umsetzung der Algorithmen in Bezug auf Zeitverhalten und Logikverbrauch ist grundlegend für den Entwurf der digitalen Schaltungen. Das Messsystem wird mit radioaktiven Prüfstrahlern hinsichtlich Messbereichsdynamik und Auflösung charakterisiert. Schließlich wird die Leistungsfähigkeit hinsichtlich der Anforderungen der Teilchentherapie mit Experimenten am Teilchenbeschleuniger untersucht.
Ergebnisse: Es wurde ein Detektorsystem auf Basis von CZT Pixeldetektoren entwickelt und erprobt. Obwohl der Einsatz einer anwendungsspezifischen integrierten Schaltung zweckmäßig wäre, wurde dieser Ansatz zurückgewiesen, da kein verfügbarer Schaltkreis die Anforderungen erfüllte. Stattdessen wurde eine vielkanalige, kompakte und rauscharme analoge Verstärkerschaltung mit seriengefertigten integrierten Schaltkreisen aufgebaut. Letztendlich werden die 65 Informationskanäle eines Detektors digitalisiert, verarbeitet und visualisiert. Eine fortschrittliche digitale Signalverarbeitung überführt die traditionellen Ansätze der Nuklearelektronik in Algorithmen und digitale Filterstrukturen für einen FPGA. Es konnte gezeigt werden, dass die digitale Pulsverarbeitung in Bezug auf die charakteristischen Signale (u.a. variierende Anstiegszeiten, tiefenabhängige Energiemessung) eines CZT Pixeldetektors eine sehr gute Energieauflösung (~2% FWHM at 511 keV) sowie eine Zeitmessung im Bereich von einigen 10 ns ermöglicht. Weiterhin haben die experimentellen Ergebnisse gezeigt, dass der Dynamikbereich des Detektorsystems im Vergleich zum bestehenden Prototyp der Compton-Kamera deutlich verbessert werden konnte (~10 keV..7 MeV). Nach allem konnten auch Zählraten von >100 kcps in einem hochenergetischen Strahl mit dem CZT Pixeldetektor verarbeitet werden. Dies stellt aber lediglich eine Begrenzung des Detektors aufgrund seines Volumens, nicht jedoch der Elektronik, dar. Zudem wurde die Vielseitigkeit der digitalen Signalverarbeitung auch mit anderen Detektormaterialen (u.a. CeBr3) demonstriert. Mit Voraussicht auf einen hohen Datendurchsatz in einer verteilten Datenerfassung von mehreren Detektoren, wurde als Datenschnittstelle eine Gigabit Ethernet Verbindung implementiert.
Schlussfolgerung: Um die Leistungsfähigkeit eines CZT Pixeldetektors vollständig auszunutzen, ist eine digitale Signalverarbeitung zwingend notwendig. Ein entscheidender Vorteil des digitalen Ansatzes ist die einfache Handhabbarkeit in einem vielkanaligen System. Mit der Digitalisierung wurde ein notwendiger Schritt getan, um die Komplexität einer Compton-Kamera beherrschbar zu machen. Weiterhin zeigt die Technologiebewertung, dass ein CZT Pixeldetektor den Anforderungen der Teilchentherapie für die Messung prompter Gammastrahlen stand hält. Der bisher eingesetzte Streifendetektor muss zugunsten einer gesteigerten Effizienz und verbesserter Energieauflösung durch den Pixeldetektor ersetzt werden. Mit der Integration des entwickelten digitalen Detektorsystems in eine Compton-Kamera muss abschließend geprüft werden, ob dieses Verfahren für die Reichweitenkontrolle in der Teilchentherapie anwendbar ist. Auch wenn sich herausstellt, dass ein anderes Verfahren unter klinischen Bedingungen praktikabler ist, so kann auch dieses Detektorsystem von der gezeigten Instrumentierung eines digitalen Signalverarbeitungssystems profitieren.
|
184 |
Fast Code Exploration for Pipeline Processing in FPGA Accelerators / Exploração Rápida de Códigos para Processamento Pipeline em Aceleradores FPGARosa, Leandro de Souza 31 May 2019 (has links)
The increasing demand for energy efficient computing has endorsed the usage of Field-Programmable Gate Arrays to create hardware accelerators for large and complex codes. However, implementing such accelerators involve two complex decisions. The first one lies in deciding which code snippet is the best to create an accelerator, and the second one lies in how to implement the accelerator. When considering both decisions concomitantly, the problem becomes more complicated since the code snippet implementation affects the code snippet choice, creating a combined design space to be explored. As such, a fast design space exploration for the accelerators implementation is crucial to allow the exploration of different code snippets. However, such design space exploration suffers from several time-consuming tasks during the compilation and evaluation steps, making it not a viable option to the snippets exploration. In this work, we focus on the efficient implementation of pipelined hardware accelerators and present our contributions on speeding up the pipelines creation and their design space exploration. Towards loop pipelining, the proposed approaches achieve up to 100× speed-up when compared to the state-uf-the-art methods, leading to 164 hours saving in a full design space exploration with less than 1% impact in the final results quality. Towards design space exploration, the proposed methods achieve up to 9:5× speed-up, keeping less than 1% impact in the results quality. / A demanda crescente por computação energeticamente eficiente tem endossado o uso de Field- Programmable Gate Arrays para a criação de aceleradores de hardware para códigos grandes e complexos. Entretanto, a implementação de tais aceleradores envolve duas decisões complexas. O primeiro reside em decidir qual trecho de código é o melhor para se criar o acelerador, e o segundo reside em como implementar tal acelerador. Quando ambas decisões são consideradas concomitantemente, o problema se torna ainda mais complicado dado que a implementação do trecho de código afeta a seleção dos trechos de código, criando um espaço de projeto combinatorial a ser explorado. Dessa forma, uma exploração do espaço de projeto rápida para a implementação de aceleradores é crucial para habilitar a exploração de diferentes trechos de código. Contudo, tal exploração do espaço de projeto é impedida por várias tarefas que consumem tempo durante os passos de compilação a análise, o que faz da exploração de trechos de códigos inviável. Neste trabalho, focamos na implementação eficiente de aceleradores pipeline em hardware e apresentamos nossas contribuições para o aceleramento da criações de pipelines e de sua exploração do espaço de projeto. Referente à criação de pipelines, as abordagens propostas alcançam uma aceleração de até 100× quando comparadas às abordagens do estado-da-arte, levando à economia de 164 horas em uma exploração de espaço de projeto completa com menos de 1% de impacto na qualidade dos resultados. Referente à exploração do espaço de projeto, as abordagens propostas alcançam uma aceleração de até 9:5×, mantendo menos de 1% de impacto na qualidade dos resultados.
|
185 |
Contribution à la continuité de service des convertisseurs statiques multiniveaux / Contribution to the continuity of service of multilevel convertersBecker, Florent 04 December 2017 (has links)
Ce mémoire s’inscrit dans le contexte général de la continuité de service des convertisseurs multiniveaux, lors de la défaillance d’un de leurs composants de puissance. Les structures concernées sont les topologies suivantes, largement utilisées dans les applications industrielles : Neutral Point Clamped (NPC) et Neutral Point Piloted (NPP) ou T-Type. Dans un premier temps, afin de limiter le taux de pannes du convertisseur, une commande contribuant à l’accroissement de la durée de vie des composants de puissance est tout d’abord proposée. Pour se faire, nous minimiserons sur chaque période le nombre de commutations des composants commandables à l’ouverture et à la fermeture. Cette idée a pour origine le fait qu’un convertisseur multiniveaux permet de générer le même niveau de tension de sortie à partir de plusieurs séquences de commutations différentes. Le principe de la commande proposée sera développé de manière générale, puis appliqué aux cas de structures type « Pont en H » à 5 niveaux, de type NPP (ou T-Type) et NPC. Ensuite, nous étudierons la continuité de service en mode nominal d’un convertisseur « Pont en H » à 5 niveaux, de type NPP (ou T-Type), suite à la défaillance en circuit ouvert d’un composant de puissance. Nous proposerons tout d’abord un diagnostic du défaut, constitué d’une première étape de détection, suivie d’une localisation précise du composant défaillant. Une topologie originale de convertisseur à tolérance de pannes permettra de garantir la continuité de service du système, en modifiant sa commande en adéquation avec le composant défaillant localisé. Des architectures électroniques numériques reconfigurables basées sur des composants FPGA (Field Programmable Gate Array) seront dédiées au diagnostic et à la reconfiguration de la commande ; elles permettront d’atteindre des performances temporelles élevées. L’ensemble des résultats présentés dans ce mémoire sera validé par modélisation/simulation, puis expérimentalement sur un banc de test / This thesis deals with continuity of service of multilevel power converters, during the failure of one of their power components. The studied converter topologies are the following, widely used in industrial applications: Neutral Point Clamped (NPC) and Neutral Point Piloted (NPP) or T-Type. First, to reduce the failure rate of the converter, an advanced control is proposed ; it increases the lifetime of the power components by minimizing the number of switchings over a period. This idea is based on the fact that a multilevel converter makes possible to generate the same output voltage level from several different switching sequences. The principle of the proposed control will be developed in a general way. Then, it is applied to the cases of 5-level "H-bridge" topologies, NPP (or T-Type) and NPC. Then, the continuity of service in nominal mode is studied for a 5 level "H-brige" NPP (or T-Type) converter, when an open circuit failure occurs on a power component. We first propose a fault diagnosis, consisting in a fault detection step, followed by the location of the faulty component. Then, an original fault-tolerant converter topology will ensure the continuity of service of the system, by modifying the control according to the localized faulty component. Reconfigurable digital electronic architectures based on Field Programmable Gate Array (FPGA) components will be dedicated to the diagnosis and the reconfiguration of the control; they will perform high temporal performances. All the results presented in this paper are validated by modeling and simulation. Then, they are experimentally validated on a test bench
|
186 |
FPGA prototyping of custom GPGPUsNigania, Nimit 08 January 2014 (has links)
Prototyping new systems on hardware is a time-consuming task with limited scope for architectural exploration. The aim of this work was to perform fast prototyping of general-purpose graphics processing units (GPGPUs) on field programmable gate arrays (FPGAs) using a novel tool chain. This hardware flow combined with the higher level simulation flow using the same source code allowed us to create a whole tool chain to study and build future architectures using new technologies. It also gave us enough flexibility at different granularities to make architectural decisions. We will also discuss some example systems that were built using this tool chain along with some results.
|
187 |
Analog signal processing on a reconfigurable platformSchlottmann, Craig Richard 08 July 2009 (has links)
The Cooperative Analog/Digital Signal Processing (CADSP) research group's approach to signal processing is to see what opportunities lie in adjusting the line between what is traditionally computed in digital and what can be done in analog. By allowing more computation to be done in analog, we can take advantage of its low power, continuous domain operation, and parallel capabilities. One setback keeping Analog Signal Processing (ASP) from achieving more wide-spread use, however, is its lack of programmability. The design cycle for a typical analog system often involves several iterations of the fabrication step, which is labor intensive, time consuming, and expensive. These costs in both time and money reduce the likelihood that engineers will consider an analog solution. With CADSP's development of a reconfigurable analog platform, a Field-Programmable Analog Array (FPAA), it has become much more practical for systems to incorporate processing in the analog domain. In this Thesis, I present an entire chain of tools that allow one to design simply at the system block level and then compile that design onto analog hardware. This tool chain uses the Simulink design environment and a custom library of blocks to create analog systems. I also present several of these ASP blocks, covering a broad range of functions from matrix computation to interfacing. In addition to these tools and blocks, the most recent FPAA architectures are discussed. These include the latest RASP general-purpose FPAAs as well as an adapted version geared toward high-speed applications.
|
188 |
Cryptography and cryptanalysis on reconfigurable devices security implementations for hardware and reprogrammable devicesGüneysu, Tim Erhan January 2009 (has links)
Zugl.: Bochum, Univ., Diss., 2009
|
189 |
Uma plataforma de hardware para processamento de imagem baseada na transformada imagem-florestaCappabianco, Fabio Augusto Menocci 15 February 2006 (has links)
Orientadores: Guido Costa Souza de Araujo, Alexandre Xavier Falcão / Dissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Computação / Made available in DSpace on 2018-08-07T09:45:52Z (GMT). No. of bitstreams: 1
Cappabianco_FabioAugustoMenocci_M.pdf: 2472578 bytes, checksum: 8df546b29eccff4337413df4b5d9a7c3 (MD5)
Previous issue date: 2006 / Resumo: Implementações de operadores de processamento de imagens em plataformas de hardware têm obtido ótimos resultados devido a sua atuação paralela em diversas regiões da imagem. Ao mesmo tempo, a IFT (Image Foresting Transform) tem provado ser uma técnica eficiente de reduzir problemas de processamento de imagens em um problema de floresta de caminhos de um grafo, cuja solução é obtida em tempo linear no o número de pixels. Este trabalho contém a implementação de uma plataforma, em hardware, chamada SIFT {Silicon Image Foresting Transform), que executa o algoritmo da IFT paralelamente. O modelo de processamento e armazenamento SIFT serve como base para outras arquiteturas de processamento de imagens e amplia o entendimento de alguns conceitos de mapas de predecessores e rótulos utilizados pela IFT. / Abstract: Great results had been achieved by the use of hardware platforms to implement image processing operators. This success was reached due to the use of multiple processors working parallel in several regions of the image. On the other hand, IFT (Image Foresting Transform), a software technique to reduce image processing problems into a graph path forest problem, performs image operations in linear time in the number of pixels in most of applications. The main goal of this work was to generate a hardware platform, that implements the an algorithm based on the IFT in a fast and efficient way. / Mestrado / Mestre em Ciência da Computação
|
190 |
Implementação de codificador LDPC para um sistema de TV digital usando ferramentas de prototipagem rapida / Implementation of an LDPC encoder for a digital TV system using rapid protoyping toolsGarcia, Fábio Lumertz, 1979- 21 December 2006 (has links)
Orientadores: Dalton Soares Arantes, Fabbryccio A. Cardoso / Dissertação (mestrado) - Universidade Estadual de Campinas, Faculdade de Engenharia Eletrica e de Computação / Made available in DSpace on 2018-08-08T03:13:26Z (GMT). No. of bitstreams: 1
Garcia_FabioLumertz_M.pdf: 3287022 bytes, checksum: 7cf0e283ddc5a0d2f929f3cc22b17903 (MD5)
Previous issue date: 2006 / Resumo: O objetivo deste trabalho é apresentar as diversas etapas de implementação de um codificador LDPC para um sistema de televisão digital, desenvolvido através do emprego de algumas tecnologias inovadoras de prototipagem rápida em FPGA. O codificador implementado foi baseado em um código LDPC eIRA, que consiste em uma classe estendida de códigos de repetição e acumulação irregulares, com palavra-código de 9792 bits e taxa de 3/4. Visando agregar outras tecnologias emergentes ao projeto de TV Digital, o sistema proposto foi desenvolvido para operar sobre o Protocolo de Internet - IP. Os esforços para a realização deste trabalho fizeram parte de um esforço mais amplo de um consórcio de universidades brasileiras, visando à concepção, ao projeto, à simulação e à implementação em hardware de um Sistema de Modulação Inovadora para o SBTVD. A grande sinergia obtida neste projeto e o uso intensivo de ferramentas de prototipagem rápida em FPGA possibilitaram a obtenção de uma prova de conceito implementada e testada em um prazo de apenas 12 meses / Abstract: This work presents the several phases in the implementation of an LDPC encoder for a digital television system, developed using innovative technologies for rapid prototyping on Field Programmable Gate Array devices - FPGAs. The implemented encoder was based on an eIRA - extended Irregular Repeat Accumulate - LDPC code with codeword-Iength equal to 9792 bits and rate 3/4. The proposed system was developed to work with video streaming over the Internet Protocol- IP. This work is part of a more ambitious project that resulted in the development of an advanced Modulation System for the Brazilian Digital TV System - BTVD / Mestrado / Telecomunicações e Telemática / Mestre em Engenharia Elétrica
|
Page generated in 0.0773 seconds