• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 90
  • 18
  • 17
  • 10
  • 10
  • 9
  • 5
  • 5
  • 4
  • 4
  • 1
  • 1
  • 1
  • Tagged with
  • 208
  • 208
  • 208
  • 205
  • 109
  • 39
  • 36
  • 35
  • 32
  • 29
  • 28
  • 27
  • 25
  • 24
  • 23
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
121

FPGA Reservoir Computing Networks for Dynamic Spectrum Sensing

Shears, Osaze Yahya 14 June 2022 (has links)
The rise of 5G and beyond systems has fuelled research in merging machine learning with wireless communications to achieve cognitive radios. However, the portability and limited power supply of radio frequency devices limits engineers' ability to combine them with powerful predictive models. This hinders the ability to support advanced 5G applications such as device-to-device (D2D) communication and dynamic spectrum sharing (DSS). This challenge has inspired a wave of research in energy efficient machine learning hardware with low computational and area overhead. In particular, hardware implementations of the delayed feedback reservoir (DFR) model show promising results for meeting these constraints while achieving high accuracy in cognitive radio applications. This thesis answers two research questions surrounding the applicability of FPGA DFR systems for DSS. First, can a DFR network implemented on an FPGA run faster and with lower power than a purely software approach? Second, can the system be implemented efficiently on an edge device running at less than 10 watts? Two systems are proposed that prove FPGA DFRs can achieve these feats: a mixed-signal circuit, followed by a high-level synthesis circuit. The implementations execute up to 58 times faster, and operate at more than 90% lower power than the software models. Furthermore, the lowest recorded average power of 0.130 watts proves that these approaches meet typical edge device constraints. When validated on the NARMA10 benchmark, the systems achieve a normalized error of 0.21 compared to state-of-the-art error values of 0.15. In a DSS task, the systems are able to predict spectrum occupancy with up to 0.87 AUC in high noise, multiple input, multiple output (MIMO) antenna configurations compared to 0.99 AUC in other works. At the end of this thesis, the trade-offs between the approaches are analyzed, and future directions for advancing this study are proposed. / Master of Science / The rise of 5G and beyond systems has fuelled research in merging machine learning with wireless communications to achieve cognitive radios. However, the portability and limited power supply of radio frequency devices limits engineers' ability to combine them with powerful predictive models. This hinders the ability to support advanced 5G and internet-of-things (IoT) applications. This challenge has inspired a wave of research in energy efficient machine learning hardware with low computational and area overhead. In particular, hardware implementations of a low complexity neural network model, called the delayed feedback reservoir, show promising results for meeting these constraints while achieving high accuracy in cognitive radio applications. This thesis answers two research questions surrounding the applicability of field-programmable gate array (FPGA) delayed feedback reservoir systems for wireless communication applications. First, can this network implemented on an FPGA run faster and with lower power than a purely software approach? Second, can the network be implemented efficiently on an edge device running at less than 10 watts? Two systems are proposed that prove the FPGA networks can achieve these feats. The systems demonstrate lower power consumption and latency than the software models. Additionally, the systems maintain high accuracy on traditional neural network benchmarks and wireless communications tasks. The second implementation is further demonstrated in a software-defined radio architecture. At the end of this thesis, the trade-offs between the approaches are analyzed, and future directions for advancing this study are proposed.
122

UTILIZATION OF FIELD PROGRAMMABLE GATE ARRAYS AND DIGITAL SIGNAL PROCESSING MICROPROCESSORS IN AN ADVANCED PC TT&C SATCOM SYSTEM

Meyers, Tom 10 1900 (has links)
International Telemetering Conference Proceedings / October 25-28, 1999 / Riviera Hotel and Convention Center, Las Vegas, Nevada / L-3 Communications Telemetry & Instrumentation (L-3 T&I) has developed an advanced IBM PC-AT Telemetry, Tracking, and Commanding (TT&C) SATCOM system based on the utilization of Field Programmable Gate Array / Digital Signal Processing (FPGA/DSP) microprocessors. This system includes up-link, down-link, and range processing sections. Physically, the system consists of one IF Transceiver and two or more FPGA/DSP microprocessor boards called Advanced Processing Microprocessors (APMs). The form factor of these PWBs is compliant with full length, full height IBM PC PCI bus cards. This paper describes the features and functionality of an advanced Telemetry, Tracking, and Commanding Processing System (TT&CPS) based on the implementation of FPGA and DSP microprocessors. The high-level functional attributes of the TT&CPS are depicted in Figure 1. There are four main functional blocks: the IF Transceiver, the Down-Link Processing Section, the Up-Link Processing Section, and the Range Processor. The analog/IF circuitry in the IF Transceiver card interfaces between the 68–72 MHz (70 MHz, nominal) IF I/O signals and the Up-Link and Down-Link Processing Section's DSP equipment. The down-link portion of the IF Transceiver card has two user-selected input ports. From the selected input, the signal is processed through selectable bandwidth limiting, gain control, Doppler correction (optional), quadrature down-conversion to zero hertz (baseband), selectable baseband filtering, and precision Analog-to-Digital (A/D) conversion. The up-link portion of the IF Transceiver card takes I/Q digital data from the APM performing the up-link processing functions. This baseband I/Q digital data is Digital-to-Analog (D/A) converted, filtered, quadrature up-converted to 68–72 MHz, up-link Doppler corrected (optional), output level detected and level controlled, and sent to a two-position output selector switch. The down-link portion of the TT&CPS provides main carrier linear PM or BPSK or QPSK demodulation and can also, in composite linear PM demodulation mode, receive and demodulate FSK and/or BPSK subcarriers and ranging signals. The demodulators use symbol timing loops and bit decision circuits (matched filters) to perform the bit synchronization function. Several decoding algorithms, including differential, de-interleaving, Viterbi, and Reed-Solomon, are available for the down-link telemetry. Command format checking and CRC status is also available on FSK-demodulated data. Direct carrier BPSK/QPSK demodulation has decoding and frame synchronization capabilities. Because of the modular construction of the firmware and the use of FPGAs and DSPs, the system can be loaded with only the functions in use, lowering initial setup time while increasing overall system capability. To support a particular function, the card is downloaded with an “image,” which programs the FPGAs and DSPs at initialization. The user can change configurations by simply downloading a new set of instructions to the FPGA/DSP on the fly to keep the ground station running with minimal downtime. The flexibility of the design minimizes spare board costs, while achieving greater programmability at the end-user location.
123

Implementation of a protocol and channel coding strategy for use in ground-satellite applications

Wiid, Riaan 03 1900 (has links)
Thesis (MScEng)--Stellenbosch University, 2012. / ENGLISH ABSTRACT: A collaboration between the Katholieke Universiteit van Leuven (KUL) and Stellenbosch University (SU), resulted in the development of a satellite based platform for use in agricultural sensing applications. This will primarily serve as a test platform for a digitally beam-steerable antenna array (SAA) that was developed by KUL. SU developed all flight - and ground station based hardware and software, enabling ground to flight communications and interfacing with the KUL SAA. Although most components had already been completed at the start of this M:Sc:Eng: project, final systems integration was still unfinished. Modules necessary for communication were also outstanding. This project implemented an automatic repeat and request (ARQ) strategy for reliable file transfer across the wireless link. Channel coding has also been implemented on a field programmable gate array (FPGA). This layer includes an advanced forward error correction (FEC) scheme i.e. a low-density parity-check (LDPC), which outperforms traditional FEC techniques. A flexible architecture for channel coding has been designed that allows speed and complexity trade-offs on the FPGA. All components have successfully been implemented, tested and integrated. Simulations of LDPC on the FPGA have been shown to provide excellent error correcting performance. The prototype has been completed and recently successfully demonstrated at KUL. Data has been reliably transferred between the satellite platform and a ground station, during this event. / AFRIKAANSE OPSOMMING: Tydens ’n samewerkingsooreenkoms tussen die Katholieke Universiteit van Leuven (KUL) en die Universiteit van Stellenbosch (US) is ’n satelliet stelsel ontwikkel vir sensor-netwerk toepassings in die landbou bedryf. Hierdie stelsel sal hoofsaaklik dien as ’n toetsmedium vir ’n digitaal stuurbare antenna (SAA) wat deur KUL ontwikkel is. Die US het alle hardeware en sagteware komponente ontwikkel om kommunikasie d.m.v die SAA tussen die satelliet en ’n grondstasie te bewerkstellig. Sedert die begin van hierdie M:Sc:Ing: projek was die meeste komponente alreeds ontwikkel en geïmplementeer, maar finale stelselsintegrasie moes nog voltooi word. Modules wat kommunikasie sou bewerkstellig was ook nog uistaande. Hierdie projek het ’n ARQ protokol geïmplementeer wat data betroubaar tussen die satelliet en ’n grondstasie kon oordra. Kanaalkodering is ook op ’n veld programmeerbare hekskikking (FPGA) geïmplementeer. ’n Gevorderde foutkorrigeringstelsel, naamlik ’n lae digtheids pariteit toetskode (LDPC), wat tradisionele foutkorrigeringstelsels se doeltreffendheid oortref, word op hierdie FPGA geïmplementeer. ’n Kanaalkoderingsargitektuur is ook ontwikkel om die verwerkingspoed van data en die hoeveelheid FPGA logika wat gebruik word, teenoor mekaar op te weeg. Alle komponente is suksesvol geïmplementeer, getoets en geïntegreer met die hele stelsel. Simulasies van LDPC op die FPGA het uistekende foutkorrigeringsresultate gelewer. ’n Werkende prototipe is onlangs voltooi en suksesvol gedemonstreer by KUL. Betroubare data oordrag tussen die satelliet en die grondstasie is tydens hierdie demonstrasie bevestig.
124

Redundant Number Systems for Optimising Digital Signal Processing Performance in Field Programmable Gate Array

Kamp, William Hermanus Michael January 2010 (has links)
Speeding up addition is the key to faster digital signal processing (DSP). This can be achieved by exploiting the properties of redundant number systems. Their expanded symbol (digit) alphabet gives them multiple representations for most values. Utilising redundant representations at the output of an adder permits addition to be performed without carry-propagation, yielding fast, constant time performance irrespective of the word length. A resource efficient implementation of this fast adder structure is developed that re-purposes the fast carry logic of low-cost field programmable gate arrays (FPGAs). Experiments confirm constant time addition and show that it outperforms binary ripple carry addition at word lengths of greater than 44 bits in a Xilinx Spartan 3 FPGA and 24 bits in an Altera Cyclone III FPGA. Redundancy also provides other properties that can be exploited for performance gain. Some redundant representations will have more zero-symbols than others. These maximise the opportunities to exploit the multiplicative absorbing and additive identity properties of zero that when exercised reduce superfluous calculations. A serial recoding algorithm is developed that generates a redundant representation for a specified value with as few nonzero symbols as possible. Unlike previously published methods, it accepts a wide specification of number systems including those with irregularly spaced symbol alphabets. A Markov analysis and analysis of the elementary cycles in the formulated state machine provides average and worst case measures for the tested number system. Typically, the average number of non-zero symbols is less than a third and the worst case is less than a half. Further to the increase in zero-symbols, zero-dominance is proposed as a new property of redundant number representations. It promotes a set of representations that have uniquely positioned zero-symbols, in a Pareto-optimal sense. This set covers all representations of a value and is used to select representations to optimise the calculation of a dot-product. The dot-product or vector-multiply is a fundamental operation in DSP, since it is employed in filtering, correlation and convolution. The nonzero partial products can be packed together, substantially reducing the calculation time. The application of redundant number systems provides a two-fold benefit. Firstly, the number of nonzero partial products is reduced. Secondly, a novel opportunity is identified to use the representations in the zero-dominant set to optimise the packing further, gaining an extra 18% improvement. An implementation of the proposed dot-product with partial product packing is developed for a Cyclone II FPGA. It outperforms a quad-multiplier binary implementation in throughput by 50% . Redundant number systems excel at increasing performance in particular DSP subsystems, those that are numerically intensive and consist of considerable accumulation. The conversion back to a binary result is the performance bottleneck in the DSP algorithm, taking a time proportional to a binary adder. Therefore, redundant number systems are best utilised when this conversion cost can be amortised over many fast redundant additions, which is typical in many DSP and communications applications.
125

Fast Code Exploration for Pipeline Processing in FPGA Accelerators / Exploração Rápida de Códigos para Processamento Pipeline em Aceleradores FPGA

Rosa, Leandro de Souza 31 May 2019 (has links)
The increasing demand for energy efficient computing has endorsed the usage of Field-Programmable Gate Arrays to create hardware accelerators for large and complex codes. However, implementing such accelerators involve two complex decisions. The first one lies in deciding which code snippet is the best to create an accelerator, and the second one lies in how to implement the accelerator. When considering both decisions concomitantly, the problem becomes more complicated since the code snippet implementation affects the code snippet choice, creating a combined design space to be explored. As such, a fast design space exploration for the accelerators implementation is crucial to allow the exploration of different code snippets. However, such design space exploration suffers from several time-consuming tasks during the compilation and evaluation steps, making it not a viable option to the snippets exploration. In this work, we focus on the efficient implementation of pipelined hardware accelerators and present our contributions on speeding up the pipelines creation and their design space exploration. Towards loop pipelining, the proposed approaches achieve up to 100× speed-up when compared to the state-uf-the-art methods, leading to 164 hours saving in a full design space exploration with less than 1% impact in the final results quality. Towards design space exploration, the proposed methods achieve up to 9:5× speed-up, keeping less than 1% impact in the results quality. / A demanda crescente por computação energeticamente eficiente tem endossado o uso de Field- Programmable Gate Arrays para a criação de aceleradores de hardware para códigos grandes e complexos. Entretanto, a implementação de tais aceleradores envolve duas decisões complexas. O primeiro reside em decidir qual trecho de código é o melhor para se criar o acelerador, e o segundo reside em como implementar tal acelerador. Quando ambas decisões são consideradas concomitantemente, o problema se torna ainda mais complicado dado que a implementação do trecho de código afeta a seleção dos trechos de código, criando um espaço de projeto combinatorial a ser explorado. Dessa forma, uma exploração do espaço de projeto rápida para a implementação de aceleradores é crucial para habilitar a exploração de diferentes trechos de código. Contudo, tal exploração do espaço de projeto é impedida por várias tarefas que consumem tempo durante os passos de compilação a análise, o que faz da exploração de trechos de códigos inviável. Neste trabalho, focamos na implementação eficiente de aceleradores pipeline em hardware e apresentamos nossas contribuições para o aceleramento da criações de pipelines e de sua exploração do espaço de projeto. Referente à criação de pipelines, as abordagens propostas alcançam uma aceleração de até 100× quando comparadas às abordagens do estado-da-arte, levando à economia de 164 horas em uma exploração de espaço de projeto completa com menos de 1% de impacto na qualidade dos resultados. Referente à exploração do espaço de projeto, as abordagens propostas alcançam uma aceleração de até 9:5×, mantendo menos de 1% de impacto na qualidade dos resultados.
126

Fully FPGA-based Sensorless Control for synchronous AC drive using an Extended Kalman Filter / Fully FPGA-based Sensorless Control for synchronous AC drive using an Extended Kalman Filter

Idkhajine, Lahoucine 24 November 2010 (has links)
L'objectif du travail réalisé dans le cadre de cette thèse est de montrer l'intérêt d'utiliser les FPGAs (Field Programmable Gate Array) comme support pour l'implantation d'algorithmes complexes dédiés à la commande de machines électriques. Pour ce faire, une commande sans capteur mécanique utilisant un filtre de Kalman étendu et basée sur FPGA est réalisée. Cette commande est destinée à piloter une machine synchrone à pôles saillants. Le modèle d-q de la machine basé sur l'approximation d'inertie infinie est implanté. L'ordre du Filtre de Kalman est donc égal à 4 et la complexité totale de la boucle de régulation est évaluée à près de 700 opérations arithmétiques (dont plus de 53% de multiplications). Les apports des solutions FPGAs en termes de performances de contrôle et en termes de capacité d'intégration sont quantifiés.En terme de performances de contrôle, il a été démontré qu'en utilisant de telles solutions matérielles, le temps de calcul est très réduit (de l'ordre de 5µs, 5% de la période d'échantillonnage). Cette rapidité de calcul permet d'avoir un contrôle quasi-instantané ce qui améliore la bande passante de la boucle de régulation. A ce sujet, une comparaison avec les performances obtenues avec une solution logicielle telle que le DSP est effectuée. Dans les deux cas, le comportement dynamique de la boucle de régulation s de vitesse ans capteur est quantifié.En termes de capacité d'intégration, il est possible de développer une architecture commune qui peut être adaptée à plusieurs systèmes. A titre d'exemple, il est possible de développer un filtre de Kalman sur un même FPGA capable d'estimer les grandeurs de plusieurs systèmes sans pour autant affecter les performances de contrôle. En outre, une méthodologie de développement dédiée à de tels algorithmes complexes est proposée. Il s'agit là d'une adaptation des méthodologies proposées dans des travaux de thèse précédents, [62] et [63]. En effet, une étape de spécification préliminaire du système ainsi que des procédures d'optimisation supplémentaires y sont introduites. Ces dernières sont particulièrement nécessaires dans le cas de commandes complexes et permettent une adéquation entre l'algorithme développé et l'architecture FPGA correspondante. De plus, cette méthodologie a été organisée de façon à distinguer l'étape du développement de l'algorithme et l'étape du développement de l'architecture FPGA. Un état de l'art sur les technologies FPGA est également proposé. La structure interne des FPGAs récents est décrite. Leur contribution dans le domaine de la commande des machines électriques est quantifiée. Les différentes étapes de la méthodologie de développement sont présentées. Le développement d'une commande numérique (basée sur FPGA) d'une machine synchrone à aimant permanent associée à un capteur de position Resolver est par la suite traité. Cette application s'inscrit dans un contexte avionique où l'objectif était d'avoir une solution FPGA hautement intégrée. Pour ce faire, le FPGA Actel Fusion est utilisé. Ce composant intègre un convertisseur analogique numérique. La commande, le traitement des signaux Resolver ainsi que la conversion analogique numériques sont implantés sur le même composant.En ce qui concerne la commande sans capteur basée sur le filtre de Kalman étendu, il a été décidé de structurer les chapitres correspondants à travers la méthodologie de développement proposée. Ainsi, la phase de spécification préliminaire du système, la phase du développement de l'algorithme, la phase du développement de l'architecture FPGA et la phase d'expérimentation sont séparément traitées. Durant la phase d'expérimentation, la procédure «Hardware-In-the-Loop (HIL)» est incluse afin de valider le fonctionnement de l'architecture développée une fois la phase d'implantation physique achevée. / The aim of this thesis is to present the interest of using Field Programmable Gate Array (FPGA) devices for the implementation of complex AC drive controllers. The case of a sensorless speed controller using the Extended Kalman Filter (EKF) has been chosen and applied to a Salient Synchronous Machine (SSM). The d-q model based on the infinite inertia hypothesis has been implemented. The corresponding EKF order is then equal to 4 and the complexity of the whole sensorless controller is equal to 700 arithmetic operations (more than 53% of multiplications). The contribution of FPGAs in this field has been quantified in terms of control performances and in terms of system integration. In terms of control performances, the proposed FPGA-based solution ensures a short execution time which is around 5µs (5% of the sampling period). This treatment fastness ensures a quasi-instantaneous control which improves the control bandwidth. To this purpose, a comparison with a software DSP-based solution is made. The dynamic behavior and the influence of the execution time, in both cases, on the control bandwidth have been quantified. In terms of integration capacity, it is possible to implement a generic FPGA architecture that can be adapted to the control of several systems. Thus, it is possible to develop a common EKF architecture that is able to estimate variables from many systems without affecting the control performances.In addition, a design methodology adapted to such complex controllers has been proposed. The particularity of this updated methodology, compared to the previous ones ([62], [63]), is to provide an enlarged set of steps starting from the preliminary system specification to the ultimate experimentation. Optimization procedures have also been introduced. These optimizations are necessary in case of complex controllers and lead to the adequation between the developed algorithm and the corresponding hardware FPGA architecture. A state of the art FPGA technology is also presented. The internal structure of the recent devices and their corresponding technology are discussed. Their contribution in the field of AC drive applications is quantified. An in-depth presentation of the proposed design methodology is made.Besides, the development of a fully integrated FPGA-based controller for a Permanent Magnet Synchronous Machine (PMSM) associated with a Resolver sensor is presented. This controller has been developed in for an aircraft application where the main objective was to develop a fully integrated FPGA solution. The Actel Fusion FPGA device has been used. This device integrates an Analog to Digital Converter (ADC). The current controller, the Resolver Processing Unit (RPU) and the analog to digital conversion are implemented within the same device. When it comes to the sensorless controller, the corresponding chapters have been structured according to the presented design methodology: the preliminary system specification, the algorithm development, the FPGA architecture development and finally the experimentation. The latter includes Hardware-In-the-Loop (HIL) tests and the final experimental validation.
127

Digital Timing Generator for Control of Plasma Discharges

Liao, Hao Hsiang January 2019 (has links)
This thesis report presents a new design of a synchronization unit for high power impulse magnetron sputtering (HiPIMS) applications used for depositing thin films. The proposed system is composed of two major hardware parts: a microcontroller unit (MCU) and a field-programmable gate array (FPGA). The control range of the new system is increased by at least ten times compared to existing synchronization unit designed by Ionautics AB.In order to verify the system and benchmark its innovations, several batches of the thin film have been deposited using the new technology. It is shown that HiPIMS with synchronized pulsed substrate bias can effectively improve coating performance. Pulsed substrate bias with user-defined pulse width and delay time is possible to use in the new control mode proposed by this master thesis work; Bias mode. As a result, this master thesis work enables users to flexibly control the HiPIMS processes.
128

Object-Oriented Development for Reconfigurable Architectures

Fröhlich, Dominik 30 November 2009 (has links) (PDF)
Reconfigurable hardware architectures have been available now for several years. Yet the application development for such architectures is still a challenging and error-prone task, since the methods, languages, and tools being used for development are inappropriate to handle the complexity of the problem. This thesis introduces a novel approach that tackles the complexity challenge by raising the level of abstraction to system-level and increasing the degree of automation. The approach is centered around the paradigms of object-orientation, platforms, and modeling. An application and all platforms being used for its design, implementation, and deployment are modeled with objects using UML and an action language. The application model is then transformed into an implementation, whereby the transformation is steered by the platform models. In this thesis solutions for the relevant problems behind this approach are discussed. It is shown how UML can be used for complete and precise modeling of applications and platforms. Application development is done at the system-level using a set of well-defined, orthogonal platform models. Thereby the core features of object-orientation - data abstraction, encapsulation, inheritance, and polymorphism - are fully supported. Novel algorithms are presented, that allow for an automatic mapping of such application models to the target architecture. Thereby the problems of platform mapping, estimation of implementation characteristics, and synthesis of UML models are discussed. The thesis explores the utilization of platform models for generation of highly optimized implementations in an automatic yet adaptable way. The approach is evaluated by a number of relevant applications. The execution of the generated implementations is supported by a run-time service. This service manages the hardware configurations and objects comprising the application. Moreover, it serves as broker for hardware objects. The efficient management of configurations and objects at run-time is discussed and optimized life cycles for these entities are proposed. Mechanisms are presented that make the approach portable among different physical hardware architectures. Further, this thesis presents UML profiles and example platforms that support system-level design. These extensions are embodied in a novel type of model compiler. The compiler is accompanied by an implementation of the run-time service. Both have been used to evaluate and improve the presented concepts and algorithms.
129

Design Flow für IP basierte, dynamisch rekonfigurierbare, eingebettete Systeme

Meisel, Andre 22 June 2010 (has links) (PDF)
Der achte Band der wissenschaftlichen Schriftenreihe EINGEBETTETE, SELBSTORGANISIERENDE SYSTEME widmet sich der Synthese von partiell dynamisch rekonfigurierbaren, eingebetteten Systemen. Mit der Möglichkeit Hardwareblöcke zur Laufzeit auf programmierbaren Bausteinen neu zu konfigurieren, lässt sich eine höhere Flexibilität im Vergleich zu einer Hardwarerealisierung in eingebettete Systeme integrieren. Gleichzeitig sind diese Systeme durch eine gesteigerte Performance gegenüber Software gekennzeichnet. Die Flexibilität kann ausgenutzt werden, um kleinere Schaltkreise bei gleichem Funktionsumfang einzusetzen. Für die Integration von Rekonfigurierung sind zusätzliche Entwurfschritte im Design Flow notwendig. Herr Meisel stellt hierfür in seiner Arbeit eine Entwurfsmethodik vor und geht im Besonderen auf die Partitionierung, Platzierung und Steuerung in dynamisch rekonfigurierbaren, eingebetteten Systemen ein. Um eine vergleichsweise effizient zu realisierende Partitionierung des Systems zu erhalten, wurde das Overlaying Verfahren aus dem Bereich der Speicherverwaltung für dynamische Rekonfigurierung adaptiert. Für das Platzierungsverfahren wurden Rekonfigurierungen als Markov Kette modelliert, um so zu einer Minimierung der durchschnittlichen Rekonfigurierungsdauer zu gelangen. Die vorgestellte Rekonfigurierungssteuerung fokussiert auf einer ressourcensparenden Hardware Implementierung. Mit einem Entwurfsbeispiel werden die Vorteile und Ergebnisse des Ansatzes anschaulich illustriert. So kann der Leser die Mächtigkeit des entwickelten Ansatzes nachvollziehen und wird motiviert, die entwickelte Methodik auf weitere Anwendungsfälle zu übertragen. / Volume 8 of scientific series EINGEBETTETE, SELBSTORGANISIERENDE SYSTEME (Embedded Self-Organized Systems) addresses the synthesis of partially dynamically reconfigurable embedded systems. With the ability to configure hardware blocks during run-time, more flexibility can be integrated in embedded systems. At the same time, these systems have better performance than functions implemented in software. Through this flexibility it is possible to use smaller circuits without limiting the functionality. For the integration of reconfiguration into embedded systems, additional design steps are required. Mr. Meisel presents a design methodology for the design flow and primarily concerns the problem of partitioning, placement, and reconfiguration control in dynamically reconfigurable embedded systems. The implemented partitioning of the system is based on the adapted memory management concept of Overlaying. For the placement method the configurations are modeled as Markov chain, in order to minimize the average reconfiguration time. The presented reconfiguration control unit focuses on a resource-saving hardware implementation. The benefits and results of the approach are clearly illustrated with a design sample. The reader can understand the power of developed approach and is motivated to transfer the developed methodology to more use cases.
130

Hardware Acceleration of a Monte Carlo Simulation for Photodynamic Therapy Treatment Planning

Lo, William Chun Yip 15 February 2010 (has links)
Monte Carlo (MC) simulations are widely used in the field of medical biophysics, particularly for modelling light propagation in biological tissue. The iterative nature of MC simulations and their high computation time currently limit their use to solving the forward solution for a given set of source characteristics and tissue optical properties. However, applications such as photodynamic therapy treatment planning or image reconstruction in diffuse optical tomography require solving the inverse problem given a desired light dose distribution or absorber distribution, respectively. A faster means for performing MC simulations would enable the use of MC-based models for such tasks. In this thesis, a gold standard MC code called MCML was accelerated using two distinct hardware-based approaches, namely designing custom hardware on field-programmable gate arrays (FPGAs) and programming commodity graphics processing units (GPUs). Currently, the GPU-based approach is promising, offering approximately 1000-fold speedup with 4 GPUs compared to an Intel Xeon CPU.

Page generated in 0.7177 seconds