751 |
Architectures numériques adaptatives pour les systèmes de transmission sans fils fiables / Adaptive Digital Architecture for Reliable Wireless Transmission SystemsChehaitly, Mouhamad 29 June 2017 (has links)
Les travaux de thèse présentés dans ce manuscrit portent sur le développement d'une nouvelle architecture de transmission spécifiquement dédiée aux réseaux de capteurs sans fils et adaptée aux caractéristiques particulières de ceux-ci. L'approche, basée sur les techniques de radio impulsionnelle pour la transmission à large bande, est développée selon deux aspects de recherche principaux: fonctionnel et matériel. L'aspect fonctionnel vise à définir les caractéristiques du signal transmis ainsi que les algorithmes de traitement (modulation et démodulation) associés. Plus largement, il s'agit de définir l'architecture fonctionnelle de la chaîne de transmission, selon deux modes différents d'exploitation: mono-utilisateur et multi-utilisateurs. L'approche proposée pour transmettre des signaux impulsionnels, est basé sur l'emploi de la transformée discrète en paquets d'ondelettes (DWPT) au niveau du récepteur et de la transformée inverse au niveau de l'émetteur (IDWPT). La nature orthogonale des ondelettes permet de réaliser, sans nécessiter une couche MAC complexe, des communications multi-utilisateurs, simultanées ou non, sur un canal large bande, grâce à la forte discrimination entre les impulsions transmises. Le deuxième aspect porte sur le développement des architectures matérielles permettant l'implantation des algorithmes de traitement développés dans la partie fonctionnelle. La recherche de performances élevées (ratio élevé entre vitesse de traitement et coût matériel) et flexibilité (configurabilité, extensibilité), est particulièrement important dans les fonctionnalités liées aux transformées discrètes en paquets d'ondelettes qui constituent le cœur critique de la chaîne de transmission. Des techniques de parallélisation massive et générique sont développées et mises en œuvre, permettant d'atteindre les niveaux de performances et de flexibilité requis. La validation a été réalisée à l'aide respectivement de modélisations et imulations sous Simulink/Matlab (de MathWorks) pour les aspects fonctionnels et de modélisations VHDL (au niveau RTL [Register Transfer Level]) et d'implantations sur FPGA pour les aspects matériels / The thesis work presented in this manuscript focuses on the development of a new transmission architecture specifically dedicated to wireless sensor networks and adapted to the particular characteristics of the later. The approach, based on impulse radio techniques for wideband transmission, is developed according to two main research aspects: functional and hardware. The functional aspect aims at defining the characteristics of the transmitted signal as well as the associated processing algorithms (modulation and demodulation). More broadly, it comes to define the functional architecture of the transmission chain, according to two different operating modes: mono- and multi-user. The proposed approach for transmitting pulse signals is based on the use of the discrete wavelet packet transform (DWPT) at the receiver and the inverse transform (IDWPT) at the transmitter. The orthogonal nature of the wavelets makes it possible, without needing a complex MAC layer, to make multi-user communications, either simultaneous or not, over a wideband channel, thanks to the strong discrimination between the transmitted pulses. The second aspect relates to the development of hardware architectures allowing the implementation of the processing algorithms developed in the functional part. The search for high performance (high ratio between processing speed and hardware cost) and flexibility (configurability, extensibility) is particularly important in the functionality related to the discrete wavelet packet transform which constitutes the critical core of the transmission chain. Massive and generic parallelization techniques are developed and implemented to achieve the required levels of performance and flexibility. Validation was carried out using respectively Simulink/Matlab (MathWorks) modeling and simulation for the functional aspects, and VHDL modeling (at the Register Transfer Level -- RTL) and FPGA implementations for the hardware aspects
|
752 |
Contribution à la parallélisation automatique : un modèle de processeur à beaucoup de coeurs parallélisant. / Contribution to the automatic parallelization : the model of the manycore parallelizing processorPorada, Katarzyna 14 November 2017 (has links)
Depuis les premiers ordinateurs on est en quête de machines plus rapides, plus puissantes, plus performantes. Après avoir épuisé le filon de l’augmentation de la fréquence, les constructeurs se sont tournés vers les multi-cœurs. Le modèle de calcul actuel repose sur les threads de l'OS qu’on exploite à travers différents langages à constructions parallèles. Cependant, la programmation multithread reste un art délicat car le calcul parallèle découpé en threads souffre d’un grand défaut : il est non déterministe.Pourtant, on peut faire du calcul parallèle déterministe, à condition de remplacer le modèle des threads par un modèle s’appuyant sur l’ordre partiel des dépendances. Dans cette thèse, nous proposons un modèle alternatif d’architecture qui exploite le parallélisme d’instructions (ILP) présent dans les programmes. Nous proposons de nombreuses techniques pour s’affranchir de la plupart des dépendances architecturales et obtenir ainsi un ILP qui croît avec la taille de l’exécution. L’ILP qu’on atteint de cette façon est suffisant pour permettre d’alimenter plusieurs milliers de cœurs. Les dépendances architecturales sérialisantes ayant été supprimées, l’ILP peut être bien mieux exploité que dans les architectures actuelles. Un code VHDL au niveau RTL de l’architecture a été développé pour en mesurer les avantages. Les résultats de synthèse d’un processeur allant de 2 à 64 cœurs montrent que la vitesse du matériel que nous proposons reste constante et que sa surface varie linéairement avec le nombre de cœurs. Cela prouve que le modèle d’interconnexion proposé est extensible. / The pursuit for faster and more powerful machines started from the first computers. After exhausting the increase of the frequency, the manufacturers have turned to another solution and started to introduce multiples cores on a chip. The computational model is today based on the OS threads exploited through different languages offering parallel constructions. However, parallel programming remains an art because the thread management by the operating system is not deterministic.Nonetheless, it is possible to compute in a parallel deterministic way if we replace the thread model by a model built on the partial order of dependencies. In this thesis, we present an alternative architectural model exploiting the Instruction Level Parallelism (ILP) naturally present in applications. We propose many techniques to remove most of the architectural dependencies which leads to an ILP increasing with the execution length. The ILP which is reached this way is enough to allow feeding thousands of cores. Eliminating the architecutral dependencies serializing the run allows to exploit the ILP better than in actual microarchitectures. A VHDL code at the RTL level has been implemented to mesure the benefits of our design. The results of the synthesis of a processeur ranging from 2 to 64 cores are reported. They show that the speed of the proposed material keeps constant and the surface grows linearly with the number of cores : our interconnect solution is scalable.
|
753 |
Projet CLEAR : Horloge composite numérique polyvalente : Asservissement en fréquence multisources / CLEAR project : CLock Ensemble Algorithm Research projectBenigni, Alexis 01 June 2018 (has links)
L'objectif de la thèse est de concevoir et développer un système numérique de combinaison de signaux d'horloges hétérogènes (PPS, horloges atomiques, quartz, ...). Le signal résultant possède une meilleure stabilité que chacune des entrée quelque soit la durée d'intégration et il peut détecter des défaillances chez une des entrées. / The goal of the PhD is to design and build a numerical system capable of combining clock signals from various sources (PPS, atomic clocks, quartz, ...). The output signal will have a better stability at each integration time than any single input signal and it could detect failures in input sources.
|
754 |
Hierarchical reconfiguration management for heterogeneous cognitive radio equipments / Gestion hiérarchique de la reconfiguration pour les équipements de radio intelligente fortement hétérogènesWu, Xiguang 21 March 2016 (has links)
Pour supporter l’évolution constante des standards de communication numérique, du GSM vers la 5G, les équipements de communication doivent continuellement s’adapter. Face à l’utilisation croissante de l’internet, on assiste à une explosion du trafic de données, ce qui augmente la consommation d'énergie des appareils de communication sans fil et conduit donc à un impact significatif sur les émissions mondiales de CO2. De plus en plus de recherches se sont concentrées sur l'efficacité énergétique de la communication sans fil. La radio Intelligente, ou Cognitive Radio (CR), est considérée comme une technologie pertinente pour les communications radio vertes en raison de sa capacité à adapter son comportement à son environnement. Sur la base de métriques fournissant suffisamment d'informations sur l'état de fonctionnement du système, une décision optimale peut être effectuée en vue d'une action de reconfiguration, dans le but de réduire au minimum la dissipation d'énergie tout en ne compromettant pas les performances. Par conséquent, tout équipement intelligent doit disposer d’une architecture de gestion de la reconfiguration. Nous avons retenu l’architecture HDCRAM (Hierarchical and Distributed Cognitive Radio Architecture Management), développée dans notre équipe, et nous l’avons déployée sur des plates-formes hétérogènes. L'un des objectifs est d'améliorer l'efficacité énergétique par la mise en œuvre de l’architecture HDCRAM. Nous l’avons appliquée à un système OFDM simplifié pour illustrer comment HDCRAM permet de gérer efficacement le système et son adaptation à un environnement évolutif. / As the digital communication systems evolve from GSM and now toward 5G, the supported standards are also growing. The desired communication equipments are required to support different standards in a single device at the same time. And more and more wireless Internet services have been being provided resulting in the explosive growth in data traffic, which increase the energy consumption of the communication devices thus leads to significant impact on global CO2 emission. More and more researches have focused on the energy efficiency of wireless communication. Cognitive Radio (CR) has been considered as an enabling technology for green radio communications due to its ability to adapt its behavior to the changing environment. In order to efficiently manage the sensing information and the reconfiguration of a cognitive equipment, it is essential, first of all, to gather the necessary metrics so as to provide enough information about the operating condition thus helping decision making. Then, on the basis of the metrics obtained, an optimal decision can be made and is followed by a reconfiguration action, whose aim is to minimize the power dissipation while not compromising on performance. Therefore, a management architecture is necessary to be added into the cognitive equipment acting as a glue to realize the CR capabilities. We introduce a management architecture, namely Hierarchical and Distributed Cognitive Radio Architecture Management (HDCRAM), which has been proposed for CR management by our team. This work focuses on the implementation of HDCRAM on heterogeneous platforms. One of the objectives is to improve the energy efficiency by the management of HDCRAM. And an example of a simplified OFDM system is used to explain how HDCRAM works to efficiently manage the system to adapt to the changing environment.
|
755 |
Metodologias para desenvolvimento de mapas auto-organizáveis de Kohonen executados em FPGA. / Methodology for the development of Kohonen\'s self-organizing maps implemented in FPGA.Sousa, Miguel Angelo de Abreu de 21 May 2018 (has links)
Dentro do cenário de projeto de circuitos elétricos orientados para o processamento de redes neurais artificiais, este trabalho se concentra no estudo da implementação de Mapas Auto-organizáveis (SOM, do inglês, Self-Organizing Maps) em chips FPGA. A pesquisa aqui realizada busca, fundamentalmente, responder à seguinte pergunta: como devem ser projetadas as arquiteturas computacionais de cada etapa de processamento do SOM para serem adequadamente executadas em FPGA? De forma mais detalhada, o trabalho investiga as possibilidades que diferentes circuitos de computação do SOM oferecem em relação à velocidade de processamento, ao consumo de recursos do FPGA e à consistência com o formalismo teórico que fundamenta esse modelo de rede neural. Tal objetivo de pesquisa é motivado por possibilitar o desenvolvimento de sistemas de processamento neural que exibam as características positivas típicas de implementações diretas em hardware, como o processamento embarcado e a aceleração computacional. CONTRIBUIÇÕES PRINCIPAIS No decorrer da investigação de tais questões, o presente trabalho gerou contribuições com diferentes graus de impacto. A contribuição mais essencial do ponto de vista de estruturação do restante da pesquisa é a fundamentação teórica das propriedades de computação do SOM em hardware. Tal fundamentação é importante pois permitiu a construção dos alicerces necessários para o estudo das diferentes arquiteturas de circuitos exploradas neste trabalho, de forma que estas permanecessem consistentes com as premissas teóricas que certificam o modelo de computação neural estudado. Outra contribuição avaliada como de grande impacto, e que se consolida como um objeto gerado pela pesquisa, é a proposta de um circuito processador para SOM em FPGA que possui o estado-da-arte em velocidade de computação, medido em CUPS (Connections Updated Per Second). Tal processador permite atingir 52,67 GCUPS, durante a fase de treinamento do SOM, um ganho de aproximadamente 100% em relação aos trabalhos publicados na literatura. A aceleração possibilitada pela exploração de processamentos paralelos em FPGA, desenvolvida neste trabalho, é de três a quatro ordens de grandeza em relação a execuções em software do SOM com a mesma configuração. A última contribuição considerada como de grande impacto é a caracterização da execução do SOM em FPGA. Tal avaliação se faz necessária porque os processos de computação dos modelos neurais em hardware, embora semelhantes, não são necessariamente idênticos aos mesmos processos executados em software. Desta forma, a contribuição deste ponto de pesquisa pode ser entendida como a análise do impacto das mudanças implementadas na computação do SOM em FPGA em relação à execução tradicional do algoritmo, feita pela avaliação dos resultados produzidos pela rede neural por medidas de erros topográficos e de quantização. Este trabalho também gerou contribuições consideradas como de médio impacto, que podem ser divididas em dois grupos: aplicações práticas e aportes teóricos. A primeira contribuição de origem prática é a investigação de trabalhos publicados na literatura envolvendo SOM cujas aplicações podem ser viabilizadas por implementações em hardware. Os trabalhos localizados nesse levantamento foram organizados em diferentes categorias, conforme a área de pesquisa - como, por exemplo, Indústria, Robótica e Medicina - e, em geral, eles utilizam o SOM em aplicações que possuem requisitos de velocidade computacional ou embarque do processamento, portanto, a continuidade de seus desenvolvimentos é beneficiada pela execução direta em hardware. As outras duas contribuições de médio impacto de origem prática são as aplicações que serviram como plataforma de teste dos circuitos desenvolvidos para a implementação do SOM. A primeira aplicação pertence à área de telecomunicações e objetiva a identificação de símbolos transmitidos por 16-QAM ou 64-QAM. Estas duas técnicas de modulação são empregadas em diversas aplicações com requisitos de mobilidade - como telefonia celular, TV digital em dispositivos portáteis e Wi-Fi - e o SOM é utilizado para identificar sinais QAM recepcionados com ruídos e distorções. Esta aplicação gerou a publicação de um artigo na revista da Springer, Neural Computing and Applications: Sousa; Pires e Del-Moral-Hernandez (2017). A segunda aplicação pertence à área de processamento de imagem e visa reconhecer ações humanas capturadas por câmeras de vídeo. O processamento autônomo de imagens executado por chips FPGA junto às câmeras de vídeo pode ser empregado em diferentes utilizações, como, por exemplo, sistemas de vigilância automática ou assistência remota em locais públicos. Esta segunda aplicação também é caracterizada por demandar arquiteturas computacionais de alto desempenho. Todas as contribuições teóricas deste trabalho avaliadas como de médio impacto estão relacionadas ao estudo das características de arquiteturas de hardware para computação do modelo SOM. A primeira destas é a proposta de uma função de vizinhança do SOM baseada em FPGA. O objetivo de tal proposta é desenvolver uma expressão computacional para ser executada no chip que constitua uma alternativa eficiente tanto à função gaussiana, tradicionalmente empregada no processo de treinamento do SOM, quanto à função retangular, utilizada de forma rudimentar nas primeiras pesquisas publicadas sobre a implementação do SOM em FPGA. A segunda destas contribuições é a descrição detalhada dos componentes básicos e dos blocos computacionais utilizados nas diferentes etapas de execução do SOM em FPGA. A apresentação dos detalhes da arquitetura de processamento, incluindo seus circuitos internos e a função computada por cada um de seus blocos, permite que trabalhos futuros utilizem os desenvolvimentos realizados nesta pesquisa. Esta descrição detalhada e funcional foi aceita para publicação no IEEE World Congress on Computational Intelligence (WCCI 2018): Sousa et al. (2018). A terceira contribuição teórica de médio impacto é a elaboração de um modelo distribuído de execução do SOM em FPGA sem o uso de uma unidade central de controle. Tal modelo permite a execução das fases de aprendizado e operação da rede neural em hardware de forma distribuída, a qual alcança um comportamento global de auto-organização dos neurônios apenas pela troca local de dados entre elementos de processamento vizinhos. A descrição do modelo distribuído, em conjunto com sua caracterização, está publicada em um artigo no International Joint Conference on Neural Networks do IEEE (IJCNN 2017): Sousa e Del-Moral-Hernandez (2017a). A última contribuição deste grupo de aporte teórico é a comparação entre diferentes modelos de execução do SOM em FPGA. A comparação tem a função de avaliar e contrastar três diferentes possibilidades de implementação do SOM: o modelo distribuído, o modelo centralizado e o modelo híbrido. Os testes realizados e os resultados obtidos estão publicados em um trabalho no International Symposium on Circuits and Systems do IEEE (ISCAS 2017): Sousa e Del-Moral-Hernandez (2017b). Finalmente, apresentam-se a seguir as contribuições avaliadas como de menor impacto, em comparação com as contribuições já descritas, ou ainda incipientes (e que possibilitam continuidades da pesquisa em trabalhos futuros), sendo relacionadas a seguir como contribuições complementares: * Pesquisa de literatura científica sobre o estado-da-arte da área da Engenharia de Sistemas Neurais Artificiais. * Identificação de grupos internacionais de pesquisa de execução do SOM em hardware, os quais foram reconhecidos por publicarem regularmente seus estudos sobre diferentes tipos de implementações e categorias de circuitos computacionais. * Enumeração das justificativas e motivações mais frequentes na literatura para o processamento de sistemas neurais de computação em hardware. * Comparação e contraste das características de microprocessadores, GPUs, FPGAs e ASICs (tais como, custo médio do componente, paralelismo computacional oferecido e consumo típico de energia) para contextualização do tipo de aplicações que a escolha pela pesquisa com o dispositivo FPGA possibilita. * Levantamento das propriedades de computação do SOM em hardware mais frequentemente utilizadas nas pesquisas publicadas na literatura, tais como, quantidade de bits usados nos cálculos, tipo de representação de dados e arquitetura típica dos circuitos de execução das diferentes etapas de processamento do SOM. * Comparação do consumo de área do FPGA e da velocidade de processamento entre a execução da função de vizinhança tradicional gaussiana e a função de vizinhança proposta neste trabalho (com resultados obtidos de aproximadamente 4 vezes menos área do chip e 5 vezes mais velocidade de operação). * Caracterização do aumento dos recursos consumidos no chip e da velocidade de operação do sistema, em relação à implementação do SOM com diferentes complexidades (quantidade de estágios decrescentes do fator de aprendizado e da abertura da função de vizinhança) e comparação destas propriedades da arquitetura proposta em relação aos valores publicados na literatura. * Proposta de uma nova métrica para caracterização do erro topográfico na configuração final do SOM após o treinamento. / In the context of design electrical circuits for processing artificial neural networks, this work focuses on the study of Self-Organizing Maps (SOM) executed on FPGA chips. The work attempts to answer the following question: how should the computational architecture be designed to efficiently implement in FPGA each one of the SOM processing steps? More specifically, this thesis investigates the distinct possibilities that different SOM computing architectures offer, regarding the processing speed, the consumption of FPGA resources and the consistency to the theory that underlies this neural network model. The motivation of the present work is enabling the development of neural processing systems that exhibit the positive features typically associate to hardware implementations, such as, embedded processing and computational acceleration. MAIN CONTRIBUITIONS In the course of the investigation, the present work generated contributions with different degrees of impact. The most essential contribution from the point of view of structuring the research process is the theoretical basis of the hardware-oriented SOM properties. This is important because it allowed the construction of the foundations for the study of different circuit architectures, so that the developments remained consistent with the theory that underpins the neural computing model. Another major contribution is the proposal of a processor circuit for implementing SOM in FPGA, which is the state-of-the-art in computational speed measured in CUPS (Connections Updated Per Second). This processor allows achieving 52.67 GCUPS, during the training phase of the SOM, which means a gain of 100%, approximately, in relation to other published works. The acceleration enabled by the FPGA parallel processing developed in this work reaches three to four orders of magnitude compared with software implementations of the SOM with the same configuration. The highlights made in the text indicate pieces of writing that synthesize the idea presented. The last main contribution of the work is the characterization of the FPGA-based SOM. This evaluation is important because, although similar, the computing processes of neural models in hardware are not necessarily identical to the same processes implemented in software. Hence, this contribution can be described as the analysis of the impact of the implemented changes, regarding the FPGA-based SOM compared to traditional algorithms. The comparison was performed evaluating the measures of topographic and quantization errors for the outputs produced by both implementations. This work also generated medium impact contributions, which can be divided into two groups: empirical and theoretical. The first empirical contribution is the survey of SOM applications which can be made possible by hardware implementations. The papers presented in this survey are classified according to their research area - such as Industry, Robotics and Medicine - and, in general, they use SOM in applications that require computational speed or embedded processing. Therefore, the continuity of their developments is benefited by direct hardware implementations of the neural network. The other two empirical contributions are the applications employed for testing the circuits developed. The first application is related to the reception of telecommunications signals and aims to identify 16-QAM and 64-QAM symbols. These two modulation techniques are used in a variety of applications with mobility requirements, such as cell phones, digital TV on portable devices and Wi-Fi. The SOM is used to identify QAM distorted signals received with noise. This research work was published in the Springer Journal on Neural Computing and Applications: Sousa; Pires e Del-Moral-Hernandez (2017). The second is an image processing application and it aims to recognize human actions captured by video cameras. Autonomous image processing performed by FPGA chips inside video cameras can be used in different scenarios, such as automatic surveillance systems or remote assistance in public areas. This second application is also characterized by demanding high performance from the computing architectures. All the theoretical contributions with medium impact are related to the study of the properties of hardware circuits for implementing the SOM model. The first of these is the proposal of an FPGA-based neighborhood function. The aim of the proposal is to develop a computational function to be implemented on chip that enables an efficient alternative to both: the Gaussian function (traditionally employed in the SOM training process) and the rectangular function (used rudimentary in the first published works on hardware-based SOMs). The second of those contributions is the detailed description of the basic components and blocks used to compute the different steps of the SOM algorithm in hardware. The description of the processing architecture includes its internal circuits and computed functions, allowing the future works to use the architecture proposed. This detailed and functional description was accepted for publication in the IEEE World Congress on Computational Intelligence (WCCI 2018): Sousa et al. (2018). The development of an FPGA distributed implementation model for the SOM composes the third of those contributions. Such a model allows an execution of the neural network learning and operational phases without the use of a central control unit. The proposal achieves a global self-organizing behavior only by using local data exchanges among the neighboring processing elements. The description and characterization of the distributed model are published in a paper in the IEEE International Joint Conference on Neural Networks (IJCNN 2017): Sousa e Del-Moral-Hernandez (2017a). The last contribution of this group is the comparison between different FPGA architectures for implementing the SOM. This comparison has the function of evaluating and contrasting three different SOM architectures: the distributed model, the centralized model and the hybrid model. The tests performed and the results obtained are published in an article in the IEEE International Symposium on Circuits and Systems (ISCAS 2017): Sousa e Del-Moral-Hernandez (2017b). Finally, the contributions assessed as having a minor impact, compared to contributions already described, or still incipient (and which allow the continuity of the research in possible future works), are presented as complementary contributions: * Research in the scientific literature on the state-of-the-art works in the field of Artificial Neural Systems Engineering. * Identification of the international research groups on hardware-based SOM, which were recognized for regularly publishing their studies on different types of implementations and categories of computational circuits. * Enumeration of the justifications and motivations often mentioned in works on hardware developments of neural computing systems. * Comparison and contrast of the characteristics of microprocessors, GPUs, FPGAs and ASICs (such as, average cost, parallelism and typical power consumption) to contextualize the type of applications enabled by the choice of FPGA as the target device. * Survey of literature for the most commonly hardware properties used for computing the SOM, such as the number of bits used in the calculations, the type of data representation and the typical architectures of the FPGA circuits. * Comparison of the FPGA resources consumption and processing speed between the execution of the traditional Gaussian neighborhood function and the proposed alternative neighborhood function (with obtained results of approximately 4 times less chip area and 5 times more computational speed). * Characterization of the increase in chip resources consumptions and the decrease in system speeds, according to the implementations of the SOM with different complexities (such as, the number of stages in learning factor and the width of the neighborhood function). Comparison of these properties between the proposed architecture and the works published in the literature. * Proposal of a new metric for the characterization of the topographic error in the final configuration of the SOM after the training phase.
|
756 |
Frame-level redundancy scrubbing technique for SRAM-based FPGAs / Técnica de correção usando a redudância a nível de quadro para FPGAs baseados em SRAMSeclen, Jorge Lucio Tonfat January 2015 (has links)
Confiabilidade é um parâmetro de projeto importante para aplicações criticas tanto na Terra como também no espaço. Os FPGAs baseados em memoria SRAM são atrativos para implementar aplicações criticas devido a seu alto desempenho e flexibilidade. No entanto, estes FPGAs são susceptíveis aos efeitos da radiação tais como os erros transientes na memoria de configuração. Além disso, outros efeitos como o envelhecimento (aging) ou escalonamento da tensão de alimentação (voltage scaling) incrementam a sensibilidade à radiação dos FPGAs. Nossos resultados experimentais mostram que o envelhecimento e o escalonamento da tensão de alimentação podem aumentar ao menos duas vezes a susceptibilidade de FPGAs baseados em SRAM a erros transientes. Estes resultados são inovadores porque estes combinam três efeitos reais que acontecem em FPGAs baseados em SRAM. Os resultados podem guiar aos projetistas a prever os efeitos dos erros transientes durante o tempo de operação do dispositivo em diferentes níveis de tensão. A correção da memoria usando a técnica de scrubbing é um método efetivo para corrigir erros transientes em memorias SRAM, mas este método impõe custos adicionais em termos de área e consumo de energia. Neste trabalho, nos propomos uma nova técnica de scrubbing usando a redundância interna a nível de quadros chamada FLR- scrubbing. Esta técnica possui mínimo consumo de energia sem comprometer a capacidade de correção. Como estudo de caso, a técnica foi implementada em um FPGA de tamanho médio Xilinx Virtex-5, ocupando 8% dos recursos disponíveis e consumindo seis vezes menos energia que um circuito corretor tradicional chamado blind scrubber. Além, a técnica proposta reduz o tempo de reparação porque evita o uso de uma memoria externa como referencia. E como outra contribuição deste trabalho, nos apresentamos os detalhes de uma plataforma de injeção de falhas múltiplas que permite emular os erros transientes na memoria de configuração do FPGA usando reconfiguração parcial dinâmica. Resultados de campanhas de injeção são apresentados e comparados com experimentos de radiação acelerada. Finalmente, usando a plataforma de injeção de falhas proposta, nos conseguimos analisar a efetividade da técnica FLR-scrubbing. Nos também confirmamos estes resultados com experimentos de radiação acelerada. / Reliability is an important design constraint for critical applications at ground-level and aerospace. SRAM-based FPGAs are attractive for critical applications due to their high performance and flexibility. However, they are susceptible to radiation effects such as soft errors in the configuration memory. Furthermore, the effects of aging and voltage scaling increment the sensitivity of SRAM-based FPGAs to soft errors. Experimental results show that aging and voltage scaling can increase at least two times the susceptibility of SRAM-based FPGAs to Soft Error Rate (SER). These findings are innovative because they combine three real effects that occur in SRAM-based FPGAs. Results can guide designers to predict soft error effects during the lifetime of devices operating at different power supply voltages. Memory scrubbing is an effective method to correct soft errors in SRAM memories, but it imposes an overhead in terms of silicon area and energy consumption. In this work, it is proposed a novel scrubbing technique using internal frame redundancy called Frame-level Redundancy Scrubbing (FLRscrubbing) with minimum energy consumption overhead without compromising the correction capabilities. As a case study, the FLR-scrubbing controller was implemented on a mid-size Xilinx Virtex-5 FPGA device, occupying 8% of available slices and consumes six times less energy per scrubbed frame than a classic blind scrubber. Also, the technique reduces the repair time by avoiding the use of an external golden memory for reference. As another contribution, this work presents the details of a Multiple Fault Injection Platform that emulates the configuration memory upsets of an FPGA using dynamic partial reconfiguration. Results of fault injection campaigns are presented and compared with accelerated ground-level radiation experiments. Finally, using our proposed fault injection platform it was possible to analyze the effectiveness of the FLR-scrubbing technique. Accelerated radiation tests confirmed these results.
|
757 |
ChipCflow - uma ferramenta para execução de algoritmos utilizando o modelo a fluxo de dados dinâmico em hardware reconfigurável / ChipCflow - a tool to executing algorithms using dynamic dataflow architecture in FPGALopes, Joelmir José 29 June 2012 (has links)
Devido à complexidade das aplicações, a demanda crescente por sistemas que usam milhões de transistores e hardware complexo; tem sido desenvolvidas ferramentas que convertem C em Linguagem de Descrição de Hardware, tais como VHDL e Verilog. Neste contexto, esta tese apresenta o projeto ChipCflow, o qual usa arquitetura a fluxo de dados, para implementar lógica de alto desempenho em Field Programmable Gate Array (FPGA). Maquinas a fluxo de dados são computadores programáveis, cujo hardware é otimizado para computação paralela de granularidade fina dirigida por dados. Em outras palavras, a execução de programas é determinado pela disponibilidade dos dados, assim, o paralelismo é intrínseco neste sistema. Por outro lado, com o avanço da tecnologia da microeletrônica, o FPGA tem sido utilizado principalmente devido a sua flexibilidade, facilidade para implementar sistemas complexos e paralelismo intrínseco. Um dos desafios é criar ferramentas para programadores que usam linguagem de alto nível (HLL), como a linguagem C, e produzir hardware diretamente. Essas ferramentas devem usar a máxima experiência dos programadores, o paralelismo das arquiteturas a fluxo de dados dinâmica, a flexibilidade e o paralelismo do FPGA, para produzir um hardware eficiente, otimizado para alto desempenho e baixo consumo de energia. O projeto ChipCflow é uma ferramenta que converte os programas de aplicação escritos em linguagem C para a linguagem VHDL, baseado na arquitetura a fluxo de dados dinâmica. O principal objetivo dessa tese é definir e implementar os operadores do ChipCflow, usando a arquitetura a fluxo de dados dinâmica em FPGA. Esses operadores usam tagged tokens para identificar dados, com base em instâncias de operadores. A implementação dos operadores e das instâncias usam um modelo de implementação assíncrono em FPGA para obter maior velocidade e menor consumo / Due to the complexity of applications, the growing demand for both systems using millions of transistors and consecutive complex hardware, tools that convert C into a Hardware Description Language (HDL), as VHDL and Verilog, have been developed. In this context this thesis presents the ChipCflow project, which uses dataflow architecture to implement high-performance logics in Field Programmable Gate Array (FPGA). Dataflow machines are programmable computers whose hardware is optimized for fine-grain data-flow parallel computation. In other words the execution of programs is determined by data availability, thus parallelism is intrinsic in these systems. On the other hand, with the advance of technology of microelectronics, the FPGA has been used mainly because of its flexibility, facilities to implement complex systems and intrinsic parallelism. One of the challenges is to create tools for programmers who use HLL (High Level Language), such as C language, producing hardware directly. These tools should use the utmost experience of the programmers, the parallelism of dynamic dataflow architecture and the flexibility and parallelism of FPGA to produce efficient hardware optimized for high performance and lower power consumption. The ChipCflow project is a tool that converts application programs written in C language into VHDL, based on the dynamic dataflow architecture. The main goal in this thesis is to define and implement the operators of ChipCflow using dynamic dataflow architecture in FPGA. These operators use tagged tokens to identify data based on instances of operators and their implementation and instances use an asynchronous implementation model in FPGA to achieve faster speed and lower consumption
|
758 |
ChipCflow - em hardware dinamicamente reconfigurável / ChipCflow - in dynamically reconfigurable hardwareAstolfi, Vitor Fiorotto 04 December 2009 (has links)
Nos últimos anos, houve um grande avanço na computação reconfigurável, em particular em hardware que emprega Field-Programmable Gate Arrays. Porém, esse aumento de capacidade e desempenho aumentou a distância entre a capacidade de projeto e a disponibilidade de tecnologia para o desenvolvimento do projeto. As linguagens de programação imperativas de alto nível, como C, são mais apropriadas para o desenvolvimento de aplicativos complexos que as linguagens de descrição de hardware. Por isso, surgiram diversas ferramentas para o desenvolvimento de hardware a partir de código em C. A ferramenta ChipCflow, da qual faz parte este projeto, é uma delas. A execução dos programas por meio dessa ferramenta será completamente baseada em seu fluxo de dados, seguindo o modelo dinâmico encontrado nas arquiteturas de computadores a fluxo de dados, aproveitando ao máximo o paralelismo considerado natural desse modelo e as características do hardware parcialmente reconfigurável. Neste projeto em particular, o objetivo é a prova de conceito (proof of concept) para a criação de instâncias, em forma de operadores, de um algoritmo ChipCflow em hardware parcialmente reconfigurável, tendo como base a plataforma Virtex da Xilinx / In recent years, reconfigurable computing has become increasingly more advanced, especially in hardware that uses Field-Programmable Gate Arrays. However, the increase of performance in FPGAs accumulated the gap between design capacity and technology for the development of the design. Imperative high-level programming languages such as C are more appropriate for the development of complex algorithms than hardware description languages (HDL). For this reason, many ANSI C-like programming tools for the development of hardware came to existence. The ChipCflow project, of which this project is part, is one of these tools. The execution of algorithms through this tool will be completely directed by data flow, according to the dynamic model found on Dataflow Architectures, taking advantage of its natural high levels of parallelism and the characteristics of the partially reconfigurable hardware. In this project, the objective is a proof of concept for the creation of instances, in the form of operators, of a ChipCflow algorithm on a partially reconfigurable hardware, taking as reference the Xilinx Virtex boards
|
759 |
Towards hardware synthesis of a flexible radio from a high-level language / Synthèse matérielle d'une radio flexible et reconfigurable depuis un langage de haut niveau dédié aux couches physiques radioTran, Mai-Thanh 13 November 2018 (has links)
La radio logicielle est une technologie prometteuse pour répondre aux exigences de flexibilité des nouvelles générations de standards de communication. Elle peut être facilement reprogrammée au niveau logiciel pour implémenter différentes formes d'onde. En s'appuyant sur une technologie dite logicielle telle que les microprocesseurs, cette approche est particulièrement flexible et assez facile à mettre en œuvre. Cependant, ce type de technologie conduit généralement à une faible capacité de calcul et, par conséquent, à des débit faibles. Pour résoudre ce problème, la technologie FPGA s'avère être une bonne alternative pour la mise en œuvre de la radio logicielle. En effet, les FPGAs offrent une puissance de calcul élevée et peuvent être reconfigurés. Ainsi, inclure des FPGAs dans le concept de radio logicielle peut permettre de prendre en charge plus de formes d'onde avec des exigences plus strictes qu'une approche basée sur la technologie logicielle. Cependant, les principaux inconvénients d’une conception à base de FPGAs sont le niveau du langage de description d'entrée qui doit typiquement être le niveau matériel, et le temps de reconfiguration qui peut dépasser les exigences d'exécution si le FPGA est entièrement reconfiguré. Pour surmonter ces problèmes, cette thèse propose une méthodologie de conception qui exploite à la fois la synthèse de haut niveau et la reconfiguration dynamique. La méthodologie proposée donne un cadre pour construire une radio flexible pour la radio logicielle à base de FPGAs et qui peut être reconfigurée pendant l'exécution. / Software defined radio (SDR) is a promising technology to tackle flexibility requirements of new generations of communication standards. It can be easily reprogrammed at a software level to implement different waveforms. When relying on a software-based technology such as microprocessors, this approach is clearly flexible and quite easy to design. However, it usually provides low computing capability and therefore low throughput performance. To tackle this issue, FPGA technology turns out to be a good alternative for implementing SDRs. Indeed, FPGAs have both high computing power and reconfiguration capacity. Thus, including FPGAs into the SDR concept may allow to support more waveforms with more strict requirements than a processor-based approach. However, main drawbacks of FPGA design are the level of the input description language that basically needs to be the hardware level, and, the reconfiguration time that may exceed run-time requirements if the complete FPGA is reconfigured. To overcome these issues, this PhD thesis proposes a design methodology that leverages both high-level synthesis tools and dynamic reconfiguration. The proposed methodology is a guideline to completely build a flexible radio for FPGA-based SDR, which can be reconfigured at run-time.
|
760 |
Development and validation of a predictive model to ensure the long-term electromagnetic compatibility of embedded electronic systems / Développement et validation de modèle prédictif pour assurer la compatibilité électromagnétique à long terme des systèmes électroniques embarqués.Ghfiri, Chaimae 13 December 2017 (has links)
Avec l’avancement technologique des circuits intégrés à travers la miniaturisation des tailles des transistors et leur multiplication au sein d’une même puce, l’intégration des circuits dans des systèmes embarqués complexes, principalement dans l’industrie aéronautique, spatiale et automobile, rencontre de plus en plus d’exigences en termes de respect des niveaux d’émission et d’immunité. De plus, étant donné que l’évolution des niveaux de Compatibilité Electromagnétique (CEM) des équipements électroniques doit respecter ces exigences à long terme, les marges définis par les industriels sont souvent surestimés et les systèmes de filtrages établis par les équipementiers peuvent être surdimensionnés. De ce fait, pour les circuits intégrés dédiés aux applications embarquées, il est nécessaire d’étudier les deux aspects qui concernent la modélisation CEM ainsi que la modélisation de la fiabilité. Ces dernières années, des standards ont été proposés et permettent la construction de modèles CEM prédictifs tel que ICEM-CE/RE (Integrated Circuit Emission Model for Conducted and Radiated Emission) et ICIM-CI (Integrated Circuit Immunity Model for Conducted Immunity). De plus, pour intégrer l’effet du vieillissement dans les modèles CEM, il faut étudier les principaux mécanismes de dégradation intrinsèques aux circuits intégrés qui accélèrent leur vieillissement tels que le HCI (Hot Carrier Injection), TDDB (Time Dependent Dielectric Breakdown), EM (Electromigration) et NBTI (Negative Bias Temperature Instability). Des modèles standardisés sont utilisés dans les différents domaines industriels qui permettent la construction de modèle de fiabilité tels que le standard MIL-HDBK-217 et le standard FIDES. Cependant, ils ne permettent de prendre en compte qu’un seul mécanisme de dégradation à la fois. Ce manuscrit de thèse introduit ces aspects de modélisation CEM et de fiabilité. Il traite également la construction d’un modèle d’émission conduite d’un FPGA avec la proposition de nouvelle méthodologie de modélisation. Ensuite, l’étude de la fiabilité du FPGA est décrite à travers l’utilisation d’un nouveau modèle permettant la prise en compte des différents mécanismes de dégradations et a été combiné au modèle CEM pour la prédiction des niveaux d’émissions conduite à long terme. / With the technological evolution of integrated circuits (ICs) through the transistors scaling, which leads to the multiplication of the number of transistors within a chip, the requirements in terms of emission and immunity levels become more restrictive in the aeronautic, space and automotive industries. Moreover, since the evolution of Electromagnetic Compatibility (EMC) levels of electronic equipment after aging must meet the EMC long-term requirements, the EMC margins defined by the manufacturers are often overestimated and the filtering systems designed by the equipment manufacturer could be oversized.Therefore, for the integrated circuits dedicated to embedded applications, it is necessary to study the different aspects of EMC modeling as well as the reliability the modeling. These last years, several standards have been proposed for the construction of predictive EMC models such as ICEM-CE/RE (Integrated Circuit Emission Model for Conducted and Radiated Emission) and ICIM-CI (Integrated Circuit Immunity Model for Conducted Immunity). On the other hand, to integrate the effect of aging in EMC models, it is important to study the main intrinsic degradation mechanisms that accelerate the aging of ICs, such as HCI (Hot Carrier Injection), TDDB (Time Dependent Dielectric Breakdown), EM (Electromigration) and NBTI (Negative Bias Temperature Instability). For this purpose, there are existing models for the reliability prediction, such as the MIL-HDBK-217 standard and the FIDES standard. However, these models could take into account only the activation of one degradation mechanism. The combination of several degradation mechanisms could be critical for the IC performances and could contribute in the evolution of EMC level.This thesis introduces the different aspects of EMC and reliability modeling. This work deals with the construction of a conducted emission model of an FPGA and the proposition of new modeling methodologies. Furthermore, the reliability of the tested FPGA is described using a new predictive model, which takes into account the activation of the different degradation mechanisms. The reliability model has been combined with the EMC model for the long-term conducted emission level prediction.
|
Page generated in 0.0474 seconds