Global ETD Search

331	Méthodologie de conception d'architectures numériques complexes : du formalisme à l’implémentation en passant par l'analyse, préservation de la conformité. Application aux neuroprothèses / Design methodology for complex digital systems : from formalism to implementation through formal analysis, preservation of the compliance. Practical application to neuroprosthetics Leroux, Hélène 28 October 2014 (has links) Dans ce mémoire, la conception de systèmes numériques complexes, et notamment de systèmes embarqués critiques, est abordée au travers d'une méthodologie allant de la modélisation formelle à l'implantation sur FPGA : la méthodologie HILECOP. Celle-ci offre au concepteur la possibilité de représenter dans un modèle formel d'une part l'architecture du système selon un assemblage de composants, et d'autre part le comportement de ces composants et leur composition par réseaux de Petri temporels. Le modèle décrit est ensuite transformé automatiquement en un modèle implémentable (en langage VHDL) pour son exécution sur la cible matérielle, mais également en un modèle analysable pour permettre l'analyse formelle des propriétés du système. Les deux objectifs principaux des travaux présentés sont l'étude de la conformité d'un point de vue comportemental entre les différents modèles utilisés dans la méthodologie (modèle conçu, modèle implémentable et modèle analysable), ainsi que l'intégration d'un mécanisme de gestion efficace des exceptions. Ces travaux ont permis de fiabiliser l'implémentation du modèle et d'obtenir un modèle analysable plus pertinent par rapport au modèle conçu, dans le sens où il garantit l'inclusion du comportement du modèle conçu dans celui du modèle analysé et réduit, dans une certaine mesure, le risque d'explosion combinatoire. Les limites de la pertinence des résultats obtenus par analyse formelle sont de plus désormais connues. En ce qui concerne la gestion des exceptions, principalement étudiée au niveau comportemental, le mécanisme de la macro-place a été retenu et adapté aux contraintes fonctionnelles et non-fonctionnelles des systèmes embarqués critiques. L'apport de la macro-place et la conservation de la conformité ont pu être validés sur des modèles industriels relatifs à l'architecture numérique de neuroprothèses. / In this thesis, the conception of digital complex systems, and notably of critical embedded systems, is discussed through a methodology which goes from formal modeling to the implementation on a FPGA: the HILECOP methodology. This methodology offers, to a designer, the possibility of representing in a formal model from one hand the digital architecture thanks to some components' assembly, and on the other hand the behavior of these components and their composition, thanks to time Petri nets. The described model is then automatically transformed in an implementable model (in the VHDL language) for its execution on a hardware target, but also in an analyzable model to allow some formal analysis on system properties to be performed. The two main goals of the presented work are the study of the behavioral conformity between the different models used in the methodology (designed model, implementable model and analyzable model) and the integration of an efficient mechanism for handling exception. These works allow to have a more reliable implementation of the model and to obtain a more relevant analyzable model. It is now possible to guarantee that the behavior of the designed model is included in the analyzed one. The risk of combinatorial explosion has also been reduced to some extent. The limits of the relevance of the obtained results thanks to the formal analysis are henceforth known. As for exception handling, it has been mostly studied on the behavioral level. The mechanism of the macroplace has been chosen and adapted to meet the functional and non-functional constraints of critical embedded systems. The benefits given by the use of the macroplace and the preservation of the conformity between the models have been validated on industrial models relative to the digital architecture of neuroprosthetics. Réseau de Petri Fpga Macroplace Implémentation Analyse formelle Exception Petri nets Fpga Macroplace Implementation Formal analysis Exception
332	Otimização de memória cache em tempo de execução para o processador embarcado LEON3 / Optimization of cache memory at runtime for embedded processor LEON3 Cuminato, Lucas Albers 28 April 2014 (has links) O consumo de energia é uma das questões mais importantes em sistemas embarcados. Estudos demonstram que neste tipo de sistema a cache é responsável por consumir a maior parte da energia fornecida ao processador. Na maioria dos processadores embarcados, os parâmetros de configuração da cache são fixos e não permitem mudanças após sua fabricação/síntese. Entretanto, este não é o cenário ideal, pois a configuração da cache pode não ser adequada para uma determinada aplicação, tendo como consequência menor desempenho na execução e consumo excessivo de energia. Neste contexto, este trabalho apresenta uma implementação em hardware, utilizando computação reconfigurável, capaz de reconfigurar automática, dinâmica e transparentemente a quantidade de ways e por consequência o tamanho da cache de dados do processador embarcado LEON3, de forma que a cache se adeque à aplicação em tempo de execução. Com esta técnica, espera-se melhorar o desempenho das aplicações e reduzir o consumo de energia do sistema. Os resultados dos experimentos demonstram que é possível reduzir em até 5% o consumo de energia das aplicações com degradação de apenas 0.1% de desempenho / Energy consumption is one of the most important issues in embedded systems. Studies have shown that in this type of system the cache consumes most of the power supplied to the processor. In most embedded processors, the cache configuration parameters are fixed and do not allow changes after manufacture/synthesis. However, this is not the ideal scenario, since the configuration of the cache may not be suitable for a particular application, resulting in lower performance and excessive energy consumption. In this context, this project proposes a hardware implementation, using reconfigurable computing, able to reconfigure the parameters of the LEON3 processor\'s cache in run-time improving applications performance and reducing the power consumption of the system. The result of the experiment shows it is possible to reduce the processor\'s power consumption up to 5% with only 0.1% degradation in performance Cache memory Computação reconfigurável Embedded systems FPGA FPGA LEON3 LEON3 Memória cache Reconfigurable computing Sistemas embarcados
333	Aplicação de técnicas de reconfiguração dinâmica a projeto de máquina de vetor suporte (SVM). / Application of dynamic reconfiguration techniques to the project of support vector machines (SVM). Gomes Filho, Jonas 08 February 2010 (has links) As Máquinas de Vetores de Suporte (SVMs) têm sido largamente empregadas em diversas aplicações, graças à sua baixa taxa de erros na fase de testes (boa capacidade de generalização) e o fato de não dependerem das condições iniciais. Dos algoritmos desenvolvidos para o treinamento da SVM, o Sequential Minimal Optimization (SMO) é um dos mais rápidos e eficientes para a execução desta tarefa. Importantes implementações da fase de treinamento da SVM têm sido feitas em FPGAs. A maioria destas implementações tem sérias restrições na quantidade de conjunto de amostras a serem treinadas, pelo fato de implementarem soluções numéricas. De observação na literatura técnica, apenas dois trabalhos implementaram o SMO para o treinamento SVM em hardware e apenas um destes possibilita o treinamento de uma quantidade importante de amostras, porém a aplicação é restrita a apenas um benchmark específico. Na última década, com a tecnologia baseada em RAM estática, os FPGAs apresentaram um novo aspecto de flexibilidade: a capacidade de reconfiguração dinâmica, que possibilita a alteração do sistema em tempo de execução trazendo redução de área. Adicionalmente, apesar de uma potencial penalidade no tempo de processamento, a velocidade de execução continua muito superior quando comparada com soluções em software. No presente trabalho, uma solução genérica é proposta para o treinamento SVM em hardware (i.e. uma arquitetura que possibilite o treinamento para diversos tipos de amostras de entrada), e, motivado pela natureza seqüencial do algoritmo SMO, uma arquitetura dinamicamente reconfigurável é desenvolvida. Um estudo da implementação genérica com codificação em ponto fixo é apresentada, assim como os efeitos de quantização. A arquitetura é implementada no dispositivo Xilinx Virtex-IV XC4VLX25. Dados de tempo e área são obtidos e detalhes da síntese são explorados. É feita uma simulação da reconfiguração dinâmica através de chaves de isolação para a validação do sistema sob reconfiguração dinâmica. A arquitetura foi testada para três diferentes benchmarks, com resultados indicando que o treinamento no hardware reconfigurável foi acelerado em até 30 vezes quando comparado com a solução em software e os estudos apontaram que uma economia de até 22,38% de área útil do FPGA pode ser obtida dependendo das metodologias de síntese e implementação adotadas. / Support Vector Machines have been largely used in different applications, due to their high classifying capability without errors (generalization capability) and the advantage of not depending on the initial conditions. Among the developed algorithms for the SVM training, the Sequential Minimal Optimization (SMO) is one of the fastest and the one of the most efficient algorithms for executing this task. Important dedicated hardware implementations of the training phase of the SVM have been proposed for digital FPGA. Most of them are very restricted about the quantity of input samples to be trained due to the fact that they implement numeric solutions. Only two works with implementation in the SMO algorithm for the SVM training in hardware have been reported recently, and just one is able to train an important quantity of input samples, however it is restricted for only one specific benchmark. In the last decade, with the technology based on static memory (SRAM), FPGAs has provided a unique aspect of flexibility: the capability of dynamic reconfiguration, which involves altering the programmed design at run-time and allows area\'s saving. In addition, although leading to some time penalty, the execution time is still faster when compared with purely software solutions. In this work we present a totally hardware general-purpose implementation of the SMO algorithm. In this general-purpose approach, training of examples with different number of samples and elements are possible, and, motivated by the sequential nature of some of the SMO tasks, a dynamically reconfigurable architecture is developed. A study of the general-purpose implementation with fixed-point codification is presented, as well as the quantization effects. The architecture is implemented in the Xilinx Virtex-IV XC4VLX25 device, and timing and area data are provided. Synthesis details are exploited. A simulation using dynamic circuit switching is carried out in order to validate the systems dynamic reconfiguration aspects. The architecture was tested in the training of three different benchmarks; the training on the reconfigurable hardware was accelerated up to 30 times when compared with software solution, and studies points to an area saving up to 22.38% depending on the synthesis and implementation methodologies adopted in the project. Arquitetura reconfigurável Artificial intelligence Circuitos FPGA FPGA circuits Inteligência artificial Microelectronics Microeletrônica Reconfigurable architeture
334	Projeto de hardware dedicado para processamento de imagens em aplicações de navegação autônoma de robôs móveis agrícolas / Dedicated hardware design for image processing in applications of autonomous agricultural robot navigation Senni, Alexandre Padilha 05 August 2016 (has links) O emprego de veículos autônomos é uma prática comumente adotada para a melhoria da produtividade no setor agrícola. No entanto, o custo computacional é um fator limitante na implementação desses dispositivos autônomos. A alternativa apresentada neste trabalho consistiu no desenvolvimento de um dispositivo de hardware dedicado para a navegação de robôs móveis agrícolas, o qual indica áreas navegáveis e não navegáveis, além do ângulo de inclinação do veículo em relação à linha de plantio. O desenvolvimento do projeto foi baseado em um método de extração de características visuais locais por meio do processamento de imagens coloridas obtidas por uma câmera de vídeo. O circuito foi implementado por meio de uma ferramenta de desenvolvimento baseado em um FPGA de baixo custo. O circuito consiste nas etapas de classificação, processamento morfológico e extração das linhas de navegação. Na primeira etapa, os pixels são classificados a partir do modelo de cores HSL em classes que representam as áreas passíveis e não passíveis de navegação. Posteriormente, a etapa de processamento morfológico realiza as tarefas de filtragem, agrupamento e extração de bordas. O processamento morfológico é realizado por meio de um arranjo de unidades de processamento dedicadas. Cada unidade pode realizar uma operação básica de morfologia matemática. O elemento estruturante utilizado na operação, bem como a operação realizada pela unidade, é configurado por meio de parâmetros do projeto. O processo de extração das linhas de orientação é realizado por meio do método de regressão linear por mínimos quadrados. A arquitetura proposta no projeto permitiu o processamento em tempo real de imagens para a aplicação de navegação autônoma de robôs móveis em ambientes agrícolas. / The use of autonomous vehicles is a generally adopted practice to improve the productivity in the agriculture sector. However, the computer requirements are a limiting factor for implementation of these autonomous devices. The alternative shown in this paper is the design of a dedicated hardware for the autonomous agricultural robot navigation. The project development was based on a local visual feature extraction method by processing digital images obtained from a color video camera. The circuit was implemented through a development tool based on a low cost FPGA. The circuit consists of stages of classification, morphological processing and guidance line extraction. In the first stage, the pixels are classified through HSL color model into classes that represent suitable and unsuitable area for navigation. Then, the morphological processing stage performs filtering, grouping and edge detection tasks. The morphological processing is carried out by an arrangement of dedicated processing units. Each unit can perform a basic operation of mathematical morphology. The structuring element used in the operation and the operation performed by the unit are configured through project parameters. The guidance line extraction process is performed through the linear regression method by least square. The architecture proposed in the design allowed the real-time image processing in autonomous robot navigation applications in agricultural environments. Agricultural mobile robots Autonomous navigation Computer vision FPGA FPGA Navegação autônoma Robôs móveis agrícolas Visão computacional
335	"Implementação do barramento on-chip AMBA baseada em computação reconfigurável" / Implementation of on-chip AMBA bus based on Reconfigurable Computing Queiroz, Daniel Cruz de 04 February 2005 (has links) A computação reconfigurável está se fortalecendo cada vez mais devido ao grande avanço dos dispositivos reprogramáveis e ferramentas de projeto de hardware utilizadas atualmente. Isso possibilita que o desenvolvimento de hardware torne-se bem menos trabalhoso e complicado, facilitando assim a vida do desenvolvedor. A tecnologia utilizada atualmente em projetos de computação reconfigurável é denominada FPGA (Field Programmable Gate Array), que une algumas características tanto de software (flexibilidade), como de hardware (desempenho). Isso fornece um ambiente bastante propício para desenvolvimento de aplicações que precisam de um bom desempenho, sem que estas devam possuir uma configuração definitiva. O objetivo deste trabalho foi implementar um barramento eficiente para possibilitar a comunicação entre diferentes CORES de um robô reconfigurável, que podem estar dispersos em diferentes dispositivos FPGAs. Tal barramento seguirá o padrão AMBA (Advanced Microcontroller Bus Architecture), pertencente à ARM. Todo o desenvolvimento do core completo do AMBA foi realizado utilizando-se a linguagem VHDL (Very High Speed Integrated Circuit Hardware Description Language) e ferramentas EDAs (Electronic Design Automation) apropriadas. É importante notar que, embora o barramento tenha sido projetado para ser utilizado em um robô, o mesmo pode ser usado em qualquer sistema on-chip. / The reconfigurable computing is each time more fortified, what leads to a great advance of reprogrammable devices and hardware design tools. This has become hardware development less laborious and complicated, thus, facilitating the life of the designer. The technology currently used in projects of reconfigurable computing is called FPGA (Field Programmable Gate Array), which combines some characteristics of software (flexibility) and hardware (performance). This technology provides a propitious environment to the development of applications that need a good performance. Those that dont need a definitive configuration. The purpose of this work was to implement an efficient bus to make possible the communication among different modules of a reconfigurable robot. This bus is based on a bus standard called AMBA (Advanced Microcontroller Bus Architecture), which belongs to ARM. All the development of full AMBA core was carried through using VHDL (Very High Speed Integrated Circuit the Hardware Description Language) language and appropriated EDA (Electronic Design Automation) tools. It is important to notice that, even so the bus have been projected to be used in a robot, it could be used in any system on-chip. AMBA AMBA barramento on-chip computação reconfigurável FPGA FPGA on-chip bus reconfigurable computing
336	Sistema para sensoriamento e controle para aplicações em biomecatrônica. / Sensing and control system for applications in biomechatronics. Rossi, Luís Filipe Fragoso de Barros e Silva 26 January 2012 (has links) Diversos trabalhos relacionados ao desenvolvimento de dispositivos robóticos biomecatrônicos estão sendo realizados em vários laboratórios no mundo. Apesar desta crescente tendência, devido a uma falta de padronização nas tecnologias utilizadas, em especial no sistema de sensoriamento e controle, há uma grande divergência nos sistemas resultantes. De forma a se conseguir atender os requisitos dos projetos, muito tempo é despendido no desenvolvimento de sistemas de sensoriamento e controle dedicados. Dentro deste cenário, neste trabalho foi projetado e implementado um sistema de sensoriamento e controle modular específico para sistemas robóticos. Este foi desenvolvido de forma a poder ser utilizado em diversos projetos reduzindo o esforço para a sua implementação. O referido sistema foi dividido em três módulos: Processador Central, Nós e Rede de Comunicação. Foi dada uma especial atenção no aspecto relacionado à comunicação por ser um fator-chave para se conseguir manter compatibilidade entre diferentes sistemas. Uma rede de comunicação denominada R-Bone foi desenvolvida pelo fato de que os sistemas existentes não atendem aos requisitos propostos. Uma descrição conceitual do sistema projetado é apresentada e a sua implementação detalhada. Todos os aspectos técnicos relevantes foram descritos de forma a facilitar a sua replicação por outros grupos. Um driver para sistema operacional Linux foi desenvolvido em conjunto com uma camada de abstração para simplificar o seu uso. Os testes realizados demonstraram que o sistema desenvolvido atende os requisitos propostos, mantendo uma condição de estabilidade adequada em seu tempo de resposta, baixa latência e pouca defasagem entre os sinais coletados pelos sensores. De forma a contribuir para uma possível padronização dos sistemas utilizados na área, todos os arquivos e informações relevantes para a replicação do sistema proposto foram disponibilizados sob a licença GNU LGPL em um servidor SVN. / Several works related to the development of biomechatronic robotic systems are being taken in several laboratories around the world. Despite this increasing trend, due to a lack of standardization in the used technologies, in special related to the control and sensing system, there is a wide divergence in the resulting system. In order to meet the project requirements, a lot of time is spent in the development of a custom control and sensing system. In this scenario, a modular sensing and control system specifically designed to be used in robotic systems, was designed and implemented. The last was developed in order to be used in several projects, thus reducing the effort spent on its implementation. This system was divided into three modules: Central Processor, Nodes and Communication Network. A special attention was given to the aspects related to the communication as it is the key-factor to keep compatibility among different systems. A communication network named R-Bone was developed, and its implementation was detailed. All the relevant technical aspects were described in order to facilitate its replication by other groups. A driver for the Linux operating system was developed in conjunction with an abstraction layer to simplify its use. The tests demonstrated that the system meets the proposed requirements, keeping a proper stability condition in the response time, low latency and little skew between the signals collected by the sensors. In order to contribute to a possible standardization of the systems used in the biomechatronics field, all the files with relevant information to make possible the replication of the proposed system were made available under the GNU LGPL license in a SVN server. Biomecatrônica Biomechatronics Controle distribuído Distributed control FPGA FPGA Modular system Robótica Robotics Sistema modular
337	Geração de b-splines via FPGA / B-spline generation via FPGA Silva, Luiz Marcelo Chiesse da 10 August 2012 (has links) As b-splines são utilizadas em sistemas CAD/CAM/CAE para representar e definir curvas e superfícies complexas, sendo adotada pelos principais padrões da computação gráfica devido a características como representação matemática de forma compacta, flexibilidade e transformações afins. Em sistemas de aquisição de dados 3D e sistemas CAM-CNC integrados, a utilização da b-spline na transferência de informações geométricas e na reconstrução da superfície de objetos resulta em um significativo incremento na eficiência do processo, geralmente implementado em sistemas embarcados. Nestes sistemas embarcados, integrados no auxílio a máquinas de manufatura, a utilização de FPGAs é incipiente, sem circuitos para b-splines disponibilizados em lógica reconfigurável de circuito aberto (open core), razão pela qual este projeto propõe o desenvolvimento de um circuito de geração b-spline aberto, em um sistema embarcado FPGA, utilizando algoritmos adaptados para os circuitos, elaborados em linguagem Verilog HDL, padronizada para a síntese de circuitos em lógica reconfigurável. Os circuitos foram desenvolvidos, utilizando-se um barramento de dados padronizado em circuito aberto, nas seguintes implementações para processamento paralelo das b-splines: o BFEA, o método baseado em funções base fixas, ambos projetados para circuitos integrados, e o fast Cox-de Boor, desenvolvido para FPGAs. Foram comparados o tempo de execução e o consumo de recursos disponíveis no FPGA utilizado, entre cada implementação. Os resultados evidenciaram que os circuitos de funções base fixas apresentaram o processamento mais rápido para a geração de b-splines em um FPGA, com um tempo de execução em média 20% menor em relação às outras implementações. Os circuitos BFEA apresentaram a menor utilização de elementos lógicos, em média 50% menor em relação aos outros circuitos implementados. O circuito fast Cox-de Boor apresentou a melhor escalabilidade, devido à modularidade da implementação, com tempos de execução similares aos circuitos de funções base fixas. / The b-splines are used in CAD/CAM/CAE systems to represent and define complex curves and surfaces, being adopted by the main computer graphics standards due to features like compact mathematic representation, flexibility and affine transformations. In 3D acquisition systems and integrated CAM-CNC systems, the use of the b-spline in the geometric information data transfer and in the object surface reconstruction results in a increase in the process efficiency, generally implemented in embedded systems. In these embedded systems, integrated in the aid to manufacturing machines, the use of FPGAs is incipient, without available b-splines open core circuits in reconfigurable logic, the reason why this project propose the development of a b-spline generation open core circuit, in a FPGA embedded system, using adaptated algorithms for the circuits, made in Verilog HDL language, standardized for the circuit synthesis in reconfigurable logic. The circuits were developed, using an open core standardized data bus, in the following implementations of b-spline parallel processing: the BFEA, fixed basis functions based method, both designed for integrated circuits, and the fast Cox-de Boor, developed for FPGAs. The execution time and available resource consumption in the FPGA were compared, between each implementation. The results show that the fixed basis functions circuits presented the fastest processing for the b-splines generation in a FPGA, with a 20% mean execution time reduction in relation to the other implementations. The BFEA circuits presented the lowest logic elements use, in mean 50% fewer in relation to the other implemented circuits. The fast Cox-De Boor circuit presented the best scalability, due to the implementation modularity, with execution times similar to the fixed basis functions circuits. B-Spline B-Spline FPGA FPGA Lógica reconfigurável Reconfigurable logic Reconstrução de superfícies Surface reconstruction
338	Um método de otimização da relação desempenho/consumo de energia para arquiteturas multi-cores heterogêneas em FPGA / A method to optimize performance/energy consumption relation for heterogeneous multi-core architectures on FPGA Silva, Bruno de Abreu 07 March 2016 (has links) Devido às tendências de crescimento da quantidade de dados processados e a crescente necessidade por computação de alto desempenho, mudanças significativas estão acontecendo no projeto de arquiteturas de computadores. Com isso, tem-se migrado do paradigma sequencial para o paralelo, com centenas ou milhares de núcleos de processamento em um mesmo chip. Dentro desse contexto, o gerenciamento de energia torna-se cada vez mais importante, principalmente em sistemas embarcados, que geralmente são alimentados por baterias. De acordo com a Lei de Moore, o desempenho de um processador dobra a cada 18 meses, porém a capacidade das baterias dobra somente a cada 10 anos. Esta situação provoca uma enorme lacuna, que pode ser amenizada com a utilização de arquiteturas multi-cores heterogêneas. Um desafio fundamental que permanece em aberto para estas arquiteturas é realizar a integração entre desenvolvimento de código embarcado, escalonamento e hardware para gerenciamento de energia. O objetivo geral deste trabalho de doutorado é investigar técnicas para otimização da relação desempenho/consumo de energia em arquiteturas multi-cores heterogêneas single-ISA implementadas em FPGA. Nesse sentido, buscou-se por soluções que obtivessem o melhor desempenho possível a um consumo de energia ótimo. Isto foi feito por meio da combinação de mineração de dados para a análise de softwares baseados em threads aliadas às técnicas tradicionais para gerenciamento de energia, como way-shutdown dinâmico, e uma nova política de escalonamento heterogeneity-aware. Como principais contribuições pode-se citar a combinação de técnicas de gerenciamento de energia em diversos níveis como o nível do hardware, do escalonamento e da compilação; e uma política de escalonamento integrada com uma arquitetura multi-core heterogênea em relação ao tamanho da memória cache L1. / Due to the growing need for high-performance computing along with higher volume of data to process, important changes are happening in computer architecture design. Parallel computing processors having hundreds or thousands of processing cores in a single chip are becoming a common solution, even for embedded systems. Power management becomes increasingly important, especially for mobile systems. A key challenge remaining open for these architectures is to perform the integration of application code, runtime scheduling and hardware control for power management. This thesis aims to present a method able to integrate these three aspects, by investigating techniques for optimizing performance versus power consumption in single-ISA heterogeneous multi-cores architectures implemented on FPGA. Our approach applies a data mining technique to analyze the application source-code, traditional techniques for power management, and an heterogeneity-aware scheduling policy. The main contributions are the combination of power management techniques at hardware, scheduling and compilation levels; a new scheduling policy along with a heterogeneous multi-core architecture relative to its L1 cache memory size determined offline and online. Consumo de energia Desempenho Energy consumption FPGA FPGA Heterogeneous multi-cores Multi-cores heterogêneos Performance
339	Algoritmos de tempo real para melhoramento de imagens capturadas no espectro do infravermelho projetados para síntese em FPGA / Real-time infrared images enhancement algorithms developed for FPGA synthesis Lucas Rotava 04 December 2015 (has links) Este trabalho apresenta o desenvolvimento de algoritmos de processamento de imagens para câmeras térmicas, com o objetivo de sintetizá-los em FPGA. Existem diversas aplicações para imagens térmicas nas áreas médica, de segurança e industrial, por isso o conhecimento e o desenvolvimento de câmeras térmicas são de interesse para a academia e para a indústria. Por consequência, o desenvolvimento de algoritmos que tratem as imagens também representa importante papel. Os algoritmos implementados neste trabalho são: correção de não uniformidade (NUC); substituição de pixels defeituosos, ou bad pixels, (BPR); redução da resolução de cor com realce de contraste; e filtro espacial para realçar detalhes da imagem, chamado de filtro de nitidez. Os três primeiros são algoritmos importantes devido à características dos detectores e de câmeras térmicas, já o filtro de nitidez foi proposto para melhorar a visualização de objetos nas imagens. Com os algoritmos simulados em Matlab foram feitas medidas de contraste e de MTF das imagens de saída, e os resultados obtidos para os algoritmos de realce de contraste e de nitidez mostraram que eles são adições importantes ao conjunto de algoritmos básicos para câmeras térmicas, já que, para alguns casos, o realce de contraste aumentou em mais de 50% a medida de contraste da imagem, em comparação com o algoritmo anterior, e o filtro de nitidez proporcionou valores de MTF até duas vezes maiores. Os algoritmos de NUC e BPR apresentaram os resultados esperados, corrigindo a imagem recebida do detector. As imagens utilizadas eram de 640×512 pixels processadas em uma taxa de 30 fps, e dessa forma optou-se pelo FPGA para a síntese dos algoritmos, sendo possível realizar os processamentos paralelamente contando com a característica de alto throughput inerente a estes componentes. Os algoritmos implementados em FPGA apresentaram desempenho superior aos requisitos mínimos de tempo para o sistema utilizado, sendo perfeitamente capazes de processar o vídeo de entrada em tempo real. / This work presents the development of FPGA-synthesizable image processing algorithms to thermal cameras. There are plenty of applications for thermal imaging in medical, security and industrial areas, therefore, the knowledge and the development of thermal cameras are of great interest to both academia and industry. Consequently, the development of algorithms to enhance the images is also important. The implemented algorithms are: nonuniformity correction (NUC); bad pixel replacement (BPR); pixel depth reduction with contrast enhancement; and emboss spatial filter. The three first algorithms are important because of some characteristics of infrared detectors and cameras, and the emboss filter is proposed to improve the visualization of objects in the images. With the algorithms simulated in Matlab, the contrast and MTF were measured in the output images, and the results showed that the contrast enhancement and the emboss filter algorithms are important additions to the infrared cameras basic set of image processing algorithms since, for some cases, the contrast enhancement was able to improve the contrast by 50% and the emboss filter have doubled the MTF. NUC and BPR algorithms had the expected results, correcting the image from the detector. There were used images with resolution of 640×512 at 30 frames per second and, because of this, it was chosen to synthesize the algorithms in an FPGA, this way it is possible to run them in parallel, counting on the high throughput characteristic of the FPGAs. The implemented algorithms have better timing performance than the needed for the system used, being perfectly able to process the input video in real time. Câmera térmica FPGA Imagem térmica Processamento de imagens FPGA Image processing Infrared imaging Thermal camera
340	Modélisation et gestion du trafic dans le cadre de réseaux sur puce multi-FPGA / Modeling and management of traffic in the Network-on-Chip multi-FPGA Dorai, Atef 07 July 2017 (has links) Avec la complexité croissante des systèmes sur puce, la conception de la nouvelle génération des systèmes embarqués dédiée aux applications multimédia doit intégrer des structures de communication efficaces telles que le réseau sur puce (Network-on-Chip : NoC). Vu la limitation du nombre de ressources d’un seul FPGA, les plateformes multi-FPGA sont considérées comme la solution la plus appropriée pour émuler et évaluer ces grands systèmes. Le déploiement passe souvent par le partitionnement du NoC sur plusieurs FPGAs et de remplacer les liens de communications internes par des liens de communications externes. Cette solution possède des limitations. En fait, l’évolution des FPGAs tend à rendre les IOs des ressources rares aggravant la bande passante intra-FPGA d’une génération à une autre. Actuellement, le nombre de signaux inter-FPGA est considéré comme un problème majeur pour déployer un NoC à grand échelle sur multi-FPGA. Comme il y a plus de signaux à connecter que les IOs disponibles sur FPGA, un goulot d’étranglement important a été crée laissant les concepteurs soufrera. Les contributions principales de cette thèse sont : (1). Nous avons développé deux architectures de gestions de collisions, une basée sur un accès aléatoire (Backoff) et l’autre basée sur un accès planifié (Round-Robin). Des comparaisons temporelles et des ressources ont été effectuées pour choisir la méthode d’accès la plus performante pour prototyper un NoC sur multi-FPGA. L’architecture basée sur le Backoff permet de partager efficacement le lien externe entre plusieurs routeurs avec un nombre minimum de collisions. Ainsi, cet algorithme permet de gérer le goulet d’étranglement et équilibre les accès des routeurs vers l’inter-FPGA. La nouvelle architecture inter-FPGA pour le Network-on-Chip basée sur l’algorithme BackOff fournit une latence plus faible avec moins de ressources par rapport à d’autres solutions comme le RR (Round-Robin) et le HRRA (Hierarchical Roun-Robin Arbiter). (2) Une méthodologie de modélisation a été émergée pour estimer le nombre de ressources utilisées par chaque architecture. Cette modélisation est basée sur la régression linéaire. Il y a des grandes surestimations avec le round-robin qu’avec le Backoff. (3) Finalement, une architecture de NoC dédiée aux applications multimédias a été proposée. L’objective de cette architecture est de transmettre des trafics avec des niveaux de priorités différentes dans des bonnes conditions. Dans cette architecture de NoC multimédia, nous avons doublé les liens physiques au lieu d’utiliser des canaux virtuels pour permettre aux trafics de haute priorité de récupérer le retard. De plus, nous avons intégré à l’intérieur des routeurs un simple arbitre pour traiter les niveaux de priorité pour chaque paquet. Cette nouvelle architecture a été comparée avec des architectures de NoC traditionnelles avec (basée sur des canaux virtuels) ou sans (NoC Handshake) qualité de service. Plusieurs testsont été effectués pour prouver l’efficacité de l’architecture du NoC multimédia. Finalement, une étude analytique a été proposée pour estimer le nombre d’AP nécessaires pour que cette architecture de NoC multimédia afin de répondre aux exigences d’utilisateurs dans le contexte de multi-FPGA / With the increasing complexity of System-on-Chip, the design of efficient embedded systems dedicated for multimedia applications must integrate effective communication interconnects such as Network-on-Chip. Given the limited number of resources of a single FPGA, multi-FPGA platforms are considered the most appropriate means for experimentation, emulation and evaluation for such large systems. Deployment often involves partitioning the Network-on-Chip on several FPGA and replacing internal communication links with external ones. The limitation of this solution stems from the fact that with ongoing evolution of FPGAs, their I/O resources become scarcer in time. This, consequently, decreases intra-FPGA bandwidth. Currently, the number of inter-FPGA signals is considered a major problem to prototype a Network-on-Chip on multi-FPGA. Since there are more signals needed for routers than the number of available FPGA I/Os. Therefore, inter-FPGA links must be shared between routers, resulting in significant bottlenecks. Since the ratio of logical capacity to the number of IOs increases slowly for each FPGA generation, this technological bottleneck will be remaining for future system designs.The main contributions of this thesis are : (1). We have developed two collision management architectures, one is based on a random access (Backoff) and the other is based on a round-robin algorithm. Timing and resources comparisons are made to evaluate the two inter-FPGA traffic management architectures. The Backoff-based sub-NoC architecture effectively shares external links between multiple routers with a minimum number of collision and balances access between all routers. The new inter-FPGA architecture for the Network-on-Chip based on the BackOff algorithm achieves lower latency with fewer resources compared to other solutions such as Round-Robin and Hierarchical Round-Robin Arbiter. (2) A modeling methodology has emerged to estimate the number of resources used by each architecture. This modeling is based on linear regression. There are considerable over-estimations in the round-robin compared to the Backoff. (3) A NoC architecture dedicated for multimedia applications has been proposed. The objective of such architecture is to transmit traffic with different priority levels under right conditions. In this architecture of NoC multimedia, we have doubled the physical links instead of using virtual channels to allow high priority traffic to recover the delay and to ensure quality of service. In Additionally we have integrated within the routers a simple arbiter to deal with the priority levels for each packet. This new architecture has been compared with traditional architecture based on virtual channels using several test partitioning. Finally, an analytical study was proposed to estimate the number of APs needed for the NoC Multimedia deployed in multi-FPGA systemse to meet the user’s requirements NoC Multi-FPGA Architecture de gestion de collisions NoC multimédia NoC Multi-FPGA Collision management architecture NoC multimedia

Search results