Global ETD Search

11	Interface Design and Synthesis for Structural Hybrid Microarchitectural Simulators Ruan, Zhuo 01 December 2013 (has links) (PDF) Computer architects have discovered the potential of using FPGAs to accelerate software microarchitectural simulators. One type of FPGA-accelerated microarchitectural simulator, namedthe hybrid structural microarchitectural simulator, is very promising. This is because a hybrid structural microarchitectural simulator combines structural software and hardware, and this particular organization provides both modeling flexibility and fast simulation speed. The performance of a hybrid simulator is significantly affected by how the interface between software and hardware is constructed. The work of this thesis creates an infrastructure, named Simulator Partitioning Research Infrastructure (SPRI), to implement the synthesis of hybrid structural microarchitectural simulators which includes simulator partitioning, simulator-to-hardware synthesis, interface synthesis. With the support of SPRI, this thesis characterizes the design space of interfaces for synthesized hybrid structural microarchitectural simulators and provides the implementations for several such interfaces. The evaluation of this thesis thoroughly studies the important design tradeoffs and performance factors (e.g. hardware capacity, design scalability, and interface latency) involved in choosing an efficient interface. The work of this thesis is essential to the research community of computer architecture. It not only contributes a complete synthesis infrastructure, but also provides guidelines to architects on how to organize software microarchitectural models and choose a proper software/hardware interface so the hybrid microarchitectural simulators synthesized from these software models can achieve desirable speedup hybrid microarchitectural simulator software codesign hardware codesign SystemC FPGA Electrical and Computer Engineering
12	RESOURCE-AWARE OPTIMIZATION TECHNIQUES FOR MACHINE LEARNING INFERENCE ON HETEROGENEOUS EMBEDDED SYSTEMS Spantidi, Ourania 01 May 2023 (has links) (PDF) With the increasing adoption of Deep Neural Networks (DNNs) in modern applications, there has been a proliferation of computationally and power-hungry workloads, which has necessitated the use of embedded systems with more sophisticated, heterogeneous approaches to accommodate these requirements. One of the major solutions to tackle these challenges has been the development of domain-specific accelerators, which are highly optimized for the computationally intensive tasks associated with DNNs. These accelerators are designed to take advantage of the unique properties of DNNs, such as parallelism and data locality, to achieve high throughput and energy efficiency. Domain-specific accelerators have been shown to provide significant improvements in performance and energy efficiency compared to traditional general-purpose processors and are becoming increasingly popular in a range of applications such as computer vision and speech recognition. However, designing these architectures and managing their resources can be challenging, as it requires a deep understanding of the workload and the system's unique properties. Achieving a favorable balance between performance and power consumption is not always straightforward and requires careful design decisions to fully exploit the benefits of the underlying hardware. This dissertation aims to address these challenges by presenting solutions that enable low energy consumption without compromising performance for heterogeneous embedded systems. Specifically, this dissertation will focus on three topics: (i) the utilization of approximate computing concepts and approximate accelerators for energy-efficient DNN inference,(ii) the integration of formal properties in the systematic employment of approximate computing concepts, and (iii) resource management techniques on heterogeneous embedded systems.In summary, this dissertation provides a comprehensive study of solutions that can improve the energy efficiency of heterogeneous embedded systems, enabling them to perform computationally intensive tasks associated with modern applications that incorporate DNNs without compromising on performance. The results of this dissertation demonstrate the effectiveness of the proposed solutions and their potential for wide-ranging practical applications. Embedded Systems Formal Properties Hardware Accelerators Hardware-Software Codesign Machine Learning Neural Networks
13	Software Performance Estimation Techniques in a Co-Design Environment Subramanian, Sriram 02 September 2003 (has links) No description available. Computer Science software performance estimation VCC SPIM MiBench hardware-software-codesign
14	Uma abordagem para a modelagem de sistemas digitais Oliveira, Wagner Luiz Alves de 18 December 2003 (has links) Orientadores: Norian Marranghello / Tese (doutorado) - Universidade Estadual de Campinas, Faculdade de Engenharia Eletrica e de Computação / Made available in DSpace on 2018-08-04T00:21:09Z (GMT). No. of bitstreams: 1 Oliveira_WagnerLuizAlvesde_D.pdf: 11239313 bytes, checksum: 6656f5270142e68410f7ed92ce02dc2d (MD5) Previous issue date: 2004 / Resumo: O projeto de sistemas digitais alcançou um elevado grau de complexidade, inviabilizando sua consecução sem o uso de ferramentas de CAD. O ponto de partida de tais ferramentas consiste numa visão conceitual do sistema pretendido (dada por um ou mais modelos conceituais), a qual é capturada para tratamento computacional por uma ou mais linguagens de especificação. Várias dessas linguagens foram desenvolvidas visando capturar tantas características de hardware e de software quanto possível, de acordo com diferentes metodologias de projeto. Rede de Petri é uma classe de modelos conceituais utilizada na modelagem de diversos tipos de sistemas computacionais paralelos. Algumas extensões de rede de Petri foram propostas visando à descrição, de forma tão acurada quanto possível, de características de sistemas digitais. Entretanto, somente duas destas extensões possuem um número maior de características necessárias à descrição integral de tais sistemas. O presente trabalho apresenta uma extensão de rede de Petri desenvolvida para superar as limitações das demais extensões na representação de sistemas digitais. O trabalho apresenta, também, uma metodologia de coprojeto hardware/software na qual a extensão proposta pode ser usada como linguagem de modelagem interna. Tal plataforma visa a descrição, simulação, análise, validação e síntese em alto nível de sistemas digitais embutidos / Abstract: Digital system design has reached a high degree of complexity that prevents its realization without CAD tools. The starting point of such tools consists on a conceptual view of the intended system (given by one or more conceptual models), which is captured for computational handling by one or more specification languages. Several of such languages were developed aiming to capture as many hardware and software characteristics as possible, according to different design methodologies. Petri net is a class of conceptual models for parallel system modeling. Some Petri net extensions have been proposed aiming at describing digital systems characteristics as accurately as possible. However, only two of them have nearly all features needed to describe such systems in full. This work presents a Petri net extension developed to overcome the restrictions for digital system modeling through Petri net extensions. A hardware/software codesign methodology in which the proposed extension can be used as the internal modeling language is presented as well. Such a framework aims embedded digital system description, simulation, analysis, validation, and high-level synthesis / Doutorado / Eletrônica, Microeletrônica e Optoeletrônica / Doutor em Engenharia Elétrica Redes de petri Simulação (Computadores) Sistemas embarcados (Computadores) Hardware - Linguagens descritivas Petri Nets Digital Embedded Systems Digital Systems Hardware / Software Codesign
15	Nástroj pro grafické prototypování vestavěných systémů / Tool for Graphical Prototyping of the Embedded Systems Ilčík, Ondřej January 2011 (has links) This study is focused on grafical modeling of embedded systems using dialects of UML. It provides a brief description of existing profiles. Furthemore it deals with modeling frameworks for the Eclipse platform and describes an implementation of such modeling tool as a part of project Lissom.
16	Vícekamerový snímač biometrických vlastností lidského prstu / Multi-Camera Scanner of Biometric Features of Human Finger Trhoň, Adam January 2015 (has links) This thesis describes a conceptual design of touchless fingerprint sensor and design, implementation and testing of its firmware, which is a composition of hardware implemented in VHDL and a program implemented in C. Result of this thesis can be used as the first step of building an industrial solution.
17	Arquitetura multi-core reconfigurável para detecção de pedestres baseada em visão / Reconfigurable Multi-core Architecture for Vision-based Pedestrian Detection Holanda, Jose Arnaldo Mascagni de 17 May 2017 (has links) Dentre as diversas tecnologias de Assistência Avançada ao Condutor (ADAS) que têm sido adicionadas aos automóveis modernos estão os sistemas de detecção de pedestres. Tais sistemas utilizam sensores, como radares, lasers e câmeras de vídeo para captar informações do ambiente e evitar a colisão com pessoas no contexto do trânsito. Câmeras de vídeo têm se apresentado como um ótima opção para esses sistemas, devido ao relativo baixo custo e à riqueza de informações que capturam do ambiente. Muitas técnicas para detecção de pedestres baseadas em visão têm surgido nos últimos anos, tendo como característica a necessidade de um grande poder computacional para que se possa realizar o processamento das imagens em tempo real, de forma robusta, confiável e com baixa taxa de erros. Além disso, é necessário que sistemas que implementem essas técnicas tenham baixo consumo de energia, para que possam funcionar em um ambiente embarcado, como os automóveis. Uma tendência desses sistemas é o processamento de imagens de múltiplas câmeras presentes no veículo, de forma que o sistema consiga perceber potenciais perigos de colisão ao redor do veículo. Neste contexto, este trabalho aborda o coprojeto de hardware e software de uma arquitetura para detecção de pedestres, considerando a presença de quatro câmeras em um veículo (uma frontal, uma traseira e duas laterais). Com este propósito, utiliza-se a flexibilidade dos dispositivos FPGA para a exploração do espaço de projeto e a construção de uma arquitetura que forneça o desempenho necessário, o consumo de energia em níveis adequados e que também permita a adaptação a novos cenários e a evolução das técnicas de detecção de pedestres por meio da programabilidade. O desenvolvimento da arquitetura baseouse em dois algoritmos amplamente utilizados para detecção de pedestres, que são o Histogram of Oriented Gradients (HOG) e o Integral Channel Features (ICF). Ambos introduzem técnicas que servem como base para os algoritmos de detecção modernos. A arquitetura implementada permitiu a exploração de diferentes tipos de paralelismo das aplicações por meio do uso de múltiplos processadores softcore, bem como a aceleração de funções críticas por meio de implementações em hardware. Também foi demonstrada sua viabilidade no atendimento a um sistema contendo quatro câmeras de vídeo. / Among the several Advanced Driver Assistance (ADAS) technologies that have been added to modern vehicles are pedestrian detection systems. Those systems use sensors, such as radars, lasers, and video cameras to capture information from the environment and avoid collision with people in the context of traffic. Video cameras have become as a great option for such systems because of the relatively low cost and all of information they are able to capture from the environment. Many techniques for vison-based pedestrian detection have appeared in the last years, having as characteristic the necessity of a great computational power so that image can be processed in real time, in a robust and reliable way, and with low error rate. In addition, systems that implement these techniques require low power consumption, so they can operate in an embedded environment such as automobiles. A trend of these systems is the processing of images from multiple cameras mounted in vehicles, so that the system can detect potential collision hazards around the vehicle. In this context, this work addresses the hardware and software codesign of an architecture for pedestrian detection, considering the presence of four cameras in a vehicle (one in the front, one in the rear and two in the sides). For this purpose, the flexibility of FPGA devices is used for design space exploration and the construction of an architecture that provides the necessary performance, energy consumption at appropriate levels and also allows adaptation to new scenarios and evolution of pedestrian detection techniques through programmability. The development of the architecture was based on two algorithms widely used for pedestrian detection, which are Histogram of Oriented Gradients (HOG) and Integral Channel Features (ICF). Both introduce techniques that serve as the basis for modern detection algorithms. The implemented architecture allowed the exploration of different types of parallelism through the use of multiple softcore processors, as well as the acceleration of critical functions through implementations in hardware. It has also been demonstrated its feasibility in attending to a system containing four video cameras. Architecture Arquitetura Co-projeto hardware/software Computer vision Detecção de pedestres FPGA FPGA Hardware Hardware Hardware/software codesign Pedestrian detection Visão computacional
18	Exploring coordinated software and hardware support for hardware resource allocation Figueiredo Boneti, Carlos Santieri de 04 September 2009 (has links) Multithreaded processors are now common in the industry as they offer high performance at a low cost. Traditionally, in such processors, the assignation of hardware resources between the multiple threads is done implicitly, by the hardware policies. However, a new class of multithreaded hardware allows the explicit allocation of resources to be controlled or biased by the software. Currently, there is little or no coordination between the allocation of resources done by the hardware and the prioritization of tasks done by the software.This thesis targets to narrow the gap between the software and the hardware, with respect to the hardware resource allocation, by proposing a new explicit resource allocation hardware mechanism and novel schedulers that use the currently available hardware resource allocation mechanisms.It approaches the problem in two different types of computing systems: on the high performance computing domain, we characterize the first processor to present a mechanism that allows the software to bias the allocation hardware resources, the IBM POWER5. In addition, we propose the use of hardware resource allocation as a way to balance high performance computing applications. Finally, we propose two new scheduling mechanisms that are able to transparently and successfully balance applications in real systems using the hardware resource allocation. On the soft real-time domain, we propose a hardware extension to the existing explicit resource allocation hardware and, in addition, two software schedulers that use the explicit allocation hardware to improve the schedulability of tasks in a soft real-time system.In this thesis, we demonstrate that system performance improves by making the software aware of the mechanisms to control the amount of resources given to each running thread. In particular, for the high performance computing domain, we show that it is possible to decrease the execution time of MPI applications biasing the hardware resource assignation between threads. In addition, we show that it is possible to decrease the number of missed deadlines when scheduling tasks in a soft real-time SMT system. MT CMP harware priorities thread prioritization resource balancing load balancing powers simultaneous multithreading SMT hardware-software codesign performance characterization software-controlled prioritization 004
19	Entwurf, Methoden und Werkzeuge für komplexe Bildverarbeitungssysteme auf Rekonfigurierbaren System-on-Chip-Architekturen / Design, methodologies and tools for complex image processing systems on reconfigurable system-on-chip-architectures Mühlbauer, Felix January 2011 (has links) Bildverarbeitungsanwendungen stellen besondere Ansprüche an das ausführende Rechensystem. Einerseits ist eine hohe Rechenleistung erforderlich. Andererseits ist eine hohe Flexibilität von Vorteil, da die Entwicklung tendentiell ein experimenteller und interaktiver Prozess ist. Für neue Anwendungen tendieren Entwickler dazu, eine Rechenarchitektur zu wählen, die sie gut kennen, anstatt eine Architektur einzusetzen, die am besten zur Anwendung passt. Bildverarbeitungsalgorithmen sind inhärent parallel, doch herkömmliche bildverarbeitende eingebettete Systeme basieren meist auf sequentiell arbeitenden Prozessoren. Im Gegensatz zu dieser "Unstimmigkeit" können hocheffiziente Systeme aus einer gezielten Synergie aus Software- und Hardwarekomponenten aufgebaut werden. Die Konstruktion solcher System ist jedoch komplex und viele Lösungen, wie zum Beispiel grobgranulare Architekturen oder anwendungsspezifische Programmiersprachen, sind oft zu akademisch für einen Einsatz in der Wirtschaft. Die vorliegende Arbeit soll ein Beitrag dazu leisten, die Komplexität von Hardware-Software-Systemen zu reduzieren und damit die Entwicklung hochperformanter on-Chip-Systeme im Bereich Bildverarbeitung zu vereinfachen und wirtschaftlicher zu machen. Dabei wurde Wert darauf gelegt, den Aufwand für Einarbeitung, Entwicklung als auch Erweiterungen gering zu halten. Es wurde ein Entwurfsfluss konzipiert und umgesetzt, welcher es dem Softwareentwickler ermöglicht, Berechnungen durch Hardwarekomponenten zu beschleunigen und das zu Grunde liegende eingebettete System komplett zu prototypisieren. Hierbei werden komplexe Bildverarbeitungsanwendungen betrachtet, welche ein Betriebssystem erfordern, wie zum Beispiel verteilte Kamerasensornetzwerke. Die eingesetzte Software basiert auf Linux und der Bildverarbeitungsbibliothek OpenCV. Die Verteilung der Berechnungen auf Software- und Hardwarekomponenten und die daraus resultierende Ablaufplanung und Generierung der Rechenarchitektur erfolgt automatisch. Mittels einer auf der Antwortmengenprogrammierung basierten Entwurfsraumexploration ergeben sich Vorteile bei der Modellierung und Erweiterung. Die Systemsoftware wird mit OpenEmbedded/Bitbake synthetisiert und die erzeugten on-Chip-Architekturen auf FPGAs realisiert. / Image processing applications have special requirements to the executing computational system. On the one hand a high computational power is necessary. On the other hand a high flexibility is an advantage because the development tends to be an experimental and interactive process. For new applications the developer tend to choose a computational architecture which they know well instead of using that one which fits best to the application. Image processing algorithms are inherently parallel while common image processing systems are mostly based on sequentially operating processors. In contrast to this "mismatch", highly efficient systems can be setup of a directed synergy of software and hardware components. However, the construction of such systems is complex and lots of solutions, like gross-grained architectures or application specific programming languages, are often too academic for the usage in commerce. The present work should contribute to reduce the complexity of hardware-software-systems and thus increase the economy of and simplify the development of high-performance on-chip systems in the domain of image processing. In doing so, a value was set on keeping the effort low on making familiar to the topic, on development and also extensions. A design flow was developed and implemented which allows the software developer to accelerate calculations with hardware components and to prototype the whole embedded system. Here complex image processing systems, like distributed camera sensor networks, are examined which need an operating system. The used software is based upon Linux and the image processing library OpenCV. The distribution of the calculations to software and hardware components and the resulting scheduling and generation of architectures is done automatically. The design space exploration is based on answer set programming which involves advantages for modelling in terms of simplicity and extensions. The software is synthesized with the help of OpenEmbedded/Bitbake and the generated on-chip architectures are implemented on FPGAs. Bildverarbeitung FPGA on-chip Entwurfsraumexploration Hardware-Software-Co-Design Antwortmengenprogrammierung image processing FPGA on-chip design space exploration hardware-software-codesign answer set programming Data processing Computer science
20	Arquitetura multi-core reconfigurável para detecção de pedestres baseada em visão / Reconfigurable Multi-core Architecture for Vision-based Pedestrian Detection Jose Arnaldo Mascagni de Holanda 17 May 2017 (has links) Dentre as diversas tecnologias de Assistência Avançada ao Condutor (ADAS) que têm sido adicionadas aos automóveis modernos estão os sistemas de detecção de pedestres. Tais sistemas utilizam sensores, como radares, lasers e câmeras de vídeo para captar informações do ambiente e evitar a colisão com pessoas no contexto do trânsito. Câmeras de vídeo têm se apresentado como um ótima opção para esses sistemas, devido ao relativo baixo custo e à riqueza de informações que capturam do ambiente. Muitas técnicas para detecção de pedestres baseadas em visão têm surgido nos últimos anos, tendo como característica a necessidade de um grande poder computacional para que se possa realizar o processamento das imagens em tempo real, de forma robusta, confiável e com baixa taxa de erros. Além disso, é necessário que sistemas que implementem essas técnicas tenham baixo consumo de energia, para que possam funcionar em um ambiente embarcado, como os automóveis. Uma tendência desses sistemas é o processamento de imagens de múltiplas câmeras presentes no veículo, de forma que o sistema consiga perceber potenciais perigos de colisão ao redor do veículo. Neste contexto, este trabalho aborda o coprojeto de hardware e software de uma arquitetura para detecção de pedestres, considerando a presença de quatro câmeras em um veículo (uma frontal, uma traseira e duas laterais). Com este propósito, utiliza-se a flexibilidade dos dispositivos FPGA para a exploração do espaço de projeto e a construção de uma arquitetura que forneça o desempenho necessário, o consumo de energia em níveis adequados e que também permita a adaptação a novos cenários e a evolução das técnicas de detecção de pedestres por meio da programabilidade. O desenvolvimento da arquitetura baseouse em dois algoritmos amplamente utilizados para detecção de pedestres, que são o Histogram of Oriented Gradients (HOG) e o Integral Channel Features (ICF). Ambos introduzem técnicas que servem como base para os algoritmos de detecção modernos. A arquitetura implementada permitiu a exploração de diferentes tipos de paralelismo das aplicações por meio do uso de múltiplos processadores softcore, bem como a aceleração de funções críticas por meio de implementações em hardware. Também foi demonstrada sua viabilidade no atendimento a um sistema contendo quatro câmeras de vídeo. / Among the several Advanced Driver Assistance (ADAS) technologies that have been added to modern vehicles are pedestrian detection systems. Those systems use sensors, such as radars, lasers, and video cameras to capture information from the environment and avoid collision with people in the context of traffic. Video cameras have become as a great option for such systems because of the relatively low cost and all of information they are able to capture from the environment. Many techniques for vison-based pedestrian detection have appeared in the last years, having as characteristic the necessity of a great computational power so that image can be processed in real time, in a robust and reliable way, and with low error rate. In addition, systems that implement these techniques require low power consumption, so they can operate in an embedded environment such as automobiles. A trend of these systems is the processing of images from multiple cameras mounted in vehicles, so that the system can detect potential collision hazards around the vehicle. In this context, this work addresses the hardware and software codesign of an architecture for pedestrian detection, considering the presence of four cameras in a vehicle (one in the front, one in the rear and two in the sides). For this purpose, the flexibility of FPGA devices is used for design space exploration and the construction of an architecture that provides the necessary performance, energy consumption at appropriate levels and also allows adaptation to new scenarios and evolution of pedestrian detection techniques through programmability. The development of the architecture was based on two algorithms widely used for pedestrian detection, which are Histogram of Oriented Gradients (HOG) and Integral Channel Features (ICF). Both introduce techniques that serve as the basis for modern detection algorithms. The implemented architecture allowed the exploration of different types of parallelism through the use of multiple softcore processors, as well as the acceleration of critical functions through implementations in hardware. It has also been demonstrated its feasibility in attending to a system containing four video cameras. Arquitetura Co-projeto hardware/software Detecção de pedestres FPGA Hardware Visão computacional Architecture Computer vision FPGA Hardware Hardware/software codesign Pedestrian detection

Search results