Global ETD Search

231	Polymorphic ASIC : For Video Decoding Adarsha Rao, S J January 2013 (has links) (PDF) Video applications are becoming ubiquitous in recent times due to an explosion in the number of devices with video capture and display capabilities. Traditionally, video applications are implemented on a variety of devices with each device targeting a speciﬁc application. However, the advances in technology have created a need to support multiple applications from a single device like a smart phone or tablet. Such convergence of applications necessitates support for interoperability among various applications, scalable performance meet the requirements of different applications and a high degree of reconﬁgurability to accommodate rapid evolution in applications features. In addition, low power consumption requirement is also very stringent for many video applications. The conventional custom hardware implementations of video applications deliver high performance at low power consumption while the recent MPSoC implementations enable high degree of interoperability and are useful to support application evolution. In this thesis, we combine the best features of custom hardware and MPSoC approaches to design a Polymorphic ASIC. A Polymorphic ASIC is an integrated circuit designed to meet the requirements of several applications belonging to a particular domain. A polymorphic ASIC consists of a fabric of computation, storage and communication resources, using which applications are composed dynamically. Although different video applications differ widely in the internal de-tails of operation, at the heart of almost every video application is a video codec (encoder and decoder). The requirements of scalability, high performance and low power consumption are very stringent for video decoding. Therefore this thesis focuses mainly on the architectural design of a Polymorphic ASIC for video decoding. We present an uniﬁed software and hardware architecture (USHA) for Polymorphic ASIC. USHA is a tiled architecture which uses loosely coupled processor and hardware tiles that are software programmable and hardware reconﬁgurable respectively. The distinctive feature of Polymorphic ASIC is the static partitioning of the application and dynamic mapping of ap-plication processes onto the computational tiles. Depending on the application scenarios, a process may be mapped onto one of the hardware or processor tiles. Polymorphic ASIC incor-porates a network–on–chip (NoC) to achieve ﬂexible communication across different tiles. Formulation of a programming framework for Polymorphic ASIC requires an implementation model that captures the structure of video decoder applications as well as the properties of the Polymorphic ASIC architecture. We derive an implementation model based on a combination of parametric polyhedral process networks, stream based functions and windowed dataﬂow models of computation. The implementation model leads to a process network oriented compilation ﬂow that achieves realization agnostic application partitioning and enables seamless migration across uniprocessor, multi–processor, semi hardware and full hardware conﬁgurations of a video decoder. The thesis also presents an application QoS aware scheduler that selects a decoder conﬁguration that best meets the application performance requirements, thereby enabling dynamic performance scaling. The memory hierarchy of Polymorphic ASIC makes use of an application speciﬁc cache. Through a combined analysis of miss rate and external memory bandwidth, we show that the degradation in decoder performance due to memory stall cycles depends on the properties of the video being decoded as well as the behavior of the external memory interface. Based on this observation, we present the design of a reconﬁgurable 2–D cache architecture which can adjust its parameters in accordance with the characteristics of the video stream being decoded. We validate the Polymorphic ASIC using a proof–of–concept implementation on an FPGA. The performance of H.264 decoder on Polymorphic ASIC is evaluated for uniprocessor, multi processor, hardware accelerated and full hardware conﬁgurations. The scaling in performance delivered by these conﬁgurations shows that the Polymorphic ASIC enables the application to achieve super linear speedups [1]. The experimental results show that different implementations of a H.264 video decoder on the Polymorphic ASIC can deliver performance comparable to a wide spectrum of devices ranging from embedded processor like ARM 9 to MPSoCs like IBM Cell. We also present the energy consumption of various conﬁgurations of video decoders on Polymorphic ASIC and an application to conﬁguration mapping aimed at minimizing the overall energy consumption of a Polymorphic ASIC. Video Decoding Polymorphic ASIC Architecture Multiprocessor System-On-Chip (MPSoC) H.264 Decoder Video Encoding H.264 Decoders H.264 Decoding Computer Science
232	Polymorphic ASIC : For Video Decoding Adarsha Rao, S J January 2013 (has links) (PDF) Video applications are becoming ubiquitous in recent times due to an explosion in the number of devices with video capture and display capabilities. Traditionally, video applications are implemented on a variety of devices with each device targeting a speciﬁc application. However, the advances in technology have created a need to support multiple applications from a single device like a smart phone or tablet. Such convergence of applications necessitates support for interoperability among various applications, scalable performance meet the requirements of different applications and a high degree of reconﬁgurability to accommodate rapid evolution in applications features. In addition, low power consumption requirement is also very stringent for many video applications. The conventional custom hardware implementations of video applications deliver high performance at low power consumption while the recent MPSoC implementations enable high degree of interoperability and are useful to support application evolution. In this thesis, we combine the best features of custom hardware and MPSoC approaches to design a Polymorphic ASIC. A Polymorphic ASIC is an integrated circuit designed to meet the requirements of several applications belonging to a particular domain. A polymorphic ASIC consists of a fabric of computation, storage and communication resources, using which applications are composed dynamically. Although different video applications differ widely in the internal de-tails of operation, at the heart of almost every video application is a video codec (encoder and decoder). The requirements of scalability, high performance and low power consumption are very stringent for video decoding. Therefore this thesis focuses mainly on the architectural design of a Polymorphic ASIC for video decoding. We present an uniﬁed software and hardware architecture (USHA) for Polymorphic ASIC. USHA is a tiled architecture which uses loosely coupled processor and hardware tiles that are software programmable and hardware reconﬁgurable respectively. The distinctive feature of Polymorphic ASIC is the static partitioning of the application and dynamic mapping of ap-plication processes onto the computational tiles. Depending on the application scenarios, a process may be mapped onto one of the hardware or processor tiles. Polymorphic ASIC incor-porates a network–on–chip (NoC) to achieve ﬂexible communication across different tiles. Formulation of a programming framework for Polymorphic ASIC requires an implementation model that captures the structure of video decoder applications as well as the properties of the Polymorphic ASIC architecture. We derive an implementation model based on a combination of parametric polyhedral process networks, stream based functions and windowed dataﬂow models of computation. The implementation model leads to a process network oriented compilation ﬂow that achieves realization agnostic application partitioning and enables seamless migration across uniprocessor, multi–processor, semi hardware and full hardware conﬁgurations of a video decoder. The thesis also presents an application QoS aware scheduler that selects a decoder conﬁguration that best meets the application performance requirements, thereby enabling dynamic performance scaling. The memory hierarchy of Polymorphic ASIC makes use of an application speciﬁc cache. Through a combined analysis of miss rate and external memory bandwidth, we show that the degradation in decoder performance due to memory stall cycles depends on the properties of the video being decoded as well as the behavior of the external memory interface. Based on this observation, we present the design of a reconﬁgurable 2–D cache architecture which can adjust its parameters in accordance with the characteristics of the video stream being decoded. We validate the Polymorphic ASIC using a proof–of–concept implementation on an FPGA. The performance of H.264 decoder on Polymorphic ASIC is evaluated for uniprocessor, multi processor, hardware accelerated and full hardware conﬁgurations. The scaling in performance delivered by these conﬁgurations shows that the Polymorphic ASIC enables the application to achieve super linear speedups [1]. The experimental results show that different implementations of a H.264 video decoder on the Polymorphic ASIC can deliver performance comparable to a wide spectrum of devices ranging from embedded processor like ARM 9 to MPSoCs like IBM Cell. We also present the energy consumption of various conﬁgurations of video decoders on Polymorphic ASIC and an application to conﬁguration mapping aimed at minimizing the overall energy consumption of a Polymorphic ASIC. Video Decoding Polymorphic ASIC Architecture Multiprocessor System-On-Chip (MPSoC) H.264 Decoder Video Encoding H.264 Decoders H.264 Decoding Computer Science
233	Evaluating Vivado High-Level Synthesis on OpenCV Functions for the Zynq-7000 FPGA Johansson, Henrik January 2015 (has links) More complex and intricate Computer Vision algorithms combined with higher resolution image streams put bigger and bigger demands on processing power. CPU clock frequencies are now pushing the limits of possible speeds, and have instead started growing in number of cores. Most Computer Vision algorithms' performance respond well to parallel solutions. Dividing the algorithm over 4-8 CPU cores can give a good speed-up, but using chips with Programmable Logic (PL) such as FPGA's can give even more. An interesting recent addition to the FPGA family is a System on Chip (SoC) that combines a CPU and an FPGA in one chip, such as the Zynq-7000 series from Xilinx. This tight integration between the Programmable Logic and Processing System (PS) opens up for designs where C programs can use the programmable logic to accelerate selected parts of the algorithm, while still behaving like a C program. On that subject, Xilinx has introduced a new High-Level Synthesis Tool (HLST) called Vivado HLS, which has the power to accelerate C code by synthesizing it to Hardware Description Language (HDL) code. This potentially bridges two otherwise very separate worlds; the ever popular OpenCV library and FPGAs. This thesis will focus on evaluating Vivado HLS from Xilinx primarily with image processing in mind for potential use on GIMME-2; a system with a Zynq-7020 SoC and two high resolution image sensors, tailored for stereo vision. FPGA OpenCV Zynq-7000 Vivado HLS Vivado High Level Synthesis HLS Computer Vision Xilinx System on Chip SoC GIMME-2 Stereo Vision Harris Robotics Robotteknik och automation Computer Engineering Datorteknik
234	Détection automatique de chutes de personnes basée sur des descripteurs spatio-temporels : définition de la méthode, évaluation des performances et implantation temps-réel / Automatic human fall detection based on spatio-temporal descriptors : definition of the method, evaluation of the performance and real-time implementation Charfi, Imen 21 October 2013 (has links) Nous proposons une méthode supervisée de détection de chutes de personnes en temps réel, robusteaux changements de point de vue et d’environnement. La première partie consiste à rendredisponible en ligne une base de vidéos DSFD enregistrées dans quatre lieux différents et qui comporteun grand nombre d’annotations manuelles propices aux comparaisons de méthodes. Nousavons aussi défini une métrique d’évaluation qui permet d’évaluer la méthode en s’adaptant à la naturedu flux vidéo et la durée d’une chute, et en tenant compte des contraintes temps réel. Dans unsecond temps, nous avons procédé à la construction et l’évaluation des descripteurs spatio-temporelsSTHF, calculés à partir des attributs géométriques de la forme en mouvement dans la scène ainsique leurs transformations, pour définir le descripteur optimisé de chute après une méthode de sélectiond’attributs. La robustesse aux changements d’environnement a été évaluée en utilisant les SVMet le Boosting. On parvient à améliorer les performances par la mise à jour de l’apprentissage parl’intégration des vidéos sans chutes enregistrées dans l’environnement définitif. Enfin, nous avonsréalisé, une implantation de ce détecteur sur un système embarqué assimilable à une caméra intelligentebasée sur un composant SoC de type Zynq. Une démarche de type Adéquation AlgorithmeArchitecture a permis d’obtenir un bon compromis performance de classification/temps de traitement / We propose a supervised approach to detect falls in home environment adapted to location andpoint of view changes. First, we maid publicly available a realistic dataset, acquired in four differentlocations, containing a large number of manual annotation suitable for methods comparison. We alsodefined a new metric, adapted to real-time tasks, allowing to evaluate fall detection performance ina continuous video stream. Then, we build the initial spatio-temporal descriptor named STHF usingseveral combinations of transformations of geometrical features and an automatically optimised setof spatio-temporal descriptors thanks to an automatic feature selection step. We propose a realisticand pragmatic protocol which enables performance to be improved by updating the training in thecurrent location with normal activities records. Finally, we implemented the fall detection in Zynqbasedhardware platform similar to smart camera. An Algorithm-Architecture Adequacy step allowsa good trade-off between performance of classification and processing time Détection de chute temps réel Descripteurs spatio-temporels Sélection d’attributs SVM Boosting Base de vidéos de chute System on Chip (SoC) No english keyword 006.3 006.6 621.39
235	Obvody s proudovou zpětnou vazbou pro zpracování analogových signálů / Current Feedback Circuits for Analog Signal Processing Stehlík, Jiří January 2009 (has links) This dissertation thesis deals with design of new functional blocks usable in area of analogue signal processing, focusing on sensor signal processing. Versatility of these circuits will find applications in programmable analogue array structures that will be possible to control and configure via a digital signal. Hereby build-up array would be fully a reconfigurable digital control system for sensor signal processing and usable for a wide range of different sensors. It offers possibility to build-up a control code for each specific sensor system, with which it would be possible to achieve optimal results of the entire system and consequently place the system on a chip. The presented programmable array is designed from configurable analogue blocks. The current feedback circuits, which in a suitable configuration can operate in voltage or current mode, are used here. This allows to achieve very good results in the systems with very low power supply, which is closely associated with mobility and autonomous behavioral (that are very important and observed parameters today) of the entire sensor-based framework. The work deals in detail with particular blocks, which are described theoretically and evaluated for using in the programmable analogue array. Design of the structure of programmable analogue array as well the use of these circuits in the part of whole system (that will be realized on a chip) are presented at the end of this thesis.
236	Analyse von Test-Pattern für SoC Multiprozessortest und -debugging mittels Test Access Port (JTAG) Vogelsang, Stefan, Köhler, Steffen, Spallek, Rainer G. 11 June 2007 (has links) Bei der Entwickelung von System-on-Chip (SoC) Debuggern ist es leider hinreichend oft erforderlich den Debugger selbst auf mögliche Fehler zu untersuchen. Da alle ernstzunehmenden Debugger konstruktionsbedingt selbst ein eingebettetes System darstellen, erwächst die Notwendigkeit eine einfache und sicher kontrollierbare Diagnose-Hardware zu entwerfen, welche den Zugang zur Funktionsweise des Debuggers über seine Ausgänge erschließt. Derzeitig ist der Test Access Port (TAP nach IEEE 1149.1-Standard) für viele Integratoren die Grundlage für den Zugriff auf ihre instanzierte Hardware. Selbst in forschungsorientierten Multi- Core System-on-Chip Architekturen wie dem ARM11MP der Firma ARM wird dieses Verfahren noch immer eingesetzt. In unserem Beitrag möchten wir ein Spezialwerkzeug zur Analyse des TAPKommunikationsprotokolles vorstellen, welches den Einsatz teurer Analysetechnik (Logik- Analysatoren) unnötig werden lässt und darüber hinaus eine komfortable, weitergehende Unterstützung für Multi-Core-Systeme bietet. Aufbauend auf der Problematik der Abtastung und Erfassung der Signalzustände am TAP mittels FPGA wird auf die verschiedenen Visualisierungs- und Analyseaspekte der TAPProtokollphasen in einer Multi-Core-Prozessor-Zielsystemumgebung eingegangen. Die hier vorgestellte Lösung ist im Rahmen eines FuE-Verbundprojektes enstanden. Das Vorhaben wird im Rahmen der Technologieförderung mit Mitteln des Europäischen Fonds für regionale Entwicklung (EFRE) 2000-2006 und mit Mitteln des Freistaates Sachsen gefördert. info:eu-repo/classification/ddc/004 ddc:004 info:eu-repo/classification/ddc/500 ddc:500 Eingebettetes System Kommunikationsprotokoll Analyse des TAP-Kommunikationsprotokolls Multi-Core-Systeme System on Chip (SoC)-Debugger
237	Évaluation de dispositifs système-sur-puce pour des applications de type simulateurs temps réel embarqués de systèmes électriques / Evaluation of system-on-chip devices for embedded real-time simulators of electrical systems Tormo Borreda, Daniel 11 July 2018 (has links) L’objectif de ce travail de Thèse est d’évaluer les capacités de composants numérique de type Système-sur-Puce (SoC en anglais) pour l’implantation de Simulateurs Temps Réel Embarqués (ERTS en anglais) de systèmes électromécaniques et d’électronique de puissance. En effet, l’utilisation de ces simulateurs n’est pas seulement limitée aux validations matériel dans la boucle (en anglais Hardware-in-the-Loop ou HIL) du système mais doivent également être embarqués avec le contrôleur afin d’assurer plusieurs fonctionnalités additionnelles comme l'observation, l'estimation, commande sans capteur (ou sensorless), le diagnostic ou la surveillance de la santé, commande tolérante aux défauts, etc.La réalisation de ces simulateurs doit néanmoins considérer plusieurs contraintes à plusieurs niveaux de développement : durant la modélisation de la partie du système à simuler en temps-réel, durant la réalisation numérique et enfin durant l’implantation sur le composant numérique utilisé. Ainsi, le travail réalisé durant cette Thèse s’est focalisé sur ce dernier niveau et l’objectif était d’évaluer les capacités temps/ressources des composants de type SoC pour l’implantation de modules ERTS. Ce type de plateformes intègrent dans un même composant de puissants processeurs, un circuit logique programmable (de type Field-Programmable Gate Array ou FPGA), et d’autres périphériques, ce qui offre plusieurs opportunités d’implantation.Afin de pallier les limitations liées au codage VHDL de la partie FPGA, il existe des outils High-Level Synthesis (HLS) qui permettent de programmer ces dispositifs en utilisant des langages à haut niveau d'abstraction comme C, C++ ou SystemC. De plus, en incluant des directives et contraintes au code source, ces outils peuvent produire des implémentations matérielles différentes (architecture totalement combinatoire, « pipeline », architecture parallélisées ou factorisées, arranger les données et leurs formats pour une meilleure utilisation des ressources de mémoire, etc.).Dans le but d’évaluer ces différentes implantations, deux cas d’études ont été choisis : le premier se compose d’un Générateur Asynchrone à Double Alimentation (GADA) et le second d’un Convertisseur Modulaire Multiniveau (ou Modular Multi-level Converter - MMC). Vu que la GADA a une dynamique basse/moyenne (dynamiques électriques et mécaniques), deux versions d’implantations ont été évaluées : (i) une implantation full-software en utilisant seulement les processeurs ARM; et (ii) une implantation full-hardware en utilisant l’outil HLS pour programmer la partie FPGA. Ces deux versions ont été évaluées avec différentes optimisations du compilateur et trois formats de données: 64/32-bit en virgule flottante, et 32-bit en virgule flottante. L’approche mixe software/hardware a également été évaluée à travers la caractérisation des transferts de données entre le processeur et l’IP ERTS implantée dans la partie FPGA. Quant au convertisseur MMC, sa complexité et sa forte dynamique (dynamique de commutation) impose une implantation exclusivement full-hardware. Celle-ci a également été réalisée à base d’outils HLS.Enfin pour la validation expérimentale de ce travail de Thèse, une maquette à base de convertisseur MMC a été construite dans le but de comparer des mesures du système réel avec les résultats fournis par l’IP ERTS. / This Doctoral Thesis is a detailed study of how suitable System-on-Chip (SoC) devices are for implementing Embedded Real-Time Simulators (ERTS) of electromechanical and power electronic systems. This emerging class of Real-Time Simulators (RTS) are not only expected for Hardware-in-the-Loop (HIL) validations of systems; but they also have to be embedded within the controller to play several roles like observers, parameter estimation, diagnostic, health monitoring, fault-tolerant and sensorless control, etc.The design of these Intellectual Properties (IP) must rigorously consider a set of constraints at different development stages: (i) during the modeling of the system to be real-time simulated; (ii) during the digital realization of the IP; and also (iii) during its final implementation in the digital platform. Thus, the conducted work of this Thesis focuses specially on this last stage and its aim is to evaluate the time/resource performances of recent SoC devices and study how suitable they are for implementing ERTSs. These kind of digital platforms combine powerful general purpose processors, a Field-Programmable Gate Array (FPGA) and other peripherals which make them very convenient for controlling and monitoring a complete system.One of the limitations of these devices is that control engineers are not particularly familiarized with FPGA programming, which needs extensive expertise in order to code these highly sophisticated algorithms using Hardware Description Languages (HDL). Notwithstanding, there exist High-Level Synthesis (HLS) tools which allow to program these devices using more generic programming languages such as C, C++ or SystemC. Moreover, by inserting directives and constraints to the source code, these tools can produce different hardware implementations (e.g. full-combinatorial design, pipelined design, parallel or factorized design, partition or arrange data for a better utilisation of memory resources, etc.).This dissertation is based on the implementation of two representative applications that are well known in our laboratory: a Doubly-fed Induction Generator (DFIG) commonly used as wind turbines; and a Modular Multi-level Converter (MMC) that can be arranged in different configurations and utilized for many different energy conversion purposes. Since the DFIG has low/medium system dynamics (electrical and mechanical ones), both a full-software implementation using solely the ARM processor and a full-hardware implementation using HLS to program the FPGA will be evaluated with different design optimizations and data formats (64/32-bit floating-point and 32-bit fixed-point). Moreover, it will also be investigated whether a system of these characteristics is interesting to be run as a hardware accelerator. Different data transfer options between the Processor System (PS) and the Programmable Logic (PL) have been studied as well for this matter. Conversely, because of its harsh dynamics (switching dynamics), the MMC will be implemented only with a full-hardware approach using HLS tools, as well.For the experimental validation of this Thesis work, a complete MMC test bench has been built from scratch in order to compare the real-world results with its SoC ERTS implementation. Simulateurs temps réel embarqués Système-Sur-Puce Synthèse de haut niveau Convertisseur Modulaire Multiniveaux Field-Programmable Gate Array Embedded Real-Time Simulator System-On-Chip High-Level Synthesis Modular Multi-Level Converter Field-Programmable Gate Array
238	High-gain metasurface in polyimide on-chip antenna based on CRLH-TL for sub-terahertz integrated circuits Alibakhshikenari, M., Virdee, B.S., See, C.H., Abd-Alhameed, Raed, Falcone, F., Limiti, E. 05 August 2020 (has links) Yes / This paper presents a novel on-chip antenna using standard CMOS-technology based on metasurface implemented on two-layers polyimide substrates with a thickness of 500 μm. The aluminium ground-plane with thickness of 3 μm is sandwiched between the two-layers. Concentric dielectric-rings are etched in the ground-plane under the radiation patches implemented on the top-layer. The radiation patches comprise concentric metal-rings that are arranged in a 3 × 3 matrix. The antennas are excited by coupling electromagnetic energy through the gaps of the concentric dielectric-rings in the ground-plane using a microstrip feedline created on the bottom polyimide-layer. The open-ended feedline is split in three-branches that are aligned under the radiation elements to couple the maximum energy. In this structure, the concentric metal-rings essentially act as series left-handed capacitances CL that extend the effective aperture area of the antenna without affecting its dimensions, and the concentric dielectric rings etched in the ground-plane act as shunt left-handed inductors LL, which suppress the surface-waves and reduce the substrates losses that leads to improved bandwidth and radiation properties. The overall structure behaves like a metasurface that is shown to exhibit a very large bandwidth of 0.350–0.385 THz with an average radiation gain and efficiency of 8.15dBi and 65.71%, respectively. It has dimensions of 6 × 6 × 1 mm3 that makes it suitable for on-chip implementation. / This work is partially supported by RTI2018-095499-B-C31, Funded by Ministerio de Ciencia, Innovación y Universidades, Gobierno de España (MCIU/AEI/FEDER,UE), and innovation programme under grant agreement H2020-MSCA-ITN-2016 SECRET-722424 and the fnancial support from the UK Engineering and Physical Sciences Research Council (EPSRC) under grant EP/E022936/1. / Research Development Fund Publication Prize Award winner, March 2020 On-chip antenna Terahertz (THz) Metasurface Metamaterials Polyimide substrate Electromagnetic coupling Wide bandwidth System-on-chip (SoC) Artificial magnetic conductor (AMC)
239	Novel Methodologies for Efficient Networks-on-Chip Implementation on Reconfigurable Devices Sethuraman, Balasubramanian January 2007 (has links) No description available. Networks-on-Chip (NoC) System-on-Chip (SoC) FPGA Reconfigurable and Platform-Based Design Light Weight Router Design Multi Local Port Router Multicast Router Power Issues and IR drop Analysis
240	Polymer-Ceramic Composites for Conformal Multilayer Antenna and RF Systems Zhou, Yijun 09 September 2009 (has links) No description available. Electrical Engineering Electromagnetism Materials Science Radiation polymer-ceramic composites carbon nanotube sheet antenna design anti-jamming GPS array cylindrically conformal microstrip array flexible electronics multilayer integration system-on-chip packaging

Search results