Global ETD Search

1	Algorithmes parallèles et architectures évolutives de faible complexité pour systèmes optiques OFDM cohérents temps réel / Low-Complexity Parallel Algorithms and Scalable Architectures for Real-Time Coherent Optical OFDM Systems Udupa, Pramod 19 June 2014 (has links) Dans cette thèse, des algorithmes à faible complexité et des architectures parallèles et efficaces sont explorés pour les systèmes CO-OFDM. Tout d'abord, des algorithmes de faible complexité pour la synchronisation et l'estimation du décalage en fréquence en présence d'un canal dispersif sont étudiés. Un nouvel algorithme de synchronisation temporelle à faible complexité qui peut résister à grande quantité de retard dispersif est proposé et comparé par rapport aux propositions antérieures. Ensuite, le problème de la réalisation d'une architecture parallèle à faible coût est étudié et une architecture parallèle générique et évolutive qui peut être utilisée pour réaliser tout type d'algorithme d'auto-corrélation est proposé. Cette architecture est ensuite étendue pour gérer plusieurs échantillons issus du convertisseur analogique/numérique (ADC) en parallèle et fournir une sortie qui suive la fréquence des ADC. L'évolutivité de l'architecture pour un nombre plus élevé de sorties en parallèle et les différents types d'algorithmes d'auto-corrélation sont explorés. Une approche d'adéquation algorithme-architecture est ensuite appliquée à l'ensemble de la chaîne de l'émetteur-récepteur CO-OFDM. Du côté de l'émetteur, un algorithme IFFT à radix-22 est choisi pour et une architecture parallèle Multipath Delay Commutator (MDC). Feed-forward (FF) est choisie car elle consomme moins de ressources par rapport aux architectures MDC-FF en radix-2/4. Au niveau du récepteur, un algorithme efficace pour l'estimation du Integer CFO est adopté et implémenté de façon optimisée sans l'utilisation de multiplicateurs complexes. Une réduction de la complexité matérielle est obtenue grâce à la conception d'architectures efficaces pour la synchronisation temporelle, la FFT et l'estimation du CFO. Une exploration du compromis entre la précision des calculs en virgule fixe et la complexité du matériel est réalisée pour la chaîne complète de l'émetteur- récepteur, de façon à trouver des points de fonctionnement qui n'affectent pas le taux d'erreur binaire (TEB) de manière significative. Les algorithmes proposés sont validés à l'aide d'une part d'expériences off-line en utilisant un générateur AWG (arbitrary wave- form generator) à l'émetteur et un oscilloscope numérique à mémoire (DSO) en sortie de la détection cohérente au récepteur, et d'autre part un émetteur-récepteur temps-réel basé sur des plateformes FPGA et des convertisseurs numériques. Le TEB est utilisé pour montrer la validité du système intégré et en donner les performances. / In this thesis, low-complexity algorithms and architectures for CO-OFDM systems are explored. First, low-complexity algorithms for estimation of timing and carrier frequency offset (CFO) in dispersive channel are studied. A novel low-complexity timing synchro- nization algorithm, which can withstand large amount of dispersive delay, is proposed and compared with previous proposals. Then, the problem of realization of low-complexity parallel architecture is studied. A generalized scalable parallel architecture, which can be used to realize any auto-correlation algorithm, is proposed. It is then extended to handle multiple parallel samples from ADC and provide outputs, which can match the input ADC rate. The scalability of the architecture for higher number of parallel outputs and different kinds of auto-correlation algorithms is explored. An algorithm-architecture approach is then applied to the entire CO-OFDM transceiver chain. At the transmitter side, radix-22 algorithm for IFFT is chosen and parallel Mul- tipath Delay Commutator (MDC) Feed-forward (FF) architecture is designed which con- sumes lesser resources compared to MDC FF architectures of radix-2/4. At the receiver side, efficient algorithm for Integer CFO estimation is adopted and efficiently realized with- out the use of complex multipliers. Reduction in complexity is achieved due to efficient architectures for timing synchronization, FFT and Integer CFO estimation. Fixed-point analysis for the entire transceiver chain is done to find fixed-point sensitive blocks, which affect bit error rate (BER) significantly. The algorithms proposed are validated using opti- cal experiments by the help of arbitrary waveform generator (AWG) at the transmitter and digital storage oscilloscope (DSO) and Matlab at the receiver. BER plots are used to show the validity of the system built. Hardware implementation of the proposed synchronization algorithm is validated using real-time FPGA platform. Ofdm Détection cohérente Algorithmes de faible complexité Architectures parallèles évolutives Fpga Convertisseurs de signaux Point fixe calcul Ofdm Coherent detection Low-complexity algorithms Scalable parallel architectures Fpga Signal converters Fixed-point computation
2	Simulations des écoulements sanguins dans des réseaux vasculaires complexes / Modeling of blood flow in real vascular networks Tarabay, Ranine 26 September 2016 (has links) Au cours des dernières décennies, des progrès remarquables ont été réalisés au niveau de la simulation d’écoulements sanguins dans des modèles anatomiques réalistes construits à partir de données d'imagerie médicale 3D en vue de simulation hémodynamique et physiologique 3D à grande échelle. Alors que les modèles anatomiques précis sont d'une importance primordiale pour simuler le flux sanguin, des conditions aux limites réalistes sont également importantes surtout lorsqu’il s’agit de calculer des champs de vitesse et de pression. La première cible de cette thèse était d'étudier l'analyse de convergence des inconnus pour différents types de conditions aux limites permettant un cadre flexible par rapport au type de données d'entrée (vitesse, pression, débit, ...). Afin de faire face au grand coût informatique associé, nécessitant un calcul haute performance, nous nous sommes intéressés à comparer les performances de deux préconditionneurs par blocs; le preconditionneur LSC (Least-Squared Commutator et le preconditionneur PCD (Pressure Convection Diffusion). Dans le cadre de cette thèse, nous avons implémenté ce dernier dans la bibliothèque Feel++. Dans le but de traiter l'interaction fluide-structure, nous nous sommes focalisés sur l'approximation de la force exercée par le fluide sur la structure, un champ essentiel intervenant dans la condition de continuité pour assurer le couplage du modèle de fluide avec le modèle de structure. Enfin, afin de valider nos choix numériques, deux cas tests ont été réalisés et une comparaison avec les données expérimentales et numériques a été établie et validée (le benchmark FDA et le benchmark Phantom). / Towards a large scale 3D computational model of physiological hemodynamics, remarkable progress has been made in simulating blood flow in realistic anatomical models constructed from three-dimensional medical imaging data in the past few decades. When accurate anatomic models are of primary importance in simulating blood flow, realistic boundary conditions are equally important in computing velocity and pressure fields. Thus, the first target of this thesis was to investigate the convergence analysis of the unknown fields for various types of boundary conditions allowing for a flexible framework with respect to the type of input data (velocity, pressure, flow rate, ...). In order to deal with the associated large computational cost, requiring high performance computing, we were interested in comparing the performance of two block preconditioners; the least-squared commutator preconditioner and the pressure convection diffusion preconditioner. We implemented the latter, in the context of this thesis, in the Feel++ library. With the purpose of handling the fluid-structure interaction, we focused of the approximation of the force exerted by the fluid on the structure, a field that is essential while setting the continuity condition to ensure the coupling of the fluid model with the structure model. Finally, in order to assess our numerical choices, two benchmarks (the FDA benchmark and the Phantom benchmark) were carried out, and a comparison with respect to experimental and numerical data was established and validated. Equations de Navier-Stokes Simulations à grande échelles Dynamique des fluides Preconditionneurs parallèles Feel++ Navier-Stokes equations Large scale simulations Computational fluid dynamics Scalable parallel Preconditioners Experimental validation Feel++ 511.8 532.5
3	Algorithm And Architecture Design for Real-time Face Recognition Mahale, Gopinath Vasanth January 2016 (has links) (PDF) Face recognition is a field of biometrics that deals with identification of subjects based on features present in the images of their faces. The factors that make face recognition popular and favorite as compared to other biometric methods are easier operation and ability to identify subjects without their knowledge. With these features, face recognition has become an integral part of the present day security systems, targeting a smart and secure world. There are various factors that de ne the performance of a face recognition system. The most important among them are recognition accuracy of algorithm used and time taken for recognition. Recognition accuracy of the face recognition algorithm gets affected by changes in pose, facial expression and illumination along with occlusions in the images. There have been a number of algorithms proposed to enable recognition under these ambient changes. However, it has been hard to and a single algorithm that can efficiently recognize faces in all the above mentioned conditions. Moreover, achieving real time performance for most of the complex face recognition algorithms on embedded platforms has been a challenge. Real-time performance is highly preferred in critical applications such as identification of crime suspects in public. As available software solutions for FR have significantly large latency in recognizing individuals, they are not suitable for such critical real-time applications. This thesis focuses on real-time aspect of FR, where acceleration of the algorithms is achieved by means of parallel hardware architectures. The major contributions of this work are as follows. We target to design a face recognition system that can identify at most 30 faces in each frame of video at 15 frames per second, which amounts to 450 recognitions per second. In addition, we target to achieve good recognition accuracy along with scalability in terms of database size and input image resolutions. To design a system with these specifications, as a first step, we explore algorithms in literature and come up with a hybrid face recognition algorithm. This hybrid algorithm shows good recognition accuracy on face images with changes in illumination, pose and expressions, and also with occlusions. In addition the computations in the algorithm are modular in nature which are suitable for real-time realizations through parallel processing. The face recognition system consists of a face detection module to detect faces in the input image, which is followed by a face recognition module to identify the detected faces. There are well established algorithms and architectures for face detection in literature which can perform detection at 15 frames per second on video frames. Detected faces of different sizes need to be scaled to the size specified by the face recognition module. To meet the real-time constraints, we propose a hardware architecture for real-time bi-cubic convolution interpolation with dynamic scaling factors. To recognize the resized faces in real-time, a scalable parallel pipelined architecture is designed for the hybrid algorithm which can perform 450 recognitions per second on a database containing grayscale images of at most 450 classes on Virtex 6 FPGA. To provide flexibility and programmability, we extend this design to REDEFINE, a multi-core massively parallel reconfigurable architecture. In this design, we come up with FR specific programmable cores termed Scalable Unit for Region Evaluation (SURE) capable of performing modular computations in the hybrid face recognition algorithm. We replicate SUREs in each tile of REDEFINE to construct a face recognition module termed REDEFINE for Face Recognition using SURE Homogeneous Cores (REFRESH). There is a need to learn new unseen faces on-line in practical face recognition systems. Considering this, for real-time on-line learning of unseen face images, we design tiny processors termed VOP, Processor for Vector Operations. VOPs function as coprocessors to process elements under each tile of REDEFINE to accelerate micro vector operations appearing in the synaptic weight computations. We also explore deep neural networks which operate similar to the processing in human brain and capable of working on very large face databases. We explore the field of Random matrix theory to come up with a solution for synaptic weight initialization in deep neural networks for better classification . In addition, we perform design space exploration of hardware architecture for deep convolution networks and conclude with directions for future work. Real-time Face Recognition Biometrics Face Recognition Multi-Core Architecture Random Matrix Theory Deep Convolutional Neural Networks Neural Networks Image Scaling Computer Architecture Processor Architecture Online Learning, Neural Networks VOP Vector Processsors Real-time FR Electronic Engineering

1

Page generated in 0.0728 seconds