Global ETD Search

401	Hardware and software co-design toward flexible terabits per second traffic processing / Co-conception matérielle et logicielle pour du traitement de trafic flexible au-delà du terabit par seconde Cornevaux-Juignet, Franck 04 July 2018 (has links) La fiabilité et la sécurité des réseaux de communication nécessitent des composants efficaces pour analyser finement le trafic de données. La diversification des services ainsi que l'augmentation des débits obligent les systèmes d'analyse à être plus performants pour gérer des débits de plusieurs centaines, voire milliers de Gigabits par seconde. Les solutions logicielles communément utilisées offrent une flexibilité et une accessibilité bienvenues pour les opérateurs du réseau mais ne suffisent plus pour répondre à ces fortes contraintes dans de nombreux cas critiques.Cette thèse étudie des solutions architecturales reposant sur des puces programmables de type Field-Programmable Gate Array (FPGA) qui allient puissance de calcul et flexibilité de traitement. Des cartes équipées de telles puces sont intégrées dans un flot de traitement commun logiciel/matériel afin de compenser les lacunes de chaque élément. Les composants du réseau développés avec cette approche innovante garantissent un traitement exhaustif des paquets circulant sur les liens physiques tout en conservant la flexibilité des solutions logicielles conventionnelles, ce qui est unique dans l'état de l'art.Cette approche est validée par la conception et l'implémentation d'une architecture de traitement de paquets flexible sur FPGA. Celle-ci peut traiter n'importe quel type de paquet au coût d'un faible surplus de consommation de ressources. Elle est de plus complètement paramétrable à partir du logiciel. La solution proposée permet ainsi un usage transparent de la puissance d'un accélérateur matériel par un ingénieur réseau sans nécessiter de compétence préalable en conception de circuits numériques. / The reliability and the security of communication networks require efficient components to finely analyze the traffic of data. Service diversification and through put increase force network operators to constantly improve analysis systems in order to handle through puts of hundreds,even thousands of Gigabits per second. Commonly used solutions are software oriented solutions that offer a flexibility and an accessibility welcome for network operators, but they can no more answer these strong constraints in many critical cases.This thesis studies architectural solutions based on programmable chips like Field-Programmable Gate Arrays (FPGAs) combining computation power and processing flexibility. Boards equipped with such chips are integrated into a common software/hardware processing flow in order to balance short comings of each element. Network components developed with this innovative approach ensure an exhaustive processing of packets transmitted on physical links while keeping the flexibility of usual software solutions, which was never encountered in the previous state of theart.This approach is validated by the design and the implementation of a flexible packet processing architecture on FPGA. It is able to process any packet type at the cost of slight resources over consumption. It is moreover fully customizable from the software part. With the proposed solution, network engineers can transparently use the processing power of an hardware accelerator without the need of prior knowledge in digital circuit design. Surveillance de trafic FPGA Architecture hétérogène Traitements haute performance Co-conception logicielle/matérielle Traffic monitoring FPGA Heterogeneous architecture High performance computing Hardware/software co-design 620
402	Development and application of an enhanced sampling molecular dynamics method to the conformational exploration of biologically relevant molecules Alibay, Irfan January 2017 (has links) This thesis describes the development a new swarm-enhanced sampling methodology and its application to the exploration of biologically relevant molecules. First, the development of a new multi-dimensional swarm-enhanced sampling molecular dynamics (msesMD) approach is detailed. Relative to the original swarm-enhanced sampling molecular dynamics (sesMD) methodology, the msesMD method demonstrates improved parameter transferability, resulting in more extensive sampling when scaling to larger systems such as alanine heptapeptide. The implementation and optimisation of the swarm-enhanced sampling algorithms in the AMBER software suite are also described. Through the use of the newer pmemd molecular dynamics (MD) engine and asynchronous MPI routines, speedups of up to three times the original sesMD implementation were achieved. The msesMD method is then applied to the investigation of carbohydrates, first looking at rare conformational changes in Lewis oligosaccharides. Validating against multi-microsecond unbiased MD trajectories and other enhanced sampling methods, the msesMD simulations identified rare conformational changes leading to the adoption of non-canonical unstacked core trisaccharide structures. Next, the use of msesMD as a tool to probe pyranose ring pucker events is explored. Evaluating against four benchmark monosaccharide systems, msesMD simulations accurately recover puckering details not easily obtained via multi-microsecond unbiased MD. This was followed by an exploration of the impact of ring substituents on conformation in glycosaminoglycan monosaccharides: through msesMD simulations, the influence of specific sulfation patterns were explored, finding that in some cases, such as 4-O-sulfation in N-acetyl-galactosamine, large changes in the relative stability of ring conformers can arise. Finally, the msesMD method was coupled with a thermodynamic integration scheme and used to evaluate solvation free energies for small molecule systems. Comparing against independent trajectory TI simulations, it was found that although the correct solvation free energies were obtained, the msesMD based method did not offer an advantage over unbiased MD for these small molecule systems. However, interesting discrepancies in free energy estimates arising from the use of hydrogen mass repartitioning were found. 610
403	Algoritmo distribuído para alocação de múltiplos recursos em ambientes distribuídos. / Distributed algorithm for multiple resource allocation in a distributed environment. Francisco Ribacionka 07 June 2013 (has links) Ao considerar um sistema distribuído composto por um conjunto de servidores, clientes e recursos, que caracterizam ambientes como grades ou nuvens computacionais, que oferecem um grande número de recursos distribuídos como CPUs ou máquinas virtuais, os quais são utilizados conjuntamente por diferentes tipos de aplicações, tem-se a necessidade de se ter uma solução para alocação destes recursos. O apoio à alocação dos recursos fornecidos por tais ambientes deve satisfazer todas as solicitações de recursos das aplicações, e fornecer respostas afirmativas para alocação eficiente de recursos, fazer justiça na alocação no caso de pedidos simultâneos entre vários clientes de recursos e responder em um tempo finito a requisições. Considerando tal contexto de grande escala em sistemas distribuídos, este trabalho propõe um algoritmo distribuído para alocação de recursos. Este algoritmo explora a Lógica Fuzzy sempre que um servidor está impossibilitado de atender a uma solicitação feita por um cliente, encaminhando esta solicitação a um servidor remoto. O algoritmo utiliza o conceito de relógio lógico para garantir justiça no atendimento das solicitações feitas em todos os servidores que compartilham recursos. Este algoritmo segue o modelo distribuído, onde uma cópia do algoritmo é executada em cada servidor que compartilha recursos para seus clientes, e todos os servidores tomam parte das decisões com relação a alocação destes recursos. A estratégia desenvolvida tem como objetivo minimizar o tempo de resposta na alocação de recursos, funcionando como um balanceamento de carga em um ambiente cliente-servidor com alto índice de solicitações de recursos pelos clientes. A eficiência do algoritmo desenvolvido neste trabalho foi comprovada através da implementação e comparação com outros algoritmos tradicionais, mostrando a possibilidade de utilização de recursos que pertencem a distintos servidores por uma mesma solicitação de recursos, com a garantia de que esta requisição será atendida, e em um tempo finito. / When considering a distributed system composed of a set of servers, clients, and resources that characterize environments like computational grids or clouds that offer a large number of distributed resources such as CPUs or virtual machines, which are used jointly by different types of applications, there is the need to have a solution for allocating these resources. Support the allocation of resources provided by such environments must satisfy all Requests for resources such applications, and provide affirmative answers to the efficient allocation of resources, to do justice in this allocation in the case of simultaneous Requests from multiple clients and answer these resources in a finite time these Requests. Considering such a context of large- scale distributed systems, this paper proposes a distributed algorithm for resource allocation This algorithm exploits fuzzy logic whenever a server is unable to meet a request made by a client, forwarding this request to a remote server. The algorithm uses the concept of logical clock to ensure fairness in meeting the demands made on all servers that share resources. This algorithm follows a distributed model, where a copy of the algorithm runs on each server that shares resources for its clients and all servers take part in decisions regarding allocation of resources. The strategy developed aims to minimize the response time in allocating resources, functioning as a load-balancing in a client-server environment with high resource Requests by customers. Algoritmos Computação em nuvem Lógica Fuzzy Processamento de alto desempenho Programação paralela Fuzzy Logic Grid and cloud computing High performance computing Parallel programming
404	Résilience dans les Systèmes de Workflow Distribués pour les Applications d’Optimisation Numérique : Conception et Expériences / Collaborative platform for multidiscipline optimization Trifan, Laurentiu 21 October 2013 (has links) Cette thèse vise à la conception d'un environnement pour le calcul haute performance dans un cadre d'optimisation numérique. Les outils de conception et d’optimisation sont répartis dans plusieurs équipes distantes, académiques et industrielles, qui collaborent au sein des mêmes projets. Les outils doivent être fédérés au sein d’un environnement commun afin d'en faciliter l'accès aux chercheurs et ingénieurs. L'environnement que nous proposons, pour répondre aux conditions précédentes, se compose d’un système de workflow et d’un système de calcul distribué. Le premier a pour objectif de faciliter la tâche de conception de l'application tandis que le second se charge de l’exécution sur des ressources de calcul distribuées. Bien sûr, des services de communication entre les deux systèmes doivent être développés. Les calculs doivent être réalisés de manière efficace, en prenant en compte le parallélisme interne de certains codes, l’exécution synchrone ou asynchrone des tâches, le transfert des données et les ressources matérielles et logicielles disponibles (répartition de charge par exemple). De plus, l’environnement doit assurer un bon niveau de tolérance aux pannes et aux défaillances logicielles, afin de minimiser leur influence sur le résultat final ou sur le temps de calcul. Une condition importante en particulier est de pouvoir implanter des dispositifs de reprise sur erreur, de telle sorte que le temps supplémentaire de traitement des erreurs reste très inférieur au temps de re-exécution total. Dans le cadre de ce travail, notre choix s'est porté sur le moteur de workflow Yawl, qui présente de bonnes caractéristiques en termes i) d'indépendance vis à vis du matériel et du logiciel (système client-serveur pouvant fonctionner sur du matériel hétérogène) et ii) de mécanisme de reprise sur erreur. Pour la partie calcul distribué, nos expériences ont été réalisées sur la plateforme Grid5000, en utilisant jusqu'à 64 machines différentes réparties sur cinq sites géographiques. Ce document détaille les choix de conception de cet environnement ainsi que les ajouts et modifications que nous avons été amenés à apporter à Yawl pour lui permettre de fonctionner sur une plateforme distribuée. / This thesis aims conceiving an environment for high performance computing in a numerical optimization context. The tools for conception and optimization are distributed across several teams, both academics and industrial, which collaborate inside a unique project. The tools should be federated within a common environment to facilitate access to researchers and engineers. The environment that we offer, in order to meet the above conditions, consists of a workflow system and a distributed computing system. The first system aims to facilitate the application design task while the latter is responsible for executing on distributed computing resources. Of course, communication services between the two systems must be developed. The computation must be performed effectively, taking into account the internal parallelism of some software code, synchronous or asynchronous task execution, the transfer of data and hardware and software resources available (e.g. load balancing). In addition, the environment should provide a good level of fault tolerance and software failures, to minimize their influence on the final result or the computation time. An important condition in particular is to implement recovery devices on error occurence, so that the extra time for error handling remains well below the total time of re-execution. As part of this work, our choice fell on the Yawl workflow engine, which has good characteristics in terms of i) hardware and software independence (client-server system that can run on heterogeneous hardware) and ii) error recovery mechanism. For distributed computing part, our experiments were performed on the Grid5000 platform, using up to 64 different machines on five geographical sites. This document details the design of this environment and the extensions and changes we have had to perform on Yawl to enable it to run on a distributed platform. Plateforme logicielle Conception collaborative Calcul haute-performance Parallèle distribué Workflow Software platform Collaborative design High-performance computing Parallel distributed Workflow 004
405	Surface Modified Capillaries in Capillary Electrophoresis Coupled to Mass Spectrometry : Method Development and Exploration of the Potential of Capillary Electrophoresis as a Proteomic Tool Zuberovic, Aida January 2009 (has links) The increased knowledge about the complexity of the physiological processes increases the demand on the analytical techniques employed to explore them. A comprehensive analysis of the entire sample content is today the most common approach to investigate the molecular interplay behind a physiological deviation. For this purpose a method that offers a number of important properties, such as speed and simplicity, high resolution and sensitivity, minimal sample volume requirements, cost efficiency and robustness, possibility of automation, high-throughput and wide application range of analysis is requested. Capillary electrophoresis (CE) coupled to mass spectrometry (MS) has a great potential and fulfils many of these criteria. However, further developments and improvements of these techniques and their combination are required to meet the challenges of complex biological samples. Protein analysis using CE is a challenging task due to protein adsorption to the negatively charged fused-silica capillary wall. This is especially emphasised with increased basicity and size of proteins and peptides. In this thesis, the adsorption problem was addressed by using an in-house developed physically adsorbed polyamine coating, named PolyE-323. The coating procedure is fast and simple that generates a coating stable over a wide pH range, 2-11. By coupling PolyE-323 modified capillaries to MS, either using electrospray ionisation (ESI) or matrix-assisted laser desorption/ionisation (MALDI), successful analysis of peptides, proteins and complex samples, such as protein digests and crude human body fluids were obtained. The possibilities of using CE-MALDI-MS/MS as a proteomic tool, combined with a proper sample preparation, are further demonstrated by applying high-abundant protein depletion in combination with a peptide derivatisation step or isoelectric focusing (IEF). These approaches were applied in profiling of the proteomes of human cerebrospinal fluid (CSF) and human follicular fluid (hFF), respectively. Finally, a multiplexed quantitative proteomic analysis was performed on a set of ventricular cerebrospinal fluid (vCSF) samples from a patient with traumatic brain injury (TBI) to follow relative changes in protein patterns during the recovery process. The results presented in this thesis confirm the potential of CE, in combination with MS, as a valuable choice in the analysis of complex biological samples and clinical applications. Topology optimization Design optimization Material distribution Wave propagation problems Inverse problems Acoustic devices Medical microwave tomography High performance computing Analytical chemistry Analytisk kemi
406	A model of dynamic compilation for heterogeneous compute platforms Kerr, Andrew 10 December 2012 (has links) Trends in computer engineering place renewed emphasis on increasing parallelism and heterogeneity. The rise of parallelism adds an additional dimension to the challenge of portability, as different processors support different notions of parallelism, whether vector parallelism executing in a few threads on multicore CPUs or large-scale thread hierarchies on GPUs. Thus, software experiences obstacles to portability and efficient execution beyond differences in instruction sets; rather, the underlying execution models of radically different architectures may not be compatible. Dynamic compilation applied to data-parallel heterogeneous architectures presents an abstraction layer decoupling program representations from optimized binaries, thus enabling portability without encumbering performance. This dissertation proposes several techniques that extend dynamic compilation to data-parallel execution models. These contributions include: - characterization of data-parallel workloads - machine-independent application metrics - framework for performance modeling and prediction - execution model translation for vector processors - region-based compilation and scheduling We evaluate these claims via the development of a novel dynamic compilation framework, GPU Ocelot, with which we execute real-world workloads from GPU computing. This enables the execution of GPU computing workloads to run efficiently on multicore CPUs, GPUs, and a functional simulator. We show data-parallel workloads exhibit performance scaling, take advantage of vector instruction set extensions, and effectively exploit data locality via scheduling which attempts to maximize control locality. Dynamic compilation GPU computing Cuda Opencl SIMD Vector Multicore Parallel computing Parallel computers Parallel programs (Computer programs) Heterogeneous computing High performance computing
407	Execution Of Distributed Database Queries On A Hpc System Onder, Ibrahim Seckin 01 January 2010 (has links) (PDF) Increasing performance of computers and ability to connect computers with high speed communication networks make distributed databases systems an attractive research area. In this study, we evaluate communication and data processing capabilities of a HPC machine. We calculate accurate cost formulas for high volume data communication between processing nodes and experimentally measure sorting times. A left deep query plan executer has been implemented and experimentally used for executing plans generated by two different genetic algorithms for a distributed database environment using message passing paradigm to prove that a parallel system can provide scalable performance by increasing the number of nodes used for storing database relations and processing nodes. We compare the performance of plans generated by genetic algorithms with optimal plans generated by exhaustive search algorithm. Our results have verified that optimal plans are better than those of genetic algorithms, as expected.
408	Parallel Solution Of Soil-structure Interaction Problems On Pc Clusters Bahcecioglu, Tunc 01 February 2011 (has links) (PDF) Numerical assessment of soil structure interaction problems require heavy computational efforts because of the dynamic and iterative (nonlinear) nature of the problems. Furthermore, modeling soil-structure interaction may require
409	High-performance direct solution of finite element problems on multi-core processors Guney, Murat Efe 04 May 2010 (has links) A direct solution procedure is proposed and developed which exploits the parallelism that exists in current symmetric multiprocessing (SMP) multi-core processors. Several algorithms are proposed and developed to improve the performance of the direct solution of FE problems. A high-performance sparse direct solver is developed which allows experimentation with the newly developed and existing algorithms. The performance of the algorithms is investigated using a large set of FE problems. Furthermore, operation count estimations are developed to further assess various algorithms. An out-of-core version of the solver is developed to reduce the memory requirements for the solution. I/O is performed asynchronously without blocking the thread that makes the I/O request. Asynchronous I/O allows overlapping factorization and triangular solution computations with I/O. The performance of the developed solver is demonstrated on a large number of test problems. A problem with nearly 10 million degree of freedoms is solved on a low price desktop computer using the out-of-core version of the direct solver. Furthermore, the developed solver usually outperforms a commonly used shared memory solver. Finite element method High-performance computing Multi-core Direct sparse solvers Fill-in reduction Numerical analysis Finite element method Computer programs Finite element method Data processing Algorithms
410	Direct numerical simulation and analysis of saturated deformable porous media Khan, Irfan 07 July 2010 (has links) Existing numerical techniques for modeling saturated deformable porous media are based on homogenization techniques and thus are incapable of performing micro-mechanical investigations, such as the effect of micro-structure on the deformational characteristics of the media. In this research work, a numerical scheme is developed based on the parallelized hybrid lattice-Boltzmann finite-element method, that is capable of performing micro-mechanical investigations through direct numerical simulations. The method has been used to simulate compression of model saturated porous media made of spheres and cylinders in regular arrangements. Through these simulations it is found that in the limit of small Reynolds number, Capillary number and strain, the deformational behaviour of a real porous media can be recovered through model porous media when the parameters porosity, permeability and bulk compressive modulus are matched between the two media. This finding motivated research in using model porous geometries to represent more complex real porous geometries in order to perform investigations of deformation on the latter. An attempt has been made to apply this technique to the complex geometries of ªfeltº, (a fibrous mat used in paper industries). These investigations lead to new understanding on the effect of fiber diameter on the bulk properties of a fibrous media and subsequently on the deformational behaviour of the media. Further the method has been used to investigate the constitutive relationships in deformable porous media. Particularly the relationship between permeability and porosity during the deformation of the media is investigated. Results show the need of geometry specific investigations. Fluid-structure interaction High performance computing Deformable porous media Lattice Boltzmann method Porous materials Deformations (Mechanics) Micromechanics Microstructure Mathematical models Computer simulation

Search results