Global ETD Search

81	A Runtime Framework for Parallel Programs Mukherjee, Joy 25 September 2006 (has links) This dissertation proposes the Weaves runtime framework for the execution of large scale parallel programs over lightweight intra-process threads. The goal of the Weaves framework is to help process-based legacy parallel programs exploit the scalability of threads without any modifications. The framework separates global variables used by identical, but independent, threads of legacy parallel programs without resorting to thread-based re-programming. At the same time, it also facilitates low-overhead collaboration among threads of a legacy parallel program through multi-granular selective sharing of global variables. Applications that follow the tenets of the Weaves framework can load multiple identical, but independent, copies of arbitrary object files within a single process. They can compose the runtime images of these object files in graph-like ways and run intra-process threads through them to realize various degrees of multi-granular selective sharing or separation of global variables among the threads. Using direct runtime control over the resolution of individual references to functions and variables, they can also manipulate program composition at fine granularities. Most importantly, the Weaves framework does not entail any modifications to either the source codes or the native codes of the object files. The framework is completely transparent. Results from experiments with a real-world process-based parallel application show that the framework can correctly execute a thousand parallel threads containing non-threadsafe global variables on a single machine - nearly twice as many as the traditional process-based approach can - without any code modifications. On increasing the number of machines, the application experiences super-linear speedup, which illustrates scalability. Results from another similar application, chosen from a different software area to emphasize the breadth of this research, show that the framework's facilities for low-overhead collaboration among parallel threads allows for significantly greater scales of achievable parallelism than technologies for inter-process collaboration allow. Ultimately, larger scales of parallelism enable more accurate software modeling of real-world parallel systems, such as computer networks and multi-physics natural phenomena. / Ph. D. Dynamic Adaptation Runtime Linking and Loading Lightweight Threads Component Composition Parallel Programs Legacy Procedural codes
82	An axisymmetric finite element solution for elastic wave propagation through threaded connections Land, J. George 07 November 2008 (has links) An axisymmetric finite element solution method is developed for axial wave propagation through a series of threaded connections in rock drills. A piston impacts axially on a string of rods held together by threaded joints and the wave propagates through these joints before reaching the bit. The energy lost in the joints limits the maximum effective depth of the drill. Several computational techniques are used to efficiently model the problem. Non-reflecting boundaries are used to numerically absorb the waves as they exit a joint. The stored waves are then re-initiated into the next joint eliminating modeling of the entire assembly of rods. The preload in the threads is modeled by shrinking the threaded sleeve onto the rods. A new dynamic relaxation damping scheme is used which starts with an undamped model and then increases the damping until the solution converges. This method converges more rapidly than the standard constant damping. / Master of Science threads wave propagation Finite element method non-reflecting boundaries rock drill LD5655.V855 1996.L368
83	Molecular Dynamics for Exascale Supercomputers / La dynamique moléculaire pour les machines exascale Cieren, Emmanuel 09 October 2015 (has links) Dans la course vers l’exascale, les architectures des supercalculateurs évoluent vers des nœuds massivement multicœurs, sur lesquels les accès mémoire sont non-uniformes et les registres de vectorisation toujours plus grands. Ces évolutions entraînent une baisse de l’efficacité des applications homogènes (MPI simple), et imposent aux développeurs l’utilisation de fonctionnalités de bas-niveau afin d’obtenir de bonnes performances.Dans le contexte de la dynamique moléculaire (DM) appliqué à la physique de la matière condensée, les études du comportement des matériaux dans des conditions extrêmes requièrent la simulation de systèmes toujours plus grands avec une physique de plus en plus complexe. L’adaptation des codes de DM aux architectures exaflopiques est donc un enjeu essentiel.Cette thèse propose la conception et l’implémentation d’une plateforme dédiée à la simulation de très grands systèmes de DM sur les futurs supercalculateurs. Notre architecture s’organise autour de trois niveaux de parallélisme: décomposition de domaine avec MPI, du multithreading massif sur chaque domaine et un système de vectorisation explicite. Nous avons également inclus une capacité d’équilibrage dynamique de charge de calcul. La conception orienté objet a été particulièrement étudiée afin de préserver un niveau de programmation utilisable par des physiciens sans altérer les performances.Les premiers résultats montrent d’excellentes performances séquentielles, ainsi qu’une accélération quasi-linéaire sur plusieurs dizaines de milliers de cœurs. En production, nous constatons une accélération jusqu’à un facteur 30 par rapport au code utilisé actuellement par les chercheurs du CEA. / In the exascale race, supercomputer architectures are evolving towards massively multicore nodes with hierarchical memory structures and equipped with larger vectorization registers. These trends tend to make MPI-only applications less effective, and now require programmers to explicitly manage low-level elements to get decent performance.In the context of Molecular Dynamics (MD) applied to condensed matter physics, the need for a better understanding of materials behaviour under extreme conditions involves simulations of ever larger systems, on tens of thousands of cores. This will put molecular dynamics codes among software that are very likely to meet serious difficulties when it comes to fully exploit the performance of next generation processors.This thesis proposes the design and implementation of a high-performance, flexible and scalable framework dedicated to the simulation of large scale MD systems on future supercomputers. We managed to separate numerical modules from different expressions of parallelism, allowing developers not to care about optimizations and still obtain high levels of performance. Our architecture is organized in three levels of parallelism: domain decomposition using MPI, thread parallelization within each domain, and explicit vectorization. We also included a dynamic load balancing capability in order to equally share the workload among domains.Results on simple tests show excellent sequential performance and a quasi linear speedup on several thousands of cores on various architectures. When applied to production simulations, we report an acceleration up to a factor 30 compared to the code previously used by CEA’s researchers. Dynamique Moléculaire Calcul Intensif Multi-Cœurs Message Passing Interface Threads Tbb Vectorisation Équilibrage de charge C++ Xeon Phi Molecular Dynamics High Performance Computing Manycore Message Passing Interface Threads Tbb Vectorization Load-Balancing C++ Xeon Phi
84	Frézování vnitřních závitů na tělesech vstřikovacích jednotek Bosch / Milling of internal threads in Bosch injection unit bodies Krčál, Petr January 2010 (has links) At the beginning of this diploma thesis I describe the production of different types of threads (with special emphasis on the production of internal threads), then I discuss different types of application of abrasion resistant coating by PVD and CVD, I describe their main advantages and disadvantages. Further, this diploma thesis explains particular mechanisms and forms of tool wear on tool with coating. In the second part of this diploma thesis the analysis of the current status of the production of internal threads in the Rail (high-pressure chamber) is made. In the last part I compare six different threads cutters by the use of a scanning electron microscope
85	DSM-PM2 : une plate-forme portable pour l'implémentation de protocoles de cohérence multithreads pour systèmes à mémoire virtuellement partagée Antoniu, Gabriel 21 November 2001 (has links) (PDF) Dans leur présentation traditionnelle, les systèmes à mémoire distribuée virtuellement partagée (MVP, en anglais DSM) permettent à des processus de partager un espace d'adressage commun selon un modèle de cohérence fixé : cohérence séquentielle, à la libération, etc. Les pro- cessus peuvent habituellement être distribués sur des noeuds physiquement distincts et leurs in- teractions par la mémoire commune sont implémentées (de manière transparente) par la MVP, en utilisant une bibliothèque de communication. Dans la plupart de travaux dans ce domaine, il est sous-entendu que la MVP et l'architecture sous-jacente sont données. Le programmeur doit alors adapter son application à ce cadre fixe, afin d'obtenir une exécution efficace. Cette approche impose des limitations statiques et ne permet pas de comparer des approches alternatives. La contribution de cette thèse consiste à proposer une plate-forme générique d'implémentation et d'expérimentation appelée DSM-PM2, qui permet de développer et d'optimiser conjointement les applications distribuées et le(s) protocole(s) de cohérence de la MVP sous-jacente. Cette plate-forme, implémentée entièrement au niveau logiciel, est portable sur plusieurs architectures de grappes hautes performances. Elle fournit les briques de bases nécessaires pour implémenter et évaluer une large classe de protocoles de cohérence multithreads dans un cadre unifié. Trois mo- dèles de cohérence sont actuellement supportés : la cohérence séquentielle, la cohérence à la libéra- tion et la cohérence Java. Plusieurs études de performance ont été effectuées à l'aide d'applications multithreads pour l'ensemble des protocoles proposés, sur différentes plates-formes. DSM-PM a été validé par son utilisation en tant que cible d'un système de compilation Java pour des grappes appelé Hyperion. [INFO] Computer Science Parallélisme processus légers threads mémoire virtuellement partagée DSM PM2 iso-adresse migration Hyperion compilation Java
86	Continuation-Passing C : Transformations de programmes pour compiler la concurrence dans un langage impératif Kerneis, Gabriel 09 November 2012 (has links) (PDF) La plupart des programmes informatiques sont concurrents : ils doivent effectuer plusieurs tâches en même temps. Les threads et les événements sont deux techniques usuelles d'implémentation de la concurrence. Les événements sont généralement plus légers et efficaces que les threads, mais aussi plus difficiles à utiliser. De plus, ils sont souvent trop limités ; il est alors nécessaire d'écrire du code hybride, encore plus complexe, utilisant à la fois des threads ordonnancés préemptivement et des événements ordonnancés coopérativement. Nous montrons dans cette thèse que des programmes concurrents écrits dans un style à threads sont traduisibles automatiquement en programmes à événements équivalents et efficaces par une suite de transformations source-source prouvées. Nous proposons d'abord Continuation-Passing C, une extension du langage C pour l'écriture de systèmes concurrents qui offre des threads très légers et unifiés (coopératifs et préemptifs). Les programmes CPC sont transformés par le traducteur CPC pour produire du code à événements séquentialisé efficace, utilisant des threads natifs pour les parties préemptives. Nous définissons et prouvons ensuite la correction de ces transformations, en particulier le lambda lifting et la conversion CPS, pour un langage impératif. Enfin, nous validons la conception et l'implémentation de CPC en le comparant à d'autres bibliothèques de threads et en exhibant notre seeder BitTorrent Hekate. Nous justifions aussi notre choix du lambda lifting en implémentant eCPC, une variante de CPC utilisant les environnements, et en comparant ses performances à celles de CPC. Programmation concurrente Threads Compilation Programmation à événements Lambda lifting
87	Adaptive transaction scheduling for transactional memory systems Yoo, Richard M. 01 April 2008 (has links) Transactional memory systems are expected to enable parallel programming at lower programming complexity, while delivering improved performance over traditional lock-based systems. Nonetheless, there are certain situations where transactional memory systems could actually perform worse. Transactional memory systems can outperform locks only when the executing workloads contain sufficient parallelism. When the workload lacks inherent parallelism, launching excessive transactions can adversely degrade performance. These situations will actually become dominant in future workloads when large-scale transactions are frequently executed. In this thesis, we propose a new paradigm called adaptive transaction scheduling to address this issue. Based on the parallelism feedback from applications, our adaptive transaction scheduler dynamically dispatches and controls the number of concurrently executing transactions. In our case study, we show that our low-cost mechanism not only guarantees that hardware transactional memory systems perform no worse than a single global lock, but also significantly improves performance for both hardware and software transactional memory systems. Parallelism Performance Transaction effectiveness Contention intensity Transaction systems (Computer systems) Threads (Computer programs) Parallel programming (Computer science) Synchronization
88	Design and evaluation of a technology-scalable architecture for instruction-level parallelism Nagarajan, Ramadass, January 1900 (has links) Thesis (Ph. D.)--University of Texas at Austin, 2007. / Vita. Includes bibliographical references.
89	VCluster a portable virtual computing library for cluster computing / Zhang, Hua. January 2008 (has links) Thesis (Ph.D.)--University of Central Florida, 2008. / Advisers: Ratan K. Guha, Joohan Lee. Includes bibliographical references (p. 132-143).
90	Stitched transmission lines for wearable RF devices Daniel, Isaac H. January 2017 (has links) With the rapid growth and use of wearable devices over the last decade, the advantages of using portable wearable devices are now been utilised for day to day activities. These wearable devices are designed to be flexible, low profile, light-weight and smoothly integrated into daily life. Wearable transmission lines are required to transport RF signals between various pieces of wearable communication equipment and to connect fabric based antennas to transmitters and receivers; the stitched transmission line is one of the hardware solutions developed to enhance the connectivity between these wearable devices. Textile manufacturing techniques that employ the use of sewing machines alongside conductive textile materials can be used to fabricate the stitched transmission line. In this thesis the feasibility of using a sewing machine in fabrication of a novel stitched transmission line for wearable devices using the idea of a braided coaxial cable have been examined. The sewing machine used is capable of a zig-zag stitch with approximate width and length within the range of 0-6 mm and 0-4mm respectively. The inner conductor and the tubular insulated layer of the stitched transmission lines were selected as RG 174, while the stitched shields were made up of copper wires and conductive threads from Light Stiches®. For shielding purpose, the structure is stitched onto a denim material with a conductive thread with the aid of a novel manufacturing technique using a standard hardware. The Scattering Parameters of the stitched transmission line were investigated with three different stitch angles 85°, 65° and 31° through simulation and experiments, with the results demonstrating that the stitched transmission line can work usefully and consistently from 0.04 to 4GHz. The extracted Scattering parameters indicated a decrease in DC loss with increased stitch angle and an increase in radiation loses, which tends to increase with increase in frequency. The proposed stitched transmission line makes a viable transmission line but a short stitch length is associated with larger losses through resistance. The DC losses observed are mainly influenced by the resistance of the conductive threads at lower frequencies while the radiation losses are influenced by the wider apertures related to the stitch angles and increase in frequency along the line. The performances of the stitched transmission line with different stitch patterns, when subjected to washing cycles and when bent through curved angles 90° and 180° were also investigated and results presented. Also, the sensitivity of the design to manufacturing tolerances was also considered. First the behaviour of the stitched transmission line with two different substrates Denim and Felt were investigated with the results indicating an insignificant increase in losses with the Denim material. Secondly, the sensitivity of the design with variations in cross section dimensions was investigated using numerical modelling techniques and the results showed that the impedance of the stitched transmission line increases when the cross sectional dimensions are decreased by 0.40mm and decreases when the cross sectional dimensions are increased by 0.40mm. Equally, repeatability of the stitched transmission line with three different stitch angles 85°, 65° and 31° were carried out. The results were seen to be consistent up to 2.5GHz, with slight deviations above that, which are mainly as a result of multiple reflections along the line resulting in loss ripples. The DC resistance of the stitched transmission line with three different stitch angles 85°, 65° and 31° corresponding to the number of stitches 60,90 and 162 were computed and a mathematical relationship was derived for computing the DC resistance of the stitch transmission line for any given number of stitches. The DC resistance computed results of 25.6Ω, 17.3Ω and 13.1Ω, for 31°, 65° and 85° stitch angles, indicated an increase in DC resistance of the stitch with decrease in stitch angle which gives rise to an increase in number of stitches. The transfer impedance of the stitched transmission line was also computed at low frequency (< 1GHz) to be ZT=(0.24+j1.09)Ω, with the result showing the effectiveness of the shield of the stitched transmission line at low frequency (< 1GHz). 621.319

Search results