Spelling suggestions: "subject:"arallel computing."" "subject:"aparallel computing.""
121 |
A parallel transformations framework for cluster environmentsBartels, Peer January 2011 (has links)
In recent years program transformation technology has matured into a practical solution for many software reengineering and migration tasks. FermaT, an industrial strength program transformation system, has demonstrated that legacy systems can be successfully transformed into efficient and maintainable structured C or COBOL code. Its core, a transformation engine, is based on mathematically proven program transformations and ensures that transformed programs are semantically equivalent to its original state. Its engine facilitates a Wide Spectrum Language (WSL), with low-level as well as high-level constructs, to capture as much information as possible during transformation steps. FermaT’s methodology and technique lack in provision of concurrent migration and analysis. This provision is crucial if the transformation process is to be further automated. As the constraint based program migration theory has demonstrated, it is inefficient and time consuming, trying to satisfy the enormous computation of the generated transformation sequence search-space and its constraints. With the objective to solve the above problems and to extend the operating range of the FermaT transformation system, this thesis proposes a Parallel Transformations Framework which makes parallel transformations processing within the FermaT environment not only possible but also beneficial for its migration process. During a migration process, many thousands of program transformations have to be applied. For example a 1 million line of assembler to C migration takes over 21 hours to be processed on a single PC. Various approaches of search, prediction techniques and a constraint-based approach to address the presented issues already exist but they solve them unsatisfactorily. To remedy this situation, this dissertation proposes a framework to extend transformation processing systems with parallel processing capabilities. The parallel system can analyse specified parallel transformation tasks and produce appropriate parallel transformations processing outlines. To underpin an automated objective, a formal language is introduced. This language can be utilised to describe and outline parallel transformation tasks whereas parallel processing constraints underpin the parallel objective. This thesis addresses and explains how transformation processing steps can be automatically parallelised within a reengineering domain. It presents search and prediction tactics within this field. The decomposition and parallelisation of transformation sequence search-spaces is outlined. At the end, the presented work is evaluated on practical case studies, to demonstrate different parallel transformations processing techniques and conclusions are drawn.
|
122 |
Concurrent Telemetry Processing TechniquesClark, Jerry 10 1900 (has links)
International Telemetering Conference Proceedings / October 28-31, 1996 / Town and Country Hotel and Convention Center, San Diego, California / Improved processing techniques, particularly with respect to parallel computing, are the underlying focus in computer science, engineering, and industry today. Semiconductor technology is fast approaching device physical limitations. Further advances in computing performance in the near future will be realized by improved problem-solving approaches. An important issue in parallel processing is how to effectively utilize parallel computers. It is estimated that many modern supercomputers and parallel processors deliver only ten percent or less of their peak performance potential in a variety of applications. Yet, high performance is precisely why engineers build complex parallel machines. Cumulative performance losses occur due to mismatches between applications, software, and hardware. For instance, a communication system's network bandwidth may not correspond to the central processor speed or to module memory. Similarly, as Internet bandwidth is consumed by modern multimedia applications, network interconnection is becoming a major concern. Bottlenecks in a distributed environment are caused by network interconnections and can be minimized by intelligently assigning processing tasks to processing elements (PEs). Processing speeds are improved when architectures are customized for a given algorithm. Parallel processing techniques have been ineffective in most practical systems. The coupling of algorithms to architectures has generally been problematic and inefficient. Specific architectures have evolved to address the prospective processing improvements promised by parallel processing. Real performance gains will be realized when sequential algorithms are efficiently mapped to parallel architectures. Transforming sequential algorithms to parallel representations utilizing linear dependence vector mapping and subsequently configuring the interconnection network of a systolic array will be discussed in this paper as one possible approach for improved algorithm/architecture symbiosis.
|
123 |
Developing Communication and Data Systems for Space Station Facility Class PayloadsHazra, Tushar K., Sun, Charles, Mian, Arshad M., Picinich, Louis M. 11 1900 (has links)
International Telemetering Conference Proceedings / October 30-November 02, 1995 / Riviera Hotel, Las Vegas, Nevada / The driving force in modern space mission control has been directed towards
developing cost effective and reliable communication and data systems. The objective is
to maintain and ensure error-free payload commanding and data acquisition as well as
efficient processing of the payload data for concurrent, real time and future use. While
Mainframe computing still comprises a majority of commercially available
communication and data systems, a significant diversion can be noticed towards
utilizing a distributed network of workstations and commercially available software and
hardware. This motivation reflects advances in modem computer technology and the
trend in space mission control today and in the future.
The development of communication and data involves the implementation of distributed
and parallel processing concepts in a network of highly powerful client server
environments. This paper addresses major issues related to developing and integrating
communication and data system and the significance for future developments.
|
124 |
Algorithmes par decomposition de domaine et méthodes de discrétisation d'ordre elevé pour la résolution des systèmes d'équations aux dérivées partielles. Application aux problèmes issus de la mécanique des fluides et de l'électromagnétismeDolean, Victorita 07 July 2009 (has links) (PDF)
My main research topic is about developing new domain decomposition algorithms for the solution of systems of partial differential equations. This was mainly applied to fluid dynamics problems (as compressible Euler or Stokes equations) and electromagnetics (time-harmonic and time-domain first order system of Maxwell's equations). Since the solution of large linear systems is strongly related to the application of a discretization method, I was also interested in developing and analyzing the application of high order methods (such as Discontinuos Galerkin methods) to Maxwell's equations (sometimes in conjuction with time-discretization schemes in the case of time-domain problems). As an active member of NACHOS pro ject (besides my main afiliation as an assistant professor at University of Nice), I had the opportunity to develop certain directions in my research, by interacting with permanent et non-permanent members (Post-doctoral researchers) or participating to supervision of PhD Students. This is strongly refflected in a part of my scientific contributions so far. This memoir is composed of three parts: the first is about the application of Schwarz methods to fluid dynamics problems; the second about the high order methods for the Maxwell's equations and the last about the domain decomposition algorithms for wave propagation problems.
|
125 |
Artificial Intelligence Models for Large Scale Buildings Energy Consumption AnalysisZhao, Haixiang 28 September 2011 (has links) (PDF)
The energy performance in buildings is influenced by many factors, such as ambient weather conditions, building structure and characteristics, occupancy and their behaviors, the operation of sub-level components like Heating, Ventilation and Air-Conditioning (HVAC) system. This complex property makes the prediction, analysis, or fault detection/diagnosis of building energy consumption very difficult to accurately and quickly perform. This thesis mainly focuses on up-to-date artificial intelligence models with the applications to solve these problems. First, we review recently developed models for solving these problems, including detailed and simplified engineering methods, statistical methods and artificial intelligence methods. Then we simulate energy consumption profiles for single and multiple buildings, and based on these datasets, support vector machine models are trained and tested to do the prediction. The results from extensive experiments demonstrate high prediction accuracy and robustness of these models. Second, Recursive Deterministic Perceptron (RDP) neural network model is used to detect and diagnose faulty building energy consumption. The abnormal consumption is simulated by manually introducing performance degradation to electric devices. In the experiment, RDP model shows very high detection ability. A new approach is proposed to diagnose faults. It is based on the evaluation of RDP models, each of which is able to detect an equipment fault.Third, we investigate how the selection of subsets of features influences the model performance. The optimal features are selected based on the feasibility of obtaining them and on the scores they provide under the evaluation of two filter methods. Experimental results confirm the validity of the selected subset and show that the proposed feature selection method can guarantee the model accuracy and reduces the computational time.One challenge of predicting building energy consumption is to accelerate model training when the dataset is very large. This thesis proposes an efficient parallel implementation of support vector machines based on decomposition method for solving such problems. The parallelization is performed on the most time-consuming work of training, i.e., to update the gradient vector f. The inner problems are dealt by sequential minimal optimization solver. The underlying parallelism is conducted by the shared memory version of Map-Reduce paradigm, making the system particularly suitable to be applied to multi-core and multiprocessor systems. Experimental results show that our implementation offers a high speed increase compared to Libsvm, and it is superior to the state-of-the-art MPI implementation Pisvm in both speed and storage requirement.
|
126 |
Fast Stochastic Global Optimization Methods and Their Applications to Cluster Crystallization and Protein FoldingZhan, Lixin January 2005 (has links)
Two global optimization methods are proposed in this thesis. They are the multicanonical basin hopping (MUBH) method and the basin paving (BP) method. <br /><br /> The MUBH method combines the basin hopping (BH) method, which can be used to efficiently map out an energy landscape associated with local minima, with the multicanonical Monte Carlo (MUCA) method, which encourages the system to move out of energy traps during the computation. It is found to be more efficient than the original BH method when applied to the Lennard-Jones systems containing 150-185 particles. <br /><br /> The asynchronous multicanonical basin hopping (AMUBH) method, a parallelization of the MUBH method, is also implemented using the message passing interface (MPI) to take advantage of the full usage of multiprocessors in either a homogeneous or a heterogeneous computational environment. AMUBH, MUBH and BH are used together to find the global minimum structures for Co nanoclusters with system size <em>N</em>≤200. <br /><br /> The BP method is based on the BH method and the idea of the energy landscape paving (ELP) strategy. In comparison with the acceptance scheme of the ELP method, moving towards the low energy region is enhanced and no low energy configuration may be missed during the simulation. The applications to both the pentapeptide Met-enkephalin and the villin subdomain HP-36 locate new configurations having energies lower than those determined previously. <br /><br /> The MUBH, BP and BH methods are further employed to search for the global minimum structures of several proteins/peptides using the ECEPP/2 and ECEPP/3 force fields. These two force fields may produce global minima with different structures. The present study indicates that the global minimum determination from ECEPP/3 prefers helical structures. Also discussed in this thesis is the effect of the environment on the formation of beta hairpins.
|
127 |
A mesh transparent numerical method for large-eddy simulation of compressible turbulent flowsTristanto, Indi Himawan January 2004 (has links)
A Large Eddy-Simulation code, based on a mesh transparent algorithm, for hybrid unstructured meshes is presented to deal with complex geometries that are often found in engineering flow problems. While tetrahedral elements are very effective in dealing with complex geometry, excessive numerical diffusion often affects results. Thus, prismatic or hexahedral elements are preferable in regions where turbulence structures are important. A second order reconstruction methodology is used since an investigation of a higher order method based upon Lele's compact scheme has shown this to be impractical on general unstructured meshes. The convective fluxes are treated with the Roe scheme that has been modified by introducing a variable scaling to the dissipation matrix to obtain a nearly second order accurate centred scheme in statistically smooth flow, whilst retaining the high resolution TVD behaviour across a shock discontinuity. The code has been parallelised using MPI to ensure portability. The base numerical scheme has been validated for steady flow computations over complex geometries using inviscid and RANS forms of the governing equations. The extension of the numerical scheme to unsteady turbulent flows and the complete LES code have been validated for the interaction of a shock with a laminar mixing layer, a Mach 0.9 turbulent round jet and a fully developed turbulent pipe flow. The mixing layer and round jet computations indicate that, for similar mesh resolution of the shear layer, the present code exhibits results comparable to previously published work using a higher order scheme on a structured mesh. The unstructured meshes have a significantly smaller total number of nodes since tetrahedral elements are used to fill to the far field region. The pipe flow results show that the present code is capable of producing the correct flow features. Finally, the code has been applied to the LES computation of the impingement of a highly under-expanded jet that produces plate shock oscillation. Comparison with other workers' experiments indicates good qualitative agreement for the major features of the flow. However, in this preliminary computation the computed frequency is somewhat lower than that of experimental measurements.
|
128 |
Forest aboveground biomass and carbon mapping with computational cloudGuan, Aimin 26 April 2017 (has links)
In the last decade, advances in sensor and computing technology are revolutionary. The latest-generation of hyperspectral and synthetic aperture radar ((SAR) instruments have increased their spectral, spatial, and temporal resolution. Consequently, the data sets collected are increasing rapidly in size and frequency of acquisition. Remote sensing applications are requiring more computing resources for data analysis. High performance computing (HPC) infrastructure such as clusters, distributed networks, grids, clouds and specialized hardware components, have been used to disseminate large volumes of remote sensing data and to accelerate the computational speed in processing raw images and extracting information from remote sensing data. In previous research we have shown that we can improve computational efficiency of a hyperspectral image denoising algorithm by parallelizing the algorithm utilizing a distributed computing grid. In recent years, computational cloud technology is emerging, bringing more flexibility and simplicity for data processing. Hadoop MapReduce is a software framework for distributed commodity computing clusters, allowing parallel processing of massive datasets. In this project, we implement a software application to map forest aboveground biomass (AGB) with normalized difference vegetation indices (NDVI) using Landsat Thematic Mapper’s bands 4 and 5 (ND45). We present observations and experimental results on the performance and the algorithmic complexity of the implementation. There are three research questions answered in this thesis, as follows. 1) How do we implement remote sensing algorithms, such as forest AGB mapping, in a computer cloud environment? 2) What are the requirements to implement distributed processing of remote sensing images using the cloud programming model? 3) What is the performance increase for large area remote sensing image processing in a cloud environment? / Graduate / 0799 / 0984
|
129 |
Massively parallel computing for particle physicsPreston, Ian Christopher January 2010 (has links)
This thesis presents methods to run scientific code safely on a global-scale desktop grid. Current attempts to harness the world’s idle desktop computers face obstacles such as donor security, portability of code and privilege requirements. Nereus, a Java-based architecture, is a novel framework that overcomes these obstacles and allows the creation of a globally-scalable desktop grid capable of executing Java bytecode. However, most scientific code is written for the x86 architecture. To enable the safe execution of unmodified scientific code, we created JPC, a pure Java x86 PC emulator. The Nereus framework is applied to two tasks, a trivially parallel data generation task, BlackMax, and a parallelization and fault tolerance framework, Mycelia. Mycelia is an implementation of the Map-Reduce parallel programming paradigm. BlackMax is a microscopic blackhole event generator, of direct relevance for the Large Hadron Collider (LHC). The Nereus based BlackMax adaptation dramatically speeds up the production of data, limited only by the number of desktop machines available.
|
130 |
Logiciel de génération de nombres aléatoires dans OpenCLKemerchou, Nabil 08 1900 (has links)
clRNG et clProbdist sont deux interfaces de programmation (APIs) que nous avons développées pour la génération de nombres aléatoires uniformes et non uniformes sur des dispositifs de calculs parallèles en utilisant l’environnement OpenCL. La première interface permet de créer au niveau d’un ordinateur central (hôte) des objets de type stream considérés comme des générateurs virtuels parallèles qui peuvent être utilisés aussi bien sur l’hôte que sur les dispositifs parallèles (unités de traitement graphique, CPU multinoyaux, etc.) pour la génération de séquences de nombres aléatoires. La seconde interface permet aussi de générer au niveau de ces unités des variables aléatoires selon
différentes lois de probabilité continues et discrètes. Dans ce mémoire, nous allons rappeler des notions de base sur les générateurs de nombres aléatoires, décrire les systèmes hétérogènes ainsi que les techniques de génération parallèle de nombres aléatoires. Nous présenterons aussi les différents modèles composant l’architecture de l’environnement OpenCL et détaillerons les structures des APIs développées. Nous distinguons pour clRNG les fonctions qui permettent la création des streams, les fonctions qui génèrent les variables aléatoires uniformes ainsi que celles qui manipulent les états des streams. clProbDist contient les fonctions de génération de variables aléatoires non uniformes selon la technique d’inversion ainsi que les fonctions qui permettent de retourner différentes statistiques des lois de distribution implémentées. Nous évaluerons ces interfaces de programmation avec deux simulations qui implémentent un exemple simplifié d’un modèle d’inventaire et un exemple d’une option financière. Enfin, nous fournirons les résultats d’expérimentation sur les performances des générateurs implémentés. / clRNG and clProbdist are two application programming interfaces (APIs) that we have developed respectively for the generation of uniform and non-uniform random numbers on parallel computing devices in the OpenCL environment. The first interface is used to create at a central computer level (host) objects of type stream considered as parallel virtual generators that can be used both on the host and on parallel devices (graphics processing units, multi-core CPU, etc.) for generating sequences of random numbers. The second interface can be used also on the host or devices to generate random variables according to different continuous and discrete probability distributions. In this thesis, we will recall the basic concepts of random numbers generators, describe the heterogeneous systems and the generation techniques of parallel random number, then present the different models composing the OpenCL environment. We will
detail the structures of the developed APIs, distinguish in clRNG the functions that allow creating streams from the functions that generate uniform random variables and the functions that manipulate the states of the streams.We will describe also clProbDist that allow the generation of non-uniform random variables based on the inversion technique
as well as returning different statistical values related to the distributions implemented. We will evaluate these APIs with two simulations, the first one implements a simplified example of inventory model and the second one estimate the value of an Asian call option. Finally, we will provide results of experimentations on the performance of the implemented generators.
|
Page generated in 0.1272 seconds