Spelling suggestions: "subject:"parallella databehandling""
1 |
Theoretical aspects on performance bounds and fault tolerance in parallel computing /Klonowska, Kamilla, January 2007 (has links)
Diss. Karlskrona : Blekinge tekniska högskola, 2007.
|
2 |
Performance characterization and evaluation of parallel PDE solvers /Johansson, Henrik. January 2006 (has links) (PDF)
Licentiatavhandling (sammanfattning) Uppsala : Uppsala universitet, 2006. / Härtill 3 uppsatser.
|
3 |
Automatic parallelization of equation-based simulation programs /Aronsson, Peter, January 2006 (has links) (PDF)
Diss. Linköping : Linköpings universitet, 2006.
|
4 |
Multithreaded PDE solvers on non-uniform memory architectures /Nordén, Markus, January 2006 (has links)
Diss. (sammanfattning) Uppsala : Uppsala universitet, 2006. / Härtill 5 uppsatser.
|
5 |
Exhaustion dominated performance : an empirical evaluation (using real life simulation software)Raja G.R., Karthik January 2008 (has links)
<p>This paper aims at implementing (or) extending the evaluation of Exhaustion Dominated Performance, a method used to compute the impact of the available memory and bandwidth over the execution time of a simulation software. This method has already been performed and tested using High Performance Linpack (a de facto for bench marking process) [1]. But in this paper, the experiment is repeated using the real world simulation software so as to prove that the method is applicable in practical. The thesis was conducted using the same experimental conditions and the results obtained proved that the method works find for real world applications also.</p>
|
6 |
Exhaustion dominated performance : an empirical evaluation (using real life simulation software)Raja G.R., Karthik January 2008 (has links)
This paper aims at implementing (or) extending the evaluation of Exhaustion Dominated Performance, a method used to compute the impact of the available memory and bandwidth over the execution time of a simulation software. This method has already been performed and tested using High Performance Linpack (a de facto for bench marking process) [1]. But in this paper, the experiment is repeated using the real world simulation software so as to prove that the method is applicable in practical. The thesis was conducted using the same experimental conditions and the results obtained proved that the method works find for real world applications also.
|
7 |
Performance prediction and improvement techniques for parallel programs in multiprocessors /Broberg, Magnus, January 2002 (has links)
Diss. Ronneby: Tekn. högsk., 2002.
|
8 |
Iterative and adaptive PDE solvers for shared memory architectures /Löf, Henrik, January 2006 (has links)
Diss. (sammanfattning) Uppsala : Uppsala universitet, 2006. / Härtill 5 uppsatser.
|
9 |
Autonomic Management of Partitioners for SAMR Grid Hierarchies /Johansson, Henrik, January 2009 (has links)
Diss. (sammanfattning) Uppsala : Uppsala universitet, 2009. / Härtill 6 uppsatser.
|
10 |
Object Based Concurrency for Data Parallel Applications : Programmability and EffectivenessDiaconescu, Roxana Elena January 2002 (has links)
<p>Increased programmability for concurrent applications in distributed systems requires automatic support for some of the concurrent computing aspects. These are: the decomposition of a program into parallel threads, the mapping of threads to processors, the communication between threads, and synchronization among threads.</p><p>Thus, a highly usable programming environment for data parallel applications strives to conceal data decomposition, data mapping, data communication, and data access synchronization.</p><p>This work investigates the problem of programmability and effectiveness for scientific, data parallel applications with irregular data layout. The complicating factor for such applications is the recursive, or indirection data structure representation. That is, an efficient parallel execution requires a data distribution and mapping that ensure data locality. However, the recursive and indirect representations yield poor physical data locality. We examine the techniques for efficient, load-balanced data partitioning and mapping for irregular data layouts. Moreover, in the presence of non-trivial parallelism and data dependences, a general data partitioning procedure complicates arbitrary locating distributed data across address spaces. We formulate the general data partitioning and mapping problems and show how a general data layout can be used to access data across address spaces in a location transparent manner.</p><p>Traditional data parallel models promote instruction level, or loop-level parallelism. Compiler transformations and optimizations for discovering and/or increasing parallelism for Fortran programs apply to regular applications. However, many data intensive applications are irregular (sparse matrix problems, applications that use general meshes, etc.). Discovering and exploiting fine-grain parallelism for applications that use indirection structures (e.g. indirection arrays, pointers) is very hard, or even impossible.</p><p>The work in this thesis explores a concurrent programming model that enables coarse-grain parallelism in a highly usable, efficient manner. Hence, it explores the issues of implicit parallelism in the context of objects as a means for encapsulating distributed data. The computation model results in a trivial SPMD (Single Program Multiple Data), where the non-trivial parallelism aspects are solved automatically.</p><p>This thesis makes the following contributions:</p><p>- It formulates the general data partitioning and mapping problems for data parallel applications. Based on these formulations, it describes an efficient distributed data consistency algorithm.</p><p>- It describes a data parallel object model suitable for regular and irregular data parallel applications. Moreover, it describes an original technique to map data to processors such as to preserve locality. It also presents an inter-object consistency scheme that tries to minimize communication.</p><p>- It brings evidence on the efficiency of the data partitioning and consistency schemes. It describes a prototype implementation of a system supporting implicit data parallelism through distributed objects. Finally, it presents results showing that the approach is scalable on various architectures (e.g. Linux clusters, SGI Origin 3800).</p>
|
Page generated in 0.1003 seconds