• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 2
  • Tagged with
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Multithreaded PDE Solvers on Non-Uniform Memory Architectures

Nordén, Markus January 2006 (has links)
A trend in parallel computer architecture is that systems with a large shared memory are becoming more and more popular. A shared memory system can be either a uniform memory architecture (UMA) or a cache coherent non-uniform memory architecture (cc-NUMA). In the present thesis, the performance of parallel PDE solvers on cc-NUMA computers is studied. In particular, we consider the shared namespace programming model, represented by OpenMP. Since the main memory is physically, or geographically distributed over several multi-processor nodes, the latency for local memory accesses is smaller than for remote accesses. Therefore, the geographical locality of the data becomes important. The focus of the present thesis is to study multithreaded PDE solvers on cc-NUMA systems, in particular their memory access pattern with respect to geographical locality. The questions posed are: (1) How large is the influence on performance of the non-uniformity of the memory system? (2) How should a program be written in order to reduce this influence? (3) Is it possible to introduce optimizations in the computer system for this purpose? The main conclusion is that geographical locality is important for performance on cc-NUMA systems. This is shown experimentally for a broad range of PDE solvers as well as theoretically using a model involving characteristics of computer systems and applications. Geographical locality can be achieved through migration directives that are inserted by the programmer or — possibly in the future — automatically by the compiler. On some systems, it can also be accomplished by means of transparent, hardware initiated migration and replication. However, a necessary condition that must be fulfilled if migration is to be effective is that the memory access pattern must not be "speckled", i.e. as few threads as possible shall make accesses to each memory page. We also conclude that OpenMP is competitive with MPI on cc-NUMA systems if care is taken to get a favourable data distribution.
2

Design of High Performance Computing Software for Genericity and Variability

Ljungberg, Malin January 2007 (has links)
Computer simulations have emerged as a cost efficient complement to laboratory experiments, as computers have become increasingly powerful. The aim of the present work is to explore the ideas of some state of the art software development practices, and ways in which these can be useful for developing high performance research codes. The introduction of these practices, and the modular designs that they give rise to, raises issues regarding a potential conflict between runtime efficiency on one hand and development efficiency on the other. Flexible software modules, based on mathematical abstractions, will provide support for convenient implementation and modification of numerical operators. Questions still remain about whether such modules will provide the efficiency which is required for high performance applications. To answer these questions, investigations were performed within two different problem domains. The first domain consisted of modular frameworks for the numerical solution of Partial Differential Equations. Such frameworks proved a suitable setting, since several of my research questions revolved around the issue of modularity. The second problem domain was that of symmetry exploiting algorithms. These algorithms are based on group theory, and make ample use of mathematical abstractions from that field. The domain of symmetry exploiting algorithms gave us opportunities to investigate difficulties in combining modularity based on high level abstractions with low level optimizations using data layout and parallelization. In conclusion, my investigation of software development practices for the area of high performance computing has proved very fruitful indeed. I have found that none of the concerns that were raised should lead us to refrain from the use of the practices that I have considered. On the contrary, in the two case studies presented here, these practices lead to designs that perform well in terms of usability as well as runtime efficiency.

Page generated in 0.054 seconds