Global ETD Search

41	Parallele Genetische Algorithmen mit Anwendungen Riedel, Marion. January 2002 (has links) Chemnitz, Techn. Univ., Diplomarb., 2002.
42	Entwicklung eines effizienten Verfahrens zur Simulation kompressibler Strömungen in 3D auf Parallelrechnern Schupp, Bernhard. Unknown Date (has links) (PDF) Universiẗat, Diss., 1999--Freiburg (Breisgau).
43	Parallel implementation of surface reconstruction from noisy samples Randrianarivony, Maharavo, Brunnett, Guido 06 April 2006 (has links) (PDF) We consider the problem of reconstructing a surface from noisy samples by approximating the point set with non-uniform rational B-spline surfaces. We focus on the fact that the knot sequences should also be part of the unknown variables that include the control points and the weights in order to find their optimal positions. We show how to set up the free knot problem such that constrained nonlinear optimization can be applied efficiently. We describe in detail a parallel implementation of our approach that give almost linear speedup. Finally, we provide numerical results obtained on the Chemnitzer Linux Cluster supercomputer. noisy samples nonlinear optimization surface reconstruction ddc:510 Geometrie Parallelverarbeitung Rationaler B-Spline
44	Task Pool Teams for Implementing Irregular Algorithms on Clusters of SMPs Hippold, Judith, Rünger, Gudula 06 April 2006 (has links) (PDF) The characteristics of irregular algorithms make a parallel implementation difficult, especially for PC clusters or clusters of SMPs. These characteristics may include an unpredictable access behavior to dynamically changing data structures or strong irregular coupling of computations. Problems are an unknown load distribution and expensive irregular communication patterns for data accesses and redistributions. Thus the parallel implementation of irregular algorithms on distributed memory machines and clusters requires a special organizational mechanism for a dynamic load balance while keeping the communication and administration overhead low. We propose task pool teams for implementing irregular algorithms on clusters of PCs or SMPs. A task pool team combines multithreaded programming using task pools on single nodes with explicit message passing between different nodes. The dynamic load balance mechanism of task pools is generalized to a dynamic load balance scheme for all distributed nodes. We have implemented and compared several versions for task pool teams. As application example, we use the hierarchical radiosity algorithm, which is based on dynamically growing quadtree data structures annotated by varying interaction lists expressing the irregular coupling between the quadtrees. Experiments are performed on a PC cluster and a cluster of SMPs. irregular algorithms task pool teams ddc:004 Coupled Cluster Parallelverarbeitung / Programmierung Radiosity Verteilter Speicher
45	Solving Linear Matrix Equations via Rational Iterative Schemes Benner, Peter, Quintana-Ortí, Enrique, Quintana-Ortí, Gregorio 01 September 2006 (has links) (PDF) We investigate the numerical solution of stable Sylvester equations via iterative schemes proposed for computing the sign function of a matrix. In particular, we discuss how the rational iterations for the matrix sign function can efficiently be adapted to the special structure implied by the Sylvester equation. For Sylvester equations with factored constant term as those arising in model reduction or image restoration, we derive an algorithm that computes the solution in factored form directly. We also suggest convergence criteria for the resulting iterations and compare the accuracy and performance of the resulting methods with existing Sylvester solvers. The algorithms proposed here are easy to parallelize. We report on the parallelization of those algorithms and demonstrate their high efficiency and scalability using experimental results obtained on a cluster of Intel Pentium Xeon processors. Halley's method Newton-Schulz iteration Sylvester equation matrix sign function model reduction ddc:510 Ordnungsreduktion Parallelverarbeitung
46	Solving Linear-Quadratic Optimal Control Problems on Parallel Computers Benner, Peter, Quintana-Ortí, Enrique S., Quintana-Ortí, Gregorio 11 September 2006 (has links) (PDF) We discuss a parallel library of efficient algorithms for the solution of linear-quadratic optimal control problems involving largescale systems with state-space dimension up to $O(10^4)$. We survey the numerical algorithms underlying the implementation of the chosen optimal control methods. The approaches considered here are based on invariant and deflating subspace techniques, and avoid the explicit solution of the associated algebraic Riccati equations in case of possible ill-conditioning. Still, our algorithms can also optionally compute the Riccati solution. The major computational task of finding spectral projectors onto the required invariant or deflating subspaces is implemented using iterative schemes for the sign and disk functions. Experimental results report the numerical accuracy and the parallel performance of our approach on a cluster of Intel Itanium-2 processors. disk function linear-quadratic optimal control sign function ddc:510 Algebraische Riccati-Gleichung Parallelverarbeitung
47	Optimizing MPI Collective Communication by Orthogonal Structures Kühnemann, Matthias, Rauber, Thomas, Rünger, Gudula 28 June 2007 (has links) (PDF) Many parallel applications from scientific computing use MPI collective communication operations to collect or distribute data. Since the execution times of these communication operations increase with the number of participating processors, scalability problems might occur. In this article, we show for different MPI implementations how the execution time of collective communication operations can be significantly improved by a restructuring based on orthogonal processor structures with two or more levels. As platform, we consider a dual Xeon cluster, a Beowulf cluster and a Cray T3E with different MPI implementations. We show that the execution time of operations like MPI Bcast or MPI Allgather can be reduced by 40% and 70% on the dual Xeon cluster and the Beowulf cluster. But also on a Cray T3E a significant improvement can be obtained by a careful selection of the processor groups. We demonstrate that the optimized communication operations can be used to reduce the execution time of data parallel implementations of complex application programs without any other change of the computation and communication structure. Furthermore, we investigate how the execution time of orthogonal realization can be modeled using runtime functions. In particular, we consider the modeling of two-phase realizations of communication operations. We present runtime functions for the modeling and verify that these runtime functions can predict the execution time both for communication operations in isolation and in the context of application programs. communication operations message passing optimization parallel programming scientific computing ddc:000 MPI <Schnittstelle> Parallelverarbeitung
48	Parallel implementation of curve reconstruction from noisy samples Randrianarivony, Maharavo, Brunnett, Guido 06 April 2006 (has links) (PDF) This paper is concerned with approximating noisy samples by non-uniform rational B-spline curves with special emphasis on free knots. We show how to set up the problem such that nonlinear optimization methods can be applied efficiently. This involves the introduction of penalizing terms in order to avoid undesired knot positions. We report on our implementation of the nonlinear optimization and we show a way to implement the program in parallel. Parallel performance results are described. Our experiments show that our program has a linear speedup and an efficiency value close to unity. Runtime results on a parallel computer are displayed. curve reconstruction noisy samples nonlinear optimization ddc:510 B-Spline Geometrie Parallelverarbeitung
49	Entwicklung, Untersuchung und Implementierung von parallelen evolutionären Algorithmen für die Modellpartitionierungskomponente parallelMAP Schulze, Hendrik 16 November 2017 (has links) Die vorliegende Arbeit untersucht Möglichkeiten der Parallelisierung von Evolutionären Algorithmen, welche hier zur Partitionierung von Daten füur die parallele Logiksimulation benutzt werden. Neben einer allgemeinen Einführung in Grundbegriffe und Methoden von Evolutionären Algorithmen, Parallelverarbeitung, Logiksimulation und Datenpartitionierung wird das im Rahmen dieser Diplomarbeit entwickelte Programmpaket pga vorgestellt, sowie auf die darin benutzten Parallelisierungsmethoden und Kommunikationsstrukturen eingegangen. info:eu-repo/classification/ddc/000 ddc:000
50	Entwicklung effizienter gemischt paralleler Anwendungen Dümmler, Jörg 08 June 2010 (has links) Die Ausnutzung von gemischter Parallelität durch parallele Tasks führt im Vergleich mit reiner Datenparallelität und reiner Taskparallelität häufig zu effizienteren und flexibleren parallelen Implementierungen. In der vorliegenden Dissertation wird mit dem CM-task Programmiermodell eine Erweiterung des Standardmodells der parallelen Tasks vorgestellt. Damit wird die Modellierung von Kommunikationsoperationen zwischen zeitgleich ausgeführten parallelen Tasks unterstützt, was zur besseren Strukturierung von parallelen Anwendungen mit einem regelmäßigen Datenaustausch zwischen verschiedenen Programmteilen beiträgt. Für das CM-task Programmiermodell wird das zugehörige Schedulingproblem definiert und ein entsprechender Schedulingalgorithmus vorgestellt. Die Anwendungsentwicklung im CM-task Programmiermodell wird durch das CM-task Compilerframework unterstützt, das eine gegebene plattformunabhängige Spezifiktion eines parallelen Algorithmus schrittweise in ein plattformspezifisches Koordinationsprogramm übersetzt. Das Koordinationsprogramm enthält Programmcode zum Anlegen und Verwalten der benötigten Prozessorgruppen, zum Ausführen der vom Anwender bereitgestellten CM-tasks auf diesen Prozessorgruppen sowie zur Realisierung der benötigten Datenumverteilungsoperationen zwischen den Prozessorgruppen. Der Aufbau und die Schnittstellen des CM-task Compilerframeworks werden in der vorliegenden Dissertation detailliert beschrieben. Anhand verschiedener Anwendungen aus dem wissenschaftlichen Rechnens wird die Einsetzbarkeit des CM-task Programmiermodells und des CM-task Compilerframeworks demonstriert. / Mixed parallel programming models based on parallel tasks often lead to more efficient and more flexible implementations compared to pure data and pure task parallelism. In this thesis, the CM-task programming model is proposed which extends standard parallel tasks such that communication phases between concurrently executed parallel tasks can be modeled thus allowing a better structuring of parallel applications that require a frequent data exchange between different program parts. Based on the CM-task programming model the CM-task scheduling problem is defined and a scheduling algorithm is proposed. The development of parallel applications within the CM-task programming model is supported by the CM-task compiler framework, which transforms a given platform independent specification of a parallel algorithm into a platform specific coordination program. The coordination program is responsible for the creation and the management of the required processor groups, the execution of the user provided CM-tasks on these processor groups and for the implementation of the data re-distribution operations between these processor groups. The architecture and the interfaces of the CM-task compiler framework are explained in detail. The applicability of the CM-task programming model and the CM-task compiler framework are demonstrated for several scientific applications. info:eu-repo/classification/ddc/004 ddc:004 Paralleler Algorithmus Parallelverarbeitung Scheduling Wissenschaftliches Rechnen Programmiermodell Softwarewerkzeug

Search results