101 |
Massively Parallel, Fast Fourier Transforms and Particle-Mesh Methods: Massiv parallele schnelle Fourier-Transformationen und Teilchen-Gitter-MethodenPippig, Michael 13 October 2015 (has links)
The present thesis provides a modularized view on the structure of fast numerical methods for computing Coulomb interactions between charged particles in three-dimensional space. Thereby, the common structure is given in terms of three self-contained algorithmic frameworks that are built on top of each other, namely fast Fourier transform (FFT), nonequispaced fast Fourier transform (NFFT) and NFFT based particle-mesh methods (P²NFFT). For each of these frameworks algorithmic enhancement and parallel implementations are presented with special emphasis on scalability up to hundreds of thousands of parallel processes.
In the context of FFT massively parallel algorithms are composed from hardware adaptive low level modules provided by the FFTW software library. The new algorithmic NFFT concepts include pruned NFFT, interlacing, analytic differentiation, and optimized deconvolution in Fourier space with respect to a mean square aliasing error. Enabled by these generalized concepts it is shown that NFFT provides a unified access to particle-mesh methods. Especially, mixed-periodic boundary conditions are handled in a consistent way and interlacing can be incorporated more efficiently. Heuristic approaches for parameter tuning are presented on the basis of thorough error estimates. / Die vorliegende Dissertation beschreibt einen modularisierten Blick auf die Struktur schneller numerischer Methoden für die Berechnung der Coulomb-Wechselwirkungen zwischen Ladungen im dreidimensionalen Raum. Die gemeinsame Struktur ist geprägt durch drei selbstständige und auf einander aufbauenden Algorithmen, nämlich der schnellen Fourier-Transformation (FFT), der nicht äquidistanten schnellen Fourier-Transformation (NFFT) und der NFFT-basierten Teilchen-Gitter-Methode (P²NFFT). Für jeden dieser Algorithmen werden Verbesserungen und parallele Implementierungen vorgestellt mit besonderem Augenmerk auf massiv paralleler Skalierbarkeit.
Im Kontext der FFT werden parallele Algorithmen aus den Hardware adaptiven Modulen der FFTW Softwarebibliothek zusammengesetzt. Die neuen NFFT-Konzepte beinhalten abgeschnittene NFFT, Versatz, analytische Differentiation und optimierte Entfaltung im Fourier-Raum bezüglich des mittleren quadratischen Aliasfehlers. Mit Hilfe dieser Verallgemeinerungen bietet die NFFT einen vereinheitlichten Zugang zu Teilchen-Gitter-Methoden. Insbesondere gemischt periodische Randbedingungen werden einheitlich behandelt und Versatz wird effizienter umgesetzt. Heuristiken für die Parameterwahl werden auf Basis sorgfältiger Fehlerabschätzungen angegeben.
|
102 |
Optimal Control Problems in Finite-Strain Elasticity by Inner Pressure and Fiber TensionGünnel, Andreas, Herzog, Roland 01 September 2016 (has links)
Optimal control problems for finite-strain elasticity are considered. An inner pressure or an inner fiber tension is acting as a driving force. Such internal forces are typical, for instance, for the motion of heliotropic plants, and for muscle tissue. Non-standard objective functions relevant for elasticity problems are introduced. Optimality conditions are derived on a formal basis, and a limited-memory quasi-Newton algorithm for their solution is formulated in function space. Numerical experiments confirm the expected mesh-independent performance.
|
103 |
Parameter Tuning for the NFFT Based Fast Ewald SummationNestler, Franziska 14 September 2016 (has links)
The computation of the Coulomb potentials and forces in charged particle systems under 3d-periodic boundary conditions is possible in an efficient way by utilizing the Ewald summation formulas and applying the fast Fourier transform (FFT). In this paper we consider the particle-particle NFFT (P2NFFT) approach, which is based on the fast Fourier transform for nonequispaced data (NFFT) and compare the error behaviors regarding different window functions, which are used in order to approximate the given continuous charge distribution by a mesh based charge density. Typically B-splines are applied in the scope of particle mesh methods, as for instance within the well-known particle-particle particle-mesh (P3M) algorithm. The publicly available P2NFFT algorithm allows the application of an oversampled FFT as well as the usage of different window functions. We consider for the first time also an approximation by Bessel functions and show how the resulting root mean square errors in the forces can be predicted precisely and efficiently. The results show that, if the parameters are tuned appropriately, the Bessel window function is in many cases even the better choice in terms of computational costs. Moreover, the results indicate that it is often advantageous in terms of efficiency to spend some oversampling within the NFFT while using a window function with a smaller support.
|
104 |
Taylor and rank-1 lattice based nonequispaced fast Fourier transformVolkmer, Toni 25 February 2013 (has links) (PDF)
The nonequispaced fast Fourier transform (NFFT) allows the fast approximate evaluation of trigonometric polynomials with frequencies supported on full box-shaped grids at arbitrary sampling nodes. Due to the curse of dimensionality, the total number of frequencies and thus, the total arithmetic complexity can already be very large for small refinements at medium dimensions. In this paper, we present an approach for the fast approximate evaluation of trigonometric polynomials with frequencies supported on an arbitrary subset of the full grid at arbitrary sampling nodes, which is based on Taylor expansion and rank-1 lattice methods. For the special case of symmetric hyperbolic cross index sets in frequency domain, we present error estimates and numerical results.
|
105 |
OpenMP parallelization in the NFFT software libraryVolkmer, Toni 29 August 2012 (has links) (PDF)
We describe an implementation of a multi-threaded NFFT (nonequispaced fast Fourier transform) software library and present the used parallelization approaches. Besides the NFFT kernel, the NFFT on the two-sphere and the fast summation based on NFFT are also parallelized. Thereby, the parallelization is based on OpenMP and the multi-threaded FFTW library. Furthermore, benchmarks for various cases are performed. The results show that an efficiency higher than 0.50 and up to 0.79 can still be achieved at 12 threads.
|
106 |
Parallel Three-Dimensional Nonequispaced Fast Fourier Transforms and Their Application to Particle SimulationPippig, Michael, Potts, Daniel 31 August 2012 (has links) (PDF)
In this paper we describe a parallel algorithm for calculating nonequispaced fast Fourier transforms on massively parallel distributed memory architectures. These algorithms are implemented in an open source software library called PNFFT. Furthermore, we derive a parallel fast algorithm for the computation of the Coulomb potentials and forces in a charged particle system, which is based on the parallel nonequispaced fast Fourier transform. To prove the high scalability of our algorithms we provide performance results on a BlueGene/P system using up to 65536 cores.
|
107 |
Numerische Simulation des viskoplastischen Verhaltens metallischer Werkstoffe bei endlichen Deformationen / Numerical simulation of visoplastic behaviour of metallic materials at finite strainsShutov, Alexey 14 October 2014 (has links) (PDF)
In den letzten Jahrzehnten hat sich auf dem Gebiet der phänomenologischen Metallplastizität eine schleichende Revolution vollzogen. Dank der gestiegenen Rechenleistung, in Kombination mit ausgereiften numerischen Algorithmen, sind viele technisch relevante Problemstellungen einer zuverlässigen numerischen Analyse zugänglich gemacht worden. Beispielsweise ermöglicht die Metallumformsimulation, als häufigste Anwendung der Plastizitätstheorie, eine Analyse des Eigenspannungszustandes und der Rückfederung in plastisch umgeformten Halbzeugen und Bauteilen. Solche Simulationen sind für die Planung energie- und ressourceneffizienter Herstellungsprozesse sowie für die Ausnutzung der plastischen Tragfähigkeitsreserven von großer Bedeutung. Die Crashtest-Simulation ist die zweithäufigste Anwendung, die in der Automobilindustrie und auch zunehmend im Flugzeugbau eingesetzt wird. Aus der Notwendigkeit, das Verhalten metallischer Werkstoffe auf Bauteilebene hinreichend genau zu beschreiben, resultiert die Motivation für eine breit angelegte Studie zur Materialmodellierung. Dabei führt die beträchtliche Anzahl unterschiedlicher Phänomene und Effekte, die berücksichtigt werden müssen, zu einer großen Vielfalt von Materialmodellen.
Da die Lösung komplizierter praktischer Probleme mit einem sehr großen numerischen Aufwand verbunden ist, wird der vorteilhafte phänomenologische Zugang bevorzugt. Bei der Konzeption von neuen phänomenologischen Materialmodellen müssen folgende Aspekte beachtet werden: die Genauigkeit bei der Beschreibung des Materialverhaltens; die Stabilität und Robustheit von zugehörigen numerischen Algorithmen; die numerische Effizienz; die zuverlässige Parameteridentifikation für einen möglichst großen Anwendbarkeitsbereich; die Anschaulichkeit und Einfachheit des Materialmodells. Im Allgemeinen stehen diese Anforderungen an ein "gutes Materialmodell" zwar in einem gewissen Widerspruch zueinander, bilden andererseits aber das Grundgerüst für eine systematische Studie. Obwohl sich die vorliegende Arbeit vordergründig an erfahrene Spezialisten im Bereich der Kontinuumsmechanik wendet, sind die darin präsentierten Modelle und Algorithmen auch für praktisch tätige Berechnungsingenieure von Interesse. / In the last decades, a creeping revolution was taking place in the area of the phenomenological metal plasticity. Due to the increased computational power, combined with refined numerical algorithms, many of technically relevant problems are now available for the numerical analysis. In particular, the metal forming simulation is a typical application of the metal plasticity. It enables the analysis of the residual stresses and spring back phenomena in plastically deformed workpieces and components. Such analysis is advantageous for planning of energy and resource-efficient manufacturing and for exploitation of plastic reserves of bearing capacity. The crash test simulation is the second most common application of metal plasticity, highly celebrated in the automotive industry and gaining increasing popularity in the aircraft industry. The need for sufficiently accurate description of metal behaviour on the macroscale motivates wide-ranging studies on material modelling. The large number of different effects and phenomena contributes to the large manifold of material models.
The current work deals with the phenomenological approach, due to its great suitability for the solution of practical problems. The following aspects should be taken into account upon the construction of new phenomenological models: the accurate description of the material behaviour, the stability and robustness of the corresponding numerical algorithms, the numerical efficiency, the reliable parameter identification for a sufficiently large application area, the clearness and simplicity of the material models. In general, these requirements imposed on a "good material model" contradict each other. In this work, however, they are complimentary to each other and build a framework for a systematic study. Although this work is written primarily for experts on the continuum mechanics, the presented models and algorithms can be of interest for practically working engineers.
|
108 |
Adaptive FEM for fibre-reinforced 3D structures and laminates / Adaptive FEM für faserverstärkte 3D-Strukturen und LaminateWeise, Michael 18 August 2014 (has links) (PDF)
The topic of this thesis is the numerical simulation of transversely isotropic 3D structures and laminates by means of the adaptive finite element method. To achieve this goal, the theoretical background of elastic deformation problems, transverse isotropy, plate theory, and the classical laminate theory is recapitulated. The classical laminate theory implies a combination of the membrane problem and the plate problem with additional coupling terms. The focus of this work is the adjustment of two integral parts of the adaptive FE algorithm according to the classical laminate theory.
One of these parts is the solution of the FE system; a good preconditioner is needed in order to use the conjugate gradient method efficiently. It is shown via a spectral equivalence bound that the combination of existing preconditioners for the membrane and plate problems poses a capable preconditioner for the combined laminate problem.
The other part is the error estimation process; the error estimator determines where the current mesh has to be refined for the next step. Existing results on residual error estimators for the elasticity problem, the biharmonic problem, and the plate problem are combined and extended to obtain a posteriori local residual error indicators for the classical laminate theory problem.
The effectiveness of both results is demonstrated by numerical examples.
|
109 |
Efficient Algorithms for the Computation of Optimal Quadrature Points on Riemannian ManifoldsGräf, Manuel 30 May 2013 (has links)
We consider the problem of numerical integration, where one aims to approximate an integral of a given continuous function from the function values at given sampling points, also known as quadrature points. A useful framework for such an approximation process is provided by the theory of reproducing kernel Hilbert spaces and the concept of the worst case quadrature error. However, the computation of optimal quadrature points, which minimize the worst case quadrature error, is in general a challenging task and requires efficient algorithms, in particular for large numbers of points.
The focus of this thesis is on the efficient computation of optimal quadrature points on the torus T^d, the sphere S^d, and the rotation group SO(3). For that reason we present a general framework for the minimization of the worst case quadrature error on Riemannian manifolds, in order to construct numerically such quadrature points. Therefore, we consider, for N quadrature points on a manifold M, the worst case quadrature error as a function defined on the product manifold M^N. For the optimization on such high dimensional manifolds we make use of the method of steepest descent, the Newton method, and the conjugate gradient method, where we propose two efficient evaluation approaches for the worst case quadrature error and its derivatives. The first evaluation approach follows ideas from computational physics, where we interpret the quadrature error as a pairwise potential energy. These ideas allow us to reduce for certain instances the complexity of the evaluations from O(M^2) to O(M log(M)). For the second evaluation approach we express the worst case quadrature error in Fourier domain. This enables us to utilize the nonequispaced fast Fourier transforms for the torus T^d, the sphere S^2, and the rotation group SO(3), which reduce the computational complexity of the worst case quadrature error for polynomial spaces with degree N from O(N^k M) to O(N^k log^2(N) + M), where k is the dimension of the corresponding manifold. For the usual choice N^k ~ M we achieve the complexity O(M log^2(M)) instead of O(M^2). In conjunction with the proposed conjugate gradient method on Riemannian manifolds we arrive at a particular efficient optimization approach for the computation of optimal quadrature points on the torus T^d, the sphere S^d, and the rotation group SO(3).
Finally, with the proposed optimization methods we are able to provide new lists with quadrature formulas for high polynomial degrees N on the sphere S^2, and the rotation group SO(3). Further applications of the proposed optimization framework are found due to the interesting connections between worst case quadrature errors, discrepancies and potential energies. Especially, discrepancies provide us with an intuitive notion for describing the uniformity of point distributions and are of particular importance for high dimensional integration in quasi-Monte Carlo methods. A generalized form of uniform point distributions arises in applications of image processing and computer graphics, where one is concerned with the problem of distributing points in an optimal way accordingly to a prescribed density function. We will show that such problems can be naturally described by the notion of discrepancy, and thus fit perfectly into the proposed framework. A typical application is halftoning of images, where nonuniform distributions of black dots create the illusion of gray toned images. We will see that the proposed optimization methods compete with state-of-the-art halftoning methods.
|
110 |
High Dimensional Fast Fourier Transform Based on Rank-1 Lattice SamplingKämmerer, Lutz 21 November 2014 (has links)
We consider multivariate trigonometric polynomials with frequencies supported on a fixed but arbitrary frequency index set I, which is a finite set of integer vectors of length d. Naturally, one is interested in spatial
discretizations in the d-dimensional torus such that
- the sampling values of the trigonometric polynomial at the nodes of this spatial discretization uniquely determines the trigonometric polynomial,
- the corresponding discrete Fourier transform is fast realizable, and
- the corresponding fast Fourier transform is stable.
An algorithm that computes the discrete Fourier transform and that needs a computational complexity that is bounded from above by terms that are linear in the maximum of the number of input and output data up to some logarithmic factors is called fast Fourier transform. We call the fast Fourier transform stable if the Fourier matrix of the discrete Fourier transform has a condition number near one and the fast algorithm does not corrupt this theoretical stability.
We suggest to use rank-1 lattices and a generalization as spatial discretizations in order to sample multivariate trigonometric polynomials and we develop construction methods in order to determine reconstructing sampling sets, i.e., sets of sampling nodes that allow for the unique, fast, and stable reconstruction of trigonometric polynomials. The methods for determining reconstructing rank-1 lattices are component{by{component constructions, similar to the seminal methods that are developed in the field of numerical integration. During this thesis we identify a component{by{component construction of reconstructing rank-1 lattices that allows for an estimate of the number of sampling nodes M
|I|\le M\le \max\left(\frac{2}{3}|I|^2,\max\{3\|\mathbf{k}\|_\infty\colon\mathbf{k}\in I\}\right)
that is sufficient in order to uniquely reconstruct each multivariate trigonometric polynomial with frequencies supported on the frequency index set I. We observe that the bounds on the number M only depends on the number of frequency indices contained in I and the expansion of I, but not on the spatial dimension d. Hence, rank-1 lattices are suitable spatial discretizations in arbitrarily high dimensional problems.
Furthermore, we consider a generalization of the concept of rank-1 lattices, which we call generated sets. We use a quite different approach in order to determine suitable reconstructing generated sets. The corresponding construction method is based on a continuous optimization method.
Besides the theoretical considerations, we focus on the practicability of the presented algorithms and illustrate the theoretical findings by means of several examples.
In addition, we investigate the approximation properties of the considered sampling schemes. We apply the results to the most important structures of frequency indices in higher dimensions, so-called hyperbolic crosses and demonstrate the approximation properties by the means of several examples that include the solution of Poisson's equation as one representative of partial differential equations.
|
Page generated in 0.0415 seconds