61 |
CUDA performance analyzerDasgupta, Aniruddha 05 April 2011 (has links)
GPGPU Computing using CUDA is rapidly gaining ground today. GPGPU has been brought to the masses through the ease of use of CUDA and ubiquity of graphics cards supporting the same. Although CUDA has a low learning curve for programmers familiar with standard programming languages like C, extracting optimum performance from it, through optimizations and hand tuning is not a trivial task. This is because, in case of GPGPU, an optimization strategy rarely affects the functioning in an isolated manner. Many optimizations affect different aspects for better or worse, establishing a tradeoff situation between them, which needs to be carefully handled to achieve good performance. Thus optimizing an application for CUDA is tough and the performance gain might not be commensurate to the coding effort put in.
I propose to simplify the process of optimizing CUDA programs using a CUDA Performance Analyzer. The analyzer is based on analytical modeling of CUDA compatible GPUs. The model characterizes the different aspects of GPU compute unified architecture and can make prediction about expected performance of a CUDA program. It would also give an insight into the performance bottlenecks of the CUDA implementation. This would hint towards, what optimizations need to be applied to improve performance. Based on the model, one would also be able to make a prediction about the performance of the application if the optimizations are applied to the CUDA implementation. This enables a CUDA programmer to test out different optimization strategies without putting in a lot of coding effort.
|
62 |
A distributed kernel summation framework for machine learning and scientific applicationsLee, Dong Ryeol 11 May 2012 (has links)
The class of computational problems I consider in
this thesis share the common trait of requiring
consideration of pairs (or higher-order tuples)
of data points. I focus on the problem of kernel
summation operations ubiquitous in many data
mining and scientific algorithms.
In machine learning, kernel summations appear in
popular kernel methods which can model nonlinear
structures in data. Kernel methods include many
non-parametric methods such as kernel density
estimation, kernel regression, Gaussian process
regression, kernel PCA, and kernel support vector
machines (SVM). In computational physics,
kernel summations occur inside the classical
N-body problem for simulating positions of a set
of celestial bodies or atoms.
This thesis attempts to marry, for the first
time, the best relevant techniques in parallel
computing, where kernel summations are in low
dimensions, with the best general-dimension
algorithms from the machine learning literature.
We provide a unified, efficient parallel
kernel summation framework that can utilize:
(1) various types of deterministic and
probabilistic approximations that may be
suitable for both low and high-dimensional
problems with a large number of data points;
(2) indexing the data using any multi-dimensional
binary tree with both distributed memory (MPI)
and shared memory (OpenMP/Intel TBB) parallelism;
(3) a dynamic load balancing scheme to adjust
work imbalances during the computation.
I will first summarize my previous research in
serial kernel summation algorithms. This work
started from Greengard/Rokhlin's earlier work on
fast multipole methods for the purpose of
approximating potential sums of many particles.
The contributions of this part of this thesis
include the followings: (1) reinterpretation of
Greengard/Rokhlin's work for the computer science
community; (2) the extension of the algorithms to
use a larger class of approximation strategies,
i.e. probabilistic error bounds via Monte Carlo
techniques; (3) the multibody series expansion:
the generalization of the theory of fast
multipole methods to handle interactions of more
than two entities; (4) the first O(N) proof of
the batch approximate kernel summation using a
notion of intrinsic dimensionality. Then I move
onto the problem of parallelization of the kernel
summations and tackling the scaling of two other
kernel methods, Gaussian process regression
(kernel matrix inversion) and kernel PCA (kernel
matrix eigendecomposition).
The artifact of this thesis has contributed to an
open-source machine learning package called
MLPACK which has been first demonstrated at the
NIPS 2008 and subsequently at the NIPS 2011 Big
Learning Workshop. Completing a portion of this
thesis involved utilization of high performance
computing resource at XSEDE (eXtreme Science and
Engineering Discovery Environment) and NERSC
(National Energy Research Scientific Computing
Center).
|
63 |
Numerische Berechnung elektromagnetischer Felder - Erweiterung einer Hybridmethode aus Momentenmethode und Einheitlicher Geometrischer Beugungstheorie um die Verallgemeinerte MultipoltechnikBalling, Stefan 30 October 2007 (has links) (PDF)
Drei numerische Feldberechnungsverfahren - die Momentenmehtode, die Einheitliche Geometrische Beugungstheorie und die Verallgemeinerte Multipoltechnik - werden schrittweise zu einer Erweiterten Hybridmethode (EHM) kombiniert. Dabei wird jeder einzelne Kombinationsschritt anschaulich anhand von Beispielen erläutert, die den Vorteil der EHM verdeutlichen: Mit diesem Verfahren lassen sich bestimmte Anordnungen äußerst effektiv analysieren.
|
64 |
Modelling visco-elastic seismic wave propagation : a fast-multipole boundary element method and its coupling with finite elementsGrasso, Eva 13 June 2012 (has links) (PDF)
The numerical simulation of elastic wave propagation in unbounded media is a topical issue. This need arises in a variety of real life engineering problems, from the modelling of railway- or machinery-induced vibrations to the analysis of seismic wave propagation and soil-structure interaction problems. Due to the complexity of the involved geometries and materials behavior, modelling such situations requires sophisticated numerical methods. The Boundary Element method (BEM) is a very effective approach for dynamical problems in spatially-extended regions (idealized as unbounded), especially since the advent of fast BEMs such as the Fast Multipole Method (FMM) used in this work. The BEM is based on a boundary integral formulation which requires the discretization of the only domain boundary (i.e. a surface in 3-D) and accounts implicitly for the radiation conditions at infinity. As a main disadvantage, the BEM leads a priori to a fully-populated and (using the collocation approach) non-symmetrical coefficient matrix, which make the traditional implementation of this method prohibitive for large problems (say O(106) boundary DoFs). Applied to the BEM, the Multi-Level Fast Multipole Method (ML-FMM) strongly lowers the complexity in computational work and memory that hinder the classical formulation, making the ML-FMBEM very competitive in modelling elastic wave propagation. The elastodynamic version of the Fast Multipole BEM (FMBEM), in a form enabling piecewise-homogeneous media, has for instance been successfully used to solve seismic wave propagation problems in a previous work (thesis dissertation of S. Chaillat, ENPC, 2008). This thesis aims at extending the capabilities of the existing frequency-domain elastodynamic FMBEM in two directions. Firstly, the time-harmonic elastodynamic ML-FMBEM formulation has been extended to the case of weakly dissipative viscoelastic media. Secondly, the FMBEM and the Finite Element Method (FEM) have been coupled to take advantage of the versatility of the FEM to model complex geometries and non-linearities while the FM-BEM accounts for wave propagation in the surrounding unbounded medium. In this thesis, we consider two strategies for coupling the FMBEM and the FEM to solve three-dimensional time-harmonic wave propagation problems in unbounded domains. The main idea is to separate one or more bounded subdomains (modelled by the FEM) from the complementary semi-infinite viscoelastic propagation medium (modelled by the FMBEM) through a non-overlapping domain decomposition. Two coupling strategies have been implemented and their performances assessed and compared on several examples
|
65 |
Fast Numerical Techniques for Electromagnetic Problems in Frequency DomainNilsson, Martin January 2003 (has links)
The Method of Moments is a numerical technique for solving electromagnetic problems with integral equations. The method discretizes a surface in three dimensions, which reduces the dimension of the problem with one. A drawback of the method is that it yields a dense system of linear equations. This effectively prohibits the solution of large scale problems. Papers I-III describe the Fast Multipole Method. It reduces the cost of computing a dense matrix vector multiplication. This implies that large scale problems can be solved on personal computers. In Paper I the error introduced by the Fast Multipole Method is analyzed. Paper II and Paper III describe the implementation of the Fast Multipole Method. The problem of computing monostatic Radar Cross Section involves many right hand sides. Since the Fast Multipole Method computes a matrix times a vector, iterative techniques are used to solve the linear systems. It is important that the solution time for each system is as low as possible. Otherwise the total solution time becomes too large. Different techniques for reducing the work in the iterative solver are described in Paper IV-VI. Paper IV describes a block Quasi Minimal Residual method for several right hand sides and Sparse Approximate Inverse preconditioner that reduce the number of iterations significantly. In Paper V and Paper VI a method based on linear algebra called the Minimal Residual Interpolation method is described. It reduces the work in an iterative solver by accurately computing an initial guess for the iterative method. In Paper VII a hybrid method between Physical Optics and the Fast Multipole Method is described. It can handle large problems that are out of reach for the Fast Multipole Method.
|
66 |
Correlacao angular direcional gama-gama no nucleo de sup76SeCAMARGO, SONIA P. de 09 October 2014 (has links)
Made available in DSpace on 2014-10-09T12:43:22Z (GMT). No. of bitstreams: 0 / Made available in DSpace on 2014-10-09T13:58:36Z (GMT). No. of bitstreams: 1
05064.pdf: 2928442 bytes, checksum: d8fe608a08084dcbff71333bddb5fa87 (MD5) / Tese (Doutoramento) / IPEN/T / Instituto de Pesquisas Energeticas e Nucleares - IPEN/CNEN-SP
|
67 |
Expansão de campos eletromagnéticos arbitrários em termos de funções de onda vetoriais / Expansion of arbitrary electromagnetic fields in terms of vector spherical wave functonsMoreira, Wendel Lopes 11 December 2010 (has links)
Orientador: Carlos Lenz Cesar / Tese (doutorado) - Universidade Estadual de Campinas, Instituto de Física "Gleb Wataghin" / Made available in DSpace on 2018-08-18T09:30:54Z (GMT). No. of bitstreams: 1
Moreira_WendelLopes_D.pdf: 1772489 bytes, checksum: 3d4e37a805f4c66c447aea58b93692b8 (MD5)
Previous issue date: 2010 / Resumo: Desde 1908, quando Mie apresentou expressões analíticas para os campos espalhados por uma partícula esférica sob incidência de uma onda eletromagnética plana, generalizações para esta expansão têm se mostrado incompletas. Isto se deve à presença de certos termos com dependência radial nos coe cientes de forma do feixe quando expandido em termos de funções de onda esféricas vetoriais. Aqui mostramos pela primeira como cancelar estes termos, permitindo expressões analíticas para os coe cientes para um campo eletromagnético completamente arbitrário. Damos tambem vários exemplos deste novo método, que também é muito apropriado para cálculos numéricos. Obtemos deste modo, expressões analíticas para feixes de Bessel e para os modos de guias de onda metálicos retangulares e cilíndricos. Estes resultados são extremamente relevantes para o incremento na velocidade de cálculo das forças de radiação atuando sobre uma partícula esférica, colocada em um campo eletromagnético arbitrário, com por exemplo, em pinças ópticas / Abstract: Since 1908, when Mie reported analytical expressions for the elds scattered by a spherical particle upon incidence of an electromagnetic plane-wave, generalizing his analysis to the case of an arbitrary incident wave has proved elusive. This is due to the presence of certain radially-dependent terms in the equation for the beam-shape coecients of the expansion of the electromagnetic elds in terms of vector spherical wave functions. Here we show for the rst time how these terms can be canceled out, allowing analytical expressions for the beam shape coecients to be found for a completely arbitrary incident eld. We give several examples of how this new method, which is well suited to numerical calculation, can be used. Analytical expressions are found for Bessel beams and the modes of rectangular and cylindrical metallic waveguides. The results are highly relevant for speeding up calculation of the radiation forces acting on spherical particles placed in an arbitrary electromagnetic eld, such as in optical tweezers / Doutorado / Física / Doutor em Ciências
|
68 |
Correlacao angular direcional gama-gama no nucleo de sup76SeCAMARGO, SONIA P. de 09 October 2014 (has links)
Made available in DSpace on 2014-10-09T12:43:22Z (GMT). No. of bitstreams: 0 / Made available in DSpace on 2014-10-09T13:58:36Z (GMT). No. of bitstreams: 1
05064.pdf: 2928442 bytes, checksum: d8fe608a08084dcbff71333bddb5fa87 (MD5) / Tese (Doutoramento) / IPEN/T / Instituto de Pesquisas Energeticas e Nucleares - IPEN/CNEN-SP
|
69 |
Estimation of electromagnetic material properties with application to high-voltage power cablesIvanenko, Yevhen January 2017 (has links)
Efficient design of high-voltage power cables is important to achieve an economical delivery of electric power from wind farms and power plants over the very long distances as well as the overseas electric power. The main focus of this thesis is the investigation of electromagnetic losses in components of high-voltage power cables. The objective of the ongoing research is to develop the theory and optimization techniques as tools to make material choices and geometry designs to minimize the high-frequency attenuation and dispersion for HVDC power cables and the power losses associated with HVAC cables. Physical limitations, dispersion relationships and the application of sum rules as well as convex optimization will be investigated to obtain adequate physical insight and a priori modeling information for these problems. For HVAC power cables, the objectives are addressed by performing measurements and estimation of complex valued permeability of cable armour steel in Papers I and II. Efficient analytical solutions for the electromagnetic field generated by helical structures with applications for HVAC power cables have been obtained in Paper III. For HVDC power cables, estimation of insulation characteristics from dielectric spectroscopy data using Herglotz functions, convex optimization and B-splines, has been investigated in Papers V and VI. The unique solution requirements in waveguide problems have been reviewed in Paper IV.
|
70 |
Development of a reference method based on the fast multipole boundary element method for sound propagation problems in urban environments : formalism, improvements & applications / Développement d’une méthode de référence basée sur la méthode par éléments de frontières multipolaires pour la propagation sonore en environnement urbain : formalisme, optimisations & applicationsVuylsteke, Xavier 10 December 2014 (has links)
Décrit comme l'un des algorithmes les plus prometteurs du 20ème siècle, le formalisme multipolaire appliqué à la méthode des éléments de frontière, permet de nos jours de traiter de larges problèmes encore inconcevables il y a quelques années. La motivation de ce travail de thèse est d'évaluer la capacité, ainsi que les avantages concernant les ressources numériques, de ce formalisme pour apporter une solution de référence aux problèmes de propagation sonore tri-dimensionnels en environnement urbain, dans l'objectif d'améliorer les algorithmes plus rapides déjà existants. Nous présentons la théorie nécessaire à l'obtention de l'équation intégrale de frontière pour la résolution de problèmes non bornés. Nous discutons également de l'équation intégrale de frontière conventionnelle et hyper-singulière pour traiter les artefacts numériques liés aux fréquences fictives, lorsque l'on résout des problèmes extérieurs. Nous présentons par la suite un bref aperçu historique et technique du formalisme multipolaire rapide et des outils mathématiques requis pour représenter la solution élémentaire de l'équation de Helmholtz. Nous décrivons les principales étapes, d'un point de vue numérique, du calcul multipolaire. Un problème de propagation sonore dans un quartier, composé de 5 bâtiments, nous a permis de mettre en évidence des problèmes d'instabilités dans le calcul par récursion des matrices de translations, se traduisant par des discontinuités sur le champs de pression de surface et une non convergence du solveur. Ceci nous a conduits à considérer le travail très récent de Gumerov et Duraiswamy en lien avec un processus récursif stable pour le calcul des coefficients des matrices de rotation. Cette version améliorée a ensuite été testée avec succès sur un cas de multi diffraction jusqu'à une taille dimensionnelle de problème de 207 longueur d'ondes. Nous effectuons finalement une comparaison entre un algorithme d'élément de frontière, Micado3D, un algorithme multipolaire et un algorithme basé sur le tir de rayons, Icare, pour le calcul de niveaux de pression moyennés dans une cour ouverte et fermée. L'algorithme multipolaire permet de valider les résultats obtenus par tir de rayons dans la cour ouverte jusqu'à 300 Hz (i.e. 100 longueur d'ondes), tandis que concernant la cour fermée, zone très sensible par l'absence de contribution directes ou réfléchies, des études complémentaires sur le préconditionnement de la matrice semblent requises afin de s'assurer de la pertinence des résultats obtenus à l'aide de solveurs itératifs / Described as one of the best ten algorithms of the 20th century, the fast multipole formalism applied to the boundary element method allows to handle large problems which were inconceivable only a few years ago. Thus, the motivation of the present work is to assess the ability, as well as the benefits in term of computational resources provided by the application of this formalism to the boundary element method, for solving sound propagation problems and providing reference solutions, in three dimensional dense urban environments, in the aim of assessing or improving fast engineering tools. We first introduce the mathematical background required for the derivation of the boundary integral equation, for solving sound propagation problems in unbounded domains. We discuss the conventional and hyper-singular boundary integral equation to overcome the numerical artifact of fictitious eigen-frequencies, when solving exterior problems. We then make a brief historical and technical overview of the fast multipole principle and introduce the mathematical tools required to expand the elementary solution of the Helmholtz equation and describe the main steps, from a numerical viewpoint, of fast multipole calculations. A sound propagation problem in a city block made of 5 buildings allows us to highlight instabilities in the recursive computation of translation matrices, resulting in discontinuities of the surface pressure and a no convergence of the iterative solver. This observation leads us to consider the very recent work of Gumerov & Duraiswamy, related to a ``stable'' recursive computation of rotation matrices coefficients in the RCR decomposition. This new improved algorithm has been subsequently assessed successfully on a multi scattering problem up to a dimensionless domain size equal to 207 wavelengths. We finally performed comparisons between a BEM algorithm, extit{Micado3D}, the FMBEM algorithm and a ray tracing algorithm, Icare, for the calculation of averaged pressure levels in an opened and closed court yards. The fast multipole algorithm allowed to validate the results computed with Icare in the opened court yard up to 300 Hz corresponding, (i.e. 100 wavelengths), while in the closed court yard, a very sensitive area without direct or reflective fields, further investigations related to the preconditioning seem required to ensure reliable solutions provided by iterative solver based algorithms
|
Page generated in 0.0583 seconds