31 |
Rapid response in psychological treatments for binge-eating disorderHilbert, Anja, Hildebrandt, Thomas, Agras, W. Stewart, Wilfley, Denise E., Wilson, G. Terence 12 April 2017 (has links) (PDF)
Objective: Analysis of short- and long-term effects of rapid response across three different treatments for binge-eating disorder (BED). Method: In a randomized clinical study comparing interpersonal psychotherapy (IPT), cognitive-behavioral guided self-help (CBTgsh), and behavioral weight loss (BWL) treatment in 205 adults meeting DSM-IV criteria for BED, the predictive value of rapid response, defined as ≥ 70% reduction in binge-eating by week four, was determined for remission from binge-eating and global eating disorder psychopathology at posttreatment, 6-, 12-, 18-, and 24-month follow-up. Results: Rapid responders in CBTgsh, but not in IPT or BWL, showed significantly greater rates of remission from binge-eating than non-rapid responders, which was sustained over the long term. Rapid and non-rapid responders in IPT and rapid responders in CBTgsh showed a greater remission from binge-eating than non-rapid responders in CBTgsh and BWL. Rapid responders in CBTgsh showed greater remission from binge-eating than rapid responders in BWL. Although rapid responders in all treatments had lower global eating disorder psychopathology than non-rapid responders in the short term, rapid responders in CBTgsh and IPT were more improved than those in BWL and non-rapid responders in each treatment. Rapid responders in BWL did not differ from non-rapid responders in CBTgsh and IPT. Conclusions: Rapid response is a treatment-specific positive prognostic indicator of sustained remission from binge-eating in CBTgsh. Regarding an evidence-based stepped care model, IPT, equally efficacious for rapid and non-rapid responders, could be investigated as a second-line treatment in case of non-rapid response to first-line CBTgsh.
|
32 |
Advancing the capabilities of Rapid Acquisition with Relaxation Enhancement magnetic resonance imagingPaul, Katharina 01 December 2015 (has links)
Die vorliegende Arbeit präsentiert neuartige schnelle Bildgebungstechniken für die Hoch- und Ultrahochfeld Magnetresonanztomographie. Zunächst werden die Grundprinzipien schneller Spin-Echo Techniken beleuchtet. Diese physikalischen Überlegungen bilden die Grundlage für die Entwicklung modifizierter Techniken. In einer ersten Entwicklungsstufe wird eine neue Variante der schnellen Spin-Echo Bildgebung vorgestellt. Diese Technik generiert anatomischen und funktionellen Bildkontrast innerhalb von nur einer Datenaufnahme. Der entscheidende Vorteil des entwickelten Ansatzes besteht in einer wesentlichen Verkürzung der Messzeit. Darüber hinaus wird eine deutliche Reduktion von Bildfehlern ermöglicht, die im konventionellen Fall häufig durch Bewegung erzeugt werden. Die zweite Entwicklungsstufe befasst sich mit der Implementierung einer schnellen Spin-Echo Technik zur Abbildung des physikalischen Phänomens der Brownschen Molekularbewegung. Diffusionsmessungen der Molekülbewegungen sind durch die Überlagerung von makroskopischen Bewegungen sehr anspruchsvoll. Diese Schwierigkeit wird in der vorliegenden Arbeit methodisch überwunden, indem eine diffusionsgewichtete schnelle Spin Echo Technik implementiert wird. Die dritte Entwicklungsstufe konzentriert sich auf suszeptibiltätsgewichtete schnelle Spin-Echo Bildgebung. Herkömmliche Techniken zur suszeptibiltätsgewichteten Bildgebung sind anfällig für Artefakte, die sich in Signalauslöschungen äußern. Um dieser Herausforderung methodisch zu begegnen, untersucht diese Arbeit das Potential einer suszeptibiltätsgewichteten schnellen Spin-Echo Technik zur Charakterisierung der Mikrostruktur des Herzmuskels bei 7.0 T. Ein Ziel der in dieser Arbeit neu entwickelten schnellen Spin-Echo Methoden besteht darin, Limitierungen bestehender Techniken zu beheben. Damit soll richtungsweisend über die Grundlagenforschung hinaus die Basis für klinische Anwendungen der entwickelten physikalischen Erkenntnisse und Methoden gelegt werden. / This thesis presents novel fast imaging techniques for magnetic resonance imaging. Rapid Acquisition with Relaxation Enhancement (RARE) is a fast imaging technique. An ever growing number of clinical applications render clinically and physically motivated advancement of RARE imaging necessary. This thesis focuses on the advancement of RARE imaging. For this purpose, the basic principle of RARE imaging is examined. The first part proposes a novel RARE variant which provides simultaneous anatomical and functional contrast within one acquisition. This approach provides an alternative versus conventional RARE variants where sequential acquisitions are put to use to achieve different image contrasts. With the speed gain of the proposed approach a substantial shortening of scanning time can be accomplished together with a reduction in the propensity for motion. The second part focuses on diffusion weighted MRI. Probing diffusion on a micrometer scale is challenging because of MRI’s sensitivity to bulk motion. Unfortunately, conventional rapid diffusion weighted imaging techniques are prone to severe image distortions. Realizing this constraint, a diffusion weighted RARE technique that affords the generation of diffusion weighted images free of distortion is implemented. The third part is formed around susceptibility weighted MRI. The underlying biophysical mechanisms allow the assessment of tissue microstructure. Common susceptibility weighted imaging techniques are prone to image artifacts. Recognizing the opportunities of susceptibility weighted MRI the potential of a susceptibility weighted RARE technique is investigated with the goal to assess myocardial microstructure. The goal of the novel RARE developments is to overcome constraints of existing imaging techniques. The physical considerations and the novel methodology introduced in this thesis are brought beyond the scope of basic research. Moreover, the foundation for clinical applicability is created.
|
33 |
The role of interneuronal networks in hippocampal ripple oscillationsLeiva, José Ramón Donoso 05 December 2016 (has links)
Hippokampale Sharp Wave-Ripples (SWRs) sind elektrografische Ereignisse, die für die Konsolidierung von Erinnerungen eine Rolle spielen. Eine SWR ist durch eine schnelle Oszillation (>90 Hz, ''ripple'') charakterisiert, die sich mit der langsameren ''sharp wave'' ( / Hippocampal sharp wave-ripples (SWRs) are electrographic events that have been implicated in memory consolidation. A SWR is characterized by a fast (> 90 Hz) oscillation, the ripple, superimposed on a slow (< 30 Hz) sharp wave. In vivo, the fast component can express frequencies either in the ripple range (140-200 Hz) or fast-gamma range (90-140 Hz). Episodes in both bands exhibit intra-ripple frequency accommodation (IFA). In vitro, ripples are frequency-resistant to GABA modulators. These features constrain the type of mechanisms underlying the generation of the fast component. A prominent hypothesis proposes that a recurrent network of parvalbumin-immunoreactive basket cells (PV+BC) is responsible of setting the ripple frequency. The focus of the present thesis is on testing to which extent the PV+BC network can account for the aforementioned features of SWRs, which remain unexplained. Here, I simulated and analyzed a physiologically constrained in silico model of the PV+BC network in CA1 under different conditions of excitatory drive. The response of the network to transient excitation exhibits both IFA in the ripple band and frequency resistance to GABA modulators. The expression of IFA in the fast gamma band requires the involvement of pyramidal cells in a closed loop with the PV+BC network. The model predicts a peculiar relationship between the instantaneous frequency of ripples and the time course of the excitatory input to CA1. This prediction was confirmed in an in vitro model of SWRs. Additionally, I study the involvement of oriens lacunosum-moleculare interneurons (O-LM) during SWRs in vitro. I characterize the excitatory currents received by O-LM cells during SWRs and investigate the factors that determine their recruitment.
|
34 |
Interpretable Approximation of High-Dimensional Data based on the ANOVA DecompositionSchmischke, Michael 08 July 2022 (has links)
The thesis is dedicated to the approximation of high-dimensional functions from scattered data nodes. Many methods in this area lack the property of interpretability in the context of explainable artificial intelligence. The idea is to address this shortcoming by proposing a new method that is intrinsically designed around interpretability. The multivariate analysis of variance (ANOVA) decomposition is the main tool to achieve this purpose. We study the connection between the ANOVA decomposition and orthonormal bases to obtain a powerful basis representation. Moreover, we focus on functions that are mostly explained by low-order interactions to circumvent the curse of dimensionality in its exponential form. Through the connection with grouped index sets, we can propose a least-squares approximation idea via iterative LSQR. Here, the proposed grouped transformations provide fast algorithms for multiplication with the appearing matrices. Through global sensitivity indices we are then able to analyze the approximation which can be used in improving it further. The method is also well-suited for the approximation of real data sets where the sparsity-of-effects principle ensures a low-dimensional structure. We demonstrate the applicability of the method in multiple numerical experiments with real and synthetic data.:1 Introduction
2 The Classical ANOVA Decomposition
3 Fast Multiplication with Grouped Transformations
4 High-Dimensional Explainable ANOVA Approximation
5 Numerical Experiments with Synthetic Data
6 Numerical Experiments with Real Data
7 Conclusion
Bibliography / Die Arbeit widmet sich der Approximation von hoch-dimensionalen Funktionen aus verstreuten Datenpunkten. In diesem Bereich leiden vielen Methoden darunter, dass sie nicht interpretierbar sind, was insbesondere im Kontext von Explainable Artificial Intelligence von großer Wichtigkeit ist. Um dieses Problem zu adressieren, schlagen wir eine neue Methode vor, die um das Konzept von Interpretierbarkeit entwickelt ist. Unser wichtigstes Werkzeug dazu ist die Analysis of Variance (ANOVA) Zerlegung. Wir betrachten insbesondere die Verbindung der ANOVA Zerlegung zu orthonormalen Basen und erhalten eine wichtige Reihendarstellung. Zusätzlich fokussieren wir uns auf Funktionen, die hauptsächlich durch niedrig-dimensionale Variableninteraktionen erklärt werden. Dies hilft uns, den Fluch der Dimensionen in seiner exponentiellen Form zu überwinden. Über die Verbindung zu Grouped Index Sets schlagen wir dann eine kleinste Quadrate Approximation mit dem iterativen LSQR Algorithmus vor. Dabei liefern die vorgeschlagenen Grouped Transformations eine schnelle Multiplikation mit den entsprechenden Matrizen. Unter Zuhilfenahme von globalen Sensitvitätsindizes können wir die Approximation analysieren und weiter verbessern. Die Methode ist zudem gut dafür geeignet, reale Datensätze zu approximieren, wobei das sparsity-of-effects Prinzip sicherstellt, dass wir mit niedrigdimensionalen Strukturen arbeiten. Wir demonstrieren die Anwendbarkeit der Methode in verschiedenen numerischen Experimenten mit realen und synthetischen Daten.:1 Introduction
2 The Classical ANOVA Decomposition
3 Fast Multiplication with Grouped Transformations
4 High-Dimensional Explainable ANOVA Approximation
5 Numerical Experiments with Synthetic Data
6 Numerical Experiments with Real Data
7 Conclusion
Bibliography
|
35 |
Integral Equation Methods for Rough Surface Scattering Problems in three Dimensions / Integralgleichungsmethoden für Streuprobleme an rauhen Oberflächen in drei DimensionenHeinemeyer, Eric 10 January 2008 (has links)
No description available.
|
36 |
Energie- und Ausführungszeitmodelle zur effizienten Ausführung wissenschaftlicher Simulationen / Energy and execution time models for an efficient execution of scientific simulationsLang, Jens 15 January 2015 (has links) (PDF)
Das wissenschaftliche Rechnen mit der Computersimulation hat sich heute als dritte Säule der wissenschaftlichen Methodenlehre neben der Theorie und dem Experiment etabliert. Aufgabe der Informatik im wissenschaftlichen Rechnen ist es, sowohl effiziente Simulationsalgorithmen zu entwickeln als auch ihre effiziente Implementierung.
Die vorliegende Arbeit richtet ihren Fokus auf die effiziente Implementierung zweier wichtiger Verfahren des wissenschaftlichen Rechnens: die Schnelle Multipolmethode (FMM) für Teilchensimulationen und die Methode der finiten Elemente (FEM), die z. B. zur Berechnung der Deformation von Festkörpern genutzt wird. Die Effizienz der Implementierung bezieht sich hier auf die Ausführungszeit der Simulationen und den zur Ausführung notwendigen Energieverbrauch der eingesetzten Rechnersysteme.
Die Steigerung der Effizienz wurde durch modellbasiertes Autotuning erreicht. Beim modellbasierten Autotuning wird für die wesentlichen Teile des Algorithmus ein Modell aufgestellt, das dessen Ausführungszeit bzw. Energieverbrauch beschreibt. Dieses Modell ist abhängig von Eigenschaften des genutzten Rechnersystems, von Eingabedaten und von verschiedenen Parametern des Algorithmus. Die Eigenschaften des Rechnersystems werden durch Ausführung des tatsächlich genutzten Codes für verschiedene Implementierungsvarianten ermittelt. Diese umfassen eine CPU-Implementierung und eine Grafikprozessoren-Implementierung für die FEM und die Implementierung der Nahfeld- und der Fernfeldwechselwirkungsberechnung für die FMM. Anhand der aufgestellten Modelle werden die Kosten der Ausführung für jede Variante vorhergesagt. Die optimalen Algorithmenparameter können somit analytisch bestimmt werden, um die gewünschte Zielgröße, also Ausführungszeit oder Energieverbrauch, zu minimieren. Bei der Ausführung der Simulation werden die effizientesten Implementierungsvarianten entsprechend der Vorhersage genutzt. Während bei der FMM die Performance-Messungen unabhängig von der Ausführung der Simulation durchgeführt werden, wird für die FEM ein Verfahren zur dynamischen Verteilung der Rechenlast zwischen CPU und GPU vorgestellt, das auf Ausführungszeitmessungen zur Laufzeit der Simulation reagiert. Durch Messung der tatsächlichen Ausführungszeiten kann so dynamisch auf sich während der Laufzeit verändernde Verhältnisse reagiert und die Verteilung der Rechenlast entsprechend angepasst werden.
Die Ergebnisse dieser Arbeit zeigen, dass modellbasiertes Autotuning es ermöglicht, die Effizienz von Anwendungen des wissenschaftlichen Rechnens in Bezug auf Ausführungszeit und Energieverbrauch zu steigern. Insbesondere die Berücksichtigung des Energieverbrauchs alternativer Ausführungspfade, also die Energieadaptivität, wird in naher Zukunft von großer Bedeutung im wissenschaftlichen Rechnen sein. / Computer simulation as a part of the scientific computing has established as third pillar in scientific methodology, besides theory and experiment. The task of computer science in the field of scientific computing is the development of efficient simulation algorithms as well as their efficient implementation.
The thesis focuses on the efficient implementation of two important methods in scientific computing: the Fast Multipole Method (FMM) for particle simulations, and the Finite Element Method (FEM), which is, e.g., used for deformation problems of solids. The efficiency of the implementation considers the execution time of the simulations and the energy consumption of the computing systems needed for the execution.
The method used for increasing the efficiency is model-based autotuning. For model-based autotuning, a model for the substantial parts of the algorithm is set up which estimates the execution time or energy consumption. This model depends on properties of the computer used, of the input data and of parameters of the algorithm. The properties of the computer are determined by executing the real code for different implementation variants. These implementation variantss comprise a CPU and a graphics processor implementation for the FEM, and implementations of near field and far field interaction calculations for the FMM. Using the models, the execution costs for each variant are predicted. Thus, the optimal algorithm parameters can be determined analytically for a minimisation of the desired target value, i.e. execution time or energy consumption. When the simulation is executed, the most efficient implementation variants are used depending on the prediction of the model. While for the FMM the performance measurement takes place independently from the execution of the simulation, for the FEM a method for dynamically distributing the workload to the CPU and the GPU is presented, which takes into account execution times measured at runtime. By measuring the real execution times, it is possible to response to changing conditions and to adapt the distribution of the workload accordingly.
The results of the thesis show that model-based autotuning makes it possible to increase the efficiency of applications in scientific computing regarding execution time and energy consumption. Especially, the consideration of the energy consumption of alternative execution paths, i.e. the energy adaptivity, will be of great importance in scientific computing in the near future.
|
37 |
PFFT - An Extension of FFTW to Massively Parallel ArchitecturesPippig, Michael January 2012 (has links)
We present a MPI based software library for computing the fast Fourier transforms on massively parallel, distributed memory architectures. Similar to established transpose FFT algorithms, we propose a parallel FFT framework that is based on a combination of local FFTs, local data permutations and global data transpositions. This framework can be generalized to arbitrary multi-dimensional data and process meshes. All performance relevant building blocks can be implemented with the help of the FFTW software library. Therefore, our library offers great flexibility and portable performance. Likewise FFTW, we are able to compute FFTs of complex data, real data and even- or odd-symmetric real data. All the transforms can be performed completely in place. Furthermore, we propose an algorithm to calculate pruned FFTs more efficiently on distributed memory architectures.
For example, we provide performance measurements of FFTs of size 512^3 and 1024^3 up to 262144 cores on a BlueGene/P architecture.
|
38 |
Organization of Smallholder Tree Growers, Support Organizations, Linkages and Implications for Woodlots Performance: The Case of Mufindi District, TanzaniaHingi Simon, Ombeni 02 May 2019 (has links)
Woodlots have become the most important investment opportunity among smallholders of Mufindi district in the southern highlands of Tanzania. Smallholder woodlots are also a major source of wood supply contributing to narrow the supply gap which in 2015 was reported to be 19.5 million m3 per year, where the main wood consumption sectors being construction and domestic heating energy. However, inadequate information about smallholder woodlots, supporting organizations, their linkages and impacts on woodlots performance derail its sustainable development and potential contribution for wood supply and poverty alleviation. The present study therefore specifically explored the tree grower’s motivations, knowledge base and challenges to woodlots farming; assessed woodlot tree species, products, and performance; analyzed the linkages of tree growers with support organizations and evaluated their impacts on the performance of woodlots. Both survey and case study approaches were used to collect data in the three villages namely: Igowole, Mninga, and Nundwe, both in Mufindi district, Tanzania. Mufindi district was purposively selected because of advanced smallholder tree growing. In all the three villages, a total of 93 actors were approached, including 72 tree grower households, 24 from each village, 14 nursery operators and 9 support organizations by snowball sampling. Then, an in-depth interview was conducted to all 72 sampled households. Of which 48 woodlots, 12 - from organized and the other 12 from unorganized tree growers in each village for Igowole and Nundwe, were assessed by rapid appraisal (RA) approach and their performance compared. While 24 woodlots were assessed from Mninga village, and all were from unorganized tree growers. Quantitative data were analyzed using SPSS version 20 and the results summarized in tables and graphs in excel. Woodlots performance and social network data were analyzed using R – software. Based on the study respondents, the results revealed that tree growers were motivated to plant and manage trees mainly for economic reasons (48%, 45%, and 51%) and land security reasons (37%, 30% and 31%) for Igowole, Mninga and Nundwe respectively. About the knowledge base, most tree growers (75% – 100%) in all the three villages had the knowledge on land preparation, nursery management, planting, weeding, pruning, and fire protection. But in all the villages, respondents did not have knowledge on forest growth principles and dynamics, on objectives for the product of the plantations and influence of tree spacing on such desired products. Again, other analyses revealed that; fire, inadequate knowledge, inadequate capital, lack of improved seeds and low timber/tree prices were the main challenges constraining farmers to plant and manage trees in woodlots in the three study villages. And, the main tree species in the study area were Pinus patula and Eucalyptus sp. Organized tree growers where much more supported by organizations than the non-organized ones. Logistic regression analysis performed in R (P = 0.05) revealed significant difference in woodlots performance among organized farmers based on gaps (P = 0.00216), growth condition (P = 0.04478) and planting space (P = 0.02013) criteria. That means, woodlots from organized farmers were generally performing better than those from unorganized farmers. The better performing woodlots of organized tree growers were contributed by social capital through networks and the role of collective action of the farmers in TGAs. Nursery operator farmers were the main source of tree seedlings for unorganized tree growers, while organized tree growers obtained most of the resources including knowledge, seeds and planting materials as well as funds from tree grower associations (TGAs), which were supported by organizations. Thus, for future planning, nursery operator farmers should be supported for improved seeds and planting materials to benefit the nonorganized farmers. Nursery operator farmers should be encouraged to join TGAs, and TGAs should generally be adopted as an effective smallholder tree grower’s support platform in the study area.
|
39 |
Energie- und Ausführungszeitmodelle zur effizienten Ausführung wissenschaftlicher SimulationenLang, Jens 09 December 2014 (has links)
Das wissenschaftliche Rechnen mit der Computersimulation hat sich heute als dritte Säule der wissenschaftlichen Methodenlehre neben der Theorie und dem Experiment etabliert. Aufgabe der Informatik im wissenschaftlichen Rechnen ist es, sowohl effiziente Simulationsalgorithmen zu entwickeln als auch ihre effiziente Implementierung.
Die vorliegende Arbeit richtet ihren Fokus auf die effiziente Implementierung zweier wichtiger Verfahren des wissenschaftlichen Rechnens: die Schnelle Multipolmethode (FMM) für Teilchensimulationen und die Methode der finiten Elemente (FEM), die z. B. zur Berechnung der Deformation von Festkörpern genutzt wird. Die Effizienz der Implementierung bezieht sich hier auf die Ausführungszeit der Simulationen und den zur Ausführung notwendigen Energieverbrauch der eingesetzten Rechnersysteme.
Die Steigerung der Effizienz wurde durch modellbasiertes Autotuning erreicht. Beim modellbasierten Autotuning wird für die wesentlichen Teile des Algorithmus ein Modell aufgestellt, das dessen Ausführungszeit bzw. Energieverbrauch beschreibt. Dieses Modell ist abhängig von Eigenschaften des genutzten Rechnersystems, von Eingabedaten und von verschiedenen Parametern des Algorithmus. Die Eigenschaften des Rechnersystems werden durch Ausführung des tatsächlich genutzten Codes für verschiedene Implementierungsvarianten ermittelt. Diese umfassen eine CPU-Implementierung und eine Grafikprozessoren-Implementierung für die FEM und die Implementierung der Nahfeld- und der Fernfeldwechselwirkungsberechnung für die FMM. Anhand der aufgestellten Modelle werden die Kosten der Ausführung für jede Variante vorhergesagt. Die optimalen Algorithmenparameter können somit analytisch bestimmt werden, um die gewünschte Zielgröße, also Ausführungszeit oder Energieverbrauch, zu minimieren. Bei der Ausführung der Simulation werden die effizientesten Implementierungsvarianten entsprechend der Vorhersage genutzt. Während bei der FMM die Performance-Messungen unabhängig von der Ausführung der Simulation durchgeführt werden, wird für die FEM ein Verfahren zur dynamischen Verteilung der Rechenlast zwischen CPU und GPU vorgestellt, das auf Ausführungszeitmessungen zur Laufzeit der Simulation reagiert. Durch Messung der tatsächlichen Ausführungszeiten kann so dynamisch auf sich während der Laufzeit verändernde Verhältnisse reagiert und die Verteilung der Rechenlast entsprechend angepasst werden.
Die Ergebnisse dieser Arbeit zeigen, dass modellbasiertes Autotuning es ermöglicht, die Effizienz von Anwendungen des wissenschaftlichen Rechnens in Bezug auf Ausführungszeit und Energieverbrauch zu steigern. Insbesondere die Berücksichtigung des Energieverbrauchs alternativer Ausführungspfade, also die Energieadaptivität, wird in naher Zukunft von großer Bedeutung im wissenschaftlichen Rechnen sein. / Computer simulation as a part of the scientific computing has established as third pillar in scientific methodology, besides theory and experiment. The task of computer science in the field of scientific computing is the development of efficient simulation algorithms as well as their efficient implementation.
The thesis focuses on the efficient implementation of two important methods in scientific computing: the Fast Multipole Method (FMM) for particle simulations, and the Finite Element Method (FEM), which is, e.g., used for deformation problems of solids. The efficiency of the implementation considers the execution time of the simulations and the energy consumption of the computing systems needed for the execution.
The method used for increasing the efficiency is model-based autotuning. For model-based autotuning, a model for the substantial parts of the algorithm is set up which estimates the execution time or energy consumption. This model depends on properties of the computer used, of the input data and of parameters of the algorithm. The properties of the computer are determined by executing the real code for different implementation variants. These implementation variantss comprise a CPU and a graphics processor implementation for the FEM, and implementations of near field and far field interaction calculations for the FMM. Using the models, the execution costs for each variant are predicted. Thus, the optimal algorithm parameters can be determined analytically for a minimisation of the desired target value, i.e. execution time or energy consumption. When the simulation is executed, the most efficient implementation variants are used depending on the prediction of the model. While for the FMM the performance measurement takes place independently from the execution of the simulation, for the FEM a method for dynamically distributing the workload to the CPU and the GPU is presented, which takes into account execution times measured at runtime. By measuring the real execution times, it is possible to response to changing conditions and to adapt the distribution of the workload accordingly.
The results of the thesis show that model-based autotuning makes it possible to increase the efficiency of applications in scientific computing regarding execution time and energy consumption. Especially, the consideration of the energy consumption of alternative execution paths, i.e. the energy adaptivity, will be of great importance in scientific computing in the near future.
|
40 |
Parameter tuning for the NFFT based fast Ewald summationNestler, Franziska 23 March 2015 (has links)
The computation of the Coulomb potentials and forces in charged particle systems under 3d-periodic boundary conditions is possible in an efficient way by utilizing the Ewald summation formulas and applying the fast Fourier transform (FFT). In this paper we consider the particle-particle NFFT (P2NFFT) approach, which is based on the fast Fourier transform for nonequispaced data (NFFT) and compare the error behaviors regarding different window functions, which are used in order to approximate the given continuous charge distribution by a mesh based charge density. While typically B-splines are applied in the scope of particle mesh methods, we consider for the first time also an approximation by Bessel functions. We show how the resulting root mean square errors in the forces can be predicted precisely and efficiently. The results show that if the parameters are tuned appropriately the Bessel window function can keep up with the B-spline window and is in many cases even the better choice with respect to computational costs.
|
Page generated in 0.0724 seconds