Global ETD Search

1	Computational approaches to and comparisons of design methods for linear controllers Boz, Ali Fuat January 1999 (has links) No description available. 629.8 Autotuning; Relay; Optimisation
2	A Framework for Automated Generation of Specialized Function Variants Chaimov, Nicholas, Chaimov, Nicholas January 2012 (has links) Efficient large-scale scientific computing requires efficient code, yet optimizing code to render it efficient simultaneously renders the code less readable, less maintainable, less portable, and requires detailed knowledge of low-level computer architecture, which the developers of scientific applications may lack. The necessary knowledge is subject to change over time as new architectures, such as GPGPU architectures like CUDA, which require very different optimizations than CPU-targeted code, become more prominent. The development of scientific cloud computing means that developers may not even know what machine their code will be running on when they are developing it. This work takes steps towards automating the generation of code variants which are automatically optimized for both execution environment and input dataset. We demonstrate that augmenting an autotuning framework with a performance database which captures metadata about environment and input and performing decision tree learning over that data can help more fully automate the process of enhancing software performance. Autotuning Optimization Specialization
3	Autotuning wavefront patterns for heterogeneous architectures Mohanty, Siddharth January 2015 (has links) Manual tuning of applications for heterogeneous parallel systems is tedious and complex. Optimizations are often not portable, and the whole process must be repeated when moving to a new system, or sometimes even to a different problem size. Pattern based parallel programming models were originally designed to provide programmers with an abstract layer, hiding tedious parallel boilerplate code, and allowing a focus on only application specific issues. However, the constrained algorithmic model associated with each pattern also enables the creation of pattern-specific optimization strategies. These can capture more complex variations than would be accessible by analysis of equivalent unstructured source code. These variations create complex optimization spaces. Machine learning offers well established techniques for exploring such spaces. In this thesis we use machine learning to create autotuning strategies for heterogeneous parallel implementations of applications which follow the wavefront pattern. In a wavefront, computation starts from one corner of the problem grid and proceeds diagonally like a wave to the opposite corner in either two or three dimensions. Our framework partitions and optimizes the work created by these applications across systems comprising multicore CPUs and multiple GPU accelerators. The tuning opportunities for a wavefront include controlling the amount of computation to be offloaded onto GPU accelerators, choosing the number of CPU and GPU threads to process tasks, tiling for both CPU and GPU memory structures, and trading redundant halo computation against communication for multiple GPUs. Our exhaustive search of the problem space shows that these parameters are very sensitive to the combination of architecture, wavefront instance and problem size. We design and investigate a family of autotuning strategies, targeting single and multiple CPU + GPU systems, and both two and three dimensional wavefront instances. These yield an average of 87% of the performance found by offline exhaustive search, with up to 99% in some cases. 004
4	Code optimization based on source to source transformations using profile guided metrics / Optimisation de code basée sur des transformations source-à-source guidées par des métriques issues de profilages Lebras, Youenn 03 July 2019 (has links) Le but est de développer d'un cadriciel permettant de définir les transformations de code source que nous jugeons judicieuses et sur la base de métriques dynamiques.Ce cadriciel sera ensuite intégré à la suite d'outil MAQAO, développée à l'UVSQ/ECR.Nous présentons des transformations source-à-source automatique guidées par l'utilisateur ansi que par les métriques dynamiques qui proviennent des différents outils d'analyse de MAQAO, afin de pouvoir travailler à la fois sur des objets sources et binaires.Ce cadriciel peut aussi servir de pré-processeur pour simplifier le développement en permettant d'effectuer certaines transformations simples mais chronophage et sources d'erreurs (i.e.: spécialisation de boucle ou fonction). / Our goal is to develop a framework allowing the definition of source code transformations based on dynamic metrics.This framework be integrated to the MAQAO tool suite developed at the UVSQ / ECR.We present a set of source-to-source transformations guidable by the end user and by the dynamic metrics coming from the various MAQAO tools in order to work at source and binary levels.This framework can also be used as a pre-processor to simplify the development by enabling to perform cleanly and automatically some simple but time-consuming and error-prone transformations (i.e .: loop/function specialization, ...). Optimisation de code Transformations source à source Autotuning Pgo Métriques dynamiques Code optimization Source-To-Source transformations Autotuning Pgo Dynamic metrics 005.45
5	Phase/amplitude estimation for tuning and monitoring Gyongy, Istvan January 2008 (has links) The benefits of good loop tuning in the process industries have long been recognized. Ensuring that controllers are kept well-configured despite changes in process dynamics can bring energy and material savings, improved product quality as well as reduced downtime. A number of loop tuning packages therefore exist that can, on demand, check the state of a loop and adjust the controller as necessary. These methods generally apply some form of upset to the process to identify the current plant dynamics, against which the controller can then be evaluated. A simple approach to the automatic tuning of PI controllers injects variable frequency sinewaves into the loop under normal plant operation. The method employs a phase-locked loop-based device called a phase-frequency/estimation and uses 'design-point' rules, where the aim is for the Nyquist locus of the loop to pass through a particular point on the complex plane. A number of advantages are offered by the scheme: it can carry out both 'one shot' tuning and continuous adaptation, the latter even with the test signal set to a lower amplitude than that of noise. A published article is included here that extends the approach to PID controllers, with simulations studies and real-life test showing the method to work consistently well for a for a wide range of typical process dynamics, the closed-loop having a response that compares well with that produced by standard tuning rules. The associated signal processing tools are tested by applying them to the transmitter of a Coriolis mass-flow meter. Schemes are devised for the tracking and control of the second mode of measurementtube oscillation alongside the so-called 'driven mode', at which the tubes are usually vibrated, leading to useful information being made available for measurement correction purposes. Once a loop has been tuned, it is important to assess it periodically and to detect any performance losses resulting from events such as changes in process or disturbance dynamics and equipment malfunction such as faulty sensors and actuators. Motivated by the effective behaviour of the controller tuners, a loop monitor developed here, also using probing sinewaves coupled with 'design-point' ideas. In this application, the effect on the process must be minimal, so the device must work with lower still SNRs. Thus it is practical to use a fixed-frequency probing signal, together with a different tool set for tracking it. An extensive mathematical framework is developed describing the statistical properties of the signal parameter estimates, and those of the indices derived from these estimates indicating the state of the loop. The result is specific practical guidelines for the application of the monitor (e.g. for the choices of test signal amplitude and test duration). Loop monitoring itself has traditionally been carried out by passive methods that calculate various performance indicators from routine operating data. Playing a central role amongst these metrics is the Harris Index (HI) and its variants, which compare the output variance to a 'minimum achievable' figure. A key advantage of the active monitor proposed here is that it is able not only to detect suboptimal control but also to suggest how the controller should be adjusted. Moreover, the monitor’s index provides a strong indication of changes in damping factor. Through simple adjustments to the algorithm (by raising the amplitude of the test signal or adding high frequency dither to the control signal), the method can be applied even in the presence of actuator non-linearity, allowing it to identify the cause of performance losses. This is confirmed by real-life trials on a non-linear flow rig. 629.8
6	MPI Performance Engineering with the MPI Tools Information Interface Ramesh, Srinivasan 06 September 2018 (has links) The desire for high performance on scalable parallel systems is increasing the complexity and the need to tune MPI implementations. The MPI Tools Information Interface (MPI T) introduced in the MPI 3.0 standard provides an opportunity for performance tools and external software to introspect and understand MPI runtime behavior at a deeper level to detect scalability issues. The interface also provides a mechanism to fine-tune the performance of the MPI library dynamically at runtime. This thesis describes the motivation, design, and challenges involved in developing an MPI performance engineering infrastructure using MPI T for two performance toolkits — the TAU Performance System, and Caliper. I validate the design of the infrastructure for TAU by developing optimizations for production and synthetic applications. I show that the MPI T runtime introspection mechanism in Caliper enables a meaningful analysis of performance data. This thesis includes previously published co-authored material. High performance computing Message passing interface Performance autotuning Performance engineering Runtime introspection TAU
7	Automatisk trimning av drivsystemreglering från MATLAB Köhlström, Jonas January 2007 (has links) <p>This master thesis covers the development of an automatic tuning process for the existing speed controller for drive systems. The drive systems are resonant two-mass systems where a motor is used to drive a load connected by a shaft. The developed method relies heavily on system identification and the construction of a complete mechanical model of the process. With this approach, the common problem with poor load speed control that derives from measuring only the motor speed can be addressed and solved for a majority of such processes.</p><p>The automatic tuning method has along with general test functions been implemented in a complete tool for automatic tuning, testing and performance evaluation and reporting for drive systems.</p> PI-regulator varvtalsreglering autotuning automatisk trimning modellbygge systemidentifiering drivsystem valsverk MATLAB Automatic control Reglerteknik
8	Automatisk trimning av drivsystemreglering från MATLAB Köhlström, Jonas January 2007 (has links) This master thesis covers the development of an automatic tuning process for the existing speed controller for drive systems. The drive systems are resonant two-mass systems where a motor is used to drive a load connected by a shaft. The developed method relies heavily on system identification and the construction of a complete mechanical model of the process. With this approach, the common problem with poor load speed control that derives from measuring only the motor speed can be addressed and solved for a majority of such processes. The automatic tuning method has along with general test functions been implemented in a complete tool for automatic tuning, testing and performance evaluation and reporting for drive systems. PI-regulator varvtalsreglering autotuning automatisk trimning modellbygge systemidentifiering drivsystem valsverk MATLAB Automatic control Reglerteknik
9	Modernizing and Evaluating the Autotuning Framework of SkePU 3 Nsralla, Basel January 2022 (has links) Autotuning is a method which enables a program to automatically choose the most suitable parameters that optimizes it for a certain goal e.g. speed, cost, etc. In this work autotuning is implemented in the context of the SkePU framework, in order to choose the best backend (CUDA, CPU, OpenCL, Hybrid) that would optimize a skeleton execution in terms of performance. SkePU is a framework that provides different algorithmic skeletons with implementations for the different backends (OpenCL, CUDA, OpenMP, CPU). Skeletons are parameterised with a user-provided per-element function which will run in parallel. This thesis shows how the autotuning of SkePU’s automatic backend selection for skeleton calls is implemented with respect to all the different parameters that a SkePU skeleton could have. The autotuning in this thesis is built upon the sampling technique, which is implemented by applying different combinations of sizes for the vector and matrix parameters to eventually generate an execution plan, which will be used as a lookup table when running the skeleton on all different backends. The execution plan will estimate the best performing backend for the sample. This work finally evaluates the implementation by comparing the results of running the autotuning on the different SkePU programs, to running the same programs without the autotuning. SkePU Autotuning Parallel Computing Multicore OpenCL OpenMP Computer Sciences Datavetenskap (datalogi)
10	Energie- und Ausführungszeitmodelle zur effizienten Ausführung wissenschaftlicher Simulationen / Energy and execution time models for an efficient execution of scientific simulations Lang, Jens 15 January 2015 (has links) (PDF) Das wissenschaftliche Rechnen mit der Computersimulation hat sich heute als dritte Säule der wissenschaftlichen Methodenlehre neben der Theorie und dem Experiment etabliert. Aufgabe der Informatik im wissenschaftlichen Rechnen ist es, sowohl effiziente Simulationsalgorithmen zu entwickeln als auch ihre effiziente Implementierung. Die vorliegende Arbeit richtet ihren Fokus auf die effiziente Implementierung zweier wichtiger Verfahren des wissenschaftlichen Rechnens: die Schnelle Multipolmethode (FMM) für Teilchensimulationen und die Methode der finiten Elemente (FEM), die z. B. zur Berechnung der Deformation von Festkörpern genutzt wird. Die Effizienz der Implementierung bezieht sich hier auf die Ausführungszeit der Simulationen und den zur Ausführung notwendigen Energieverbrauch der eingesetzten Rechnersysteme. Die Steigerung der Effizienz wurde durch modellbasiertes Autotuning erreicht. Beim modellbasierten Autotuning wird für die wesentlichen Teile des Algorithmus ein Modell aufgestellt, das dessen Ausführungszeit bzw. Energieverbrauch beschreibt. Dieses Modell ist abhängig von Eigenschaften des genutzten Rechnersystems, von Eingabedaten und von verschiedenen Parametern des Algorithmus. Die Eigenschaften des Rechnersystems werden durch Ausführung des tatsächlich genutzten Codes für verschiedene Implementierungsvarianten ermittelt. Diese umfassen eine CPU-Implementierung und eine Grafikprozessoren-Implementierung für die FEM und die Implementierung der Nahfeld- und der Fernfeldwechselwirkungsberechnung für die FMM. Anhand der aufgestellten Modelle werden die Kosten der Ausführung für jede Variante vorhergesagt. Die optimalen Algorithmenparameter können somit analytisch bestimmt werden, um die gewünschte Zielgröße, also Ausführungszeit oder Energieverbrauch, zu minimieren. Bei der Ausführung der Simulation werden die effizientesten Implementierungsvarianten entsprechend der Vorhersage genutzt. Während bei der FMM die Performance-Messungen unabhängig von der Ausführung der Simulation durchgeführt werden, wird für die FEM ein Verfahren zur dynamischen Verteilung der Rechenlast zwischen CPU und GPU vorgestellt, das auf Ausführungszeitmessungen zur Laufzeit der Simulation reagiert. Durch Messung der tatsächlichen Ausführungszeiten kann so dynamisch auf sich während der Laufzeit verändernde Verhältnisse reagiert und die Verteilung der Rechenlast entsprechend angepasst werden. Die Ergebnisse dieser Arbeit zeigen, dass modellbasiertes Autotuning es ermöglicht, die Effizienz von Anwendungen des wissenschaftlichen Rechnens in Bezug auf Ausführungszeit und Energieverbrauch zu steigern. Insbesondere die Berücksichtigung des Energieverbrauchs alternativer Ausführungspfade, also die Energieadaptivität, wird in naher Zukunft von großer Bedeutung im wissenschaftlichen Rechnen sein. / Computer simulation as a part of the scientific computing has established as third pillar in scientific methodology, besides theory and experiment. The task of computer science in the field of scientific computing is the development of efficient simulation algorithms as well as their efficient implementation. The thesis focuses on the efficient implementation of two important methods in scientific computing: the Fast Multipole Method (FMM) for particle simulations, and the Finite Element Method (FEM), which is, e.g., used for deformation problems of solids. The efficiency of the implementation considers the execution time of the simulations and the energy consumption of the computing systems needed for the execution. The method used for increasing the efficiency is model-based autotuning. For model-based autotuning, a model for the substantial parts of the algorithm is set up which estimates the execution time or energy consumption. This model depends on properties of the computer used, of the input data and of parameters of the algorithm. The properties of the computer are determined by executing the real code for different implementation variants. These implementation variantss comprise a CPU and a graphics processor implementation for the FEM, and implementations of near field and far field interaction calculations for the FMM. Using the models, the execution costs for each variant are predicted. Thus, the optimal algorithm parameters can be determined analytically for a minimisation of the desired target value, i.e. execution time or energy consumption. When the simulation is executed, the most efficient implementation variants are used depending on the prediction of the model. While for the FMM the performance measurement takes place independently from the execution of the simulation, for the FEM a method for dynamically distributing the workload to the CPU and the GPU is presented, which takes into account execution times measured at runtime. By measuring the real execution times, it is possible to response to changing conditions and to adapt the distribution of the workload accordingly. The results of the thesis show that model-based autotuning makes it possible to increase the efficiency of applications in scientific computing regarding execution time and energy consumption. Especially, the consideration of the energy consumption of alternative execution paths, i.e. the energy adaptivity, will be of great importance in scientific computing in the near future. Energieeffizienz wissenschaftliches Rechnen Methode der finiten Elemente Schnelle Multipolmethode modellbasiertes Autotuning energy efficiency scientific computing finite element method fast multipole method model-based autotuning ddc:005 Wissenschaftliches Rechnen Hochleistungsrechnen Finite-Elemente-Methode Simulation Energieeffizienz

Search results