Spelling suggestions: "subject:"equential approximately optimization"" "subject:"aequential approximately optimization""
1 |
Parallel paradigms in optimal structural designVan Huyssteen, Salomon Stephanus 12 1900 (has links)
Thesis (MScEng)--Stellenbosch University, 2011. / ENGLISH ABSTRACT: Modern-day processors are not getting any faster. Due to the power consumption limit of frequency
scaling, parallel processing is increasingly being used to decrease computation time. In
this thesis, several parallel paradigms are used to improve the performance of commonly serial
SAO programs. Four novelties are discussed:
First, replacing double precision solvers with single precision solvers. This is attempted in order
to take advantage of the anticipated factor 2 speed increase that single precision computations
have over that of double precision computations. However, single precision routines present
unpredictable performance characteristics and struggle to converge to required accuracies, which
is unfavourable for optimization solvers.
Second, QP and dual are statements pitted against one another in a parallel environment. This
is done because it is not always easy to see which is best a priori. Therefore both are started in
parallel and the competing threads are cancelled as soon as one returns a valid point. Parallel QP
vs. dual statements prove to be very attractive, converging within the minimum number of outer
iterations. The most appropriate solver is selected as the problem properties change during the
iteration steps. Thread cancellation poses problems caused by threads having to wait to arrive at
appropriate checkpoints, thus su ering from unnecessarily long wait times because of struggling
competing routines.
Third, multiple global searches are started in parallel on a shared memory system. Problems
see a speed increase of nearly 4x for all problems. Dynamically scheduled threads alleviate the
need for set thread amounts, as in message passing implementations.
Lastly, the replacement of existing matrix-vector multiplication routines with optimized BLAS
routines, especially BLAS routines targeted at GPGPU technologies (graphics processing units),
proves to be superior when solving large matrix-vector products in an iterative environment. These problems scale well within the hardware capabilities and speedups of up to 36x are
recorded. / AFRIKAANSE OPSOMMING: Hedendaagse verwerkers word nie vinniger nie as gevolg van kragverbruikingslimiet soos die
verwerkerfrekwensie op-skaal. Parallelle prosesseering word dus meer dikwels gebruik om berekeningstyd
te laat daal. Verskeie parallelle paradigmas word gebruik om die prestasie van
algemeen sekwensiële optimeringsprogramme te verbeter. Vier ontwikkelinge word bespreek:
Eerste, is die vervanging van dubbel presisie roetines met enkel presisie roetines. Dit poog om
voordeel te trek uit die faktor 2 spoed verbetering wat enkele presisie berekeninge het oor dubbel
presisie berekeninge. Enkele presisie roetines is onvoorspelbaar en sukkel in meeste gevalle om
die korrekte akkuraatheid te vind.
Tweedens word QP teen duale algoritmes in ’n parallel omgewing gebruik. Omdat dit nie altyd
voor die tyd maklik is om te sien watter een die beste gaan presteer nie, word almal in parallel
begin en die mededingers word dan gekanselleer sodra een terugkeer met ’n geldige KKT punt.
Parallele QP teen duale algoritmes blyk om baie aantreklik te wees. Konvergensie gebeur in alle
gevalle binne die minimum aantal iterasies. Die mees geskikte algoritme word op elke iterasie
gebruik soos die probleem eienskappe verander gedurende die iterasie stappe. “Thread” kanseleering
hou probleme in en word veroorsaak deur “threads” wat moet wag om die kontrolepunte
te bereik, dus ly die beste roetines onnodig as gevolg van meededinger roetines was sukkel.
Derdens, verskeie globale optimerings word in parallel op ’n “shared memory” stelsel begin.
Probleme bekom ’n spoed verhoging van byna vier maal vir alle probleme. Dinamiese geskeduleerde
“threads” verlig die behoefte aan voorafbepaalde “threads” soos gebruik word in die
“message passing” implementerings.
Laastens is die vervanging van die bestaande matriks-vektor vermenigvuldiging roetines met
geoptimeerde BLAS roetines, veral BLAS roetines wat gerig is op GPGPU tegnologië. Die GPU roetines bewys om superieur te wees wanneer die oplossing van groot matrix-vektor produkte in
’n iteratiewe omgewing gebruik word. Hierdie probleme skaal ook goed binne die hardeware se
vermoëns, vir die grootste probleme wat getoets word, word ’n versnelling van 36 maal bereik.
|
2 |
Data Driven Surrogate Based Optimization in the Problem Solving Environment WBCSimDeshpande, Shubhangi 14 December 2009 (has links)
Large scale, multidisciplinary, engineering designs are always difficult due to the complexity and dimensionality of these problems. Direct coupling between the analysis codes and the optimization routines can be prohibitively time consuming. One way of tackling this problem is by constructing computationally cheap(er) approximations of the expensive simulations, that mimic the behavior of the simulation model as closely as possible. This paper presents a data driven, surrogate based optimization algorithm that uses a trust region based sequential approximate optimization (SAO) framework and a statistical sampling approach based on design of experiment (DOE) arrays. The algorithm is implemented using techniques from the two packages SURFPACK and SHEPPACK that provide a collection of approximation algorithms to build the surrogates and three different DOE techniques: full factorial (FF), Latin hypercube sampling (LHS), and central composite design (CCD) are used to train the surrogates. The biggest concern in using the proposed methodology is the generation of the required database. This thesis proposes a data driven approach where an expensive simulation run is required if and only if a nearby data point does not exist in the cumulatively growing database. Over time the database matures and is enriched as more and more optimizations are performed. Results show that the response surface approximations constructed using design of experiments can be effectively managed by a SAO framework based on a trust region strategy. An interesting result is the significant reduction in the number of simulations for the subsequent runs of the optimization algorithm with a cumulatively growing simulation database. / Master of Science
|
Page generated in 0.2204 seconds