Global ETD Search

111	Scalable Computation of Long-Range Potentialsfor Molecular Dynamics / : Skalerbar beräkning av potentialer med l°angräckvidd i molekulärdynamiska simulationer Rachinger, Christoph January 2013 (has links) To calculate long-range potentials in a molecular dynamics simulation, a naive approach using direct particle interactions needs a computational work of order O(N2). This is infeasible for larger simulations. In order to reduce this complexity and thus allow to increase the size of the simulation, several algorithms have been proposed in the last decades. This thesis first gives an overview over these algorithms and examines the advantages and disadvantages of these methods with respect to high performance computing, i.e., how well they are suited for a good scalability on a many-processor system. Two algorithms that seem well suited for this task, the Multilevel Summation Method and the Meshed Continuum Method, both of which are based on a hierarchy of multiple grids, are implemented and optimized for a massively parallel environment. The mathematical foundation as well as the implementation steps to improve the performance and scalability of the algorithms are explained in detail. Finally the algorithms were tested with up to 8192 processors at PDC. The results of these runs are presented together with an explanation of possible performance bottlenecks and a final comparison of both algorithms / Ett naivt sätt att beräkna potentialer med lång räckvidd i molekylärdynamiska simulationer vore att använda direkta partikelinteraktioner som behöver ett beräkningsarbete av ordo O(N2). Det är inte genomförbart i större simulationer. Flera algoritmer föreslogs under de förra decennierna för att reducera komplexitäten och tillåta större beräkningar. I rapporten ges en översyn över dessa algoritmer och utredas för- och nackdelar med hänsyn till högprestandaberäkningar, d.v.s. skalerbarhet på system med flärkärnprocessorer. Två algoritmer, den s.k. Multilevel Summation Metod och den s.k. Meshed Continuum Method, tycks passa bra. Både metoder baseras på en hierarki av flera rutnät. Både kan implementeras och optimeras för massiv parallella system. De matematiska fundamenten och implementeringsstegen för att förbättra algoritmernas prestanda och skalerbarhet förklaras detaljerat. Algoritmerna testades med upp till 8192 processorer at PDC. Resultaten av dessa tester, förklaringar av möjliga orsaker för prestandaproblem samt en slutlig jämförelse av algoritmerna presenteras. Computational Mathematics Beräkningsmatematik
112	High performance adaptive finite elementmodeling of complex CAD geometry / Adaptiv finita-element-modellering av complex CAD-geometri Strunk, Stefanie January 2013 (has links) CAD (Computer Aided Design) and finite elementanalysis are of fundamental importance for numerical simulations. The generalapproach is to design a model using CAD software, create a mesh of a domainthat includes this model and use finite element analysis to perform simulationson that mesh. When using more advanced simulation techniques, like adaptivefinite element methods, it is more and more desired to use CAD information, notonly for the creation of the initial mesh but also during the simulation. Inthis thesis, an approach is presented how to use CAD data during adaptive mesh refinementin a finite element simulation. An error indicator is presented to find theelements in a mesh, which need to be improved for a better geometricapproximation and it is shown how to integrate the different approaches into anexisting high performance finite element solver / CAD (Computer Aided Design)och finita-element-analys är grundläggande för numerisk simulering. Mankonstruerar en modell med CAD-program, skapar ett beräkningsnät på en domän sominnehåller modellen, och använder finita-elementanalys för beräkningar pånätet. I mer avancerade simuleringar, som för adaptiva finita-element-metoder,är det önskvärt att använda CAD information inte bara för att skapa det förstanätet utan under nätförfiningarna i adaptionen under simuleringen. I dettaarbete presenteras ett sätt att använda CAD-data för adaptiv nätförfining i enfinita-element-simulering. En fel-indikator ges för att hitta de element somska förfinas för att förbättra geometrisk approximation och vi beskriver hur deolika angreppssätten kan integreras i ett finita-element programpaket för högpresterandedatorer Computational Mathematics Beräkningsmatematik
113	Large-scale time parallelization for molecular dynamics problems / Storskalig tidsparallellisering för molekyldynamik Bulin, Johannes January 2013 (has links) As modern supercomputers draw their power from the sheer number of cores, an efficient parallelization of programs is crucial for achieving good performance. When one tries to solve differential equations in parallel this is usually done by parallelizing the computation of one single time step. As the speedup of such parallelization schemes is usually limited, e.g. by the spatial size of the problem, additional parallelization in time may be useful to achieve better scalability. This thesis will introduce two well-known schemes for time-parallelization, namely the waveform relaxation method and the parareal algorithm. These methods are then applied to a molecular dynamics problem which is a useful test example as the number of required time steps is high while the number of unknowns is relatively low. Afterwards it is investigated how these methods can be adapted to large-scale computations. / Moderna superdatorer använder ett stort antal processorer för att uppnå hög prestanda. Därför är det nödvändigt att parallellisera sina program på ett effektivt sätt. När man löser differentialekvationer så brukar man parallellisera beräkningen av en enda tidspunkt. Speedupen av sådana program är ofta begränsad, till exempel av problemets storlek. Genom att använda ytterligare parallellisering i tid kan man uppnå bättre skalbarhet. Denna avhandling presenterar två välkända algoritmer för tidsparallellisering: waveform relaxation och parareal. Dessa metoder används för att lösa ett molekyldynamikproblem där tidsdomänen är stor jämförd med antalet obekanta. Slutligen undersöks några förbättringar för att möjliggöra storskaliga beräkningar. Computational Mathematics Beräkningsmatematik
114	Simulering av luftflöden genom ventilationsdon / Simulation of Air Flow Through Ventilation Ducts Dalsryd, Erik January 2013 (has links) I rapporten undersöks i vad mån det går att simulera luftflödet genom ett ventilationsdon och därigenom bestämma en k-faktor, vilken anger förhållandet mellan luftflödet och kvadratroten ur tryckfallet över en strypning. En tvådimensionell axelsymmetrisk modell har använts för att simulera ett irisspjäll utan böjar på luftkanalen. En tredimensionell modell har använts för att simulera en 180°°-böj. Som jämförelser till simuleringarna har praktiska uppmätningar används. Även jämförelser med andra simuleringar har gjorts. Slutsatsen är att en simulering av den här 180°0-böjen ger för stora fel (lokalt omkring 20 %, men oftast mindre än 10 %) för att vara en användbar metod i samband med ”obligatorisk ventilationskontroll” (OVK). Simuleringen med irisspjäll och raka kanaler ger ett maximalt fel på mellan 3 % och 10 %. Efter ytterligare studier är det tänkbart att metoden kan bli praktiskt användbar. / In this report I study the airflow throuth ventilation ducts. By numerical simulation, the so-called k-factor has been estimated. The k-factor is the quotient of the airflow volume and the square root of the pressure drop over the duct. A two dimensional axial symmetric model has been used to simulate an iris damper connected to a straight pipe. A three dimensional model has been used to simulate a pipe with a 18°00 bend. The simulations have been compared to practical measurements and also to other simulations. The conclusion is that the result of a simulation of this 180°0 bend have errors that are too large (locally around 20 %, but usually less than 10 %) to be useful in a context of ”obligatorisk ventilationskontroll” (OVK). The result of the simulation of the iris damper with a straight pipe gives a maximum error between 3 % and 10 %. After further investigations, it may be possible that this method will be useful Computational Mathematics Beräkningsmatematik
115	CFD Simulation of Jet Cooling andImplementation of Flow Solvers inGPU Hosain, Md. Lokman January 2013 (has links) In rolling of steel into thin sheets the final step is the cooling of the finished product on the Runout Table. In this thesis, the heat transfer into a water jet impinging on a hot flat steel plate was studied as the key cooling process on the runout table. The temperature of the plate was kept under the boiling point. Heat transfer due to a single axisymmetric jet with different water flow rate was compared to cases of a single jet and two jets in 3D. The RANS model in ANSYS Fluent was used with the k −ε model in transient simulation of the axisymmetric model and steady flow for the 3D cases. Two different boundary conditions, constant temperature and constant heat flux were applied at the surface of the steel plate. The numerical results were consistent between 2D and 3D and compared well to literature data. The time dependent simulation for the 3D model requires very large computational power which motivated an investigation of simpler flow solvers running on a GPU platform. A simple 2D Navier-Stokes solver based on Finite Volume Method was written using OpenCL which can simulate flow and heat convection. A standard CFD problem named "Lid Driven Cavity" in 2D was chosen as validation case and for performance measurement and tuning of the solver. / När stål valsas till plåt är det sista steget att kyla den färdiga produkten på utrullningsbordet (ROT). I detta arbete studeras värmetransporten i en vattenstråle som faller in mot en varm plan platta som är den viktigaste kylprocessen på utrullningsbordet. Plattans temperatur hölls under kokpunkten. Värmeövergång i en ensam rotationssymmetrisk stråle med olika hastighet jämförs med en och två strålar i 3D modeller. RANS-modellering i ANSYS Fluent med k −ε turbulensmodell används för transientberäkning för rotationssymmetri och för stationär beräkning för 3D-fallen. Två olika randvillkor, konstant temperatur och konstant värmeflöde, används vid plattan. De numeriska resultaten är konsistenta mellan rotationssymmetri och 3D och jämförbara med litteratur-data. Transient simulering av 3D modellerna kräver stora datorresurser vilket motiverar en undersökning om enklare strömningsmodeller som kan köra på GPU-plattform. En enkel 2D Navier-Stokes-lösare baserad på Finita Volym-metoden implementerades i OpenCL för simulering av konvektiv värmetransport. Lid Driven Cavity-problemet i 2D valdes för verifiering och tidtagning. Computational Mathematics Beräkningsmatematik
116	Analysis and implementation of anefficient solver for large-scalesimulations of neuronal systems The, Matthew January 2013 (has links) Numerical integration methods exploiting the characteristics of neuronal equation systems were investigated. The main observations was a high stiffness and a quasi-linearity of the system. The latter allowed for decomposition into two smaller systems by using a block diagonal Jacobian approximation. The popular backwards differentiation formulas methods (BDF) showed performance degradation for this during first experiments. Linearly implicit peer methods (PeerLI), a new class of methods, did not show this degradation. Parameters for PeerLI were optimized by experimental means and then compared in performance to BDF. Models were simulated in both Matlab and NEURON, a neuron modelling package. For small models PeerLI was competitive with BDF, especially with a block diagonal Jacobian. In NEURON the performance of the block diagonal Jacobian did no longer degrade for BDF, but instead showed degradation for PeerLI, especially for large models. With full Jacobian PeerLI was competitive with BDF, but with block diagonal Jacobian an increase of ca.50% was seen in simulation time. Overall PeerLI methods were competitive for certain problems, but did not give the desired performance gain for block diagonal Jacobian for large problems. There is, however, still a lot of room for improvement, since parameters were only determined experimentally and tuned to small problems. / Undersökningen gäller numeriska integrationsmetoder som utnyttjar egenskaper hos de ekvationer som beskriver neuronsystem, huvudsakligen utpräglad styvhet och kvasi-linjaritet. Den senare tillåter uppdelning i två mindre system med block-diagonal Jacobian-approximation. De populära bakåtderiveringsmetoderna (BDF) påverkades negativt av detta i de inledande experimenten. Linjärt implicita peer metoder (PeerLI), en ny metodklass, påverkades inte. Parametrarna i PeerLI optimerades experimentellt och metoderna jämfördes sedan med BDF. Modeller simulerades både i Matlab och neuron-modelleringsprogrammet NEURON. För små system var BDF och PeerLI likvärdiga, särskilt med block-diagonal Jacobian. I NEURON försämrades inte BDF av block-diagonal Jacobian, utan i stället PeerLI, särskilt för större modeller. Med full Jacobian var PeerLI och BDF lika bra, men med block-diagonal Jacobian ökade tiden med 50%. översiktligt var PeerLI likvärdig för vissa problem men gav inte önskvärd uppsnabbning för block-diagonal Jacobian för stora system. Men förbättringsmöjligheterna är många eftersom parameterinställningen gjordes experimentellt för små modeller. Computational Mathematics Beräkningsmatematik
117	Computational methods to estimate error rates forpeptide identifications in mass spectrometry-based proteomics / Beräkningsmetoder för att uppskatta felfrekvensen hos peptididentifikationer inom masspektrometri-baserad proteomik Liang, Xiao January 2013 (has links) In the field of proteomics, tandem mass spectrometry is the core technology which promises to identify peptide components within complex mixtures on a large scale. Currently the bottleneck is to reduce the error rates and assign accurate statistical estimates of peptide identifications. In this work, we introduce the techniques of identifying chimeric spectra, where two or more precursor ions with similar mass and retention time are co-fragmented and sequenced by the MS/MS instrument. Based on this, we try to analyze the factor which leads to the high error rate of identifications. We show that chimeric spectra have high correlations with the ranking scores and can reduce the number of positive identifications. Additionally, we address the problem of assigning a posterior error probability (PEP) to the individual peptide-spectrum matches (PSMs) that are obtained via search engines. This problem is computationally more difficult than estimating the error rate associated with a large collection of PSMs, such as false discovery rate (FDR). Existing methods rely on parametric or semiparametric models of the underlying score distribution as preassumption.We provide a so-called kernel logistic regression procedure without any explicit assumptions about the score distribution. Based on an appropriate positive definite Gaussian kernel, the resulting PEP estimate is proven to be robust by achieving a close correspondence between the PEP-derived q-values and FDR-derived q-values. Furthermore, we also accept at least 200 more significant PSMs with setting a threshold based on PEP-derived q-values compared to FDR-derived q-values. Finally, we show that this kernel logistic regression method is well established in the statistics literature and it can produce accurate PEP estimates for different types of PSM score functions and data. / Tandemmasspektrometri (MS/MS) är kärnan i proteomikstudier som försöker att identifiera peptider inom komplexa proteinlösningar i stor skala. För närvarande är flaskhalsen att minska felprocenten av peptideidentifikationerna, samt att tilldela noggranna statistiska skattningar av dessa. I detta arbete presenterar vi metoder för att identifiera chimära spektra, där två eller flera produktjoner med liknande massa och retentionstid är samfragmenterade och sekvenserade i ett MS/MS-instrument. Hypotesen är att dessa sam-fragmenterade joner är en anledning till den höga felfrekvensen hos peptideidentifikationer. Vi visar att chimära spektra har korrelerar med identifikationskvalitéten och kan minska antalet positiva identifikationer. Dessutom undersöker vi problemet med att tilldela en posteriori felsannolikhet (posterior error probability, PEP) till individuella peptid-spektrum matcher (PSM) som erhålls genom sökmotorer. Detta problem är beräkningsmässigt svårare än att uppskatta felfrekvensen med en stor samling av PSM, såsom false discover rate (FDR). Befintliga metoder förlitar sig på parametriska eller delvis-parametriska modeller av den underliggande fördelningen av poäng till identifikationer. Vi tillhandahåller en kernel-logistisk regressionsmodell utan några explicita antaganden av fördelningen. Baserat på en lämpligt positiv definit Gausskärna, har den resulterande PEP-uppskattningen visat sig vara robust genom att uppnå ett nära samband mellan PEP-härledda q-värden och FDR-härledda q-värden. Slutligen visar vi att denna icke-parametrisk kernel-logistisk regression metod är väl etablerad i den statistiska litteraturen och kan producera noggranna PEP uppskattningar för olika typer av PSM värderingar Computational Mathematics Beräkningsmatematik
118	A comparison between finite differenceand binomial methods for solvingAmerican single-stock options Eriksson, Alexander January 2013 (has links) In this thesis, we compare four different finite-difference solvers with a binomial solver for pricing American options, with a special emphasis on achievable accuracy under computational time constraints. The three finite-difference solvers are: an operator splitting method suggested by S. Ikonen and J. Toivanen, a boundary projection method suggested by M. Brennan and E. Schwartz, projected successive overrelaxation and second order accurate operator splitting method known as Peaceman-Rachford. The binomial method is a modified variant employing an analytical final step as suggested by M. Broadie and J. Detemple. The model problem is an American put option, and we empirically examine the effects of the relevant numerical parameters on the quality of the solutions. For the finite-difference methods we utilize both a Crank-Nicolson discretization and a fully implicit second-order-in-time discretization. We conclude that the operator splitting method suggested by S. Ikonen and J. Toivanen is the Alternating Direction Implicit algorithm known as the Douglas-Rachford algorithm. We also conclude that the accuracy of the Peaceman- Rachford algorithm degrades to first order for the American option problem. Of the finite-difference methods tried, the Douglas-Rachford algorithm has the highest performance in terms of accuracy under computational time constraints. We conclude that it does, however, not outperform the modified binomial model Computational Mathematics Beräkningsmatematik
119	Numerical and experimental investigation of the effectof geometry modification on the aerodynamic characteristics of a NACA 64(2)-415wing / Numerisk och experimentell undersökning av effekten avgeometrimodifikationer på NACA-profil på dess aerodynamiska egenskaper Ramesh, Pradeep January 2013 (has links) The objective of the thesis is to study the effect of the geometry modifications on the aerodynamic characteristics of a standard airfoil (NACA series). The Airfoil was chosen for a high aspect ratio and reynolds number of the range 10⁶ - 10⁷ (realistic conditions for flight and naval applications). Experimental and Numerical investigation were executed in collaboration with KTH – CTL and Schlumberger. Experimental investigations were conducted at NTNU which was funded by Schlumberger. The numerical investigation was executed with the massively parallel unified continuum adaptive finite element method solver “Unicorn” and the computing resources at KTH – CTL. The numerical results are validated against the experiments and against experimental results in the literature, and possible discrepancies analyzed and discussed based on the numerical method. In addition, this will help us to expand our horizon and get acquainted with the numerical methods and the computational framework. The further scope of this thesis is to develop and implement the new modules for the Unicorn solver suitable for the aerodynamic applications. / I arbetet studeras effekten av geometri-modifikationerpå aerodynamiska egenskaper hos en standard-vingprofil ur NACA-serien. Profilenvaldes för en slank vinge och Reynoldstal mellan en och tio miljoner vilket kanvara realistiskt för flygplan och marina tillämpningar. Experiment ochnumeriska beräkningar utförs i samarbete mellan KTH/CTL och Schlumberger.Experimenten utfördes på NTNU med stöd av Schlumberger. Beräkningarna gjordesmed finita-element paketet "Unicorn" på KTH/CTL s datorer. NyaUnicorn-moduler för aerodynamiska beräkningar utvecklas vilket ger erfarenhetav de numeriska metoderna och beräkningsmiljön. Numeriska resultat validerasmot experimenten och resultat i litteraturen, och avvikelserna för den aktuellanumeriska metoden analyseras Computational Mathematics Beräkningsmatematik
120	Development, Implementation, Optimization and Performance Analysis of Matrix-Vector Multiplication on Eight-Core Digital Signal Processor Muradov, Feruz January 2013 (has links) This thesis work aims at implementing the sparse matrix vector multiplication on eight-core Digital Signal Processor (DSP) and giving insights on how to optimize matrix multiplication on DSP to achieve high energy efficiency. We used two sparse matrix formats: the Compressed Sparse Row (CSR) and the Block Compressed Sparse Row (BCSR) formats. We carried out loop unrolling optimization of the naive algorithm. In addition, we implemented the Registerblocked and the Cache-blocked sparse matrix vector multiplications to optimize the naive algorithm. The computational performance improvement with loop unrolling technique was promising (≈12%). With this optimization, we observed a decrease of power usage (0.3 W) when using a matrix size of 600 and an increase of power usage (1.2 W), when using larger size matrices. The Register-blocked algorithm resulted to be the most efficient technique on DSP. With this algorithm, we were able to increase performance by a factor of six when compared to the naive algorithm, still retaining low power consumption (≈ 14 W). The Cache-blocked sparse matrix vector multiplication is known to be most convenient for large number of architectures with coherent caches. However, because DSP does not support coherency between caches, this method did not show large improvement in computational performance. In fact, we confirm that power consumption for the Cache-blocked method was higher when compared to other effective algorithms such as Register-blocked sparse matrix vector multiplication and loop unrolling of naive algorithm. In conclusion, we found that the DSP delivers low power consumption, excellent computational performance and energy efficiency when the Register-blocked sparse matrix vector multiplication technique is used. Computational Mathematics Beräkningsmatematik

Search results