Spelling suggestions: "subject:"arallel programming"" "subject:"arallel erogramming""
281 |
Contributions to Formal Specification and Modular Verification of Parallel and Sequential SoftwareWeide, Alan January 2021 (has links)
No description available.
|
282 |
Automatic and Explicit Parallelization Approaches for Mathematical Simulation ModelsGebremedhin, Mahder January 2015 (has links)
The move from single core and processor systems to multi-core and many-processors systemscomes with the requirement of implementing computations in a way that can utilizethese multiple units eciently. This task of writing ecient multi-threaded algorithmswill not be possible with out improving programming languages and compilers to providethe mechanisms to do so. Computer aided mathematical modeling and simulationis one of the most computationally intensive areas of computer science. Even simpli-ed models of physical systems can impose a considerable amount of computational loadon the processors at hand. Being able to take advantage of the potential computationpower provided by multi-core systems is vital in this area of application. This thesis triesto address how we can take advantage of the potential computation power provided bythese modern processors to improve the performance of simulations. The work presentsimprovements for the Modelica modeling language and the OpenModelica compiler. Two approaches of utilizing the computational power provided by modern multi-corearchitectures are presented in this thesis: Automatic and Explicit parallelization. Therst approach presents the process of extracting and utilizing potential parallelism fromequation systems in an automatic way with out any need for extra eort from the modelers/programmers side. The thesis explains improvements made to the OpenModelicacompiler and presents the accompanying task systems library for ecient representation,clustering, scheduling proling and executing complex equation/task systems with heavydependencies. The Explicit parallelization approach explains the process of utilizing parallelismwith the help of the modeler or programmer. New programming constructs havebeen introduced to the Modelica language in order to enable modelers write parallelizedcode. the OpenModelica compiler has been improved accordingly to recognize and utilizethe information from this new algorithmic constructs and generate parallel code toimprove the performance of computations. / <p>The series name <em>Linköping Studies in Science and Technology Licentiate Thesis</em> is incorrect. The correct series name is <em>Linköping Studies in Science and Technology Thesis.</em></p>
|
283 |
Program structures and computer architectures for parallel processingMontagne, Euripides. January 1985 (has links)
No description available.
|
284 |
GPU-Assisted Collision Avoidance for Trajectory Optimization : Parallelization of Lookup Table Computations for Robotic Motion Planners Based on Optimal ControlBishnoi, Abhiraj January 2021 (has links)
One of the biggest challenges associated with optimization based methods forrobotic motion planning is their extreme sensitivity to a good initial guess,especially in the presence of local minima in the cost function landscape.Additional challenges may also arise due to operational constraints, robotcontrollers sometimes have very little time to plan a trajectory to perform adesired function. To work around these limitations, a common solution is tosplit the motion planner into an offline phase and an online phase. The offlinephase entails computing reference trajectories for varying parameterizationsof the task space in the form of a lookup table. During the online phase,a stripped down version of the optimizer is supplied with a suitable initialguess from the lookup table using the current state estimate of the robot andits surrounding bodies. This method helps in alleviating problems related toboth local minima and operational time constraints, by seeding the optimizerwith a suitable initial guess that allows it to converge to the global minimummuch faster.The problem however, shifts to the computational complexity of computinga lookup table of reference trajectories for a fine enough discreti- zation ofthe input state space. For many robotic scenarios of interest, it is oftenimpractical and sometimes computationally infeasible to compute a look uptable using a serial, single core implementation of the offline phase of a motionplanner. The main contribution of this work is to develop and evaluate amethod for reducing the time spent on computing a lookup table of referencetrajectories during the offline phase of motion planners based on optimalcontrol. We implement a method to offload the computation of collisionavoidance constraints during trajectory optimization on a Graphics ProcessingUnit (GPU), while simultaneously benefiting from a task based approach todistribute lookup table computations for independent subsets of the input statespace across multiple processes on a cluster of machines. We demonstrate theefficacy of the proposed method in a practical setting by implementing andevaluating it within a representative motion planner based on optimal control.We observe that the implemented method is 115x faster than the originalserial version of the planner, using 86 processes on 5 machines with standardserver grade hardware and 5 Graphics Processing Units in total. Additionally,we observe that the implemented method results in solutions identical to theoriginal serial version in 96.6% of cases, lending credibility for its use inrobotic motion planning. / En av de största utmaningarna med optimeringsbaserade metoder för rörelseplaneringinom robotik är deras extrema känslighet för en bra initial gissning,särskilt i närvaro av lokala minima i kostnadsfunktionslandskapet. Ytterligareutmaningar kan också uppstå på grund av operativa begränsningar. Robotkontrollerhar ibland väldigt lite tid att planera en väg för att utföra en önskadfunktion. För att kringgå dessa begränsningar är en vanlig lösning att dela upprörelseplaneraren i en offline-fas och en online-fas. Offlinefasen inkluderarberäkning av referensvägar för olika punkter i ingångstillståndsutrymmet iform av en uppslagstabell. Under online-fasen levereras en avskalad versionav optimeraren med en lämplig initial gissning från uppslagstabellen medden aktuella uppskattningen av roboten och dess omgivande kroppar. Dennametod hjälper till att lindra problem relaterade till både lokala minima ochdriftstidsbegränsningar genom att sådd optimeraren med en lämplig initialgissning som gör att den kan konvergera till det globala minimumet mycketsnabbare.Problemet flyttas emellertid nu till beräkningskomplexiteten för att beräknaen uppslagstabell över referensvägar för ett tillräckligt fint utrymme för ingångstillståndsutrymmet.För många robotscenarier av intresse är det ofta opraktisktoch ibland beräkningsmässigt omöjligt att beräkna en uppslagstabell med hjälpav en seriell, enda kärnimplementering av offline-fasen i en rörelseplanner.Huvudbidraget till detta arbete är att utveckla och utvärdera en metod för attminska tiden som används för att beräkna en uppslagstabell över referensvägarunder offline-fasen för rörelsesplanerare baserat på optimal kontroll. Vi implementeraren metod för att utföra en kollision undvika en grafikbehandlingsenhet(GPU), medan du använder en uppgiftsbaserad metod för att distribuerauppslagningsberäkningar för oberoende delmängder av inmatningsutrymmeöver flera processer i ett kluster av maskiner. Vi demonstrerar effektivitetenav den föreslagna metoden i en praktisk miljö genom att implementeraoch utvärdera den inom en representativ rörelseplanner baserat på optimalkontroll. Vi noterar att den implementerade metoden är 115 gånger snabbareän den ursprungliga serieversionen av schemaläggaren, med 86 processer på 5maskiner med standardhårdvara och totalt 5 GPU: er. Dessutom observerarvi att den implementerade metoden resulterar i lösningar som är identiskamed den ursprungliga serieversionen i mer än 96,6 % av fallen, vilket gertrovärdighet för dess användning i robotrörelse planering.
|
285 |
Threaded WARPED : An Optimistic Parallel Discrete Event Simulator for Cluster of Multi-Core MachinesMuthalagu, Karthikeyan January 2012 (has links)
No description available.
|
286 |
CUDA Accelerated 3D Non-rigid Diffeomorphic Registration / CUDA-accelererad icke-rigid diffeomorf registrering i 3DQu, An January 2017 (has links)
Advances of magnetic resonance imaging (MRI) techniques enable visualguidance to identify the anatomical target of interest during the image guidedintervention(IGI). Non-rigid image registration is one of the crucial techniques,aligning the target tissue with the MRI preoperative image volumes. As thegrowing demand for the real-time interaction in IGI, time used for intraoperativeregistration is increasingly important. This work implements 3D diffeomorphicdemons algorithm on Nvidia GeForce GTX 1070 GPU in C++ based on CUDA8.0.61 programming environment, using which the average registration time hasaccelerated to 5s. We have also extensively evaluated GPU accelerated 3D diffeomorphicregistration against both CPU implementation and Matlab codes, and theresults show that GPU implementation performs a much better algorithm efficiency.
|
287 |
Parallel ILU Preconditioning for Structured Grid MatricesEisenlohr, John Merrick 20 May 2015 (has links)
No description available.
|
288 |
Modeling Performance of Tensor Transpose using Regression TechniquesSrivastava, Rohit Kumar 15 August 2018 (has links)
No description available.
|
289 |
Scalable Task Parallel Programming in the Partitioned Global Address SpaceDinan, James S. 02 September 2010 (has links)
No description available.
|
290 |
Design and Implementation of a Performance Visualization Tool for the High-Level Parallel Programming Framework SkePU / Design och implementation av ett prestandavisualiseringsverktyg för parallellprogrammeringsramverket SkePUFrankell, Elin January 2024 (has links)
The rise of parallel programming languages, as a result of processors' flattening clock frequencies, has lead to further use of high-level parallel pattern-based programming framework such as SkePU. SkePU provides a sequential high-level interface connected to different back-ends, or a hybrid of several, which reduces the time required to learn several new programming languages. Although this high-level interface introduces a new problem, an abstraction layer from the user to what code is being executed. To make SkePU programs easier to analyze this thesis implements a performance visualization tool for SkePU. The conducted literature study found that there currently exists very few performance visualization tools that cater specifically towards skeleton programming languages. This thesis evaluates usability of the implemented tool by conducting a survey and a user study with participants whom are very familiar with SkePU. The choice of evaluation method is in itself critically evaluated, as is the design choices made, and results are presented with an accompanying discussion as to how those results were derived.
|
Page generated in 0.1655 seconds