Spelling suggestions: "subject:"atructured parallelism"" "subject:"atructured arallelism""
1 |
Dynamic Data Race Detection for Structured ParallelismRaman, Raghavan 24 July 2013 (has links)
With the advent of multicore processors and an increased emphasis on parallel computing, parallel programming has become a fundamental requirement for achieving available performance. Parallel programming is inherently hard because, to reason about the correctness of a parallel program, programmers have to consider large numbers of interleavings of statements in different threads in the program. Though structured parallelism imposes some restrictions on the programmer, it is an attractive approach because it provides useful guarantees such as deadlock-freedom. However, data races remain a challenging source of bugs in parallel programs. Data races may occur only in few of the possible schedules of a parallel program, thereby making them extremely hard to detect, reproduce, and correct. In the past, dynamic data race detection algorithms have suffered from at least one of the following limitations: some algorithms have a worst-case linear space and time overhead, some algorithms are dependent on a specific scheduling technique, some algorithms generate false positives and false negatives, some have no empirical evaluation as yet, and some require sequential execution of the parallel program.
In this thesis, we introduce dynamic data race detection algorithms for structured parallel programs that overcome past limitations. We present a race detection algorithm called ESP-bags that requires the input program to be executed sequentially and another algorithm called SPD3 that can execute the program in parallel. While the ESP-bags algorithm addresses all the above mentioned limitations except sequential execution, the SPD3 algorithm addresses the issue of sequential execution by scaling well across highly parallel shared memory multiprocessors. Our algorithms incur constant space overhead per memory location and time overhead that is independent of the number of processors on which the programs execute. Our race detection algorithms support a rich set of parallel constructs (including async, finish, isolated, and future) that are found in languages such as HJ, X10, and Cilk. Our algorithms for async, finish, and future are precise and sound for a given input. In the presence of isolated, our algorithms are precise but not sound. Our experiments show that our algorithms (for async, finish, and isolated) perform well in practice, incurring an average slowdown of under 3x over the original execution time on a suite of 15 benchmarks. SPD3 is the first practical dynamic race detection algorithm for async-finish parallel programs that can execute the input program in parallel and use constant space per memory location. This takes us closer to our goal of building dynamic data race detectors that can be "always-on" when developing parallel applications.
|
2 |
Language Constructs for Safe Parallel Programming on Multi-CoresÖstlund, Johan January 2016 (has links)
The last decade has seen the transition from single-core processors to multi-cores and many-cores. This move has by and large shifted the responsibility from chip manufacturers to programmers to keep up with ever-increasing expectations on performance. In the single-core era, improvements in hardware capacity could immediately be leveraged by an application: faster machine - faster program. In the age of the multi-cores, this is no longer the case. Programs must be written in specific ways to utilize available parallel hardware resources. Programming language support for concurrent and parallel programming is poor in most popular object-oriented programming languages. Shared memory, threads and locks is the most common concurrency model provided. Threads and locks are hard to understand, error-prone and inflexible; they break encapsulation - the very foundation of the object-oriented approach. This makes it hard to break large complex problems into smaller pieces which can be solved independently and composed to make a whole. Ubiquitous parallelism and object-orientation, seemingly, do not match. Actors, or active objects, have been proposed as a concurrency model better fit for object-oriented programming than threads and locks. Asynchronous message passing between actors each with a logical thread of control preserves encapsulation as objects themselves decide when messages are executed. Unfortunately most implementations of active objects do not prevent sharing of mutable objects across actors. Sharing, whether on purpose or by accident, exposes objects to multiple threads of control, destroying object encapsulation. In this thesis we show techniques for compiler-enforced isolation of active objects, while allowing sharing and zero-copy communication of mutable data in the cases where it is safe to do so. We also show how the same techniques that enforce isolation can be utilized internal to an active object to allow data race-free parallel message processing and data race-free structured parallel computations. This overcomes the coarse-grained nature of active object parallelism without compromising safety. / UPMARC
|
Page generated in 0.0676 seconds