Global ETD Search

341	Branch-level scheduling in Aurora : the Dharma scheduler Sindaha, Raed Yousef Saba January 1995 (has links) No description available. 005
342	Improving Performance and Quality-of-Service through the Task-Parallel Model : Optimizations and Future Directions for OpenMP Podobas, Artur January 2015 (has links) With the failure of Dennard's scaling, which stated that shrinking transistors will be more power-efficient, computer hardware has today become very divergent. Initially the change only concerned the number of processor on a chip (multicores), but has today further escalated into complex heterogeneous system with non-intuitive properties -- properties that can improve performance and power consumption but also strain the programmer expected to develop on them. Answering these challenges is the OpenMP task-parallel model -- a programming model that simplifies writing parallel software. Our focus in the thesis has been to explore performance and quality-of-service directions of the OpenMP task-parallel model, particularly by taking architectural features into account. The first question tackled is: what capabilities does existing state of the art runtime-systems have and how do they perform? We empirically evaluated the performance of several modern task-parallel runtime-systems. Performance and power-consumption was measured through the use of benchmarks and we show that the two primary causes for bottlenecks in modern runtime-systems lies in either the task management overheads or how tasks are being distributed across processors. Next, we consider quality-of-service improvements in task-parallel runtime-systems. Striving to improve execution performance, current state of the art runtime-systems seldom take dynamic architectural features such as temperature into account when deciding how work should be distributed across the processors, which can lead to overheating. We developed and evaluated two strategies for thermal-awareness in task-parallel runtime-systems. The first improves performance when the computer system is constrained by temperature while the second strategy strives to reduce temperature while meeting soft real-time objectives. We end the thesis by focusing on performance. Here we introduce our original contribution called BLYSK -- a prototype OpenMP framework created exclusively for performance research. We found that overheads in current runtime-systems can be expensive, which often lead to performance degradation. We introduce a novel way of preserving task-graphs throughout application runs: task-graphs are recorded, identified and optimized the first time an OpenMP application is executed and are later re-used in following executions, removing unnecessary overheads. Our proposed solution can nearly double the performance compared with other state of the art runtime-systems. Performance can also be improved through heterogeneity. Today, manufacturers are placing processors with different capabilities on the same chip. Because they are different, their power-consuming characteristics and performance differ. Heterogeneity adds another dimension to the multiprocessing problem: how should work be distributed across the heterogeneous processors?We evaluated the performance of existing, homogeneous scheduling algorithms and found them to be an ill-match for heterogeneous systems. We proposed a novel scheduling algorithm that dynamically adjusts itself to the heterogeneous system in order to improve performance. The thesis ends with a high-level synthesis approach to improve performance in task-parallel applications. Rather than limiting ourselves to off-the-shelf processors -- which often contains a large amount of unused logic -- our approach is to automatically generate the processors ourselves. Our method allows us to generate application-specific hardware from the OpenMP task-parallel source code. Evaluated using FPGAs, the performance of our System-on-Chips outperformed other soft-cores such as the NiosII processor and were also comparable in performance with modern state of the art processors such as the Xeon PHI and the AMD Opteron. / <p>QC 20151016</p> Task Parallel OpenMP Scheduling OmpSs multicore manycore
343	Supporting fault-tolerant parallel programming in Linda. Bakken, David Edward January 1994 (has links) As people are becoming increasingly dependent on computerized systems, the need for these systems to be dependable is also increasing. However, programming dependable systems is difficult, especially when parallelism is involved. This is due in part to the fact that very few high-level programming languages support both fault-tolerance and parallel programming. This dissertation addresses this problem by presenting FT-Linda, a high-level language for programming fault-tolerant parallel programs. FT-Linda is based on Linda, a language for programming parallel applications whose most notable feature is a distributed shared memory called tuple space. FT-Linda extends Linda by providing support to allow a program to tolerate failures in the underlying computing platform. The distinguishing features of FT-Linda are stable tuple spaces and atomic execution of multiple tuple space operations. The former is a type of stable storage in which tuple values are guaranteed to persist across failures, while the latter allows collections of tuple operations to be executed in an all-or-nothing fashion despite failures and concurrency. Example FT-Linda programs are given for both dependable systems and parallel applications. The design and implementation of FT-Linda are presented in detail. The key technique used is the replicated state machine approach to constructing fault-tolerant distributed programs. Here, tuple space is replicated to provide failure resilience, and the replicas are sent a message describing the atomic sequence of tuple space operations to perform. This strategy allows an efficient implementation in which only a single multicast message is needed for each atomic sequence of tuple space operations. An implementation of FT-Linda for a network of workstations is also described. FT-Linda is being implemented using Consul, a communication substrate that supports fault-tolerant distributed programming. Consul is built in turn with the x-kernel, an operating system kernel that provides support for composing network protocols. Each of the components of the implementation has been built and tested. Parallel programming (Computer science). Fault-tolerant computing.
344	PDDS : a parallel deductive database system Cao, Hua January 1995 (has links) No description available. 005
345	New approaches to static task graph scheduling for control Sandnes, Frode Eika January 1997 (has links) No description available. 629.8 Parallel controllers; Real-time control
346	Integer performance evaluation of the dynamically trace scheduled VLIW De Souza, Alberto Ferreira January 1999 (has links) No description available. 005
347	Efficient scheduling of parallel applications on workstation clusters Dantas, Mario A. R. January 1996 (has links) No description available. 005
348	New data synchronization & mapping strategies for PACE - VLSI processor architecture Xu, Yifan January 1995 (has links) No description available. 005 Parallel processing; Real-time control
349	FADI : a fault-tolerant environment for distributed processing systems Osman, Taha Mohammed January 1998 (has links) No description available. 005 Parallel computing; Error detection; PVM
350	An Automated Micromanipulation System for 3D Parallel Microassembly Chu, Henry Kar Hang 05 January 2012 (has links) The introduction of microassembly technologies has opened up new venues for the fabrication of sophisticated, three-dimensional Microelectromechanical System (MEMS) devices. This thesis presents the development of a robotic micromanipulation system and its controller algorithms for conventional pick-and-place microassembly processes. This work incorporated the approach of parallel assembly and automation to improve overall productivity and reduce operating costs of the process. A parallel set of three microgrippers was designed and implemented for the grasping and assembly of three microparts simultaneously. The complete microassembly process was automated through a vision-based control approach. Visual images from two vision systems were adopted for precise position evaluation and alignment. Precise alignment between the micropart and microgripper is critical to the microassembly process. Due to the limited field of view of the vision systems, the micropart could displace away from the microscope field of view during the re-orientation process. In this work, a tracking algorithm was developed to constrain the micropart within the camera view. The unwanted translational motions of the micropart were estimated. The algorithm then continuously manipulated and repositioned the micropart for the vision-based assembly. In addition, the limited fields of view of the vision systems are not sufficient to concurrently monitor the assembly operation for all three individual grippers. This work presents a strategy to use visual information from only one gripper set for all the necessary alignment and positioning processes. Through proper system calibration and the alignment algorithms developed, grippers that were not visually monitored could also perform the assembly operations. When using visual images from a single vision camera for 3D positioning, the extra dimension between the 2D image and 3D workspace results in errors in position evaluation. Hence, a novel approach is presented to utilize image reflection of the micropart for online evaluation of the Jacobian matrix. The relative 3D position between the slot and micropart was evaluated with high precision. The developed algorithms were integrated onto the micromanipulation system. Automated parallel microassemblies were conducted successfully. Microassemby Parallel Micromanipulation MEMS Visual Servoing 0548

Search results