• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1931
  • 582
  • 307
  • 237
  • 150
  • 48
  • 38
  • 34
  • 25
  • 23
  • 21
  • 21
  • 15
  • 15
  • 12
  • Tagged with
  • 4266
  • 1169
  • 1042
  • 973
  • 612
  • 603
  • 599
  • 594
  • 478
  • 457
  • 421
  • 408
  • 369
  • 325
  • 318
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
81

Algorithm design for structured matrix computations

Huang, Yuguang January 1999 (has links)
No description available.
82

Partial dynamic reconfiguration of FPGAs for systolic circuits

Cadenas Medina, Oswaldo January 2002 (has links)
No description available.
83

Task assignment in parallel processor systems

Manoharan, Sathiamoorthy January 1993 (has links)
A generic object-oriented simulation platform is developed in order to conduct experiments on the performance of assignment schemes. The simulation platform, called Genesis, is generic in the sense that it can model the key parameters that describe a parallel system: the architecture, the program, the assignment scheme and the message routing strategy. Genesis uses as its basis a sound architectural representation scheme developed in the thesis. The thesis reports results from a number of experiments assessing the performance of assignment schemes using Genesis. The comparison results indicate that the new assignment scheme proposed in this thesis is a promising alternative to the work-greedy assignment schemes. The proposed scheme has a time-complexity less than those of the work-greedy schemes and achieves an average performance better than, or comparable to, those of the work-greedy schemes. To generate an assignment, some parameters describing the program model will be required. In many cases, accurate estimation of these parameters is hard. It is thought that inaccuracies in the estimation would lead to poor assignments. The thesis investigates this speculation and presents experimental evidence that shows such inaccuracies do not greatly affect the quality of the assignments.
84

Adaptive techniques for BSP Time Warp

Low, Malcolm Yoke Hean January 2002 (has links)
Parallel simulation is a well developed technique for executing large and complex simulation models in order to obtain simulation output for analysis within an acceptable time frame. The main contribution of this thesis is the development of different adaptive techniques to improve the consistency, performance and resilience of the BSP Time Warp as a general purpose parallel simulation protocol. We first study the problem of risk hazards in the BSP Time Warp optimistic simulation protocols. Successive refinements to the BSP Time Warp protocol are carried out to eliminate errors in simulation execution due to different risk hazards. We show that these refinements can be incorporated into the BSP Time Warp protocol with minimal performance degradation. We next propose an adaptive scheme for the BSP Time Warp algorithm that automatically throttles the number of events to be executed per superstep. We show that the scheme, operating in a shared memory environment, can minimize computation load-imbalance and rollback overhead at the expense of incurring higher synchronization cost. The next contribution of this thesis is the study of different techniques for dynamic load-balancing and process migration for Time Warp on a cluster of workstations. We propose different dynamic load-balancing algorithms for BSP Time Warp that seek to balance both computation workload and communication workload, optimizing lookaheads between processors, as well as manage interruption from external workload. Finally, we propose an adaptive technique for BSP Time Warp that automatically varies the number of processors used for parallel computation based on the characteristics of the underlying parallel computing platform and the simulation workload.
85

Low communication cost parallel system using PCs.

January 1996 (has links)
by Yiu Sau Yan Vincent. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1996. / Includes bibliographical references (leaves 86-88). / Chapter 1 --- Introduction --- p.1 / Chapter 2 --- Related Works --- p.3 / Chapter 2.1 --- Tightly-coupled Parallel Systems --- p.5 / Chapter 2.2 --- Loosely-coupled Parallel Systems --- p.6 / Chapter 3 --- Communication Protocol --- p.11 / Chapter 3.1 --- Terminology --- p.12 / Chapter 3.2 --- CUP Model --- p.14 / Chapter 3.3 --- Message Format --- p.15 / Chapter 3.4 --- Message Header --- p.16 / Chapter 3.5 --- Message Content - Control Message --- p.17 / Chapter 3.6 --- Message Transfer Functions --- p.18 / Chapter 3.7 --- Application Development --- p.22 / Chapter 4 --- Multiple Computer Infrastructure --- p.28 / Chapter 4.1 --- Application Supper --- p.32 / Chapter 4.1.1 --- Send and Receive --- p.34 / Chapter 4.1.2 --- Multicast --- p.35 / Chapter 4.1.3 --- Barrier Synchronization --- p.36 / Chapter 4.1.4 --- Start and Delete Process --- p.37 / Chapter 4.2 --- Local Message Routing --- p.39 / Chapter 4.2.1 --- Berkeley Socket --- p.40 / Chapter 4.2.2 --- System V Message Queue --- p.45 / Chapter 4.2.3 --- Shared Memory Queue SMQ --- p.47 / Chapter 4.3 --- Network Message Routing --- p.49 / Chapter 4.3.1 --- Ethernet & TCP Socket --- p.51 / Chapter 4.3.2 --- SCSI Link --- p.52 / Chapter 5 --- System Supporting Facilities --- p.54 / Chapter 5.1 --- Kernel Message Support --- p.54 / Chapter 5.2 --- SCSI Hardware & Device Driver --- p.60 / Chapter 5.2.1 --- SCSI Bus Operations --- p.61 / Chapter 5.2.2 --- Device Driver Internals --- p.65 / Chapter 6 --- Performance --- p.73 / Chapter 7 --- Conclusion --- p.83 / Chapter 7.1 --- Summary of Our Research --- p.83 / Chapter 7.2 --- Future Researches --- p.84
86

Special Cases of Carry Propagation

Izsak, Alexander 01 May 2007 (has links)
The average time necessary to add numbers by local parallel computation is directly related to the length of the longest carry propagation chain in the sum. The mean length of longest carry propagation chain when adding two independent uniform random n bit numbers is a well studied topic, and useful approximations as well as an exact expression for this value have been found. My thesis searches for similar formulas for mean length of the longest carry propagation chain in sums that arise when a random n-digit number is multiplied by a number of the form 1 + 2d. Letting Cn, d represent the desired mean, my thesis details how to find formulas for Cn,d using probability, generating functions and linear algebra arguments. I also find bounds on Cn,d to prove that Cn,d = log2 n + O(1), and show work towards finding an even more exact approximation for Cn,d.
87

Master/slave parallel processing

Larsen, Steen K. 13 January 1999 (has links)
An 8 bit microcontroller slave unit was designed, constructed, and tested to demonstrate advantages and feasibility of master/slave parallel processing using conventional processors and relatively slow inter-processor communications. An 8 bit ISA bus controlled by an 80X86 is interfaced to a logic block that controls data flow to and from the slave processors. The slave processors retrieve tasks sent by the master processor and once completed, return results to the master that are buffered for the master's retrieval. The task message sent to the slave processors has task description and task parameters. The master has access to the bi-directional buffer and a status byte for each slave processor. Considerable effort is made to allow the hardware and software architecture to be expandable such that the general design could be used on different master/slave targets. Attention is also given to cost effective solutions such that development and possible market production can be considered. / Graduation date: 1999
88

Towards architecture-adaptable parallel programming

Kumaran, Santhosh 26 July 1996 (has links)
There is a software gap in parallel processing. The short lifespan and small installation base of parallel architectures have made it economically infeasible to develop platform-specific parallel programming environments that deliver performance and programmability. One obvious solution is to build architecture-independent programming environments. But the architecture independence usually comes at the expense of performance, since the most efficient parallel algorithm for solving a problem often depends on the target platform. Thus, unless a parallel programming system has the ability to adapt the algorithm to the architecture, it will not be effectively machine-independent. This research develops a new methodology for architecture-adaptable parallel programming. The methodology is built on three key ideas: (1) the use of a database of parameterized algorithmic templates to represent computable functions; (2) frame-based representation of processing environments; and (3) the use of an analytical performance prediction tool for automatic algorithm design. This methodology pursues a problem-oriented approach to parallel processing as opposed to the traditional algorithm-oriented approach. This enables the development of software environments with a high level of abstraction. The users state the problem to be solved using a high-level notation; they are freed from the esoteric tasks of parallel algorithm design and implementation. This methodology has been validated in the format of a prototype of a system capable of automatically generating an efficient parallel program when presented with a well-defined problem and the description of a target platform. The use of object technology has made the system easily extensible. The templates are designed using a parallel adaptation of the well-known divide-and-conquer paradigm. The prototype system has been used to solve several numerical problems efficiently on a wide spectrum of architectures. The target platforms include multicomputers (Thinking Machines CM-5 and Meiko CS-2), networks of workstations (IBM RS/6000s connected by FDDI), multiprocessors (Sequent Symmetry, SGI Power Challenge, and Sun SPARCServer), and a hierarchical system consisting of a cluster of multiprocessors on Myrinet. / Graduation date: 1997
89

Analytical performance prediction of data-parallel programs

Clement, Mark J. 25 July 1994 (has links)
Graduation date: 1995
90

Solving Multiple Classes of Problems in Parallel with MATLAB*P

Choy, Ron, Edelman, Alan 01 1900 (has links)
MATLAB [7] is one of the most widely used mathematical computing environments in technical computing. It is an interactive environment that provides high performance computational routines and an easy-to-use, C-like scripting language. Mathworks, the company that develops MATLAB, currently does not provide a version of MATLAB that can utilize parallel computing [9]. This has led to academic and commercial efforts outside Mathworks to build a parallel MATLAB, using a variety of approaches. MATLAB*P is a parallel MATLAB that focus on enhancing productivity by providing an easy to use parallel computing tool. Using syntaxes identical to regular MATLAB, it can be used to solve large scale algebraic problems as well as multiple small problems in parallel. This paper describes how the innovative combination of ’*p mode’ and ’MultiMATLAB/MultiOctave mode’ in MATLAB*P can be used to solve a large range of real world problems. / Singapore-MIT Alliance (SMA)

Page generated in 0.0557 seconds