Spelling suggestions: "subject:"multiprocessors"" "subject:"ultiprocessors""
101 |
Multiplexed pipelining : a cost effective loop transformation techniquePai, Satish 01 January 1992 (has links)
Parallel processing has gained increasing importance over the last few years. A key aim of parallel processing is to improve the execution times of scientific programs by mapping them to many processors. Loops form an important part of most computational programs and must be processed efficiently to get superior performance in terms of execution times. Important examples of such programs include graphics algorithms, matrix operations (which are used in signal processing and image processing applications), particle simulation, and other scientific applications. Pipelining uses overlapped parallelism to efficiently reduce execution time.
|
102 |
Optimization and enhancement strategies for data flow systemsDunkelman, Laurence William. January 1984 (has links)
No description available.
|
103 |
Multiport memory as a medium for interprocessor communication in multiprocessors / by Nasser Asgari.Asgari, Nasser January 2003 (has links)
"February 2003" / Includes bibliography (leaves 192-203) / xix, 203 leaves : ill. ; 30 cm. / Title page, contents and abstract only. The complete thesis in print form is available from the University Library. / Thesis (Ph.D.)--University of Adelaide, Dept. of Electrical and Electronic Engineering, 2003
|
104 |
A study of hardware/software multithreadingCarlson, Ryan L. 04 June 1998 (has links)
As the design of computers advances, two important trends have surfaced: The
exploitation of parallelism and the design against memory latency. Into these two new
trends has come the Multithreaded Virtual Processor (MVP). Based on a standard
superscalar core, the MVP is able to exploit both Instruction Level Parallelism (ILP) and,
utilizing the concepts of multithreading, is able to further exploit Thread Level Parallelism
(TLP) in program code. By combining both hardware and software multithreading
techniques into a new hybrid model, the MVP is able to use fast hardware context
switching techniques along with both hardware and software scheduling. The new hybrid
creates a processor capable of exploiting long memory latency operations to increase
parallelism, while introducing both minimal software overhead and hardware design
changes.
This thesis will explore the MVP model and simulator and provide results that
illustrate MVP's effectiveness and demonstrate its recommendation to be included in future
processor designs. Additionally, the thesis will show that MVP's effectiveness is
governed by four main considerations: (1) The data set size relative to the cache size, (2) the number of hardware contexts/threads supported, (3) the amount of locality within the
data sets, and (4) the amount of exploitable parallelism within the algorithms. / Graduation date: 1999
|
105 |
Automatic program restructuring for distributed memory multicomputersIkei, Mitsuru 04 1900 (has links) (PDF)
M.S. / Computer Science and Engineering / To compile a Single Program Multiple Data (SPMD) program for a Distributed Memory Multicomputer (DMMC), we need to find data that can be processed in parallel in the program and we need to distribute the data among processors such that the interprocessor communication becomes reasonably small. Loop restructuring is needed for finding parallelism in imperative programs and array alignment is one effective step to reduce interprocessor communication caused by array references. Automatic conversion of imperative programs using these two program restructuring steps has been implemented in the Tiny loop restructuring tool. The restructuring strategy is derived by translating the way that the compiler uses for the functional language Crystal, to the imperative language Tiny. Although an imperative language can have more varied loop structures than a functional language and it is more difficult to select the optimal one, we can get a loop structure which is comparable to Crystal. We also can find array alignment preference (temporal + spatial) relations in a Tiny source program and add a new construct, the align statement, to Tiny to express the array alignment preferences. In this thesis, we discuss these program restructuring strategies which we used for Tiny by comparison with Crystal.
|
106 |
Program allocation for hypercube based dataflow systemsFreytag, Vincent R. 18 March 1993 (has links)
The dataflow model of computation differs from the traditional control-flow
model of computation in that it does not utilize a program counter to sequence
instructions in a program. Instead, the execution of instructions is based solely on the
availability of their operands. Thus, an instruction is executed in a dataflow computer
when all of its operands are available. This asynchronous nature of the dataflow model of
computation allows the exploitation of fine-grain parallelism inherent in programs.
Although the dataflow model of computation exploits parallelism, the problem of
optimally allocating a program to processors belongs to the class of NP-complete
problems. Therefore, one of the major issues facing designers of dataflow
multiprocessors is the proper allocation of programs to processors.
The problem of program allocation lies in maximizing parallelism while
minimizing interprocessor communication costs. The culmination of research in the area
of program allocation has produced the proposed method called the Balanced Layered
Allocation Scheme that utilizes heuristic rules to strike a balance between computation
time and communication costs in dataflow multiprocessors. Specifically, the proposed
allocation scheme utilizes Critical Path and Longest Directed Path heuristics when
allocating instructions to processors. Simulation studies indicate that the proposed
scheme is effective in reducing the overall execution time of a program by considering
the effects of communication costs on computation times. / Graduation date: 1993
|
107 |
MPSoC simulation and implementation of KPN applicationsCheung, Chun Shing. January 2009 (has links)
Thesis (Ph. D.)--University of California, Riverside, 2009. / Includes abstract. Title from first page of PDF file (viewed March 8, 2010). Available via ProQuest Digital Dissertations. Includes bibliographical references (p. 123-137). Also issued in print.
|
108 |
Fair and high performance shared memory resource managementEbrahimi, Eiman 31 January 2012 (has links)
Chip multiprocessors (CMPs) commonly share a large portion of memory
system resources among different cores. Since memory requests from
different threads executing on different cores significantly interfere
with one another in these shared resources, the design of the shared
memory subsystem is crucial for achieving high performance and
fairness.
Inter-thread memory system interference has different implications
based on the type of workload running on a CMP. In multi-programmed
workloads, different applications can experience significantly
different slowdowns. If left uncontrolled, large disparities in
slowdowns result in low system performance and make system software's
priority-based thread scheduling policies ineffective. In a single
multi-threaded application, memory system interference between threads
of the same application can slow each thread down significantly. Most
importantly, the critical path of execution can also be
significantly slowed down, resulting in increased application
execution time.
This dissertation proposes three mechanisms that address different
shortcomings of current shared resource management techniques targeted
at multi-programmed workloads, and one mechanism which speeds up a
single multi-threaded application by managing main-memory related
interference between its different threads.
With multi-programmed workloads, the key idea is that both demand- and
prefetch-caused inter-application interference should be taken into
account in shared resource management techniques across the entire
shared memory system. Our evaluations demonstrate that doing so
significantly improves both system performance and fairness compared
to the state-of-the-art. When executing a single multi-threaded
application on a CMP, the key idea is to take into account the
inter-dependence of threads in memory scheduling decisions. Our
evaluation shows that doing so significantly reduces the execution
time of the multi-threaded application compared to using
state-of-the-art memory schedulers designed for multi-programmed
workloads.
This dissertation concludes that the performance and fairness of CMPs
can be significantly improved by better management of inter-thread
interference in the shared memory resources, both for multi-programmed
workloads and multi-threaded applications. / text
|
109 |
A technology-scalable composable architectureKim, Changkyu 28 August 2008 (has links)
Not available / text
|
110 |
Fault-tolerant real-time multiprocessor schedulingSrinivasan, Anand 26 August 2015 (has links)
Graduate
|
Page generated in 0.0399 seconds