Spelling suggestions: "subject:"arallel processing"" "subject:"aparallel processing""
61 |
Expressing mobility in process algebras : first-order and higher-order paradigmsSangiorgi, Davide January 1993 (has links)
We study mobile systems, i.e. systems with a dynamically changing communication topology, from a process algebras point of view. Mobility can be introduced in process algebras by allowing names or terms to be transmitted. We distinguish these two approaches as first-order and higher-order. The major target of the thesis is the comparison between them. The prototypical calculus in the first-order paradigm is the π-calculus. By generalising its sort discipline we derive an w-order extension called Higher-Order π-calculus (HOπ). We show that such an extension does not add expressiveness to the π-calculus: Higher-order processes can be faithfully compiled down to first-order, and respecting the behavioural equivalence we adopted in the calculi. Such an equivalence is based on the notion of bisimulation, a fundamental concept of process algebras. Unfortunately, the standard definition of bisimulation is unsatisfactory in a higher-order calculus because it is over-discriminating. To overcome the problem, we propose barbed bisimulation. Its advantage is that it can be defined uniformly in different calculi because it only requires that the calculus possesses an interaction or reduction relation. As a test for barbed bisimulation, we show that in CCS and π-calculus, it allows us to recover the familiar bisimulation-based equivalences. We also give simpler characterisations of the equivalences utilised in HOπ. For this we exploit a special kind of agents called triggers, with which it is possible to reason fairly efficiently in a higher-order calculus notwithstanding the complexity of its transitions. Finally, we use the compilation from HOπ to π-calculus to investigate Milner's
|
62 |
Real-time sound synthesis on a multi-processor platformItagaki, Takebumi January 1998 (has links)
Real-time sound synthesis means that the calculation and output of each sound sample for a channel of audio information must be completed within a sample period. At a broadcasting standard, a sampling rate of 32,000 Hz, the maximum period available is 31.25 μsec. Such requirements demand a large amount of data processing power. An effective solution for this problem is a multi-processor platform; a parallel and distributed processing system. The suitability of the MIDI [Music Instrument Digital Interface] standard, published in 1983, as a controller for real-time applications is examined. Many musicians have expressed doubts on the decade old standard's ability for real-time performance. These have been investigated by measuring timing in various musical gestures, and by comparing these with the subjective characteristics of human perception. An implementation and its optimisation of real-time additive synthesis programs on a multi-transputer network are described. A prototype 81-polyphonic-note- organ configuration was implemented. By devising and deploying monitoring processes, the network's performance was measured and enhanced, leading to an efficient usage; the 88-note configuration. Since 88 simultaneous notes are rarely necessary in most performances, a scheduling program for dynamic note allocation was then introduced to achieve further efficiency gains. Considering calculation redundancies still further, a multi-sampling rate approach was applied as a further step to achieve an optimal performance. The theories underlining sound granulation, as a means of constructing complex sounds from grains, and the real-time implementation of this technique are outlined. The idea of sound granulation is quite similar to the quantum-wave theory, "acoustic quanta". Despite the conceptual simplicity, the signal processing requirements set tough demands, providing a challenge for this audio synthesis engine. Three issues arising from the results of the implementations above are discussed; the efficiency of the applications implemented, provisions for new processors and an optimal network architecture for sound synthesis.
|
63 |
A survey of data flow machine architecturesMead, David Anthony January 2010 (has links)
Typescript (photocopy). / Digitized by Kansas Correctional Industries
|
64 |
Exploiting parallelism in centralized reduced-ported register filesSirsi, Sandeep. January 2006 (has links)
Thesis (M.S.)--State University of New York at Binghamton, Electrical and Computer Engineering Dept., 2006. / Includes bibliographical references.
|
65 |
Architectural support for multithreading on a 4-way multiprocessorKim, Gwang-Myung 10 December 1999 (has links)
The microprocessors will have more than a billion logic transistors on a single chip in the near future. Several alternatives have been suggested for obtaining highest performance with billion-transistor chips. To achieve the highest performance possible, an on-chip multiprocessor will become one promising alternative to the current superscalar microprocessor. It may execute multiple threads effectively on multiple processors in parallel if the application program is parallelized properly. This increases the utilization of the processor and provides latency tolerance for the latency caused from data dependency and cache misses.
The Electronics and Telecommunications Research Institute (ETRI) in South Korea developed an on-chip multiprocessor RAPTOR Simulator "RapSim", which contains four SPARC microprocessor cores in it. To support this 4-way multiprocessor simulator, Multithreaded Mini Operating System (MMOS) was developed by OSU MMOS group. RapSim runs multiple threads on multiple processor cores concurrently. POSIX threads was used to build Symmetric Multiprocessor (SMP) safe Pthreads
package, called MMOS. Benchmarks should be properly parallelized by the programmer to run multiple threads across the multiple processors simultaneously. Performance simulation results shows the RAPTOR can exploit thread level parallelism effectively and offer a promising architecture for future on-chip multiprocessor designs. / Graduation date: 2000
|
66 |
Interface design and system impact analysis of a message-handling processor for fine-grain multithreadingMetz, David 28 April 1995 (has links)
There appears to be a broad agreement that high-performance computers of the future will be
Massively Parallel Architectures (MPAs), where all processors are interconnected by a high-speed
network. One of the major problems with MPAs is the latency observed for remote operations. One
technique to hide this latency is multithreading. In multithreading, whenever an instruction accesses a
remote location, the processor switches to the next available thread waiting for execution. There have
been a number of architectures proposed to implement multithreading. One such architecture is the
Threaded Abstract Machine (TAM). It supports fine-grain multithreading by an appropriate compilation
strategy rather that through elaborate hardware. Experiments on TAM have already shown that fine-grain
multithreading on conventional architectures can achieve reasonable performance.
However, a significant deficiency of the conventional design in the context of fine-grain program
execution is that the message handling is viewed as an appendix rather than as an integral, essential part
of the architecture. Considering that message handling in TAM can constitute as much as one fifth to one
half of total instructions executed, special effort must be given to support it in the underlying hardware.
This thesis presents the design modifications required to efficiently support message handling for
fine-grain parallelism on stock processors. The idea of having a separate processor is proposed and
extended to reduce the overhead due to messages. A detailed hardware is designed to establish the
interface between the conventional processor and the message-handling processor. At the same time, the
necessary cycle cost required to guarantee atomicity between the two processors is minimized. However,
the hardware modifications are kept to a minimum so as not to disturb the original functionality of a
conventional RISC processor. Finally, the effectiveness of the proposed architecture is analyzed in terms
of its impact on the system. The distribution of the workload between both processors is estimated to
indicate the potential speed-up that can be achieved with a separate processor to handle messages. / Graduation date: 1995
|
67 |
Instruction history management for high-performance microprocessorsBhargava, Ravindra Nath. January 2003 (has links) (PDF)
Thesis (Ph. D.)--University of Texas at Austin, 2003. / Vita. Includes bibliographical references. Available also from UMI Company.
|
68 |
Solutions for some problems in star graphsAu Yeung, Chun-kan., 歐陽春根. January 2005 (has links)
published_or_final_version / Computer Science / Doctoral / Doctor of Philosophy
|
69 |
Atlas : a dynamically parallelizing chip-multiprocessor for gigascale integrationCodrescu, Lucian 05 1900 (has links)
No description available.
|
70 |
Global optimisation of communication protocols for bulk synchronous parallel computationDonaldson, Stephen Richard January 1999 (has links)
In the Bulk Synchronous Parallel (or BSP) model of parallel communication represented by BSPlib, the relaxed coupling of the global computation, communication and synchronisation, whilst providing a definite semantics, does not prescribe exactly when and where communication is to be carried out during the computation. It merely states that it cannot happen before requested by the application and that at certain points local computation cannot proceed unless updates have been applied from the other participating processors. The nature of the computation and this framework is open to exploitation by the implementation of the runtime system and can be made to suit particular physical environments without requiring application program changes. This bulk and global view of parallel computation can be used to implement protocols that both maintain and take into account global state for optimising performance. Such global protocols can provide performance improvements which are not easily achieved with local and greedy strategies and may in turn be locally sub-optimal. This global perspective and the exploitable nature of BSP computation is applied to congestion avoidance, transport layer protocols suitable for BSP computation, global stable check-pointing, and work process placement and migration, to achieve a better overall performance. An important consideration for the compositionality of parallel computer systems into larger systems is that in order for the composite to exhibit good performance, the individual components must also do so. However, it is not obvious how the individual components contribute to the global performance. Already mentioned is that non-locally optimal strategies might lead to globally optimal performance, but also of importance is that variance observed at the local level also influences performance. A number of decisions in the transport protocol design and implementations have been made in order that the observed variance in the protocol's behaviour is minimised. It is demonstrated why this is required using the BSP model. The analysis also suggests a regression technique which can be applied to sampled global performance data.
|
Page generated in 0.0714 seconds