91 |
Routing Statistics for Unqueued Banyan NetworksKnight, Thomas F., Jr., Sobalvarro, Patrick G. 01 September 1990 (has links)
Banyan networks comprise a large class of networks that have been used for interconnection in large-scale multiprocessors and telephone switching systems. Regular variants of Banyan networks, such as delta and butterfly networks, have been used in multiprocessors such as the IBM RP3 and the BBN Butterfly. Analysis of the performance of Banyan networks has typically focused on these regular variants. We present a methodology for performance analysis of unbuffered Banyan multistage interconnection networks. The methodology has two novel features: it allows analysis of networks where some inputs are more likely to be active than others, and allows analysis of Banyan networks of arbitrary topology.
|
92 |
Rendering Animations with Distributed Applets over the InternetMcMahon, Adam 01 January 2009 (has links)
High quality 3D rendering requires massive computing resources. In order to render animations within a reasonable amount of time, the rendering process is often distributed among a cluster of computers, typically called a rendering farm. However, most individuals and small studios do not have the resources to purchase or lease a rendering farm. In the late 1990?s, Java technology brought a hope that distributed applets could be utilized as an alternative to traditional network rendering models. Yet, this hope was never realized, nor was it fully implemented. Taking into account new developments in web application technology and the Sunflow renderer, this thesis will reexamine the possibility of distributed rendering applets. This thesis will suggest that distributed Java applets can effectively render projects across a collection of heterogeneous and geographically dispersed computers over the Internet. Moreover, this paper will present a prototype web application, called RenderWeb, that uses distributed applets to quickly render projects created in popular animation programs, such as Blender, 3D Studio MAX, and Softimage.
|
93 |
Adaptation of a large-scale computational chemistry program to the iPSC concurrent computerLarrabee, Alan Roger 08 1900 (has links) (PDF)
M.S. / Computer Science & Engineering / A study was made of some of the characteristics, capabilities, and limitations of the iPSC concurrent computer manufactured by the Intel Corporation. Initial experiments with test programs measured the large amount of time required to send and receive messages between nodes and between the cube manager and the nodes. Programs adapted to run concurrently will have the greatest speedup over the same program executed serially if the computational time is large relative to the time spent passing messages. A large-scale computational chemistry program (named ECEPP83) that calculates the global minimum energy of peptide structures (a peptide is a small protein) was ported and adapted to execute on the iPSC computer. The data entry and checking portion of the original code was ported to the 286/310 Intel computer that serves as a manager of the 32 to 128 CPU's (nodes) of the iPSC. The data for each structure is sent by the manager to a separate node which reports its results back to the host or system manager and then is assigned another structure. This adaptation is able to concurrently minimize the energy for 32 chemical structures a maximum of approximately 17 times faster than the same data can be utilized serially on a VAX 11-780 computer. A user manual was written to assist the user in assembling the input data file.
|
94 |
RF Pulse Design for Parallel Excitation in Magnetic Resonance ImagingLiu, Yinan 2012 May 1900 (has links)
Parallel excitation is an emerging technique to improve or accelerate multi-dimensional spatially selective excitations in magnetic resonance imaging (MRI) using multi-channel transmit arrays. The technique has potential in many applications, such as accelerating imaging speed, mitigating field inhomogeneity in high-field MRI, and alleviating the susceptibility artifact in functional MRI (fMRI). In these applications, controlling radiofrequency (RF) power deposition (quantified by Specific Absorption Rate, or SAR) under safe limit is a critical issue, particularly in high-field MRI. This \dissertation will start with a review of multidimensional spatially selective excitation in MRI and current parallel excitation techniques. Then it will present two new RF pulse design methods to achieve reduced local/global SAR for parallel excitation while preserving the time duration and excitation pattern quality. Simulations incorporating human-model based tissue density and dielectric property were performed. Results have show that the proposed methods can achieve significant SAR reductions without enlonging the pulse duration at high-fields.
|
95 |
A parallel adaptive method for pseudo-arclength continuationDubitski, Alexander 01 August 2011 (has links)
We parallelize the pseudo-arclength continuation method for solving nonlinear systems
of equations. Pseudo-arclength continuation is a predictor-corrector method where the
correction step consists of solving a linear system of algebraic equations. Our algorithm
parallelizes adaptive step-length selection and inexact prediction. Prior attempts to parallelize
pseudo-arclength continuation are typically based on parallelization of the linear
solver which leads to completely solver-dependent software. In contrast, our method is
completely independent of the internal solver and therefore applicable to a large domain
of problems. Our software is easy to use and does not require the user to have extensive
prior experience with High Performance Computing; all the user needs to provide is the
implementation of the corrector step. When corrector steps are costly or continuation
curves are complicated, we observe up to sixty percent speed up with moderate numbers
of processors. We present results for a synthetic problem and a problem in turbulence. / UOIT
|
96 |
Perfromance analysis of the Parallel Community Atmosphere Model (CAM) applicationShawky Sharkawi, Sameh Sherif 02 June 2009 (has links)
Efficient execution of parallel applications requires insight into how the parallel
system features impact the performance of the application. Significant experimental
analysis and the development of performance models enhance the understanding of such
an impact. Deep understanding of an application’s major kernels and their design leads to
a better understanding of the application’s performance, and hence, leads to development
of better performance models. The Community Atmosphere Model (CAM) is the latest in
a series of global atmospheric models developed at the National Center for Atmospheric
Research (NCAR) as a community tool for NCAR and the university research community.
This work focuses on analyzing CAM and understanding the impact of different
architectures on this application. In the analysis of CAM, kernel coupling, which
quantifies the interaction between adjacent and chains of kernels in an application, is used.
All experiments are conducted on four parallel platforms: NERSC (National Energy
Research Scientific Computing Center) Seaborg, SDSC (San Diego Supercomputer
Center) DataStar P655, DataStar P690 and PSC (Pittsburgh Supercomputing Center)
Lemieux. Experimental results indicate that kernel coupling gave an insight into many of
the application characteristics. One important characteristic of CAM is that its
performance is heavily dependent on a parallel platform memory hierarchy; different
cache sizes and different cache policies had the major effect on CAM’s performance.
Also, coupling values showed that although CAM’s kernels share many data structures,
most of the coupling values are still destructive (i.e., interfering with each other so as to
adversely affect performance). The kernel coupling results helps developers in pointing
out the bottlenecks in memory usage in CAM. The results obtained from processor
partitioning are significant in helping CAM users in choosing the right platform to run
CAM.
|
97 |
Research and Development for DSP-based ON-Line Uninterruptible Power Supply with Parallel OperationTseng, Kuo-Tung 12 July 2002 (has links)
The thesis is accomplished two DSP-based On-Line UPS using voltage and current control to implement the parallel operation. Each inverter in the parallel operation system has the same control method. The system can reduce the zero crossover distribution causing by SPWM and the influence causing by load variation with inner current and outer voltage loop control which are taking advantage of P and PI control respectively. On the premise of the same system parameters, the two system can achieve the synchronous between phase and frequency by the control of digital PLL circuit to eliminate the circulation current.
|
98 |
Array combination for parallel imaging in Magnetic Resonance ImagingSpence, Dan Kenrick 17 September 2007 (has links)
In Magnetic Resonance Imaging, the time required to generate an image is
proportional to the number of steps used to encode the spatial information. In rapid
imaging, an array of coil elements and receivers are used to reduce the number of
encoding steps required to generate an image. This is done using knowledge of the
spatial sensitivity of the array and receiver channels. Recently, these arrays have begun
to include a large number of coil elements. Ideally, each coil element would have its
own receiver channel to acquire the image data. In practice, this is not always possible
due to economic or other constraints. In this dissertation, methods are explored to
combine a large array to a limited number of receivers so as to optimize the performance
for parallel imaging; this dissertation focuses on SENSE in particular. Simple
combinations that represent larger coils that might be constructed are discussed. More
complex solutions form current sheets. One solution uses Roemer'ÃÂÃÂs method to optimize
image SNR at a set of points. In this dissertation, Roemer's solution is generalized to
give the weighting coefficients that optimize SNR over regions. Also, solutions fitted to
ideal profiles that minimize noise amplification are shown. These fitted profiles can
allow the SENSE algorithm to function at optimal reduction factors. Finally, a
description of how to build the combiner in hardware is discussed.
|
99 |
Non-blocking array-based algorithms for stacks and queues /Shafiei, Niloufar. January 2007 (has links)
Thesis (M.Sc.)--York University, 2007. Graduate Programme in Computer Science and Engineering. / Typescript. Includes bibliographical references (leaves 170-173). Also available on the Internet. MODE OF ACCESS via web browser by entering the following URL: http://gateway.proquest.com/openurl?url_ver=Z39.88-2004&res_dat=xri:pqdiss&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&rft_dat=xri:pqdiss:MR38826
|
100 |
Efficient and portable parallel algorithms for Cholesky decomposition /Chu, Pei Yue Liu. January 2003 (has links)
Thesis (Ph. D.)--Lehigh University, 2003. / Includes vita. Includes bibliographical references (leaves 90-103).
|
Page generated in 0.0699 seconds