Spelling suggestions: "subject:"alongparallel"" "subject:"inparallel""
391 |
IPPM : Interactive parallel program monitorBrandis, Robert Craig 08 1900 (has links) (PDF)
M.S. / Computer Science & Engineering / The tasks associated with designing and. implementing parallel programs involve effectively partitioning the problem, defining an efficient. control strategy and mapping the design to a particular system. The task then becomes one of analyzing the program for correctness and stepwise refinement of its performance. New tools are needed to assist the programmer with these last two stages. Metrics and methods of instrumentation are needed to help with behavior analysis (debugging) and performance analysis. First, current tools and analysis methods are reviewed, and then a set of models is proposed for analyzing parallel programs. The design of IPPM, based on these models, is then presented. IPPM is an interactive, parallel program monitor for the Intel iPSC. It gives a post-mortem view of an iPSC program based on a script of events collected during execution. A user can observe changes in program state and synchronization, select statistics, interactively filter events and time critical sequences.
|
392 |
Multi-area power system state estimation utilizing boundary measurements and phasor measurement units ( PMUs)Freeman, Matthew A 30 October 2006 (has links)
The objective of this thesis is to prove the validity of a multi-area state estimator and investigate the advantages it provides over a serial state estimator. This is done utilizing the IEEE 118 Bus Test System as a sample system. This thesis investigates the benefits that stem from utilizing a multi-area state estimator instead of a serial state estimator. These benefits are largely in the form of increased accuracy and decreased processing time. First, the theory behind power system state estimation is explained for a simple serial estimator. Then the thesis shows how conventional measurements and newer, more accurate PMU measurements work within the framework of weighted least squares estimation. Next, the multi-area state estimator is examined closely and the additional measurements provided by PMUs are used to increase accuracy and computational efficiency. Finally, the multi-area state estimator is tested for accuracy, its ability to detect bad data, and computation time.
|
393 |
Improved design of three-degree of freedom hip exoskeleton based on biomimetic parallel structurePan, Min 01 July 2011 (has links)
The external skeletons, Exoskeletons, are not a new research area in this highly
developed world. They are widely used in helping the wearer to enhance human strength,
endurance, and speed while walking with them. Most exoskeletons are designed for the
whole body and are powered due to their applications and high performance needs.
This thesis introduces a novel design of a three-degree of freedom parallel robotic
structured hip exoskeleton, which is quite different from these existing exoskeletons. An
exoskeleton unit for walking typically is designed as a serial mechanism which is used
for the entire leg or entire body. This thesis presents a design as a partial manipulator
which is only for the hip. This has better advantages when it comes to marketing the
product, these include: light weight, easy to wear, and low cost. Furthermore, most
exoskeletons are designed for lower body are serial manipulators, which have large
workspace because of their own volume and occupied space. This design introduced in
this thesis is a parallel mechanism, which is more stable, stronger and more accurate.
These advantages benefit the wearers who choose this product.
This thesis focused on the analysis of the structure of this design, and verifies if
the design has a reasonable and reliable structure. Therefore, a series of analysis has been
done to support it. The mobility analysis and inverse kinematic solution are derived, and
the Jacobian matrix was derived analytically. Performance of the CAD model has been
checked by the finite element analysis in Ansys, which is based on applied force and
moment. The comparison of the results from tests has been illustrated clearly for stability
iii
and practicability of this design. At the end of this thesis, an optimization of the hip
exoskeleton is provided, which offers better structure of this design. / UOIT
|
394 |
Scalable parallel architecture for biological neural simulation on hardware platformsPourhaj, Peyman 04 October 2010
Difficulties and dangers in doing experiments on living systems and providing a
testbed for theorists make the biologically detailed neural simulation an essential part of
neurobiology. Due to the complexity of the neural systems and dynamic properties of the
neurons simulation of biologically realistic models is very challenging area. Currently all
general purpose simulator are software based. Limitation on the available processing
power provides a huge gap between the maximum practical simulation size and human
brain simulation as the most complex neural system. This thesis aimed at providing a
hardware friendly parallel architecture in order to accelerate the simulation process.<p>
This thesis presents a scalable hierarchical architecture for accelerating simulations of
large-scale biological neural systems on field-programmable gate arrays (FPGAs). The
architecture provides a high degree of flexibility to optimize the parallelization ratio
based on available hardware resources and model specifications such as complexity of
dendritic trees. The whole design is based on three types of customized processors and a
switching module. An addressing scheme is developed which allows flexible integration
of various combination of processors. The proposed addressing scheme, design
modularity and data process localization allow the whole system to extend over multiple
FPGA platforms to simulate a very large biological neural system.<p>
In this research Hodgkin-Huxley model is adopted for cell excitability. Passive
compartmental approach is used to model dendritic tree with any level of complexity.
The whole architecture is verified in MATLAB and all processor modules and the
switching unit implemented in Verilog HDL and Schematic Capture. A prototype
simulator is integrated and synthesized for Xilinx V5-330t-1 as the target FPGA. While
not dependent on particular IP (Intellectual Property) cores, the whole implementation is
based on Xilinx IP cores including IEEE-754 64-bit floating-point adder and multiplier
cores. The synthesize results and performance analyses are provided.
|
395 |
Parallel algorithms for real-time peptide-spectrum matchingZhang, Jian 16 December 2010
Tandem mass spectrometry is a powerful experimental tool used in molecular biology to determine the composition of protein mixtures. It has become a standard technique for protein identification. Due to the rapid development of mass spectrometry technology, the instrument can now produce a large number of mass spectra which are used for peptide identification. The increasing data size demands efficient software tools to perform peptide identification.<p>
In a tandem mass experiment, peptide ion selection algorithms generally select only the most abundant peptide ions for further fragmentation. Because of this, the low-abundance proteins in a sample rarely get identified. To address this problem, researchers develop the notion of a `dynamic exclusion list', which maintains a list of newly selected peptide ions, and it ensures these peptide ions do not get selected again for a certain time. In this way, other peptide ions will get more opportunity to be selected and identified, allowing for identification of peptides of lower abundance.
However, a better method is to also include the identification results into the `dynamic exclusion list' approach. In order to do this, a real-time peptide identification algorithm is required.<p>
In this thesis, we introduce methods to improve the speed of peptide identification so that the `dynamic exclusion list' approach can use the peptide identification results without affecting the throughput of the instrument. Our work is based on RT-PSM, a real-time program for peptide-spectrum matching with statistical significance. We profile the speed of RT-PSM and find out that the peptide-spectrum scoring module is the most time consuming portion.<p>
Given by the profiling results, we introduce methods to parallelize the peptide-spectrum scoring algorithm. In this thesis, we propose two parallel algorithms using different technologies. We introduce parallel peptide-spectrum matching using SIMD instructions. We implemented and tested the parallel algorithm on Intel SSE architecture. The test results show that a 18-fold speedup on the entire process is obtained. The second parallel algorithm is developed using NVIDIA CUDA technology. We describe two CUDA kernels based on different algorithms and compare the performance of the two kernels. The more efficient algorithm is integrated into RT-PSM. The time measurement results show that a 190-fold speedup on the scoring module is achieved and 26-fold speedup on the entire process is obtained. We perform profiling on the CUDA version again to show that the scoring module has been optimized sufficiently to the point where it is no longer the most time-consuming module in the CUDA version of RT-PSM.<p>
In addition, we evaluate the feasibility of creating a metric index to reduce the number of candidate peptides. We describe evaluation methods, and show that general indexing methods are not likely feasible for RT-PSM.
|
396 |
Application of Parallel Imaging to Murine Magnetic Resonance ImagingChang, Chieh-Wei 1980- 14 March 2013 (has links)
The use of parallel imaging techniques for image acceleration is now common in clinical magnetic resonance imaging (MRI). There has been limited work, however, in translating the parallel imaging techniques to routine animal imaging. This dissertation describes foundational level work to enable parallel imaging of mice on a 4.7 Tesla/40 cm bore research scanner.
Reducing the size of the hardware setup associated with typical parallel imaging was an integral part of achieving the work, as animal scanners are typically small-bore systems. To that end, an array element design is described that inherently decouples from a homogenous transmit field, potentially allowing for elimination of typically necessary active detuning switches. The unbalanced feed of this "dual-plane pair" element also eliminates the need for baluns in this case. The use of the element design in a 10-channel adjustable array coil for mouse imaging is presented, styled as a human cardiac top-bottom half-rack design. The design and construction of the homogenous transmit birdcage coil used is also described, one of the necessary components to eliminating the active detuning networks on the array elements. In addition, the design of a compact, modular multi-channel isolation preamplifier board is described, removing the preamplifiers from the elements and saving space in the bore. Several additions/improvements to existing laboratory infrastructure needed for parallel imaging of live mice are also described, including readying an animal preparation area and developing the ability to maintain isoflurane anesthesia delivery during scanning. In addition, the ability to trigger the MRI scanner to the ECG and respiratory signals from the mouse in order to achieve images free from physiological motion artifacts is described. The imaging results from the compact 10-channel mouse array coils are presented, and the challenges associated with the work are described, including difficulty achieving sample-loss dominance and signal-to-noise ratio (SNR) limitations. In conclusion, in vivo imaging of mice with cardiac and respiratory gating has been demonstrated. Compact array coils tailored for mice have been studied and potential future work and design improvements for our lab in this area are discussed.
|
397 |
Scalable parallel architecture for biological neural simulation on hardware platformsPourhaj, Peyman 04 October 2010 (has links)
Difficulties and dangers in doing experiments on living systems and providing a
testbed for theorists make the biologically detailed neural simulation an essential part of
neurobiology. Due to the complexity of the neural systems and dynamic properties of the
neurons simulation of biologically realistic models is very challenging area. Currently all
general purpose simulator are software based. Limitation on the available processing
power provides a huge gap between the maximum practical simulation size and human
brain simulation as the most complex neural system. This thesis aimed at providing a
hardware friendly parallel architecture in order to accelerate the simulation process.<p>
This thesis presents a scalable hierarchical architecture for accelerating simulations of
large-scale biological neural systems on field-programmable gate arrays (FPGAs). The
architecture provides a high degree of flexibility to optimize the parallelization ratio
based on available hardware resources and model specifications such as complexity of
dendritic trees. The whole design is based on three types of customized processors and a
switching module. An addressing scheme is developed which allows flexible integration
of various combination of processors. The proposed addressing scheme, design
modularity and data process localization allow the whole system to extend over multiple
FPGA platforms to simulate a very large biological neural system.<p>
In this research Hodgkin-Huxley model is adopted for cell excitability. Passive
compartmental approach is used to model dendritic tree with any level of complexity.
The whole architecture is verified in MATLAB and all processor modules and the
switching unit implemented in Verilog HDL and Schematic Capture. A prototype
simulator is integrated and synthesized for Xilinx V5-330t-1 as the target FPGA. While
not dependent on particular IP (Intellectual Property) cores, the whole implementation is
based on Xilinx IP cores including IEEE-754 64-bit floating-point adder and multiplier
cores. The synthesize results and performance analyses are provided.
|
398 |
A frequency-translating hybrid architecture for wideband analog-to-digital convertersJalali Mazlouman, Shahrzad 05 1900 (has links)
Many emerging applications call for wideband analog-to-digital converters and some require medium-to-high resolution. Incorporating such ADCs allows for shifting as much of the signal processing tasks as possible to the digital domain, where more flexible and programmable circuits are available. However, realizing such ADCs with the existing single stage architectures is very challenging. Therefore, parallel ADC architectures such as time-interleaved structures are used. Unfortunately, such architectures require high-speed high-precision sample-and-hold (S/H) stages that are challenging to implement.
In this thesis, a parallel ADC architecture, namely, the frequency-translating hybrid ADC (FTH-ADC) is proposed to increase the conversion speed of the ADCs, which is also suitable for applications requiring medium-to-high resolution ADCs. This architecture addresses the sampling problem by sampling on narrowband baseband subchannels, i.e., sampling is accomplished after splitting the wideband input signals into narrower subbands and frequency-translating them into baseband where identical narrowband baseband S/Hs can be used. Therefore, lower-speed, lower-precision S/Hs are required and single-chip CMOS implementation of the entire ADC is possible.
A proof of concept board-level implementation of the FTH-ADC is used to analyze the effects of major analog non-idealities and errors. Error measurement and compensation methods are presented. Using four 8-bit, 100 MHz subband ADCs, four 25 MHz Butterworth filters, two 64-tap FIR reconstruction filters, and four 10-tap FIR compensation filters, a total system with an effective sample rate of 200 MHz is implemented with an effective number of bits of at least 7 bits over the entire 100 MHz input bandwidth.
In addition, one path of an 8-GHz, 4-bit, FTH-ADC system, including a highly-linear mixer and a 5th-order, 1 GHz, Butterworth Gm-C filter, is implemented in a 90 nm CMOS technology. Followed by a 4-bit, 4-GHz subband ADC, the blocks consume a total power of 52 mW from a 1.2 V supply, and occupy an area of 0.05 mm2. The mixer-filter has a THD ≤ 5% (26 dB) over its full 1 GHz bandwidth and provides a signal with a voltage swing of 350 mVpp for the subsequent ADC stage.
|
399 |
A model of dynamic compilation for heterogeneous compute platformsKerr, Andrew 10 December 2012 (has links)
Trends in computer engineering place renewed emphasis on increasing parallelism and heterogeneity.
The rise of parallelism adds an additional dimension to the challenge of portability, as
different processors support different notions of parallelism, whether vector parallelism executing
in a few threads on multicore CPUs or large-scale thread hierarchies on GPUs. Thus, software
experiences obstacles to portability and efficient execution beyond differences in instruction sets;
rather, the underlying execution models of radically different architectures may not be compatible.
Dynamic compilation applied to data-parallel heterogeneous architectures presents an abstraction
layer decoupling program representations from optimized binaries, thus enabling portability without
encumbering performance. This dissertation proposes several techniques that extend dynamic
compilation to data-parallel execution models. These contributions include:
- characterization of data-parallel workloads
- machine-independent application metrics
- framework for performance modeling and prediction
- execution model translation for vector processors
- region-based compilation and scheduling
We evaluate these claims via the development of a novel dynamic compilation framework,
GPU Ocelot, with which we execute real-world workloads from GPU computing. This enables
the execution of GPU computing workloads to run efficiently on multicore CPUs, GPUs, and a
functional simulator. We show data-parallel workloads exhibit performance scaling, take advantage
of vector instruction set extensions, and effectively exploit data locality via scheduling which
attempts to maximize control locality.
|
400 |
Wildland Fire Prediction based on Statistical Analysis of Multiple SolutionsBianchini, Germán 21 July 2006 (has links)
En diferentes áreas científicas, el uso de modelos para representar sistemas físicos se ha tornado una tarea habitual. Estos modelos reciben parámetros de entradas representando condiciones particulares y proveen una salida que representa la evolución del sistema. Usualmente, dichos modelos están integrados en herramientas de simulación que pueden ser ejecutadas en una computadora.Un caso particular donde los modelos resultan muy útiles es la predicción de la propagación de Incendios Forestales. Los incendios se han vuelto un gran peligro que cada año provoca grandes pérdidas desde el punto de vista ambiental, económico, social y humano. En particular, las estaciones secas y calurosas incrementan seriamente el riesgo de incendios en el área Mediterránea. Por lo tanto, el uso de modelos es relevante para estimar el riesgo de incendios y predecir el comportamiento de los mismos.Sin embargo, en muchos casos, los modelos presentan una serie de limitaciones. Estas se relacionan con la necesidad de un gran número de parámetros de entrada. En muchos casos, tales parámetros presentan cierto grado de incertidumbre debido a la imposibilidad de medirlos en tiempo real, y deben ser estimados a partir de datos indirectas. Además, en muchos casos estos modelos no se pueden resolver analíticamente y deben ser calculados aplicando métodos numéricos que son una aproximación de la realidad.Se han desarrollado diversos métodos basados en asimilación de datos para optimizar los parámetros de entrada. Comúnmente, estos métodos operan sobre un gran número de parámetros de entrada y, a través de optimización, se enfocan en hallar un único conjunto de parámetros que describa de la mejor forma posible el comportamiento previo. Por lo tanto, es de esperar que el mismo conjunto de valores pueda ser usado para describir el futuro inmediato.Sin embargo, esta clase de predicción se basa en un solo conjunto de parámetros y, por lo que se explicó, debido a aquellos parámetros que presentan un comportamiento dinámico, los valores optimizados pueden no resultar adecuados para el siguiente paso.El presente trabajo propone un método alternativo. Nuestro sistema, llamado Sistema Estadístico para la Gestión de Incendios Forestales, se basa en conceptos estadísticos. Su objetivo es hallar un patrón del comportamiento del incendio, independientemente de los valores de los parámetros. En este método, cada parámetro es representado mediante un rango de valores y una cardinalidad. Se generan todos los posibles escenarios considerando todas las posibles combinaciones de los valores de los parámetros de entrada, y entonces se evalúa la propagación para cada caso. Los resultados son agregados estadísticamente para determinar la probabilidad de que cada área se queme. Esta agregación se utiliza para predecir el área quemada en el siguiente paso.Para validar nuestro método, usamos un conjunto de quemas reales prescritas. Además, comparamos nuestro método contra otros dos. Uno de estos dos métodos fue implementado para este trabajo: GLUE (Generalized Likelihood Uncertainty Estimation). Dicho método corresponde a una adaptación de un sistema hidrológico. El otro caso (Método Evolutivo) es un algoritmo genético previamente desarrollado e implementado también por nuestro equipo de investigación.Los sistemas propuestos requieren un gran número de simulaciones, razón por la cual decidimos usar un esquema paralelo para implementarlos. Esta forma de trabajo difiere del esquema tradicional de teoría y experimentación, lo cual es la forma común de la ciencia y la ingeniería. El cómputo científico está en continua expansión, principalmente a través del análisis de modelos matemáticos implementados en computadores. Los científicos e ingenieros desarrollan programas de computador que modelan los sistemas bajo estudio. Esta metodología está creando una nueva rama de la ciencia basada en métodos computacionales, la cual crece de forma acelerada. Esta aproximación es llamada Ciencia Computacional. / In many different scientific areas, the use of models to represent the physical system has become a common strategy. These models receive some input parameters representing the particular conditions and provide an output representing the evolution of the system. Usually, these models are integrated in simulation tools that can be executed on a computer.A particular case where models are very useful is the prediction of Forest Fire propagation. Forest fire is a very significant hazard that every year provokes huge looses from the environmental, economical, social and human point of view. Particularly dry and hot seasons seriously increase the risk of forest fires in the Mediterranean area. Therefore, the use of models is very relevant to estimate fire risk, and predict fire behavior.However, in many cases models present a series of limitations. Usually, such limitations are due to the need of a large number of input parameters. In many cases such parameters present some uncertainty due to the impossibility to measure all of them in real time and must be estimated from indirect measurements. Moreover, in most cases these models cannot be solved analytically and must be solved applying numerical methods that are only an approach to reality (still without considering the limitations that present the translations of these solutions when they are carried out by means of computers).Several methods based on data assimilation have been developed to optimize the input parameters. In general, these methods operate over a large number of input parameters, and, by mean of some kind of optimization, they focus on finding a unique parameter set that would describe the previous behavior in the best form. Therefore, it is hoped that the same set of values could be used to describe the immediate future.However, this kind of prediction is based on a single value of parameters and, as it has been said above, for those parameters that present a dynamic behavior the new optimized values cannot be adequate for the next step.The objective of this work is to propose an alternative method. Our method, called Statistical System for Forest Fire Management, is based on statistical concepts. Its goal is to find a pattern of the forest fire behavior, independently of the parameters values. In this method, each parameter is represented by a range of values with a particular cardinality for each one of them. All possible scenarios considering all possible combinations of input parameters values are generated and the propagation for each scenario is evaluated. All results are statically aggregated to determine the burning probability of each area. This aggregation is used to predict the burned area in the next step.To validate our method, we use a set of real prescribed burnings. Furthermore, we compare our method against two other methods. One of these methods was implemented by us for this work: GLUE (Generalized Likelihood Uncertainty Estimation). It corresponds to an adaptation of a hydrological method. The other method (Evolutionary method) is a genetic algorithm previously developed and implemented by our research team.The proposed system requires a large number of simulations, a reason why we decide to use a parallel-scheme to implement them. This way of working is different from traditional scheme of theory and experiment, which is the common form of science and engineering. The scientific computing approach is in continuous expansion, mainly through the analysis of mathematical models implemented on computers. Scientists and engineers develop computer programs that model the systems under study. This methodology is creating a new branch of science based on computational methods that is growing very fast. This approach is called Computational Science.
|
Page generated in 0.0488 seconds