• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1930
  • 582
  • 307
  • 237
  • 150
  • 48
  • 38
  • 34
  • 25
  • 23
  • 21
  • 21
  • 15
  • 15
  • 12
  • Tagged with
  • 4265
  • 1169
  • 1042
  • 973
  • 612
  • 603
  • 599
  • 594
  • 478
  • 457
  • 421
  • 408
  • 369
  • 325
  • 318
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
171

Performance Projections of HPC Applications on Chip Multiprocessor (CMP) Based Systems

Shawky Sharkawi, Sameh Sh 2011 May 1900 (has links)
Performance projections of High Performance Computing (HPC) applications onto various hardware platforms are important for hardware vendors and HPC users. The projections aid hardware vendors in the design of future systems and help HPC users with system procurement and application refinements. In this dissertation, we present an efficient method to project the performance of HPC applications onto Chip Multiprocessor (CMP) based systems using widely available standard benchmark data. The main advantage of this method is the use of published data about the target machine; the target machine need not be available. With the current trend in HPC platforms shifting towards cluster systems with chip multiprocessors (CMPs), efficient and accurate performance projection becomes a challenging task. Typically, CMP-based systems are configured hierarchically, which significantly impacts the performance of HPC applications. The goal of this research is to develop an efficient method to project the performance of HPC applications onto systems that utilize CMPs. To provide for efficiency, our projection methodology is automated (projections are done using a tool) and fast (with small overhead). Our method, called the surrogate-based workload application projection method, utilizes surrogate benchmarks to project an HPC application performance on target systems where computation component of an HPC application is projected separately from the communication component. Our methodology was validated on a variety of systems utilizing different processor and interconnect architectures with high accuracy and efficiency. The average projection error on three target systems was 11.22 percent with standard deviation of 1.18 percent for twelve HPC workloads.
172

Independent Operation of Parallel Three-phase Converters for Motor Drive Applications

Fingas, William Daniel 18 January 2010 (has links)
A motor drive consisting of two parallel voltage-sourced converters was developed and implemented. A parallel converter arrangement allows the system to be constructed in a modular fashion to gain economies of scale and redundancy. The converters are connected to common ac- and dc-buses without isolation and are controlled without inter-converter communication or a master/slave arrangement. The system was simulated and the results validated against an experimental setup. Both steady-state and dynamic load sharing were achieved through the use of drooped PI speed regulators. PI controllers were used to regulate the quadrature currents provided by each converter. Circulating 0-sequence current was regulated using P controllers. A linearized state-space model of the system was developed and an eigenvalue analysis was performed, showing system stability. Speed steps in simulation and in the laboratory demonstrated good response. The loss of one converter’s gating was emulated. The system continued to operate, showing an advantage of system redundancy.
173

Independent Operation of Parallel Three-phase Converters for Motor Drive Applications

Fingas, William Daniel 18 January 2010 (has links)
A motor drive consisting of two parallel voltage-sourced converters was developed and implemented. A parallel converter arrangement allows the system to be constructed in a modular fashion to gain economies of scale and redundancy. The converters are connected to common ac- and dc-buses without isolation and are controlled without inter-converter communication or a master/slave arrangement. The system was simulated and the results validated against an experimental setup. Both steady-state and dynamic load sharing were achieved through the use of drooped PI speed regulators. PI controllers were used to regulate the quadrature currents provided by each converter. Circulating 0-sequence current was regulated using P controllers. A linearized state-space model of the system was developed and an eigenvalue analysis was performed, showing system stability. Speed steps in simulation and in the laboratory demonstrated good response. The loss of one converter’s gating was emulated. The system continued to operate, showing an advantage of system redundancy.
174

A Compiler and Symbolic Debugger for Occam

Chelliah, M 08 1900 (has links)
We have implemented Occam, a parallel programming language, on a uniprocessor machine (MC-68020 based HORIZON I11 running on UNIX system V.2) with simulated concurrency. Occam is a descendant of CSP with a few convenient modifications like channels used for communication and procedures. Two additions to the original language, i.e., output guards and recursion have been proposed. Front end of the compiler was developed using LEX and YACC. An innovative code generator, generator based on tree pattern matching has been used to generate the back end of the compiler, which generates efficient MC-68020 assembly code. A kernel for process administration is the runtime support provided. It has been developed entirely in ' C ' and made available as a library. This is linked with the assembly module to generate the executable version of the input Occam program. We have also interfaced our Occam compiler with Unix system V.2 source level debugger 'Sdb' so as to provide debugging support for Occam programmers. Issues involved in parallel debugging have been investigated and those demanding minimum effort have been incorporated in Occam debugger by modifying the runtime support of the uniprocessor implementation. Modifications to the uniprocessor implementation so as to make it run on a shared memory multiprocessor machine(HCL MAGNUM-P with four MC-68030 processors) are also discussed. The support provided by MAGNUM-P at the architecture and operating system levels is explained in detail. Our Occam compiler for the multiprocessor generates code, but the generated code has not been tested since the machine is not yet ready.
175

Communication-efficient bulk synchronous parallel algorithms

Huang, Chun-Hsi. January 2001 (has links)
Thesis (Ph. D.)--State University of New York at Buffalo, 2001. / Includes bibliographical references (leaves 126-136). Also available in print.
176

Achieving robust performance in parallel programming languages /

Lewis, E Christopher, January 2001 (has links)
Thesis (Ph. D.)--University of Washington, 2001. / Vita. Includes bibliographical references (p. 104-113).
177

A descriptive performance model of small, low cost, diskless Beowulf clusters /

Nielson, Curtis R., January 2003 (has links) (PDF)
Thesis (M.S.)--Brigham Young University. School of Technology, 2003. / Includes bibliographical references (p. 93-96).
178

Parallelizing an interactive theorem prover : functional programming and proofs with ACL2

Rager, David Lawrence 15 February 2013 (has links)
Multi-core systems have become commonplace, however, theorem provers often do not take advantage of the additional computing resources in an interactive setting. This research explores automatically using these additional resources to lessen the delay between when users submit conjectures to the theorem prover and when they receive feedback from the prover that is useful in discovering how to successfully complete the proof of a particular theorem. This research contributes mechanisms that permit applicative programs to execute in parallel while simultaneously preparing these programs for verification by a semi-automatic reasoning system. It also contributes a parallel version of an automated theorem prover, with management of user interaction issues, such as output and how inherently single-threaded, user-level proof features can be configured for use with parallel computation. Finally, this dissertation investigates the types of proofs that are amenable to parallel execution. This investigation yields the result that almost all proof attempts that require a non-trivial amount of time can benefit from parallel execution. Proof attempts executed in parallel almost always provide the aforementioned feedback sooner than if they executed serially, and their execution time is often significantly reduced. / text
179

Using parallel computation to apply the singular value decomposition (SVD) in solving for large Earth gravity fields based on satellite data

Hinga, Mark Brandon 28 August 2008 (has links)
Not available / text
180

Productivity with performance: property/behavior-based automated composition of parallel programs from self-describing components / Property/behavior-based automated composition of parallel programs from self-describing components

Mahmood, Nasim, 1976- 28 August 2008 (has links)
Development of efficient and correct parallel programs is a complex task. These parallel codes have strong requirements for performance and correctness and must operate robustly and efficiently across a wide spectrum of application parameters and on a wide spectrum of execution environments. Scientific and engineering programs increasingly use adaptive algorithms whose behavior can change dramatically at runtime. Performance properties are often not known until programs are tested and performance may degrade during execution. Many errors in parallel programs arise in incorrect programming of interactions and synchronizations. Testing has proven to be inadequate. Formal proofs of correctness are needed. This research is based on systematic application of software engineering methods to effective development of efficiently executing families of high performance parallel programs. We have developed a framework (P-COM²) for development of parallel program families which addresses many of the problems cited above. The conceptual innovations underlying P-COM² are a software architecture specification language based on self-describing components, a timing and sequencing algorithm which enables execution of programs with both concrete and abstract components and a formal semantics for the architecture specification language. The description of each component incorporates compiler-useable specifications for the properties and behaviors of the components, the functionality a component implements, pre-conditions and postconditions on the inputs and outputs and state machine based sequencing control for invocations of the component. The P-COM² compiler and runtime system implement these concepts to enable: (a) evolutionary development where a program instance is evolved from a performance model to a complete application with performance known at each step of evolution, (b) automated composition of program instances targeting specific application instances and/or execution environments from self-describing components including generation of all parallel structuring, (c) runtime adaptation of programs on a component by component basis, (d) runtime validation of pre-and post-conditions and sequencing of interactions and (e) formal proofs of correctness for interactions among components based on model checking of the interaction and synchronization properties of the program. The concepts and their integration are defined, the implementation is described and the capabilities of the system are illustrated through several examples.

Page generated in 0.0489 seconds