• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 226
  • 81
  • 30
  • 24
  • 14
  • 7
  • 6
  • 3
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 501
  • 501
  • 103
  • 70
  • 61
  • 58
  • 58
  • 57
  • 57
  • 56
  • 54
  • 54
  • 52
  • 50
  • 47
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
91

Real-Time Prediction-Driven Dynamics Simulation to Mitigate Frame Time Variation

Buck, Mackinnon A 01 December 2021 (has links) (PDF)
Real-time physics engines have seen recent performance improvements through techniques like hardware acceleration and artificial intelligence. However, state of the art physics simulation technology fails to account for the variation in simulation complexity over time. Sudden increases in contact frequency between simulated bodies can momentarily increase the processing time per frame. To solve this, we present a prediction-driven real-time dynamics method that uses a memory-efficient graph-based state buffer to minimize the cost of mispredictions. This buffer, which is generated by a separate thread running the physics pipeline, allows physics computation to temporarily run slower than real-time without affecting the frame rate of the host application. The main thread, whose role in dynamics computation gets limited to querying the simulation state and regenerating mispredicted state, sees a significant reduction in time spent per frame on dynamics computation when our multi-threaded prediction pipeline is enabled. Thus, our technique enables interactive multimedia applications to increase the computational budget for graphics at no cost perceptible to the end user. Furthermore, our method guarantees determinism and low input latency, making it suitable in competitive games and other real-time interactive applications. We also provide a C++ API to integrate custom game logic with the prediction engine to further minimize the frequency of mispredictions.
92

Epithelium: The lightweight, customizable epithelial tissue simulator

Drag, Melvyn I. 29 May 2015 (has links)
No description available.
93

A High Productivity Framework for Parallel Data Intensive Computing in MATLAB

Panuganti, Rajkiran 26 June 2009 (has links)
No description available.
94

High-Performance Multi-Transport MPI Design for Ultra-Scale InfiniBand Clusters

Koop, Matthew J. 03 September 2009 (has links)
No description available.
95

Optimization Frameworks for Discrete Composite Laminate Stacking Sequences

Adams, David Bruce 23 August 2005 (has links)
Composite panel structure optimization is commonly decomposed into panel optimization subproblems, with specified local loads, resulting in manufacturing incompatibilities between adjacent panel designs. Using genetic algorithms to optimize local panel stacking sequences allows panel populations of stacking sequences to evolve in parallel and send migrants to adjacent panels, so as to blend the local panel designs globally. The blending process is accomplished using the edit distance between individuals of a population and the set of migrants from adjacent panels. The objective function evaluating the fitness of designs is modified according to the severity of mismatches detected between neighboring populations. This lays the ground work for natural evolution to a blended global solution without leaving the paradigm of genetic algorithms. An additional method applied here for constructing globally blended panel designs uses a parallel decomposition antithetical to that of earlier work. Rather than performing concurrent panel genetic optimizations, a single genetic optimization is conducted for the entire structure with the parallelism solely within the fitness evaluations. A guide based genetic algorithm approach is introduced to exclusively generate and evaluate valid globally blended designs, utilizing a simple master-slave parallel implementation, implicitly reducing the size of the problem design space and increasing the quality of discovered local optima. / Ph. D.
96

High Performance Computing Issues in Large-Scale Molecular Statics Simulations

Pulla, Gautam 02 June 1999 (has links)
Successful application of parallel high performance computing to practical problems requires overcoming several challenges. These range from the need to make sequential and parallel improvements in programs to the implementation of software tools which create an environment that aids sharing of high performance hardware resources and limits losses caused by hardware and software failures. In this thesis we describe our approach to meeting these challenges in the context of a Molecular Statics code. We describe sequential and parallel optimizations made to the code and also a suite of tools constructed to facilitate the execution of the Molecular Statics program on a network of parallel machines with the aim of increasing resource sharing, fault tolerance and availability. / Master of Science
97

Towards Enhancing Performance, Programmability, and Portability in Heterogeneous Computing

Krommydas, Konstantinos 03 May 2017 (has links)
The proliferation of a diverse set of heterogeneous computing platforms in conjunction with the plethora of programming languages and optimization techniques on each language for each underlying architecture exacerbate widespread adoption of such platforms. This is especially true for novice programmers and the non-technical-savvy masses that are largely precluded from enjoying the advantages of high-performance computing. Moreover, different groups within the heterogeneous computing community (e.g., hardware architects, tool developers, and programmers) are presented with new challenges with respect to performance, programmability, and portability (or the three P's) of heterogeneous computing. In this work we discuss such challenges and identify benchmarking techniques based on computation and communication patterns as an appropriate means for the systematic evaluation of heterogeneous computing with respect to the three P's. Our proposed approach is based on OpenCL implementations of the Berkeley dwarfs. We use our benchmark suite (OpenDwarfs) in characterizing performance of state-of-the-art parallel architectures, and as the main component of a methodology (Telescoping Architectures) for identifying trends in future heterogeneous architectures. Furthermore, we employ OpenDwarfs in a multi-faceted study on the gaps between the three P's in the context of the modern heterogeneous computing landscape. Our case-study spans a variety of compilers, languages, optimizations, and target architectures, including the CPU, GPU, MIC, and FPGA. Based on our insights, and extending aspects of prior research (e.g., in compilers, programming languages, and auto-tuning), we propose the introduction of grid-based data structures as the basis of programming frameworks and present a prototype unified framework (GLAF) that encompasses a novel visual programming environment with code generation, auto-parallelization, and auto-tuning capabilities. Our results, which span scientific domains, indicate that our holistic approach constitutes a viable alternative towards enhancing the three P's and further democratizing heterogeneous, parallel computing for non-programming-savvy audiences, and especially domain scientists. / Ph. D.
98

Scalability Analysis of Synchronous Data-Parallel Artificial Neural Network (ANN) Learners

Sun, Chang 14 September 2018 (has links)
Artificial Neural Networks (ANNs) have been established as one of the most important algorithmic tools in the Machine Learning (ML) toolbox over the past few decades. ANNs' recent rise to widespread acceptance can be attributed to two developments: (1) the availability of large-scale training and testing datasets; and (2) the availability of new computer architectures for which ANN implementations are orders of magnitude more efficient. In this thesis, I present research on two aspects of the second development. First, I present a portable, open source implementation of ANNs in OpenCL and MPI. Second, I present performance and scaling models for ANN algorithms on state-of-the-art Graphics Processing Unit (GPU) based parallel compute clusters. / Master of Science / Artificial Neural Networks (ANNs) have been established as one of the most important algorithmic tools in the Machine Learning (ML) toolbox over the past few decades. ANNs’ recent rise to widespread acceptance can be attributed to two developments: (1) the availability of large-scale training and testing datasets; and (2) the availability of new computer architectures for which ANN implementations are orders of magnitude more efficient. In this thesis, I present research on two aspects of the second development. First, I present a portable, open source implementation of ANNs in OpenCL and MPI. Second, I present performance and scaling models for ANN algorithms on state-of-the-art Graphics Processing Unit (GPU) based parallel compute clusters.
99

Structural Modeling and Optimization of Aircraft Wings having Curvilinear Spars and Ribs (SpaRibs)

De, Shuvodeep 22 September 2017 (has links)
The aviation industry is growing at a steady rate but presently, the industry is highly dependent on fossil fuel. As the world is running out of fossil fuels and the wide-spread acceptance of climate change due to carbon emissions, both the governments and industry are spending a significant amount of resources on research to reduce the weight and hence the fuel consumption of commercial aircraft. A commercial fixed-wing aircraft wing consists of spars which are beams running in span-wise direction, carrying the flight loads and ribs which are panels with holes attached to the spars to preserve the outer airfoil shape of the wing. Kapania et al. at Virginia Tech proposed the concept of reducing the weight of aircraft wing using unconventional design of the internal structure consisting of curvilinear spars and ribs (known as SpaRibs) for enhanced performance. A research code, EBF3GLWingOpt, was developed by the Kapania Group. at Virginia Tech to find the best configuration of SpaRibs in terms of weight saving for given flight conditions. However, this software had a number of limitations and it can only create and analyze limited number of SpaRibs configurations. In this work, the limitations of the EBF3GLWingOpt code has been identified and new algorithms have been developed to make is robust and analyze larger number of SpaRibs configurations. The code also has the capability to create cut-outs in the SpaRibs for passage of fuel pipes and wirings. This new version of the code can be used to find best SpaRibs configuration for multiple objectives such as reduction of weight and increase flutter velocity. The code is developed in Python language and it has parallel computational capabilities. The wing is modeled using commercial FEA software, MSC.PATRAN and analyzed using MSC.NASTRAN which are from within EBF3GLWingOpt. Using this code a significant weight reduction for a transport aircraft wing has been achieved. / PHD
100

An Efficient Parallel Three-Level Preconditioner for Linear Partial Differential Equations

Yao, Aixiang I Song 26 February 1998 (has links)
The primary motivation of this research is to develop and investigate parallel preconditioners for linear elliptic partial differential equations. Three preconditioners are studied: block-Jacobi preconditioner (BJ), a two-level tangential preconditioner (D0), and a three-level preconditioner (D1). Performance and scalability on a distributed memory parallel computer are considered. Communication cost and redundancy are explored as well. After experiments and analysis, we find that the three-level preconditioner D1 is the most efficient and scalable parallel preconditioner, compared to BJ and D0. The D1 preconditioner reduces both the number of iterations and computational time substantially. A new hybrid preconditioner is suggested which may combine the best features of D0 and D1. / Master of Science

Page generated in 0.0486 seconds