Global ETD Search

91	Real-Time Prediction-Driven Dynamics Simulation to Mitigate Frame Time Variation Buck, Mackinnon A 01 December 2021 (has links) (PDF) Real-time physics engines have seen recent performance improvements through techniques like hardware acceleration and artificial intelligence. However, state of the art physics simulation technology fails to account for the variation in simulation complexity over time. Sudden increases in contact frequency between simulated bodies can momentarily increase the processing time per frame. To solve this, we present a prediction-driven real-time dynamics method that uses a memory-efficient graph-based state buffer to minimize the cost of mispredictions. This buffer, which is generated by a separate thread running the physics pipeline, allows physics computation to temporarily run slower than real-time without affecting the frame rate of the host application. The main thread, whose role in dynamics computation gets limited to querying the simulation state and regenerating mispredicted state, sees a significant reduction in time spent per frame on dynamics computation when our multi-threaded prediction pipeline is enabled. Thus, our technique enables interactive multimedia applications to increase the computational budget for graphics at no cost perceptible to the end user. Furthermore, our method guarantees determinism and low input latency, making it suitable in competitive games and other real-time interactive applications. We also provide a C++ API to integrate custom game logic with the prediction engine to further minimize the frequency of mispredictions. Real-Time Physics Engine Physics Simulation Dynamics Parallel Computing
92	Epithelium: The lightweight, customizable epithelial tissue simulator Drag, Melvyn I. 29 May 2015 (has links) No description available. Biology Computer Science Mathematics parallel computing epithelial tissue
93	A High Productivity Framework for Parallel Data Intensive Computing in MATLAB Panuganti, Rajkiran 26 June 2009 (has links) No description available. Computer Science High Performance High Productivity Matlab Parallel Computing Frameworks
94	High-Performance Multi-Transport MPI Design for Ultra-Scale InfiniBand Clusters Koop, Matthew J. 03 September 2009 (has links) No description available. Computer Science MPI InfiniBand Message Passing Cluster Parallel Computing
95	Optimization Frameworks for Discrete Composite Laminate Stacking Sequences Adams, David Bruce 23 August 2005 (has links) Composite panel structure optimization is commonly decomposed into panel optimization subproblems, with specified local loads, resulting in manufacturing incompatibilities between adjacent panel designs. Using genetic algorithms to optimize local panel stacking sequences allows panel populations of stacking sequences to evolve in parallel and send migrants to adjacent panels, so as to blend the local panel designs globally. The blending process is accomplished using the edit distance between individuals of a population and the set of migrants from adjacent panels. The objective function evaluating the fitness of designs is modified according to the severity of mismatches detected between neighboring populations. This lays the ground work for natural evolution to a blended global solution without leaving the paradigm of genetic algorithms. An additional method applied here for constructing globally blended panel designs uses a parallel decomposition antithetical to that of earlier work. Rather than performing concurrent panel genetic optimizations, a single genetic optimization is conducted for the entire structure with the parallelism solely within the fitness evaluations. A guide based genetic algorithm approach is introduced to exclusively generate and evaluate valid globally blended designs, utilizing a simple master-slave parallel implementation, implicitly reducing the size of the problem design space and increasing the quality of discovered local optima. / Ph. D. Blending Decomposition Genetic Algorithms Composite Laminates Combinatorial Optimization Parallel Computing
96	Structural Modeling and Optimization of Aircraft Wings having Curvilinear Spars and Ribs (SpaRibs) De, Shuvodeep 22 September 2017 (has links) The aviation industry is growing at a steady rate but presently, the industry is highly dependent on fossil fuel. As the world is running out of fossil fuels and the wide-spread acceptance of climate change due to carbon emissions, both the governments and industry are spending a significant amount of resources on research to reduce the weight and hence the fuel consumption of commercial aircraft. A commercial fixed-wing aircraft wing consists of spars which are beams running in span-wise direction, carrying the flight loads and ribs which are panels with holes attached to the spars to preserve the outer airfoil shape of the wing. Kapania et al. at Virginia Tech proposed the concept of reducing the weight of aircraft wing using unconventional design of the internal structure consisting of curvilinear spars and ribs (known as SpaRibs) for enhanced performance. A research code, EBF3GLWingOpt, was developed by the Kapania Group. at Virginia Tech to find the best configuration of SpaRibs in terms of weight saving for given flight conditions. However, this software had a number of limitations and it can only create and analyze limited number of SpaRibs configurations. In this work, the limitations of the EBF3GLWingOpt code has been identified and new algorithms have been developed to make is robust and analyze larger number of SpaRibs configurations. The code also has the capability to create cut-outs in the SpaRibs for passage of fuel pipes and wirings. This new version of the code can be used to find best SpaRibs configuration for multiple objectives such as reduction of weight and increase flutter velocity. The code is developed in Python language and it has parallel computational capabilities. The wing is modeled using commercial FEA software, MSC.PATRAN and analyzed using MSC.NASTRAN which are from within EBF3GLWingOpt. Using this code a significant weight reduction for a transport aircraft wing has been achieved. / PHD / The aviation industry is growing at a steady rate but presently, the industry is highly dependent on fossil fuel. As the world is running out of fossil fuels and the wide-spread acceptance of climate change due to carbon emissions, both the governments and industry are spending a significant amount of resources on research to reduce the weight and hence the fuel consumption of commercial aircraft. A commercial fixed-wing aircraft wing consists of spars which are beams running in span-wise direction, carrying the flight loads and ribs which are panels with holes attached to the spars to preserve the outer airfoil shape of the wing. Kapania et al. at Virginia Tech proposed the concept of reducing the weight of aircraft wing using unconventional design of the internal structure consisting of curvilinear spars and ribs (known as SpaRibs) for enhanced performance. A research code, EBF3GLWingOpt, was developed by the Kapania Group. at Virginia Tech to find the best configuration of SpaRibs in terms of weight saving for given flight conditions. However, this software had a number of limitations and it can only create and analyze limited number of SpaRibs configurations. In this work, the limitations of the EBF3GLWingOpt code has been identified and new algorithms have been developed to make is robust and analyze larger number of SpaRibs configurations. The code also has the capability to create cut-outs in the SpaRibs for passage of fuel pipes and wirings. This new version of the code can be used to find best SpaRibs configuration for multiple objectives such as reduction of weight and increase flutter velocity. The code is developed in Python language and it has parallel computational capabilities. The wing is modeled using commercial FEA software, MSC.PATRAN and analyzed using MSC.NASTRAN which are from within EBF3GLWingOpt. Using this code a significant weight reduction for a transport aircraft wing has been achieved. Multidisciplinary Optimization Finite Element Methods Aeroelastic Analysis Buckling Parallel Computing
97	High Performance Computing Issues in Large-Scale Molecular Statics Simulations Pulla, Gautam 02 June 1999 (has links) Successful application of parallel high performance computing to practical problems requires overcoming several challenges. These range from the need to make sequential and parallel improvements in programs to the implementation of software tools which create an environment that aids sharing of high performance hardware resources and limits losses caused by hardware and software failures. In this thesis we describe our approach to meeting these challenges in the context of a Molecular Statics code. We describe sequential and parallel optimizations made to the code and also a suite of tools constructed to facilitate the execution of the Molecular Statics program on a network of parallel machines with the aim of increasing resource sharing, fault tolerance and availability. / Master of Science distributed computing high performance computing computational environments parallel computing
98	Scalability Analysis of Synchronous Data-Parallel Artificial Neural Network (ANN) Learners Sun, Chang 14 September 2018 (has links) Artificial Neural Networks (ANNs) have been established as one of the most important algorithmic tools in the Machine Learning (ML) toolbox over the past few decades. ANNs' recent rise to widespread acceptance can be attributed to two developments: (1) the availability of large-scale training and testing datasets; and (2) the availability of new computer architectures for which ANN implementations are orders of magnitude more efficient. In this thesis, I present research on two aspects of the second development. First, I present a portable, open source implementation of ANNs in OpenCL and MPI. Second, I present performance and scaling models for ANN algorithms on state-of-the-art Graphics Processing Unit (GPU) based parallel compute clusters. / Master of Science / Artificial Neural Networks (ANNs) have been established as one of the most important algorithmic tools in the Machine Learning (ML) toolbox over the past few decades. ANNs’ recent rise to widespread acceptance can be attributed to two developments: (1) the availability of large-scale training and testing datasets; and (2) the availability of new computer architectures for which ANN implementations are orders of magnitude more efficient. In this thesis, I present research on two aspects of the second development. First, I present a portable, open source implementation of ANNs in OpenCL and MPI. Second, I present performance and scaling models for ANN algorithms on state-of-the-art Graphics Processing Unit (GPU) based parallel compute clusters. artificial neural networks Machine learning heterogenous computing parallel computing
99	Towards Enhancing Performance, Programmability, and Portability in Heterogeneous Computing Krommydas, Konstantinos 03 May 2017 (has links) The proliferation of a diverse set of heterogeneous computing platforms in conjunction with the plethora of programming languages and optimization techniques on each language for each underlying architecture exacerbate widespread adoption of such platforms. This is especially true for novice programmers and the non-technical-savvy masses that are largely precluded from enjoying the advantages of high-performance computing. Moreover, different groups within the heterogeneous computing community (e.g., hardware architects, tool developers, and programmers) are presented with new challenges with respect to performance, programmability, and portability (or the three P's) of heterogeneous computing. In this work we discuss such challenges and identify benchmarking techniques based on computation and communication patterns as an appropriate means for the systematic evaluation of heterogeneous computing with respect to the three P's. Our proposed approach is based on OpenCL implementations of the Berkeley dwarfs. We use our benchmark suite (OpenDwarfs) in characterizing performance of state-of-the-art parallel architectures, and as the main component of a methodology (Telescoping Architectures) for identifying trends in future heterogeneous architectures. Furthermore, we employ OpenDwarfs in a multi-faceted study on the gaps between the three P's in the context of the modern heterogeneous computing landscape. Our case-study spans a variety of compilers, languages, optimizations, and target architectures, including the CPU, GPU, MIC, and FPGA. Based on our insights, and extending aspects of prior research (e.g., in compilers, programming languages, and auto-tuning), we propose the introduction of grid-based data structures as the basis of programming frameworks and present a prototype unified framework (GLAF) that encompasses a novel visual programming environment with code generation, auto-parallelization, and auto-tuning capabilities. Our results, which span scientific domains, indicate that our holistic approach constitutes a viable alternative towards enhancing the three P's and further democratizing heterogeneous, parallel computing for non-programming-savvy audiences, and especially domain scientists. / Ph. D. / In the past decade computing has moved from <i>single-core</i> machines, that is machines with a CPU that can execute code in a serial manner, to <i>multi-core</i> ones, i.e., machines with CPUs that can execute code in a parallel fashion. Another paradigm shift that has manifested in the past years entails computing that utilizes <i>heterogeneous processing</i>, as opposed to <i>homogeneous processing</i>. In the latter case a single type of processor (CPU) is responsible for executing a given program, whereas in the former case different types of processors (such as CPUs, graphics processors or other accelerators) collaborate in an effort to tackle computationally difficult problems in a fast, parallel manner. The shift to <i>multi-core, parallel, heterogeneous computing</i> described above is accompanied by an associated shift in programming languages for such platforms, as well as techniques to optimize programs for high performance (i.e., execution speed). The unique complexities of parallel and heterogeneous computing exacerbate widespread adoption of such platforms. This is especially true for novice programmers and the non-technical-savvy masses that are largely precluded from the advantages of high-performance computing. Challenges include obtaining fast execution speeds (i.e., <i>performance<i>), easiness of programming (i.e., <i>programmability</i>), and the ability to execute programs across different heterogeneous platforms (i.e., <i>portability</i>). Performance, programmability, and portability constitute the <i>3 P’s of heterogeneous computing</i>. In this work we discuss the above challenges in detail and provide insights and solutions for different interest groups within the computing community, such as computer architects, tool developers and programmers. We propose an approach for evaluating existing heterogeneous computing platforms based on the concept of <i>dwarf-based benchmarks</i> (i.e., applications that are characterized by certain computation and communication patterns). Furthermore, we propose a methodology for utilizing the dwarf concept for evaluating potential future heterogeneous platforms. In our research we attempt to quantify the trade-offs between performance, programmability, and portability in a wide set of modern heterogeneous platforms. Based on the above, we seek to bridge the 3 P’s by introducing a programming framework that democratizes parallel algorithm development on heterogeneous architectures for novice programmers and domain scientists. Specifically, our framework produces parallel, optimized code implementations in multiple languages with the potential of executing across different heterogeneous platforms. Performance Programmability Heterogeneous Architectures Parallel Computing Programming Framework
100	An Efficient Parallel Three-Level Preconditioner for Linear Partial Differential Equations Yao, Aixiang I Song 26 February 1998 (has links) The primary motivation of this research is to develop and investigate parallel preconditioners for linear elliptic partial differential equations. Three preconditioners are studied: block-Jacobi preconditioner (BJ), a two-level tangential preconditioner (D0), and a three-level preconditioner (D1). Performance and scalability on a distributed memory parallel computer are considered. Communication cost and redundancy are explored as well. After experiments and analysis, we find that the three-level preconditioner D1 is the most efficient and scalable parallel preconditioner, compared to BJ and D0. The D1 preconditioner reduces both the number of iterations and computational time substantially. A new hybrid preconditioner is suggested which may combine the best features of D0 and D1. / Master of Science domain decomposition preconditioner parallel computing PDE distributed systems

Search results