Global ETD Search

251	Parallel Go on CUDA with Monte Carlo Tree Search Zhou, Jun 11 October 2013 (has links) No description available. Computer Science Monte Carlo Tree Search CUDA Go Biased Evaluation Function Artificial Intelligence Parallel Computing
252	Applying Computational Resources to the Down-Arrow Problem Koch, Johnathan 28 April 2023 (has links) No description available. Mathematics Computer Science Graph theory Ramsey theory Edge-coloring Graph methods Scientific computing Python Parallel computing
253	Modernizing and Evaluating the Autotuning Framework of SkePU 3 Nsralla, Basel January 2022 (has links) Autotuning is a method which enables a program to automatically choose the most suitable parameters that optimizes it for a certain goal e.g. speed, cost, etc. In this work autotuning is implemented in the context of the SkePU framework, in order to choose the best backend (CUDA, CPU, OpenCL, Hybrid) that would optimize a skeleton execution in terms of performance. SkePU is a framework that provides different algorithmic skeletons with implementations for the different backends (OpenCL, CUDA, OpenMP, CPU). Skeletons are parameterised with a user-provided per-element function which will run in parallel. This thesis shows how the autotuning of SkePU’s automatic backend selection for skeleton calls is implemented with respect to all the different parameters that a SkePU skeleton could have. The autotuning in this thesis is built upon the sampling technique, which is implemented by applying different combinations of sizes for the vector and matrix parameters to eventually generate an execution plan, which will be used as a lookup table when running the skeleton on all different backends. The execution plan will estimate the best performing backend for the sample. This work finally evaluates the implementation by comparing the results of running the autotuning on the different SkePU programs, to running the same programs without the autotuning. SkePU Autotuning Parallel Computing Multicore OpenCL OpenMP Computer Sciences Datavetenskap (datalogi)
254	Automated Adaptive Data Center Generation For Meshless Methods Mitteff, Eric 01 January 2006 (has links) Meshless methods have recently received much attention but are yet to reach their full potential as the required problem setup (i.e. collocation point distribution) is still significant and far from automated. The distribution of points still closely resembles the nodes of finite volume-type meshes and the free parameter, c, of the radial-basis expansion functions (RBF) still must be tailored specifically to a problem. The localized meshless collocation method investigated requires a local influence region, or topology, used as the expansion medium to produce the required field derivatives. Tests have shown a regular cartesian point distribution produces optimal results, however, in order to maintain a locally cartesian point distribution a recursive quadtree scheme is herein proposed. The quadtree method allows modeling of irregular geometries and refinement of regions of interest and it lends itself for full automation, thus, reducing problem setup efforts. Furthermore, the construction of the localized expansion regions is closely tied up to the point distribution process and, hence, incorporated into the automated sequence. This also allows for the optimization of the RBF free parameter on a local basis to achieve a desired level of accuracy in the expansion. In addition, an optimized auto-segmentation process is adopted to distribute and balance the problem loads throughout a parallel computational environment while minimizing communication requirements. meshless methods radial-basis functions quadtree octree parallel computing Engineering Mechanical Engineering
255	Optimising 3D object destruction tools for improved performance and designer efficiency in video game development Forslund, Elliot January 2023 (has links) Background. In video game development, efficient destruction tools and workflows were crucial for creating engaging gaming environments. This study delved into the fundamental principles of 3D object properties and interactions, reviewed existing destruction techniques, and offered insights into their practical application, with a specific focus on Embark Studios’ destruction tool. Objectives. This study focused on the optimisation of an existing destruction tool to enhance efficiency and integration within a gaming company’s pipeline. The key objectives included reducing execution time, and improving designer workflow. The study utilised performance counters and Unreal Insights profiling to identify and optimise hotspots in the destruction tool. Additionally, the performance of the op- timised tool was measured and compared to the existing one to quantify efficiency improvements. An expert evaluation with designers at Embark Studios was con- ducted to assess the impact of the optimised tool on their workflow. Methods. The existing destruction tool was optimised primarily through parallelisation. The efficiency of the optimised tool was evaluated both empirically, by measuring the execution time, and subjectively, through an expert evaluation involv- ing three professional level designers. Results. The optimisation significantly reduced the execution time of the destruc- tion tool. Feedback from the expert evaluation indicated that the optimised tool could enhance designer efficiency, particularly in rebuilding the destruction graphs. However, the performance of the optimised tool was found to be hardware-dependent, with varying execution times observed across different hardware configurations. Conclusions. This study presented an optimised destruction tool which demon- strated improved performance and efficiency, validating its suitability for integration into the pipeline of game development. It was proposed that future work could further optimise this tool and explore its performance across diverse hardware con- figurations. 3D Object Destruction Performance Optimisation Game Development Workflow Parallel Computing Designer Efficiency Computer Systems Datorsystem
256	An Investigation of the Behavior of Structural Systems with Modeling Uncertainties Hardyniec, Andrew B. 24 March 2014 (has links) Recent advancements in earthquake engineering have caused a movement toward a probabilistic quantification of the behavior of structural systems. Analysis characteristics, such as ground motion records, material properties, and structural component behavior are defined by probabilistic distributions. The response is also characterized probabilistically, with distributions fitted to analysis results at intensity levels ranging from the maximum considered earthquake ground motion to collapse. Despite the progress toward a probabilistic framework, the variability in structural analysis results due to modeling techniques has not been considered. This work investigates the uncertainty associated with modeling geometric nonlinearities and Rayleigh damping models on the response of planar frames at multiple ground motion intensity levels. First, an investigation is presented on geometric nonlinearity approaches for planar frames, followed by a critical review of current damping models. Three frames, a four-story buckling restrained braced frame, a four-story steel moment resisting frame, and an eight-story steel moment resisting frame, are compared using two geometric nonlinearity approaches and five Rayleigh damping models. Static pushover analyses are performed on the models in the geometric nonlinearities study, and incremental dynamic analyses are performed on all models to compare the response at the design based earthquake ground motion (DBE), maximum considered earthquake ground motion (MCE), and collapse intensity levels. The results indicate noticeable differences in the responses at the DBE and MCE levels and significant differences in the responses at the collapse level. Analysis of the sidesway collapse mechanisms indicates a shift in the behavior corresponding to the different modeling assumptions, though the effects were specific to each frame. The FEMA P-695 Methodology provided a framework that defined the static and dynamic analyses performed during the modeling uncertainties studies. However, the Methodology is complex and the analyses are computationally expensive. To expedite the analyses and manage the results, a toolkit was created that streamlines the process using a set of interconnected modules. The toolkit provides a program that organizes data and reduces mistakes for those familiar with the process while providing an educational tool for novices of the Methodology by stepping new users through the intricacies of the process. The collapse margin ratio (CMR), calculated in the Methodology, was used to compare the collapse behavior of the models in the modeling uncertainties study. Though it provides a simple scalar quantity for comparison, calculation of the CMR typically requires determination of the full set of incremental dynamic analysis curves, which require prohibitively large analysis time for complex models. To reduce the computational cost of calculating the CMR, a new parallel computing method, referred to as the fragility search method, was devised that uses approximate collapse fragility curves to quickly converge on the median collapse intensity value. The new method is shown to have favorable attributes compared to other parallel computing methods for determining the CMR. / Ph. D. Modeling Uncertainty Parallel Computing Nonlinear Dynamic Analysis FEMA P-695 Optimization Nonlinear Response History Analysis
257	Aircraft Multidisciplinary Design Optimization using Design of Experiments Theory and Response Surface Modeling Methods Giunta, Anthony A. 01 May 1997 (has links) Design engineers often employ numerical optimization techniques to assist in the evaluation and comparison of new aircraft configurations. While the use of numerical optimization methods is largely successful, the presence of numerical noise in realistic engineering optimization problems often inhibits the use of many gradient-based optimization techniques. Numerical noise causes inaccurate gradient calculations which in turn slows or prevents convergence during optimization. The problems created by numerical noise are particularly acute in aircraft design applications where a single aerodynamic or structural analysis of a realistic aircraft configuration may require tens of CPU hours on a supercomputer. The computational expense of the analyses coupled with the convergence difficulties created by numerical noise are significant obstacles to performing aircraft multidisciplinary design optimization. To address these issues, a procedure has been developed to create two types of noise-free mathematical models for use in aircraft optimization studies. These two methods use elements of statistical analysis and the overall procedure for using the methods is made computationally affordable by the application of parallel computing techniques. The first modeling method, which has been the primary focus of this work, employs classical statistical techniques in response surface modeling and least squares surface fitting to yield polynomial approximation models. The second method, in which only a preliminary investigation has been performed, uses Bayesian statistics and an adaptation of the Kriging process in Geostatistics to create exponential function-based interpolating models. The particular application of this research involves modeling the subsonic and supersonic aerodynamic performance of high-speed civil transport (HSCT) aircraft configurations. The aerodynamic models created using the two methods outlined above are employed in HSCT optimization studies so that the detrimental effects of numerical noise are reduced or eliminated during optimization. Results from sample HSCT optimization studies involving five and ten variables are presented here to demonstrate the utility of the two modeling methods. / Ph. D. high-speed civil transport aerodynamics parallel computing LD5655.V856 1997.G586
258	Hardware Accelerated Particle Filter for Lane Detection and Tracking in OpenCL Madduri, Nikhil January 2014 (has links) A road lane detection and tracking algorithm is developed, especially tailored to run on high-performance heterogeneous hardware like GPUs and FPGAs in autonomous road vehicles. The algorithm was initially developed in C/C++ and was ported to OpenCL which supports computation on heterogeneous hardware.A novel road lane detection algorithm is proposed using random sampling of particles modeled as straight lines. Weights are assigned to these particles based on their location in the gradient image. To improve the computation efficiency of the lane detection algorithm, lane tracking is introduced in the form of a Particle Filter. Creation of the particles in lane detection step and prediction, measurement updates in lane tracking step are computed parellelly on GPU/FPGA using OpenCL code, while the rest of the code runs on a host CPU. The software was tested on two GPUs - NVIDIA GeForce GTX 660 Ti & NVIDIA GeForce GTX 285 and an FPGA - Altera Stratix-V, which gave a computational frame rate of up to 104 Hz, 79 Hz and 27 Hz respectively. The code was tested on video streams from five different datasets with different scenarios of varying lighting conditions on the road, strong shadows and the presence of light to moderate traffic and was found to be robust in all the situations for detecting a single lane. / <p>Validerat; 20140128 (global_studentproject_submitter)</p> GPGPU FPGA Image Processing Computer Vision Parallel Computing Road Lane Detection Lane Tracking Particle Filter OpenCL
259	Evaluating the applicability of Deep Learning techniques in agricultural systems modeling Saravi, Babak 13 December 2019 (has links) A rapidly expanding world population and extreme climate change have made food production a crucial challenge in the twentyirst century. Therefore, improving crop production through agricultural management could be an effective solution for this challenge. However, due to the associated cost and time to perform field works, researchers widely rely on agricultural system modeling to examine the impacts of different crop management scenarios. However, due to the complexity of agricultural system modeling, their applications in producing practical knowledge for producers are limited. Concurrently, deep learning techniques have been recognized as a preferred method when dealing with large datasets. This study was performed in three phases. First, A deep learning network was utilized and trained by incorporating a large number of datasets produced by the Decision Support System for Agrotechnology Transfer (DSSAT) model. To the best of our knowledge, no research has been done in the literature on modeling a cropping system by deep learning. An model accuracy level of around 98\% was obtained, and it was 770 times faster than classical crop models DSSAT in calculating 900,000 different crop growth scenarios. However, The second phase of the study examined the robustness of the deep learning model under a wider range of environmental factors (e.g., different irrigation and climatological conditions) while a deep learning structure was desired compare to the first study. To optimize the deep learning structure, three variable reduction methods were used (Bayesian, Spearman, and Principal Component Analysis). The result of this study showed that a deep learning structure could be developed that has a similar accuracy level as the original model while the structural size was reduced up to 80 times. In the third phase of the study, three techniques (L1/L2 regularization, and neurons dropout) were used to address the overfitting problem in some deep learning models. The L2 regularization was identified as the most effective method that increased model generalization and reduced overfitting. The overall results from this study demonstrated the effectiveness of the proposed deep learning technique in replicating the yield results from crop modeling under different climatological and management conditions. Deep Learning Neural Network Parallel Computing Agricultural Systems Modeling Crop Modeling
260	The role of Reynolds number in the fluid-elastic instability of cylinder arrays Ghasemi, Ali 05 1900 (has links) The onset of fluid-elastic instability in cylinder arrays is usually thought to depend primarily on the mean flow velocity, the Scruton number and the natural frequency of the cylinders. Currently, there is considerable evidence from experimental measurements and computational fluid dynamic (CFD) simulations that the Reynolds number is also an important parameter. However, the available data are not sufficient to understand or quantify this effect. In this study we use a high resolution pseudo-spectral scheme to solve 2-D penalized Navier-Stokes equations in order to accurately model turbulent flow past cylinder array. To uncover the Reynolds number effect we perform simulations that vary Reynolds number independent of flow velocity at a fixed Scruton number, and then analyze the cylinder responses. The computational complexity of our algorithm is a function of Reynolds number. Therefore, we developed a high performance parallel code which allows us to simulate high Reynolds numbers at a reasonable computational cost. The simulations reveal that increasing Reynolds number has a strong de-stabilizing effect for staggered arrays. On the other hand, for the in-line array case Reynolds number still affects the instability threshold, but the effect is not monotonic with increasing Reynolds number. In addition, our findings suggest that geometry is also an important factor since at low Reynolds numbers critical flow velocity in the staggered array is considerably higher than the in-line case. This study helps to better predict how the onset of fluid-elastic instability depends on Reynolds number and reduces uncertainties in the experimental data which usually do not consider the effect of Reynolds number. / Thesis / Master of Science (MSc) fluid structure interaction fluid elastic instability Reynolds number high performance and parallel computing Cylinder array

Search results