1 |
Cloth Modelling on the GPUDencker, Kjartan January 2006 (has links)
<p>This project explores the possibility to use general purpose programming on the GPU to simlate clothes in 3D. The goal is to implement a faster version of the method given in 'Large Steps in Cloth Modelling' by Baraff et. al. (Implicit Euler).</p>
|
2 |
Physically Based Simulation and Visualization of Fire in Real-Time using the GPURødal, Knut Erik Samuel, Storli, Geir January 2006 (has links)
<p>Fire is a powerful natural effect which can greatly enhance the immersion of virtual environments and games. In this thesis we describe the theory and GPU implementation of a physically based approach for simulating and visualizing 3D fire in real-time. Previous approaches are generally lacking either in visual quality, turbulence and flickering, or flexibility and extensibility. We attempt to address all these issues by using an underlying fluid simulation, modeling the mass and heat transfer aspects of the physics of fire, in combination with an explicit combustion process. The fluid simulation is used to control the behavior of a velocity field governing the motion of fuel gas, hot exhaust gas, and temperature fields, and the combustion process models the conversion of fuel gas to exhaust gas when the temperature is above the ignition temperature of the fuel gas. The velocity field is among other affected by vorticity confinement, causing a more turbulent and flickering fire, and a buoyancy force modeling upward motion. We perform the fire simulation both in 3D and in a set of 2D slices using volumetric extrusion to define an implicit 3D domain. In order to achieve satisfying visual quality, we visualize the fire using a particle system of textured particles guided by the results from the fire simulation. The particle colors are based on black-body radiation from the hot exhaust gas, and the particles move according to the velocity field from the fluid simulation. A similar particle system is used to visualize the cooled exhaust gas or smoke. As an alternative to particle systems we have also implemented a volume rendering approach for visualizing fire, but it falls short both in performance and visual quality. Finally, we model dynamic illumination, approximating the illumination from the fire on the surrounding scene by a set of point lights, whose intensities are computed in a similar fashion as the fire particle colors. The point lights are either stationary positioned near the center of the fire, or set to follow the velocity field just like the particles of the fire and smoke particle systems. Both the simulation and visualization of fire are implemented completely on the GPU, ensuring high frame rates without sacrificing visual quality. We have achieved a flickering and turbulent fire which compares favorably to previous approaches and works well in virtual environments, especially due to the dynamic illumination. The fire visualization also has realistic colors and intensity, and thus captures important elements of real fire. Our underlying physically based simulation enables us to efficiently simulate a variety of different kinds of small-scale fires, by altering a set of simulation parameters. One of our main contributions is implementing the explicit combustion process with fluid simulation on the GPU, as well as using it in combination with vorticity confinement and volumetric extrusion. Our contributions also include the dynamic illumination already mentioned, simulation domain advection, a novel method for modeling the behavior of fire as it is moved, and using time-dependent noise curves to model dynamic wind affecting the fire.</p>
|
3 |
Parallel Methods for Real-Time Visualization of SnowSaltvik, Ingar January 2006 (has links)
<p>Using computer generated imaging is becoming more and more popular in areas such as computer gaming, movie industry and simulation. A familiar scene in the winter months for most us in the Nordic countries is snow. This thesis discusses some of the complex numerical algorithms behind snow simulations. Previous methods for snow simulation have either covered only a very limited aspect of snow, or have been unsuitable for real-time performance. In this thesis, some of these methods are combined into a model for real-time snow simulation that handles both snowflake motion through the air, wind simulation, and accumulation of snow on objects and the ground. With a goal towards achieving real-time performance with more than 25 frames per second, some new parallel methods for the snow model are introduced. Focus is set on efficient parallelization on new SMP and multi-core computer systems. The algorithms are first parallelized in a pure data-parallel manner by dividing the data structures among threads. This scheme is then improved by overlapping inherently sequential algorithms with computations for the following frame, to eliminate processor idle time. A speedup of 1.9 on modern dual CPU workstations is achieved, while displaying a visually satisfying result in real-time. By utilizing Hyper-Threading enabled dual CPU systems, the speedup is further improved to 2.0.</p>
|
4 |
Neighborhood Mining in Biological NetworksStenersen, Kristoffer, Sundsdal, Sverre January 2006 (has links)
<p>Biologists are constantly looking for new knowledge about biological properties and processes. Bio-molecular interaction networks model dependencies among proteins and the processes they participate. By studying patterns of interaction in these networks, it may be possible to discover implicit information embedded in the network topology. In this thesis we improve existing and develop new methods for investigating similarities between proteins, and for discovering protein interaction sub-patterns. Cytoscape (Shannon et al., 2003) is a tool for visualization and analysis of interaction networks used by biologists. We have developed an extension to Cytoscape that lets biologists perform the following tasks: - Compare proteins based on neighborhood information - Find interaction sub pattern in an interaction network. - Discover sub patterns in one or several networks. Our main contributions are improvements to graph mining algorithms gSpan by Yan and Han (2002) and Apriori by Inokuchi et al. (2003) whose original task was the discovering of frequent sub-patterns in a very large set of networks. We have enabled mining a single network and enabled less exact matches. The graph mining algorithm runs on labeled graphs, and we have used various clustering techniques for this task. The clustering is done through similarity measures between proteins, which we have based on Gene Ontology annotations and experimental data obtained from a ChIP-chip experiment. Our plug-in may easily be extended by adding other cluster techniques or similarity measures. We verify the results of our implementations and test them for speed. We find that of the two mining algorithms gSpan shows the most promise for mining biological graphs.</p>
|
5 |
Dynamic Selection of MPI Intra-copy Routines Based on Program CharacteristicsBorg, Øystein Lauen January 2006 (has links)
<p>The Message Passing Interface(MPI) has become a de-facto standard for parallel programming. The ultimate goal of parallel processing is high performance and this brings a motivation for a highly optimized MPI - implementation. When an application calls an MPI communications routine, data is copied between user memory and the memory areas managed by the MPI library. The speed of this transfer depends on a multitude of factors, including the architecture, amount of data, data layout and whether the data is referenced right before or after a transfer. There are numerous ways to copy data from one location to another, and their characteristics combined with the data properties will yield different efficiency. The information needed to select the best way to copy data is only available during application execution. In this Master's Thesis, we present and implement a method to improve the performance of parallel applications by dynamically perform a close-to-optimal selection of intra-copy routines within an MPI implementation. Our method detect loops of MPI calls, and exploit loop predictability to time their performance while varying the routine selections. In order to obtain a good routine selection reasonably fast, a global optimization heuristic, simulated annealing, is used. In particular, our solution method is employed within Scali MPI Connect (SMC), an MPI implementation providing 35 different intra-copy routines. Through various benchmarks, it is observed that our method introduce low overhead and find a good selection fast, thus reducing the execution time of the given benchmark. In benchmarks where the difference between an optimal routine selection and the standard selection within SMC allows it, a bandwidth improvement of 40% is observed.</p>
|
6 |
Bandwidth-Aware Prefetching in Chip MultiprocessorsGrannæs, Marius January 2006 (has links)
<p>Chip Multiprocessors (CMP) are an increasingly popular architecture and increasing numbers of vendors are now offering CMP solutions. The shift to CMP architectures from uniprocessors is driven by the increasing complexity of cores, the processor-memory performance gap, limitations in ILP and increasing power requirements. Prefetching is a successful technique commonly used in high performance processors to hide latency. In a CMP, prefetching offers new opportunities and challenges, as current uniprocessor heuristics will need adaption or redesign to integrate with CMPs. In this thesis, I look at the state of the art in prefetching and CMP architecture. I conduct experiments on how unmodified uniprocessor prefetching heuristics perform in a CMP. In addition, I have proposed a new prefetching scheme based on bandwidth monitoring and prediction through performance counters, suited for embedded CMP systems. This new prefetching scheme has been simulated with SimpleScalar. It offers lower bandwidth usage (up to 47.8 %), while retaining most of the performance gains from prefetching for low accuracy prefetching heuristics.</p>
|
7 |
Visualization of water surface using GPUGustavsen, Jostein, Harkestad, Dan Lewi January 2006 (has links)
<p>Several methods for simulating a body of water and a water surface has been investigated. A method by Layton & van de Panne based on a simplification of the Navier-Stokes equations was selected. A number of simplifications was made to increase the performance of the method, and it was implemented on the programmable graphical processing unit (GPU) using the Jacobi method to solve the linear equations. A conjugate gradient solver was also implemented on the GPU. The performance of the methods were measured and recorded.</p>
|
8 |
Weighted Pattern Matching with PWMs on FPGAsKrutådal, Lars Karsten January 2006 (has links)
<p>This paper has presented a solution to an FPGA-based PWM matcher in the form of the so-called FPWM Prototype, using the hardware facilities on the Cray XD1 Supercomputer. The prototype implementation currently runs as a single core on a single node of the Cray, and provides a theoretical PWM matching capability roughly 15 times greater than a contemporary Pentium M general-purpose CPU. Theoretical and empirical data regarding performance and resource consumption for this implementation have been provided. A method for increasing the speedup to a theoretical maximum of 480x has also been described, using a multi-core implementation on a single chip. This theoretical limit could potentially be attained with today's hardware, but would require certain compromises with regard to bit resolution and PWM length in order to fit on the FPGA. A full-scale implementation providing the capabilities required by many of today's algorithms would most likely not reach this speed, but as the FPGA currently installed on the Cray is also available in a larger variant (the Virtex-4 family), it is reasonable to assume that such an implementation could indeed be feasible on contemporary hardware. A method for using several nodes on the Cray XD1 transparently for the user application, in order to further increase the performance, has also been described. However, as theoretical performance estimation on such hardware is a highly inexact science, and empirical measurements could not be performed at this time due to the state of the prototype, no estimates have been provided for this method. While some of the original goals were attained, other parts of the project could be considered a failure. Due to a number of implementation problems, a working FPWM was not available in time for use with the two other projects mentioned in the introduction, involving hardware acceleration of the Gibbs Sampling and MEME algorithms. The main problem with the cooperation between these projects was that it relied on the FPWM being in a finished and working condition before the work involving it could begin, which turned out to be much harder and take much longer time than what was first envisioned. The planned empirical measurements of the performance boost for these algorithms are therefore not yet available.</p>
|
9 |
Discovery of approximate composite motifs in biological sequencesValebjørg, Vetle Søraas January 2006 (has links)
<p>Mapping the regulatory system in living organisms is a great challenge, and many methods have been created during the last 15 years to solve this problem. The biological processes are however more flexible and complex than first thought, and many of the methods lack the ability to imitate this exactly. The new method devised here is not a complete solution to this situation, but pose an innovative solution for finding approximate composite patterns in a set of sequences. Motifs are read from any third-party tool represented as either {A,C,G,T}, IUPAC or PWMs, and weighted with significance and support as an estimate to how important the patterns are. Finding combinations with both high significance and support can reveal important properties preserved in the sequences. Based on this, the algorithm use a branch-and-bound approach to traverse every combination while preserving the best solutions in this multiple object optimization problem in a Pareto front. The best patterns found, are investigated further by applying different statistical and experimental method to better support the significance of the patterns found. The three most important tests done on the TransCompel dataset, where (i) to look at the patterns predicted measured against known sites based on nucleotide correlation. (ii) Find the frequency for motifs participating in the combinations, so that the best could be studied manually. And (iii), different test where compared when the significance was based on real background sequences instead of the uniform distribution. Some of the results found where low, but still similar to the accuracy provided by other known methods that have been tested with the same methods. The test results can be biased by the parameters used, a too simple and restrictive test set or by faulty predictions done one the dataset tested. More testing and tuning of parameters might result in better predictions. However, the different tests still proved this method to be a valuable tool in composite motif discovery.</p>
|
10 |
BSPlab - experiment manager (BEM)Klepaker, Erlend Søreide January 2006 (has links)
<p>Dette dokumentet beskriver utviklingen av en grafisk eksperimentomgivelse for BSPlab. BSPlab er en parallell datamaskinsimulator, som gjør det mulig å simulere kjøringer av programmer skrevet for BSP-modellen (Bulk Synchronous Paralell) på forskjellige datamaskinarkitekturer. Målet med oppgaven er å utvikle grafiske omgivelser for denne simulatoren, som lar brukeren sette opp simuleringer ved hjelp av en rekke parametere, lar brukeren kjøre simuleringen og motta informasjon fra BSP programmet under kjøring og har verktøy for å la brukeren behandle resultatdata fra simulering visuelt etter kjøring. Utviklingen av denne grafiske eksperimentomgivelsen er i all hovedsak gjort i programmeringsspråket Python.</p>
|
Page generated in 0.0462 seconds