Global ETD Search

451	Rendering for Microlithography on GPU Hardware Iwaniec, Michel January 2008 (has links) Over the last decades, integrated circuits have changed our everyday lives in a number of ways. Many common devices today taken for granted would not have been possible without this industrial revolution. Central to the manufacturing of integrated circuits is the photomask used to expose the wafers. Additionally, such photomasks are also used for manufacturing of flat screen displays. Microlithography, the manufacturing technique of such photomasks, requires complex electronics equipment that excels in both speed and fidelity. Manufacture of such equipment requires competence in virtually all engineering disciplines, where the conversion of geometry into pixels is but one of these. Nevertheless, this single step in the photomask drawing process has a major impact on the throughput and quality of a photomask writer. Current high-end semiconductor writers from Micronic use a cluster of Field-Programmable Gate Array circuits (FPGA). FPGAs have for many years been able to replace Application Specific Integrated Circuits due to their flexibility and low initial development cost. For parallel computation, an FPGA can achieve throughput not possible with microprocessors alone. Nevertheless, high-performance FPGAs are expensive devices, and upgrading from one generation to the next often requires a major redesign. During the last decade, the computer games industry has taken the lead in parallel computation with graphics card for 3D gaming. While essentially being designed to render 3D polygons and lacking the flexibility of an FPGA, graphics cards have nevertheless started to rival FPGAs as the main workhorse of many parallel computing applications. This thesis covers an investigation on utilizing graphics cards for the task of rendering geometry into photomask patterns. It describes different strategies that were tried and the throughput and fidelity achieved with them, along with the problems encountered. It also describes the development of a suitable evaluation framework that was critical to the process. Microlithography GPU area sampling anti-aliasing OpenGL CUDA Signal processing Signalbehandling
452	An Embedded Shading Language Qin, Zheng January 2004 (has links) Modern graphics accelerators have embedded programmable components in the form of vertex and fragment shading units. Current APIs permit specification of the programs for these components using an assembly-language level interface. Compilers for high-level shading languages are available but these read in an external string specification, which can be inconvenient. It is possible, using standard C++, to define an embedded high-level shading language. Such a language can be nearly indistinguishable from a special-purpose shading language, yet permits more direct interaction with the specification of textures and parameters, simplifies implementation, and enables on-the-fly generation, manipulation, and specification of shader programs. An embedded shading language also permits the lifting of C++ host language type, modularity, and scoping constructs into the shading language without any additional implementation effort. Computer Science Shading GPU Shading Language shader vertex shader fragment shader
453	Rendering Antialiased Shadows using Warped Variance Shadow Maps Lauritzen, Andrew Timothy January 2008 (has links) Shadows contribute significantly to the perceived realism of an image, and provide an important depth cue. Rendering high quality, antialiased shadows efficiently is a difficult problem. To antialias shadows, it is necessary to compute partial visibilities, but computing these visibilities using existing approaches is often too slow for interactive applications. Shadow maps are a widely used technique for real-time shadow rendering. One major drawback of shadow maps is aliasing, because the shadow map data cannot be filtered in the same way as colour textures. In this thesis, I present variance shadow maps (VSMs). Variance shadow maps use a linear representation of the depth distributions in the shadow map, which enables the use of standard linear texture filtering algorithms. Thus VSMs can address the problem of shadow aliasing using the same highly-tuned mechanisms that are available for colour images. Given the mean and variance of the depth distribution, Chebyshev's inequality provides an upper bound on the fraction of a shaded fragment that is occluded, and I show that this bound often provides a good approximation to the true partial occlusion. For more difficult cases, I show that warping the depth distribution can produce multiple bounds, some tighter than others. Based on this insight, I present layered variance shadow maps, a scalable generalization of variance shadow maps that partitions the depth distribution into multiple segments. This reduces or eliminates an artifact - "light bleeding" - that can appear when using the simpler version of variance shadow maps. Additionally, I demonstrate exponential variance shadow maps, which combine moments computed from two exponentially-warped depth distributions. Using this approach, high quality results are produced at a fraction of the storage cost of layered variance shadow maps. These algorithms are easy to implement on current graphics hardware and provide efficient, scalable solutions to the problem of shadow map aliasing. computer science programming graphics GPU shadows shadow map texture filtering antialiasing Computer Science
454	Automatic Parallelization for Graphics Processing Units in JikesRVM Leung, Alan Chun Wai January 2008 (has links) Accelerated graphics cards, or Graphics Processing Units (GPUs), have become ubiquitous in recent years. On the right kinds of problems, GPUs greatly surpass CPUs in terms of raw performance. However, GPUs are currently used only for a narrow class of special-purpose applications; the raw processing power available in a typical desktop PC is unused most of the time. The goal of this work is to present an extension to JikesRVM that automatically executes suitable code on the GPU instead of the CPU. Both static and dynamic features are used to decide whether it is feasible and beneficial to off-load a piece of code on the GPU. Feasible code is discovered by an implementation of data dependence analysis. A cost model that balances the speedup available from the GPU against the cost of transferring input and output data between main memory and GPU memory has been deployed to determine if a feasible parallelization is indeed beneficial. The cost model is parameterized so that it can be applied to different hardware combinations. We also present ways to overcome several obstacles to parallelization inherent in the design of the Java bytecode language: unstructured control flow, the lack of multi-dimensional arrays, the precise exception semantics, and the proliferation of indirect references. Compiler GPU Automatic Parallelization Just In Time Virtual Machine JikesRVM Java Optimization Computer Science
455	A Study of Efficiency, Accuracy, and Robustness in Intensity-Based Rigid Image Registration Xu, Lin January 2008 (has links) Image registration is widely used in different areas nowadays. Usually, the efficiency, accuracy, and robustness in the registration process are concerned in applications. This thesis studies these issues by presenting an efficient intensity-based mono-modality rigid 2D-3D image registration method and constructing a novel mathematical model for intensity-based multi-modality rigid image registration. For mono-modality image registration, an algorithm is developed using RapidMind Multi-core Development Platform (RapidMind) to exploit the highly parallel multi-core architecture of graphics processing units (GPUs). A parallel ray casting algorithm is used to generate the digitally reconstructed radiographs (DRRs) to efficiently reduce the complexity of DRR construction. The optimization problem in the registration process is solved by the Gauss-Newton method. To fully exploit the multi-core parallelism, almost the entire registration process is implemented in parallel by RapidMind on GPUs. The implementation of the major computation steps is discussed. Numerical results are presented to demonstrate the efficiency of the new method. For multi-modality image registration, a new model for computing mutual information functions is devised in order to remove the artifacts in the functions and in turn smooth the functions so that optimization methods can converge to the optimal solutions accurately and efficiently. With the motivation originating from the objective to harmonize the discrepancy between the image presentation and the mutual information definition in previous models, the new model computes the mutual information function using both the continuous image function representation and the mutual information definition for continuous random variables. Its implementation and complexity are discussed and compared with other models. The mutual information computed using the new model appears quite smooth compared with the functions computed by others. Numerical experiments demonstrate the accuracy and efficiency of optimization methods in the case that the new model is used. Furthermore, the robustness of the new model is also verified. Image registration Image processing GPU Mutual information Mathematical model Optimization Computer Science
456	Towards High Speed Aerial Tracking of Agile Targets Rizwan, Yassir January 2011 (has links) In order to provide a novel perspective for videography of high speed sporting events, a highly capable trajectory tracking control methodology is developed for a custom designed Kadet Senior Unmanned Aerial Vehicle (UAV). The accompanying high fidelity system identification ensures that accurate flight models are used to design the control laws. A parallel vision based target tracking technique is also demonstrated and implemented on a Graphical Processing Unit (GPU), to assist in real-time tracking of the target. Nonlinear control techniques like feedback linearization require a detailed and accurate system model. This thesis discusses techniques used for estimating these models using data collected during planned test flights. A class of methods known as the Output Error Methods are discussed with extensions for dealing with wind turbulence. Implementation of these methods, including data acquisition details, on the Kadet Senior are also discussed. Results for this UAV are provided. For comparison, additional results using data from a BAC-221 simulation are also provided as well as typical results from the work done at the Dryden Flight Research Center. The proposed controller combines feedback linearization with linear tracking control using the internal model approach, and relies on a trajectory generating exosystem. Three different aircraft models are presented each with increasing levels of complexity, in an effort to identify the simplest controller that yields acceptable performance. The dynamic inversion and linear tracking control laws are derived for each model, and simulation results are presented for tracking of elliptical and periodic trajectories on the Kadet Senior. Mechanical Engineering
457	Design of a Multi-Core Multi-thread Floating-Point Processor and Its Application in Computer Graphics Yeh, Chia-Yu 06 September 2011 (has links) Graphics processing unit (GPU) designs usually adopts various computer architecture techniques to boost the computation speed, including single-instruction multiple data (SIMD), very-long-instruction word (VLIW), multi-threading, and/or multi-core. In OpenGL ES 2.0, user programmable vertex shader (VS) hardware unit can be designed using vectored SIMD computation unit so that it can efficiently compute the matrix-vector multiplication, one of the key operations in vertex transformation. Recently, high-performance GPU, such as Telsa series from nVidia, is designed with many-core architectures with each core responsible for scalar operations. The intention is to allow for efficient execution of general-purpose computations in addition to the specialized graphics computations. In this thesis, we design a scalar-based multi-threaded GPU design that is composed of four scalar processors, one special-function unit, and can execute multi-threaded instructions. We use the example of vertex transformation to demonstrate execution efficiency of the scalar-based multi-threaded GPU. We also make comparison with the vector-based SIMD GPU. multi-threading graphics processing unit (GPU) vertex shader SIMD matrix-vector multiplication OpenGL ES 2.0
458	GPU Based Digital Coherent Receiver for Optical transmission system Hsiao, Hsiang-Hung 18 July 2012 (has links) The coherent optical fiber communication technology is attracting significant attentions in the world, because it can realize the spectrally efficient transmission system. One major difference between 1980¡¦s and the latest coherent technology is the utilization of the digital signal processing (DSP). In 1980¡¦s the optical phase locked loop (OPLL) was required to realize the homodyne detection, and it was significantly difficult to realize. The latest coherent technology utilizes the DSP in place of the OPLL to realize the homodyne detection, and it is much easier than the OPLL. The real-time realization of the DSP is still a problem. Because the DSP uses software to process the signal, it needs an extreme calculation power for the high-speed communication system. People always utilize the field programmable gate array (FPGA) to realize the real-time DSP, but the cost of the FPGA is too expensive for the commercial system at this moment. This master thesis intend to utilize commercially available personal computer (PC) contained a GPU calculation board to replace FPGA. It can reduce the cost of the coherent receiver. Also, this receiver is defined by the software rather than the hardware. It means that we can realize a flexible receiver defined by the software. Phase estimation Digital signal processing Coherent detection Digital coherent receiver QPSK GPU
459	Real-time Water Waves with Wave Particles Yuksel, Cem 2010 August 1900 (has links) This dissertation describes the wave particles technique for simulating water surface waves and two way fluid-object interactions for real-time applications, such as video games. Water exists in various different forms in our environment and it is important to develop necessary technologies to be able to incorporate all these forms in real-time virtual environments. Handling the behavior of large bodies of water, such as an ocean, lake, or pool, has been computationally expensive with traditional techniques even for offline graphics applications, because of the high resolution requirements of these simulations. A significant portion of water behavior for large bodies of water is the surface wave phenomenon. This dissertation discusses how water surface waves can be simulated efficiently and effectively at real-time frame rates using a simple particle system that we call "wave particles." This approach offers a simple, fast, and unconditionally stable solution to wave simulation. Unlike traditional techniques that try to simulate the water body (or its surface) as a whole with numerical techniques, wave particles merely track the deviations of the surface due to waves forming an analytical solution. This allows simulation of seemingly infinite water surfaces, like an open ocean. Both the theory and implementation of wave particles are discussed in great detail. Two-way interactions of floating objects with water is explained, including generation of waves due to object interaction and proper simulation of the effect of water on the object motion. Timing studies show that the method is scalable, allowing simulation of wave interaction with several hundreds of objects at real-time rates. wave particles waves real-time simulation fluid-object interaction GPU algorithms
460	Accelerated Ray Tracing Using Programmable Graphics Pipelines Es, S. Alphan 01 January 2008 (has links) (PDF) The graphics hardware have evolved from simple feed forward triangle rasterization devices to flexible, programmable, and powerful parallel processors. This evolution allows the researchers to use graphics processing units (GPU) for both general purpose computations and advanced graphics rendering. Sophisticated GPUs hold great opportunities for the acceleration of computationally expensive photorealistic rendering methods. Rendering of photorealistic images in real-time is a challenge. In this work, we investigate efficient ways to utilize GPUs for real-time photorealistic rendering. Specifically, we studied uniform grid based ray tracing acceleration methods and GPU friendly traversal algorithms. We show that our method is faster than or competitive to other GPU based ray tracing acceleration techniques. The proposed approach is also applicable to the fast rendering of volumetric data. Additionally, we devised GPU based solutions for real-time stereoscopic image generation which can be used in companion with GPU based ray tracers. QA Computer Software 76.75-76.765

Search results