Global ETD Search

11	A Seamless Journey Neuner, Stefanie 01 January 2006 (has links) My quilts serve as a visual journal of some of my travels and experiences. Through my quilts, I remember the rich colors and memories of my European adventures. The methodical process of stitching quilts line by line is an important process for my recollection of the many steps taken during my trip abroad. Stitching fabric is the method that communicates the opportunities and experiences of my travel that I want to share with others. stamp stencil decorative art thread sew assemblage applique Art and Design Arts and Humanities Interdisciplinary Arts and Media
12	Performance optimization of geophysics stencils on HPC architectures / Optimização de desempenho de estênceis geofísicos sobre arquiteturas HPC Abaunza, Víctor Eduardo Martínez January 2018 (has links) A simulação de propagação de onda é uma ferramenta crucial na pesquisa de geofísica (para análise eficiente dos terremotos, mitigação de riscos e a exploração de petróleo e gáz). Devido à sua simplicidade e sua eficiência numérica, o método de diferenças finitas é uma das técnicas implementadas para resolver as equações da propagação das ondas. Estas aplicações são conhecidas como estênceis porque consistem num padrão que replica a mesma computação num domínio multidimensional de dados. A Computação de Alto Desempenho é requerida para solucionar este tipo de problemas, como consequência do grande número de pontos envolvidos nas simulações tridimensionais do subsolo. A optimização do desempenho dos estênceis é um desafio e depende do arquitetura usada. Neste contexto, focamos nosso trabalho em duas partes. Primeiro, desenvolvemos nossa pesquisa nas arquiteturas multicore; analisamos a implementação padrão em OpenMP dos modelos numéricos da transferência de calor (um estêncil Jacobi de 7 pontos), e o aplicativo Ondes3D (um simulador sísmico desenvolvido pela Bureau de Recherches Géologiques et Minières); usamos dois algoritmos conhecidos (nativo, e bloqueio espacial) para encontrar correlações entre os parâmetros da configuração de entrada, na execução, e o desempenho computacional; depois, propusemos um modelo baseado no Aprendizado de Máquina para avaliar, predizer e melhorar o desempenho dos modelos estênceis na arquitetura usada; também usamos um modelo de propagação da onda acústica fornecido pela empresa Petrobras; e predizemos o desempenho com uma alta precisão (até 99%) nas arquiteturas multicore. Segundo, orientamos nossa pesquisa nas arquiteturas heterogêneas, analisamos uma implementação padrão do modelo de propagação de ondas em CUDA, para encontrar os fatores que afetam o desempenho quando o número de aceleradores é aumentado; então, propusemos uma implementação baseada em tarefas para amelhorar o desempenho, de acordo com um conjunto de configuração no tempo de execução (algoritmo de escalonamento, tamanho e número de tarefas), e comparamos o desempenho obtido com as versões de só CPU ou só GPU e o impacto no desempenho das arquiteturas heterogêneas; nossos resultados demostram um speedup significativo (até 25) em comparação com a melhor implementação disponível para arquiteturas multicore. / Wave modeling is a crucial tool in geophysics, for efficient strong motion analysis, risk mitigation and oil & gas exploration. Due to its simplicity and numerical efficiency, the finite-difference method is one of the standard techniques implemented to solve the wave propagation equations. This kind of applications is known as stencils because they consist in a pattern that replicates the same computation on a multi-dimensional domain. High Performance Computing is required to solve this class of problems, as a consequence of a large number of grid points involved in three-dimensional simulations of the underground. The performance optimization of stencil computations is a challenge and strongly depends on the underlying architecture. In this context, this work was directed toward a twofold aim. Firstly, we have led our research on multicore architectures and we have analyzed the standard OpenMP implementation of numerical kernels from the 3D heat transfer model (a 7-point Jacobi stencil) and the Ondes3D code (a full-fledged application developed by the French Geological Survey). We have considered two well-known implementations (naïve, and space blocking) to find correlations between parameters from the input configuration at runtime and the computing performance; thus, we have proposed a Machine Learning-based approach to evaluate, to predict, and to improve the performance of these stencil models on the underlying architecture. We have also used an acoustic wave propagation model provided by the Petrobras company and we have predicted the performance with high accuracy on multicore architectures. Secondly, we have oriented our research on heterogeneous architectures, we have analyzed the standard implementation for seismic wave propagation model in CUDA, to find which factors affect the performance; then, we have proposed a task-based implementation to improve the performance, according to the runtime configuration set (scheduling algorithm, size, and number of tasks), and we have compared the performance obtained with the classical CPU or GPU only versions with the results obtained on heterogeneous architectures. Simulação Aprendizado : máquina Geoinformática HPC Machine Learning Performance improvement Performance Simulation Stencil Computations Heterogeneous Architectures Multicore
13	The Study of Metal Diffusion on Si(001) using a Nanostencil Shadow Mask To, Nelson 25 August 2011 (has links) A self-aligning nanostencil mask is used to fabricate circular features of tin, indium and silver on an atomically clean Si(001) substrate. The shadow mask limits deposited material to areas under openings in the mask, leaving adjacent clean areas for material to diffuse. STM, SEM and AFM have been used to study the surface diffusion of these metals in UHV. The diffusion of tin is relatively limited in comparison to the other metals. Indium forms metal islands that dissolve over time and contribute to the spreading of a surrounding single layer film. Lastly, silver forms a film that spreads even in the absence of metal islands. Surface diffusion metal films nanostencil shadow mask stencil thin films lithography 0794
14	The Study of Metal Diffusion on Si(001) using a Nanostencil Shadow Mask To, Nelson 25 August 2011 (has links) A self-aligning nanostencil mask is used to fabricate circular features of tin, indium and silver on an atomically clean Si(001) substrate. The shadow mask limits deposited material to areas under openings in the mask, leaving adjacent clean areas for material to diffuse. STM, SEM and AFM have been used to study the surface diffusion of these metals in UHV. The diffusion of tin is relatively limited in comparison to the other metals. Indium forms metal islands that dissolve over time and contribute to the spreading of a surrounding single layer film. Lastly, silver forms a film that spreads even in the absence of metal islands. Surface diffusion metal films nanostencil shadow mask stencil thin films lithography 0794
15	Performance optimization of geophysics stencils on HPC architectures / Optimização de desempenho de estênceis geofísicos sobre arquiteturas HPC Abaunza, Víctor Eduardo Martínez January 2018 (has links) A simulação de propagação de onda é uma ferramenta crucial na pesquisa de geofísica (para análise eficiente dos terremotos, mitigação de riscos e a exploração de petróleo e gáz). Devido à sua simplicidade e sua eficiência numérica, o método de diferenças finitas é uma das técnicas implementadas para resolver as equações da propagação das ondas. Estas aplicações são conhecidas como estênceis porque consistem num padrão que replica a mesma computação num domínio multidimensional de dados. A Computação de Alto Desempenho é requerida para solucionar este tipo de problemas, como consequência do grande número de pontos envolvidos nas simulações tridimensionais do subsolo. A optimização do desempenho dos estênceis é um desafio e depende do arquitetura usada. Neste contexto, focamos nosso trabalho em duas partes. Primeiro, desenvolvemos nossa pesquisa nas arquiteturas multicore; analisamos a implementação padrão em OpenMP dos modelos numéricos da transferência de calor (um estêncil Jacobi de 7 pontos), e o aplicativo Ondes3D (um simulador sísmico desenvolvido pela Bureau de Recherches Géologiques et Minières); usamos dois algoritmos conhecidos (nativo, e bloqueio espacial) para encontrar correlações entre os parâmetros da configuração de entrada, na execução, e o desempenho computacional; depois, propusemos um modelo baseado no Aprendizado de Máquina para avaliar, predizer e melhorar o desempenho dos modelos estênceis na arquitetura usada; também usamos um modelo de propagação da onda acústica fornecido pela empresa Petrobras; e predizemos o desempenho com uma alta precisão (até 99%) nas arquiteturas multicore. Segundo, orientamos nossa pesquisa nas arquiteturas heterogêneas, analisamos uma implementação padrão do modelo de propagação de ondas em CUDA, para encontrar os fatores que afetam o desempenho quando o número de aceleradores é aumentado; então, propusemos uma implementação baseada em tarefas para amelhorar o desempenho, de acordo com um conjunto de configuração no tempo de execução (algoritmo de escalonamento, tamanho e número de tarefas), e comparamos o desempenho obtido com as versões de só CPU ou só GPU e o impacto no desempenho das arquiteturas heterogêneas; nossos resultados demostram um speedup significativo (até 25) em comparação com a melhor implementação disponível para arquiteturas multicore. / Wave modeling is a crucial tool in geophysics, for efficient strong motion analysis, risk mitigation and oil & gas exploration. Due to its simplicity and numerical efficiency, the finite-difference method is one of the standard techniques implemented to solve the wave propagation equations. This kind of applications is known as stencils because they consist in a pattern that replicates the same computation on a multi-dimensional domain. High Performance Computing is required to solve this class of problems, as a consequence of a large number of grid points involved in three-dimensional simulations of the underground. The performance optimization of stencil computations is a challenge and strongly depends on the underlying architecture. In this context, this work was directed toward a twofold aim. Firstly, we have led our research on multicore architectures and we have analyzed the standard OpenMP implementation of numerical kernels from the 3D heat transfer model (a 7-point Jacobi stencil) and the Ondes3D code (a full-fledged application developed by the French Geological Survey). We have considered two well-known implementations (naïve, and space blocking) to find correlations between parameters from the input configuration at runtime and the computing performance; thus, we have proposed a Machine Learning-based approach to evaluate, to predict, and to improve the performance of these stencil models on the underlying architecture. We have also used an acoustic wave propagation model provided by the Petrobras company and we have predicted the performance with high accuracy on multicore architectures. Secondly, we have oriented our research on heterogeneous architectures, we have analyzed the standard implementation for seismic wave propagation model in CUDA, to find which factors affect the performance; then, we have proposed a task-based implementation to improve the performance, according to the runtime configuration set (scheduling algorithm, size, and number of tasks), and we have compared the performance obtained with the classical CPU or GPU only versions with the results obtained on heterogeneous architectures. Simulação Aprendizado : máquina Geoinformática HPC Machine Learning Performance improvement Performance Simulation Stencil Computations Heterogeneous Architectures Multicore
16	MICRO/NANOSTRUCTURED SURFACES THROUGH THIN FILM STENCIL LIFT-OFF: APPLICATIONS TO PATTERNING AND SENSING Zhu, Yujie January 2017 (has links) The rapid development of micro/nanofabrication techniques have enabled engineering of material interfacial properties. Micro/nanostructures with unique electrical, mechanical, thermal, magnetic, optical, and biological properties, have found applications in a wide range of fields such as electronics, photonics, biological/chemical sensing, tissue engineering, and diagnostics, etc. As such, numerous strategies have been developed for structuring materials into micro/nano- scale. However, the challenge still lies in the high cost, low throughput, complexity in fabrication, and difficulty in scaling up. This thesis aims to explore fabrication strategies for micro/nanostructured surfaces that are versatile, simple, and inexpensive. The thin film stencil lift-off technique with both Parylene and self-adhesive vinyl has been explored for this purpose. Further applications of the resulted micro/nanostructured surfaces are also presented in this thesis. Through improved Parylene stencil fabrication process, both spontaneously phase-segregated and arbitrary binary supported lipid bilayer patterns have been achieved. Also, the microstructured Parylene surfaces have been ddemonstrated for patterning stacked SLBs that are either homogeneous or phase-segregated. Without any lithography technique involved, vinyl stencil lift-off offers as a facile and inexpensive benchtop method for patterning thin films such as metal and glassy films. Combining the thermal shrinking of shape memory polymer, the patterned feature sizes are further decreased by 60% in both x and y dimensions, pushing the patterning resolution to down to sub-100 μm range. In addition, the shrinking process induces micro/nanostructures onto the deposited thin film, and the structure sizes are easily tunable with film thickness deposited. Further applications of such patterned micro/nanostructured surfaces has also been explored. The structured gold films have served as high-surface-area electrodes for electrochemical sensing. By introducing photoresist as a sacrificial layer, the structured gold thin films can be lifted off and transferred onto elastomeric substrate, and serve stretchable and flexible sensors. Such sensing devices exhibit great stability and reproducibility even when working under external strain. Finally, the micro/nanostructured glassy surfaces have been employed as substrate for cell growth to study topographical effect on cell morphology. It has been concluded that rougher surfaces lead to cell elongation, and finer structures promote filopodia generation. These results underscore the strength and suitability of thin film stencil lift-off as a powerful technique for creating micro- and nanostructured surfaces. These structured surfaces could find applications in many other areas, due to their great properties such as tunable structure size, high surface area, flexibility, and long-term stability. / Thesis / Doctor of Philosophy (PhD) Micro/nano- structured surfaces Polymer stencil lift-off Supported lipid bilayer Micropatterning Electrochemical sensing
17	Optimization of Stencil Computations on GPUs Rawat, Prashant Singh 10 August 2018 (has links) No description available. Computer Science Stencil Computations GPGPU Register Pressure Fusion Tiling Instruction Reordering
18	Performance Optimization of Memory-Bound Programs on Data Parallel Accelerators Sedaghati Mokhtari, Naseraddin 08 June 2016 (has links) No description available. Computer Science Computer Engineering Engineering
19	On the Programmability and Performance of OpenCL Designs for FPGA Verma, Anshuman 09 February 2018 (has links) Field programmable gate arrays (FPGAs) have been emerging as a promising bedrock to provide opportunities for several types of accelerators that spans across various domains such as finance, web-search, and data center networking, among others. Research interests facilitating the development of accelerators on FPGAs are increasing significantly, in particular, because of their effectiveness with a variety of applications, flexibility, and high performance per watt. However, several key challenges remain that hinder their large-scale deployment. Overcoming these challenges would enable them to match the pervasiveness of graphics processor units (GPUs), their principal competitors in this arena. One of the primary reasons responsible for the slow adaptation by programmers has been the programming model, which uses a low-level hardware description language (HDL). Using HDLs require a detailed understanding of logic design and significant effort to implement and verify the behavioral models, with the latter growing with its complexity. Recent advancements in high-level language synthesis (HLS) tools have addressed this challenge to a considerable extent by allowing the programmers to write their applications in a high-level language named OpenCL. These applications are then compiled and synthesized to create a bitstream that configures the FPGA. This thesis characterizes the efficacy of HLS compiler optimizations that can be employed to improve the performance of these applications. The synthesized hardware from OpenCL kernels is fundamentally different from traditional hardware such as CPUs and GPUs, which exploit instruction level parallelism (ILP) thread level parallelism (TLP), or data level parallelism (DLP) for performance gains. FPGAs typically use deep pipelining (i.e., ILP) for performance. A stall in this pipeline may severely undermine the performance of applications. Thus, it is imperative to identify and remove any such bottlenecks. To this end, this thesis presents and discusses a software-centric framework to debug and profile the synthesized designs generated using HLS tools. This thesis proposes basic code patterns, including a timestamp and a scalable framework, which can be plugged easily into OpenCL kernels, to collect and process run-time information dynamically. This scalable framework has a small overhead for area utilization and frequency but provides fine-grained information about the bottlenecks and latencies in design. Additionally, although HLS tools have improved programmability, this may come at the cost of performance or area utilization. This thesis addresses this design trade-off via a comparative study of a hand-coded design in HDL and an architecturally similar, tool-generated design using an OpenCL compiler in the application area of 3D-stencil (i.e., structured grid) computation. Experiments in this thesis show that the performance of an OpenCL approach can achieve 95% of the peak attainable performance of a microkernel for multiple problem sizes. In comparison to the OpenCL approach, an HDL approach results in approximately 50% less memory usage and only 2% better performance on average. / MS / A hardware chip consists of switches or transistors, and a modern chip can have a few billions of them. Specifying the interconnection among these transistors and their placement on a chip is a complex problem. To simplify this, the chip-design flow uses automated tools and abstraction at the different levels of the flow, such as architecture, design, synthesis, placement, among others. During design, an engineer specifies the behavioral model in a hardware description language (HDL), which is later used by the automated tools for further processing. Using the HDL requires a detailed understanding of logic design and significant effort to implement and verify the behavioral models, with the latter growing with its complexity. Recent advancements in high-level language synthesis tools have addressed this challenge to a considerable extent by allowing the programmers to write their applications in a high-level language. This thesis characterizes the efficacy of such a tool and available optimizations that can be employed to improve the performance of these applications. Additionally, this thesis presents and discusses a framework to debug and profile the designs generated using high-level synthesis tools, which can be plugged easily into an application, to collect and process run-time information dynamically. This scalable framework has a small overhead but provides fine-grained information about the bottlenecks in the design. Furthermore, the experiments in this work show that a design generated from a high-level synthesis tool has similar performance when compared to a manual design in HDL, at the expense of area utilization. Field programmable gate arrays OpenCL HDL HLS AOCL Verilog Accelerator GEM Stencil
20	Tiling Stencil Computations To Maximize Parallelism Bandishti, Vinayaka Prakasha 12 1900 (has links) (PDF) Stencil computations are iterative kernels often used to simulate the change in a discretized spatial domain overtime (e.g., computational fluid dynamics) or to solve for unknowns in a discretized space by converging to a steady state (i.e., partial differential equations).They are commonly found in many scientific and engineering applications. Most stencil computations allow tile-wise concurrent start ,i.e., there exists a face of the iteration space and a set of tiling hyper planes such that all tiles along that face can be started concurrently. This provides load balance and maximizes parallelism. Loop tiling is a key transformation used to exploit both data locality and parallelism from stencils simultaneously. Numerous works exist that target improving locality, controlling frequency of synchronization, and volume of communication wherever applicable. But, concurrent start-up of tiles that evidently translates into perfect load balance and often reduction in frequency of synchronization is completely ignored. Existing automatic tiling frameworks often choose hyperplanes that lead to pipelined start-up and load imbalance. We address this issue with a new tiling technique that ensures concurrent start-up as well as perfect load balance whenever possible. We ﬁrst provide necessary and sufficient conditions on tiling hyperplanes to enable concurrent start for programs with affine data accesses. We then discuss an iterative approach to find such hyperplanes. It is not possible to directly apply automatic tiling techniques to periodic stencils because of the wrap-around dependences in them. To overcome this, we use iteration space folding techniques as a pre-processing stage after which our technique can be applied without any further change. We have implemented our techniques on top of Pluto-a source-level automatic parallelizer. Experimental evaluation on a 12-core Intel Westmere shows that our code is able to outperform a tuned domain-speciﬁc stencil code generator by 4% to2 x, and previous compiler techniques by a factor of 1.5x to 15x. For the swim benchmark from SPECFP2000, we achieve an .improvement of 5.12 x on a 12-core Intel Westmere and 2.5x on a 16-core AMD Magny-Cours machines, over the auto-parallelizer of Intel C Compiler. Stencil Computations Concurrent Start-Up Tiling Hyperplanes Periodic Stencils Compilers (Computer Programs) Multiprocessors Computer Architecture Parallelism (Computer Architecture) Tiling Stencil Computations Automatic Parallelizers Computer Science

Search results