Global ETD Search

1	Exploration of water-based inks in fine art screenprinting Adams, Irena Zdena January 1998 (has links) No description available. 700 Stencil making; Cellulose
2	Wall and ceiling stenciling in American Victorian homes Nissen, Diana Jo January 2010 (has links) Includes nineteenth-century axioms and rules of color, and illustrated stenciling procedure. / Digitized by Kansas Correctional Industries Stencil workUnited States Stencils and stencil cutting Decoration and ornament--Victorian style
3	The complexity of simplicity / Whitely, Jette. January 1983 (has links) Thesis (M.F.A.)--Rochester Institute of Technology, 1983. / Typescript. Includes bibliographical references (leaf 27).
4	Tiling and Asynchronous Communication Optimizations for Stencil Computations Malas, Tareq Majed Yasin 07 December 2015 (has links) The importance of stencil-based algorithms in computational science has focused attention on optimized parallel implementations for multilevel cache-based processors. Temporal blocking schemes leverage the large bandwidth and low latency of caches to accelerate stencil updates and approach theoretical peak performance. A key ingredient is the reduction of data traffic across slow data paths, especially the main memory interface. Most of the established work concentrates on updating separate cache blocks per thread, which works on all types of shared memory systems, regardless of whether there is a shared cache among the cores. This approach is memory-bandwidth limited in several situations, where the cache space for each thread can be too small to provide sufficient in-cache data reuse. We introduce a generalized multi-dimensional intra-tile parallelization scheme for shared-cache multicore processors that results in a significant reduction of cache size requirements and shows a large saving in memory bandwidth usage compared to existing approaches. It also provides data access patterns that allow efficient hardware prefetching. Our parameterized thread groups concept provides a controllable trade-off between concurrency and memory usage, shifting the pressure between the memory interface and the Central Processing Unit (CPU).We also introduce efficient diamond tiling structure for both shared memory cache blocking and distributed memory relaxed-synchronization communication, demonstrated using one-dimensional domain decomposition. We describe the approach and our open-source testbed implementation details (called Girih), present performance results on contemporary Intel processors, and apply advanced performance modeling techniques to reconcile the observed performance with hardware capabilities. Furthermore, we conduct a comparison with the state-of-the-art stencil frameworks PLUTO and Pochoir in shared memory, using corner-case stencil operators. We study the impact of the diamond tile size on computational intensity, cache block size, and energy consumption. The impact of computational intensity on power dissipation on the CPU and in the DRAM is investigated and shows that DRAM power is a decisive factor for energy consumption in the Intel Ivy Bridge processor, which is strongly influenced by the computational intensity. Moreover, we show that highest performance does not necessarily lead to lowest energy even if the clock speed is fixed. We apply our approach to an electromagnetic simulation application for solar cell development, demonstrating several-fold speedup compared to an efficient spatially blocked variant. Finally, we discuss the integration of our approach with other techniques for future High Performance Computing (HPC) systems, which are expected to be more memory bandwidth-starved with a deeper memory hierarchy. High Performance Computing stencil computations tiling
5	Exploring Performance Portability for Accelerators via High-level Parallel Patterns Hou, Kaixi 27 August 2018 (has links) Nowadays, parallel accelerators have become prominent and ubiquitous, e.g., multi-core CPUs, many-core GPUs (Graphics Processing Units) and Intel Xeon Phi. The performance gains from them can be as high as many orders of magnitude, attracting extensive interest from many scientific domains. However, the gains are closely followed by two main problems: (1) A complete redesign of existing codes might be required if a new parallel platform is used, leading to a nightmare for developers. (2) Parallel codes that execute efficiently on one platform might be either inefficient or even non-executable for another platform, causing portability issues. To handle these problems, in this dissertation, we propose a general approach using parallel patterns, an effective and abstracted layer to ease the generating efficient parallel codes for given algorithms and across architectures. From algorithms to parallel patterns, we exploit the domain expertise to analyze the computational and communication patterns in the core computations and represent them in DSL (Domain Specific Language) or algorithmic skeletons. This preserves the essential information, such as data dependencies, types, etc., for subsequent parallelization and optimization. From parallel patterns to actual codes, we use a series of automation frameworks and transformations to determine which levels of parallelism can be used, what optimal instruction sequences are, how the implementation change to match different architectures, etc. Experiments show that our approaches by investigating a couple of important computational kernels, including sort (and segmented sort), sequence alignment, stencils, etc., across various parallel platforms (CPUs, GPUs, Intel Xeon Phi). / Ph. D. / Nowadays, parallel accelerators have become prominent and ubiquitous, e.g., multi-core CPUs, many-core GPUs (Graphics Processing Units) and Intel Xeon Phi. The performance gains from them can be as high as many orders of magnitude, attracting extensive interest from many scientific domains. However, the gains are closely followed by two main problems: (1) A complete redesign of existing codes might be required if a new parallel platform is used, leading to a nightmare for developers. (2) Parallel codes that execute efficiently on one platform might be either inefficient or even non-executable for another platform, causing portability issues. To handle these problems, in this dissertation, we propose a general approach using parallel patterns, an effective and abstracted layer to ease the generating efficient parallel codes for given algorithms and across architectures. From algorithms to parallel patterns, we exploit the domain expertise to analyze the computational and communication patterns in the core computations and represent them in DSL (Domain Specific Language) or algorithmic skeletons. This preserves the essential information, such as data dependencies, types, etc., for subsequent parallelization and optimization. From parallel patterns to actual codes, we use a series of automation frameworks and transformations to determine which levels of parallelism can be used, what optimal instruction sequences are, how the implementation change to match different architectures, etc. Experiments show that our approaches by investigating a couple of important computational kernels, including sort (and segmented sort), sequence alignment, stencils, etc., across various parallel platforms (CPUs, GPUs, Intel Xeon Phi). GPU AVX sort stencil wavefront pattern parallelism
6	VLSI physical design automation for double patterning and emerging lithography Yuan, Kun, 1983- 07 February 2011 (has links) Due to aggressive scaling in semiconductor industry, the traditional optical lithography system is facing great challenges printing 32nm and below circuit layouts. Various promising nanolithography techniques have been developed as alternative solutions for patterning sub-32nm feature size. This dissertation studies physical design related optimization problem for these emerging methodologies, mainly focusing on double patterning and electronic beam lithography. Double Patterning Lithography (DPL) decomposes a single layout into two masks, and patterns the chip in two exposure steps. As a benefit, the pitch size is doubled, which enhances the resolution. However, the decomposition process is not a trivial task. Conflict and stitch are its two main manufacturing challenges. First of all, a post-routing layout decomposer has been developed to perform simultaneous conflict and stitch minimization, making use of the integer linear programming and efficient graph reduction techniques. Compared to the previous work which optimizes conflict and stitch separately, the proposed method produces significantly better result. Redundant via insertion, another key yield improvement technique, may increase the complexity in DPL-compliance. It could easily introduce unmanufacturable conflict, while not carefully planned and inserted. Two algo- rithms have been developed to take care of this redundant via DPL-compliance problem in the design side. While design itself is not DPL-friendly, post-routing decomposition may not achieve satisfactory solution quality. An efficient framework of WISDOM has been further proposed to perform wire spreading for better conflict and stitch elimination. The solution quality has been improved in great extent, with a little extra layout perturbations. As another promising solution for sub-22nm, Electronic Beam Lithography (EBL) is a maskless technology which shoots desired patterns directly into a silicon wafer, with charged particle beam. EBL overcomes the diffraction limit of light in current optical lithography system, however, the low throughput becomes its key technical hurdle. The last work of my dissertation formulates and investigates a bin-packing problem for reducing the processing time of EBL. / text VLSI physical design automation VLSI Double patterning Lithography Stencil design
7	Pigments and pianos painter and varnisher Lyman White / Garcia, Rebecca J. January 2007 (has links) Thesis (M.A.)--University of Delaware, 2007. / Principal faculty advisor: Brock W. Jobe, Winterthur Program in Early American Culture. Includes bibliographical references.
8	ACCTuner: OpenACC Auto-Tuner For Accelerated Scientific Applications Alzayer, Fatemah 17 May 2015 (has links) We optimize parameters in OpenACC clauses for a stencil evaluation kernel executed on Graphical Processing Units (GPUs) using a variety of machine learning and optimization search algorithms, individually and in hybrid combinations, and compare execution time performance to the best possible obtained from brute force search. Several auto-tuning techniques – historic learning, random walk, simulated annealing, Nelder-Mead, and genetic algorithms – are evaluated over a large two-dimensional parameter space not satisfactorily addressed to date by OpenACC compilers, consisting of gang size and vector length. A hybrid of historic learning and Nelder-Mead delivers the best balance of high performance and low tuning effort. GPUs are employed over an increasing range of applications due to the performance available from their large number of cores, as well as their energy efficiency. However, writing code that takes advantage of their massive fine-grained parallelism requires deep knowledge of the hardware, and is generally a complex task involving program transformation and the selection of many parameters. To improve programmer productivity, the directive-based programming model OpenACC was announced as an industry standard in 2011. Various compilers have been developed to support this model, the most notable being those by Cray, CAPS, and PGI. While the architecture and number of cores have evolved rapidly, the compilers have failed to keep up at configuring the parallel program to run most e ciently on the hardware. Following successful approaches to obtain high performance in kernels for cache-based processors using auto-tuning, we approach this compiler-hardware gap in GPUs by employing auto-tuning for the key parameters “gang” and “vector” in OpenACC clauses. We demonstrate results for a stencil evaluation kernel typical of seismic imaging over a variety of realistically sized three-dimensional grid configurations, with different truncation error orders in the spatial dimensions. Apart from random walk and historic learning based on nearest neighbor in grid size, most of our heuristics, including the one that proves best, appear to be applied in this context for the first time. This work is a stepping-stone towards an OpenACC auto-tuning framework for more general high-performance numerical kernels optimized for GPU computations. auto-tuning stencil search algorithms open ACC speedup
9	Développement du pompage de charges pour la caractérisation in-situ de nanocristaux de Si synthétisés localement dans SiO2 par implantation ionique basse énergie et lithographie stencil / Development of the charge pumping technique for the in-situ characterization of Si nanocrystals synthesized locally in SiO2 by ultra-low-energy ion-beam-synthesis and stencil lithography Diaz, Regis 04 November 2011 (has links) Le regain d'attention des industriels pour les mémoires non volatiles intégrant des nanocristaux, illustré par l'introduction sur le marché de la Flexmemory de Freescale en technologie 90 nm, incite à poursuivre des études sur ce type de systèmes. Pour cela, nous avons mis au point des cellules mémoires élémentaires, à savoir des transistors MOS dont l'oxyde de grille contient une grille granulaire formée par un plan de nanocristaux de silicium (Si-ncx) stockant la charge électrique.Ce travail présente les principaux résultats issus de ces travaux, ceux-ci allant du procédé de fabrication à la caractérisation fine des dispositifs mémoires. Le parfait contrôle de l'élaboration de la grille granulaire de Si-ncx par implantation ionique à très basse énergie (ULE-IBS) est accompagné de caractéristiques « mémoires » répondant aux normes industrielles d'endurance et d'une discrimination des pièges responsables du chargement. Le stockage majoritaire par les Si-ncx est démontré, ce qui est essentiel pour la rétention de la charge. Nous avons développé une technique électrique permettant d'extraire à la fois la quantité de charge stockée par les Si-ncx mais également leurs principales caractéristiques structurales (taille, densité, position dans l'oxyde). Cette extension de la technique électrique de « pompage de charges », non destructive et in-situ permet de suivre l'état du composant en fonctionnement et de caractériser des pièges (e.g. les Si-ncx) pour la première fois au-delà de 3 nm de profondeur dans l'oxyde. Ces résultats ont été validés par des observations TEM. La résolution du pompage de charge étant le piège unique, nous avons alors couplé l'ULE-IBS avec la lithographie « Stencil » pour réduire latéralement le nombre de Si-ncx synthétisés. Cette technique nous permet pour le moment de contrôler la synthèse locale à la position désirée dans l'oxyde de « poches » de Si-ncx de 400 nm. La synthèse de « quelques » Si-ncx est envisagée à très court terme. Nous serons alors en mesure de fabriquer des mémoires à nombre choisi de nanocristaux (par SM-ULE-IBS), dont les propriétés structurales (taille, densité, position) et électriques (quantité de charge stockée) seront vérifiées par pompage de charge, offrant ainsi des outils puissants pour la fabrication et la caractérisation de mémoires à nombre réduit de nanocristaux, notamment pour des longueurs de grilles inférieures à 90 nm / The aim of this thesis has been to fabricate and electrically characterize elementary memory cells containing silicon nanocrystals (Si-ncs), in other words MOSFET which insulating layer (SiO2) contains a Si-ncs array storing the electrical charge. We have shown that we perfectly control the synthesis of a 2D array of 3-4 nm Si-ncs embedded into the MOSFET oxide by low-energy ion implantation (1-3 keV) Reaching this goal implied two key steps: on the one hand develop a reliable MOSFET fabrication process incorporating the Si-ncs synthesis steps and on the other hand develop tools and methods for both memory window and Si-ncs array itself characterizations. We have developed an in-situ characterization technique based on the well-known charge pumping technique, allowing for the first time the extraction of traps depth (e.g. the Si-ncs array) further than 3 nm into the oxide layer leading to the characterization of both position of these Si-ncs into the SiO2 matrix and their structural properties (diameter, density). These results have been confirmed by EF-TEM measurements. Finally, we have worked on the improvement of controlled local synthesis of Si-ncs pockets by combining low-energy ion implantation and stencil lithography. We reduced the size of these pockets down to about 400 nm using this parallel, low cost and reliable technique and identified the limiting effect for the pockets size reduction. These results pave the way for memory cells containing a few Si-ncs with a well-defined position into the oxide and a well-controlled number of ncs Pompage de charges Nanocristaux Lithographie stencil Implantation ionique Mémoires non volatiles Silicon nanocrystals Ion-beam-synthesis Non-volatile memory Charge pumping Stencil lithography
10	Schémas numériques d'ordre élevé pour la simulation des écoulements turbulents sur maillage structuré et non structuré / High Order numerical schemes for turbulent flows simulation on structured and unstructured grids Cayot, Pierre 26 April 2016 (has links) Nous nous intéressons dans cette thèse au développement et à la mise en oeuvre de schémas numériques Volumes Finis d’ordre élevé pour des maillages non-structurés. Il s’agit de mettre en place les ingrédients numériques pour réaliser des simulations aux grandes échelles avec le code numérique elsA. Les schémas numériques proposés sont basés sur une approche directionnelle, afin de limiter le coût CPU et de réduire la molécule de points. La partie convective du schéma numérique doit être d’ordre élevé. L’ordre élevé est obtenu en utilisant différents gradients sur un stencil prédéféni utilisant 4 cellules. Deux gradients sont utilisés pour la partie convective : le gradient GreenGauss et le gradient “UIG”. Pour la partie diffusive, le gradient “UIG” est utilisé. Ce gradient a été développé durant la thèse et permet d’avoir un gradient moyen d’ordre 2 sur chaque interface. Ce gradient a été étudié et validé sur différents cas-tests. Les schémas numériques d’ordre élevé ont été analysés théoriquement avec des analyses d’ordre et de stabilité. Il a été montré que ces schémas peuvent atteindre l’ordre 5 sur des hexaèdres et l’ordre 3 sur des triangles équilatéraux. Suite à cette analyse, les différents schémas ont été d’abord testés en 1D sur un cas classique d’advection, puis ont été validés sur le cas de convection du vortex isentropique. / This study will present the development and results of high-order Finite Volume schemes for unstructured grids. The goal is to prepare numerical tools to perform Large Eddy Simulations with the indutrial solver elsA. These numerical schemes are based on a directional approach in order to limitate the CPU cost and reduce the stencil. The convective part of the scheme needs to be high order and this is obtained by the use of gradients on a four-cell stencil. Two gradients are used for the convective part, the Green-Gauss gradient and the “UIG” gradient. For the diffusive part, the “UIG” gradient is used. It was developped during this study and allows to recover a secondorder accurate scheme. This gradient was validated theorically and numerically on some test cases. High order numerical schemes were studied theorically with order and frequency analysis. It was shown that these schemes are fifth-order accurate on regular hexaedral elements and third-order accurate on equilateral triangles. Following this analysis, these schemes were tested in 1D on an advection test case and were then validated on the convection of an isentropic vortex. Volumes Finis Schéma convectif d’ordre élevé Simulation aux Grandes Echelles Non-Structuré Stencil Directionnel Gradient Finite Volume High Order convection scheme Large Eddy Simulation Unstructured Directional stencil Gradient

Search results