Global ETD Search

41	Climate change impact on probable maximum precipitation and probable maximum flood in Québec / Les effets du changement climatique sur la pluie maximale probable et la crue maximale probable au Québec Rouhani, Hassan January 2016 (has links) Abstract : As atmospheric temperatures at the Earth’s surface increase due to global warming, the capacity of lower atmospheric levels to hold water vapor rises and thus, precipitations and floods will be influenced. In turn, extreme precipitation and flood events are subject to potential modifications under climate change, namely, Probable Maximum Precipitation (PMP) and Probable Maximum Flood (PMF). This research aims at analyzing climate change influences on PMP and PMF in three watersheds with different climatic conditions across the province of Québec, Canada. The watersheds are located in the south, center and north of the province. They have been selected in a manner which reflects climate diversity across Québec. In order to study climate change conditions, the data output of the Canadian Regional Climate Model (CRCM) was used. This database covers a time horizon from 1961 up to 2100 and includes daily precipitation, temperature, specific humidity and Convective Available Potential Energy (CAPE). These data were used to estimate PMP. The World Meteorological Organization (WMO) method was adapted to estimate PMP values under climate change conditions. The 100-year return period precipitable water (W100) was selected as an upper limit of precipitable water in establishing maximization ratio. The time series for estimating W100 was established from annual maximum precipitable water values that have similar atmospheric variables of the event to be maximized. The atmospheric variables used in this research were atmospheric temperature at the Earth’s surface and CAPE. This method does not require setting any upper bound limit to the maximization ratio and is therefore more amenable to calculate the PMP in a climate change context. The PMP was used to run a distributed hydrological model to estimate PMF. PMP and PMF values were estimated in three 45-year time horizons: recent past (centered on 1985), near future (2030) and far future (2070). In regions where snowmelt plays a key role in the annual hydrological cycle, winter-spring flooding could be the major discharge. Consequently, PMP and PMF were separately analyzed in two seasons: summer-fall (snow-free) and winter-spring (snow accumulation and melt). The largest value obtained was identified as the all-season PMP/PMF. Summer-fall PMF was estimated by inserting the PMP in each day of the simulated time horizon. Therefore, all soil moisture conditions prior to PMP occurrence were included. Accordingly, a distribution of PMF values based on different initial conditions (soil wetness levels) was obtained. Winter-spring PMF was estimated by inserting the PMP value at the end of a warm melting period and for an extreme snow accumulation. Our results show that the PMF of three watersheds would occur in the winter-spring season in current and future climate projections. Furthermore, all-season PMP and PMF in southern Québec would decrease, but trends in central and northern of Québec would be reversed and the PMP and PMF would increase in projected climate conditions. In the center and north of Québec, the PMF would increase by 25 and 23% respectively, at the end of the 21st century. For the same period, PMF would have a reduction of 25% in the south of Québec. Of the three watersheds, the PMF always occurs at the end of winter-spring season when the snow accumulation is the greatest. / Résumé: Quand la température atmosphérique à la surface a des augmentations en raison du réchauffement climatique mondial, la capacité des niveaux atmosphériques inférieurs à contenir de la vapeur d’eau s’élève. Ceci peut influencer les précipitations et les inondations. C’est pourquoi le réchauffement mondial conduit au changement climatique. Les précipitations extrêmes et les inondations extrêmes peuvent potentiellement subir des changements, à savoir, la précipitation maximale probable (PMP) et la crue maximale probable (CMP). Cette recherche vise à analyser les influences du changement climatique sur la PMP et la CMP dans trois bassins versants avec différentes conditions climatiques à travers la province de Québec, Canada. Les bassins versants sont situés dans le sud, le centre et le nord du Québec. Ils ont été sélectionnés d’une manière qui reflète la diversité du climat à travers le Québec. Afin d'étudier les conditions du changement climatique, les sorties du modèle régional canadien du climat (MRCC) ont été utilisées. Cette base de données couvre un horizon de temps à partir de 1961 jusqu'à 2100. Les données comprennent la précipitation quotidienne, la température, l'humidité spécifique et l’énergie potentielle de convection disponible (EPCD). Ces données ont été utilisées pour estimer la PMP. La méthode de l’Organisation Météorologique Mondiale (OMM) a été adaptée pour estimer les valeurs de la PMP dans des conditions de changements climatiques. L'eau précipitable centennale (W100) a été choisie comme une limite supérieure de l'eau précipitable pour déterminer le rapport de maximisation. Les séries chronologiques pour estimer W100 ont été établies à partir de valeurs annuelles maximales d'eau précipitable qui sont associées à des valeurs de variables atmosphériques similaires à l'événement qui doit être maximisé. Les variables atmosphériques utilisées dans cette recherche sont la température atmosphérique à la surface et l'EPCD. Cette méthode ne nécessite pas de fixer une limite supérieure au rapport de maximisation et est donc plus propice à la détermination de la PMP dans un contexte des changements climatiques. La PMP résultante a été utilisée pour forcer un modèle hydrologique distribué afin d’estimer la CMP. Les valeurs de la PMP et de la CMP ont été estimées en trois horizons de temps: le passé récent, les futurs proches (2030) et lointains (2070). Dans les régions où la fonte des neiges joue un rôle clé dans le cycle hydrologique annuel, les crues printanières en climat actuel correspondent habituellement au débit maximum annuel. La PMP et la CMP ont cependant été analysées séparément en deux saisons: l'été-automne (sans neige) et l'hiver-printemps (accumulation et fonte de neige) pour évaluer l’impact des changements climatiques sur la saisonnalité de ces événements extrêmes. La plus grande valeur obtenue a été identifiée comme la PMP / CMP annuelle. La CMP d’été-automne a été estimée par l'insertion de la PMP pour chaque jour de l'horizon de temps de simulation. Par conséquent, toutes les conditions possibles d'humidité du sol avant l’événement de la PMP ont été incluses. En conséquence, une distribution des valeurs de CMP basées sur différentes conditions initiales (niveaux d'humidité du sol) a été obtenue. La CMP d’hiver-printemps a été estimée en insérant la valeur de PMP à la fin d'une période de fonte et une accumulation de neige extrême. Nos résultats montrent que dans le sud et le nord du Québec, la CMP a toujours lieu à la fin de la saison hiver-printemps lorsque l’accumulation de neige est maximale. Aussi, la PMP et la CMP au sud du Québec devraient diminuer, mais la tendance au centre et au nord du Québec serait inversée. Dans le centre et le nord, la CMP augmente de 22 et 21%, respectivement, à la fin du 21e siècle alors que pour la même période, CMP aurait une réduction de 13% dans le sud du Québec. La CMP annuelle de ces bassins versants se produit dans la saison hiver-printemps dans trois horizons temporels. PMP CMP Changements climatiques Québec PMF Climate change conditions
42	A Multi-core Testbed on Desktop Computer for Research on Power/Thermal Aware Resource Management Dierivot, Ashley 06 June 2014 (has links) Our goal is to develop a flexible, customizable, and practical multi-core testbed based on an Intel desktop computer that can be utilized to assist the theoretical research on power/thermal aware resource management in design of computer systems. By integrating different modules, i.e. thread mapping/scheduling, processor/core frequency and voltage variation, temperature/power measurement, and run-time performance collection, into a systematic and unified framework, our testbed can bridge the gap between the theoretical study and practical implementation. The effectiveness for our system was validated using appropriately selected benchmarks. The importance of this research is that it complements the current theoretical research by validating the theoretical results in practical scenarios, which are closer to that in the real world. In addition, by studying the discrepancies of results of theoretical study and their applications in real world, the research also aids in identifying new research problems and directions. power thermal cpu testbed multicore CMP multiprocessor real-time
43	InP/Si Template for Photonic Application Larsson, Niklas January 2015 (has links) In this work an epitaxial layer of Indium Phosphide (InP) has been grown on top of a silicon substrate using the Corrugated Epitaxial Lateral Overgrowth (CELOG) technique. The grown InP CELOG top layer typically has a poor surface roughness and planarity. Before this surface can be used for any processing it has to be smooth and planarized. For this purpose a two-step Chemical Mechanical Polishing (CMP) technique has been investigated and developed. In the first step commercially available Chemlox has been used to planarize the sample. In the second step Citric Acid (CA) and sodium hypochlorite (NaClO) has been mixed together to form abrasive-free polishing slurry. The second step has been developed to remove defects introduced by the first step. This surface is prepared to demonstrate that a photonic device such as a quantum well can be realized in a Photonic Integrated Circuit (PIC). A quantum well was grown on the polished CELOG InP/Si sample and measured with Atomic Force Microscopy (AFM), Scanning Electron Microscopy (SEM), X-ray Diffraction (XRD) and Photoluminescence (PL). The roughness was improved with CMP from 33.2 nm to 12.4 nm. However the quantum well did not give any response in the PL measurements. / I detta arbete har ett epitaxiellt lager indiumfosfid (InP) blivit växt på ett kiselsubstrat med hjälp av en korrugerad epitaxial lateral överväxt (CELOG) teknik. Det översta lagret av den CELOG växta ytan har ofta en väldigt ojämn yta. Innan denna yta kan användas till någon fortsatt utveckling måste den vara slät och plan. Det översta lagret har polerats med hjälp av en två-stegs kemisk mekanisk polerings (CMP) teknik. I det första steget har komersiellt tillgänglig Chemlox använts för att planarisera ytan. I det andra steget har citronsyra (CA) och natrium hypoklorit (NaClO) blandats samman för att bilda ett partikelfritt polermedel. Det andra steget har tagits fram för att ta bort defekter introducerade I det första steget. Ytan är preparerad för att demonstrera att en fotonisk enhet, t.ex. en kvantbrunn kan realiseras I en fotonisk integrerad krets (PIC). En kvantbrunn växtes fram på det polerade CELOG InP/Si provet och mattes med hjälp av atomkraftsmikroskop (AFM), scanning electron mikroskopi (SEM), röntgendiffraktion (XRD) och fotoluminisens (PL). Ytojämnheten förbättrades med hjälp av CMP från 33.2 nm till 12.4 nm. Dock så gav ej kvantbrunnen någon respons I PL-mätningarna. CELOG CMP Chemlox Citric Acid Sodium Hypochlorite InP CELOG CMP Chemlox Citronsyra Natrium Hypoklorit InP Nano Technology Nanoteknik
44	Aufklärung der Wechselwirkung von Abrasivteilchen einer Poliersuspension mit Oberflächen mittels direkter Kraft- und rheologischer Untersuchungen Hempel, Steffi 09 January 2012 (has links) (PDF) Das chemisch-mechanische Planarisieren (CMP) in der Halbleiterindustrie ist ein Prozess mit sehr vielen Einflussgrößen, wobei das Polierergebnis unter anderem von den Eigenschaften der Wechselwirkungskomponenten Wafer, Poliersuspension und Polierpad abhängig ist. Bei der Entwicklung neuer Schaltkreisentwürfe werden die strukturellen Abhängigkeiten der Topografie nach dem CMP häufig im Verlauf von zeit- und kostenintensiven Lernzyklen aufgedeckt und angepasst. Um Dauer und Kosten für die Entwicklung neuer Schaltkreise zu reduzieren, sollte im Rahmen eines BMBF-Projektes ein umfassendes Gesamtmodell, welches den Polierprozess ausführlich beschreibt, entwickelt werden. Für die Umsetzung dieses Vorhabens ist ein umfassendes qualitatives und quantitatives Verständnis der mechanisch-hydrodynamischen und physikalisch-chemischen Mechanismen zu erarbeiten, welche den Materialabtrag und die Planarisierung beim CMP bestimmen. Ziel der vorliegenden Arbeit war es zum einen, mittels direkter Kraftmessung am AFM die Wechselwirkungskräfte zwischen den Festkörperoberflächen von Schleifpartikel und Wafer sowie zwischen den Schleifpartikeln untereinander in CMP-relevanten Flüssigkeiten und ihre Bedeutung für das CMP zu untersuchen. Um die Wechselwirkungskräfte am AFM bestimmen zu können, war zuvor die Entwicklung einer geeigneten Versuchsanordnung nötig. Zur Absicherung der Ergebnisse aus den Kraftmessungen wurde eine Methode erarbeitet, um die zwischenpartikulären Wechselwirkungen mittels rheologischer Untersuchungen indirekt bestimmen zu können. Des Weiteren fanden rheologische Messungen zur Untersuchung der Fließeigenschaften der Poliersuspensionen statt, wobei außerdem der Einfluss anwendungsrelevanter hydrodynamischer Kräfte auf die Stabilität der Poliersuspension zu überprüfen war. Als Poliersuspensionen kamen kommerziell verfügbare Slurries sowie eine Modellslurry zum Einsatz. Neben Systemen mit dispergierten Silica-Partikeln wurde auch eine Slurry mit Ceria-Partikeln als disperse Phase betrachtet. Die kontinuierliche Phase einer Poliersuspension ist ein Mehrkomponentensystem und enthält unterschiedlichste Additive. Untersucht wurde der Einfluss von pH-Wert und Elektrolytkonzentration auf die Wechselwirkungskräfte, das Fließverhalten sowie den Materialabtrag. Wechselwirkungskräfte direkte Kraftmessung Rasterkraftmikroskopie (AFM) Rheology Chemical mechanical planarization (CMP) interaction forces direct force measurements atomic force microscopy (AFM) rheology ddc:620 rvk:VE 9857
45	Scheduling Tasks on Heterogeneous Chip Multiprocessors with Reconfigurable Hardware Teller, Justin Stevenson 31 July 2008 (has links) No description available. Computer Science Electrical Engineering scheduling and task partitioning reconfigurable hardware parallel processors emerging technologies heterogeneous systems Network on a Chip (NoC) Chip Multiprocessor (CMP) matching and scheduling
46	Fabrication par lithographie hybride et procédé damascène de transistors monoélectroniques à grille auto-alignée Morissette, Jean-François January 2010 (has links) Ce mémoire est le résultat d'un projet de fabrication de transistors monoélectroniques (SET). Ces dispositifs, fabriqués pour la première fois à la fin des années quatre-vingt, permettent d'observer le passage d'un nombre discret d'électrons entre deux électrodes. À température ambiante, le fonctionnement des transistors n'est pas garanti, et nécessite généralement des composantes de taille nanométriques. Autrefois vus comme de potentiels remplaçants aux transistors MOSFET dans les circuits intégrés, les SET ont vu le consensus général quant à leur application migrer vers les applications-niche, et vers une intégration hybride SET-CMOS. On présente ici une méthode de fabrication basée sur un procédé damascène développé par Dubuc et al .[10][l1]. Les résultats obtenus antérieurement ont démontré que des transistors ainsi fabriqués atteignent des températures maximales d'opération de 433K. Par contre, la fabrication fait appel exclusivement à la lithographie par faisceau d'électrons. Si cette technique permet de définir des motifs de très petite taille, elle est néanmoins relativement lente pour l'écriture de motifs de plus grande taille tels que des pistes de contact électrique. Les motifs sont lithographies directement dans le SiO[indice inférieur 2], qui est une électrorésine à très haute résolution, mais qui demande des doses d'expositions très élevées, ralentissant davantage le procédé. De plus, les transistors utilisent l'arrière de l'échantillon en lieu de grille de contrôle, ce qui fait qu'il est impossible de contrôler individuellement les transistors. Le projet de recherche propose une plateforme pour la fabrication de SET damascène par lithographie hybride. Le but est de prendre avantage à la fois de la rapidité et de la production en lot de la photolithographie, et de la capacité d'écriture de composantes de taille submicronique de l'électrolithographie. On propose également l'ajout d'une grille individuelle auto-alignée et la migration vers la gravure plasma du diélectrique SiO[indice inférieur 2] avec un masque d'électrorésine en PMMA. Ces changements demandent la conception d'un photomasque comprenant les parties des dispositifs qui sont d'assez grande taille pour être fabriquées en photolithographie. Le design de deux dispositifs-test est également proposé. Ces dispositifs servent à caractériser les couches métalliques employées, les caractéristiques électriques des transistors et les paramètres de fabrication. La réalisation de la plateforme a permis l'accélération du rythme de production des dispositifs, tout en établissant un point de départ pour des évolutions futures. Le procédé de fabrication incluant une grille de surface auto-alignée a également été montré avec succès. Des problèmes de polissage et de dépôt par soulèvement de couches métalliques ont empêché la réalisation de dispositifs complets et fonctionnels électriquement pendant la durée du projet. Polissage chimique-mécanique (CMP) Grille auto-alignée Jonctions tunnel Lithographie hybride Électrolithographie Damascène Transistor monoélectronique (SET)
47	Porous Ultra Low-k Material Integration Through An Extended Dual Damascene Approach: Pre-/ Post-CMP Curing Comparison Calvo, Jesús, Koch, Johannes, Thrun, Xaver, Seidel, Robert, Uhlig, Benjamin 22 July 2016 (has links) (PDF) Integration of dielectrics with increased porosity is required to reduce the capacitance of interconnects. However, the conventional dual damascene integration approach is causing negative effects to these materials avoiding their immediate implementation. A post-CMP curing approach could solve some of these issues. However, materials with porogens being stable at temperatures of the barrier-seed deposition process are not common, hindering this approach. Here, we report on an extended dual-damascene integration approach which permits post-CMP curing. porous low-k integration CMP curing ddc:620 Low-k-Dielektrikum Integration Chemisch-mechanisches Polieren
48	Electro-Acoustic and Electronic Applications Utilizing Thin Film Aluminium Nitride Martin, David Michael January 2009 (has links) In recent years there has been a huge increase in the growth of communication systems such as mobile phones, wireless local area networks (WLAN), satellite navigation and various other forms of wireless data communication that have made analogue frequency control a key issue. The increase in frequency spectrum crowding and the increase of frequency into microwave region, along with the need for minimisation and capacity improvement, has shown the need for the development of high performance, miniature, on-chip filters operating in the low to medium GHz frequency range. This has hastened the need for alternatives to ceramic resonators due to their limits in device size and performance, which in turn, has led to development of the thin film electro-acoustics industry with surface acoustic wave (SAW) and bulk acoustic wave (BAW) filters now fabricated in their millions. Further, this new technology opens the way for integrating the traditionally incompatible integrated circuit (IC) and electro-acoustic (EA) technologies, bringing about substantial economic and performance benefits. In this thesis the compatibility of aluminium nitride (AlN) to IC fabrication is explored as a means for furthering integration issues. Various issues have been explored where either tailoring thin film bulk acoustic resonator (FBAR) design, such as development of an improved solidly mounted resonator (SMR) technology, and use of IC technology, such as chemical mechanical polishing (CMP) or nickel silicide (NiSi), has made improvements beneficial for resonator fabrication or enabled IC integration. The former has resulted in major improvements to Quality factor, power handling and encapsulation respectively. The later has provided alternative methods to reduce electro- or acoustomigration, reduced device size, for plate waves, supplied novel low acoustic impedance material for high power applications and alternative electrodes for use in high temperature sensors. Another method to enhance integration by using the piezoelectric material, AlN, in the IC side has also been explored. Here methods for analysing AlN film contamination and stoichiometry have been used for analysis of AlN as a high-k dielectric material. This has even brought benefits in knowledge of film composition for use as a passivation material with SiC substrates, investigated in high power high frequency applications. Lastly AlN has been used as a buried insulator material for new silicon-on-insulator substrates (SOI) for increased heat conduction. These new substrates have been analysed with further development for improved performance indicated. / wisenet AlN FBAR FPAR CMP SOI Nickel Silicide Wafer Bonding Electronics Elektronik
49	Fault-tolerant Cache Coherence Protocols for CMPs Fernández Pascual, Ricardo 23 July 2007 (has links) We propose a way to deal with transient faults in the interconnection network of many-core CMPs that is different from the classic approach of building a fault-tolerant interconnection network. In particular, we provide fault tolerance mechanisms at the level of the cache coherence protocol so that it guarantees the correct execution of programs even when the underlying interconnection network does not deliver all messages correctly. This way, we can take advantage of the different meaning of each message to achieve fault tolerance with lower overhead than at the level of the interconnection network, which has to treat all messages alike with respect to reliability.We design several fault-tolerant cache coherence protocols using these techniques and evaluate them. This evaluation shows that, in absence of faults, our techniques do not increase significantly the execution time of the applications and their major cost is an increase in network traffic due to acknowledgment messages that ensure the reliable transference of ownership between coherence nodes, which are sent out of the critical path of cache misses. In addition, a system using our protocols degrades gracefully when transient faults actually happen and can support fault rates much higher than those expected in the real world with only a small performance degradation. / Se proponen una forma de tratar con los fallos transitorios en la red de interconexión de un CMP con gran número de núcleos que es diferente del enfoque clásico basado en construir una red de interconexión tolerante a fallos. En particular se proporcionan mecanismos de tolerancia a fallos al nivel del protocolo de coherencia. De esta forma, se puede aprovechar el conocimiento que el protocolo tiene sobre el significado de cada mensaje para obtener tolerancia a fallos con menor sobrecarga que en el nivel de red, que tiene que tratar todos los mensajes idénticamente.En la tesis se diseñan y evalúan varios protocolos de coherencia utilizando estas técnicas. Los resultados muestran que, cuando no hay fallos, nuestras técnicas no incrementan significativamente el tiempo de ejecución de las aplicaciones y su mayor coste es un incremento en el tráfico de red. Además, un sistema que use nuestros protocolos soporta tasas de fallos mucho mayores que las esperadas en circunstancias realistas y su rendimiento se degrada gradualmente cuando ocurren los fallos. CMP multiprocesador en un chip coherencia tolerancia a fallos fault-tolerance cache-coherence Arquitectura de computadores 621.3
50	Improving cache Behavior in CMP architectures throug cache partitioning techniques Moretó Planas, Miquel 19 March 2010 (has links) The evolution of microprocessor design in the last few decades has changed significantly, moving from simple inorder single core architectures to superscalar and vector architectures in order to extract the maximum available instruction level parallelism. Executing several instructions from the same thread in parallel allows significantly improving the performance of an application. However, there is only a limited amount of parallelism available in each thread, because of data and control dependences. Furthermore, designing a high performance, single, monolithic processor has become very complex due to power and chip latencies constraints. These limitations have motivated the use of thread level parallelism (TLP) as a common strategy for improving processor performance. Multithreaded processors allow executing different threads at the same time, sharing some hardware resources. There are several flavors of multithreaded processors that exploit the TLP, such as chip multiprocessors (CMP), coarse grain multithreading, fine grain multithreading, simultaneous multithreading (SMT), and combinations of them.To improve cost and power efficiency, the computer industry has adopted multicore chips. In particular, CMP architectures have become the most common design decision (combined sometimes with multithreaded cores). Firstly, CMPs reduce design costs and average power consumption by promoting design re-use and simpler processor cores. For example, it is less complex to design a chip with many small, simple cores than a chip with fewer, larger, monolithic cores.Furthermore, simpler cores have less power hungry centralized hardware structures. Secondly, CMPs reduce costs by improving hardware resource utilization. On a multicore chip, co-scheduled threads can share costly microarchitecture resources that would otherwise be underutilized. Higher resource utilization improves aggregate performance and enables lower cost design alternatives.One of the resources that impacts most on the final performance of an application is the cache hierarchy. Caches store data recently used by the applications in order to take advantage of temporal and spatial locality of applications. Caches provide fast access to data, improving the performance of applications. Caches with low latencies have to be small, which prompts the design of a cache hierarchy organized into several levels of cache.In CMPs, the cache hierarchy is normally organized in a first level (L1) of instruction and data caches private to each core. A last level of cache (LLC) is normally shared among different cores in the processor (L2, L3 or both). Shared caches increase resource utilization and system performance. Large caches improve performance and efficiency by increasing the probability that each application can access data from a closer level of the cache hierarchy. It also allows an application to make use of the entire cache if needed.A second advantage of having a shared cache in a CMP design has to do with the cache coherency. In parallel applications, different threads share the same data and keep a local copy of this data in their cache. With multiple processors, it is possible for one processor to change the data, leaving another processor's cache with outdated data. Cache coherency protocol monitors changes to data and ensures that all processor caches have the most recent data. When the parallel application executes on the same physical chip, the cache coherency circuitry can operate at the speed of on-chip communications, rather than having to use the much slower between-chip communication, as is required with discrete processors on separate chips. These coherence protocols are simpler to design with a unified and shared level of cache onchip.Due to the advantages that multicore architectures offer, chip vendors use CMP architectures in current high performance, network, real-time and embedded systems. Several of these commercial processors have a level of the cache hierarchy shared by different cores. For example, the Sun UltraSPARC T2 has a 16-way 4MB L2 cache shared by 8 cores each one up to 8-way SMT. Other processors like the Intel Core 2 family also share up to a 12MB 24-way L2 cache. In contrast, the AMD K10 family has a private L2 cache per core and a shared L3 cache, with up to a 6MB 64-way L3 cache.As the long-term trend of increasing integration continues, the number of cores per chip is also projected to increase with each successive technology generation. Some significant studies have shown that processors with hundreds of cores per chip will appear in the market in the following years. The manycore era has already begun. Although this era provides many opportunities, it also presents many challenges. In particular, higher hardware resource sharing among concurrently executing threads can cause individual thread's performance to become unpredictable and might lead to violations of the individual applications' performance requirements. Current resource management mechanisms and policies are no longer adequate for future multicore systems.Some applications present low re-use of their data and pollute caches with data streams, such as multimedia, communications or streaming applications, or have many compulsory misses that cannot be solved by assigning more cache space to the application. Traditional eviction policies such as Least Recently Used (LRU), pseudo LRU or random are demand-driven, that is, they tend to give more space to the application that has more accesses to the cache hierarchy.When no direct control over shared resources is exercised (the last level cache in this case), it is possible that a particular thread allocates most of the shared resources, degrading other threads performance. As a consequence, high resource sharing and resource utilization can cause systems to become unstable and violate individual applications' requirements. If we want to provide a Quality of Service (QoS) to applications, we need to enhance the control over shared resources and enrich the collaboration between the OS and the architecture.In this thesis, we propose software and hardware mechanisms to improve cache sharing in CMP architectures. We make use of a holistic approach, coordinating targets of software and hardware to improve system aggregate performance and provide QoS to applications. We make use of explicit resource allocation techniques to control the shared cache in a CMP architecture, with resource allocation targets driven by hardware and software mechanisms.The main contributions of this thesis are the following:- We have characterized different single- and multithreaded applications and classified workloads with a systematic method to better understand and explain the cache sharing effects on a CMP architecture. We have made a special effort in studying previous cache partitioning techniques for CMP architectures, in order to acquire the insight to propose improved mechanisms.- In CMP architectures with out-of-order processors, cache misses can be served in parallel and share the miss penalty to access main memory. We take this fact into account to propose new cache partitioning algorithms guided by the memory-level parallelism (MLP) of each application. With these algorithms, the system performance is improved (in terms of throughput and fairness) without significantly increasing the hardware required by previous proposals.- Driving cache partition decisions with indirect indicators of performance such as misses, MLP or data re-use may lead to suboptimal cache partitions. Ideally, the appropriate metric to drive cache partitions should be the target metric to optimize, which is normally related to IPC. Thus, we have developed a hardware mechanism, OPACU, which is able to obtain at run-time accurate predictions of the performance of an application when running with different cache assignments.- Using performance predictions, we have introduced a new framework to manage shared caches in CMP architectures, FlexDCP, which allows the OS to optimize different IPC-related target metrics like throughput or fairness and provide QoS to applications. FlexDCP allows an enhanced coordination between the hardware and the software layers, which leads to improved system performance and flexibility.- Next, we have made use of performance estimations to reduce the load imbalance problem in parallel applications. We have built a run-time mechanism that detects parallel applications sensitive to cache allocation and, in these situations, the load imbalance is reduced by assigning more cache space to the slowest threads. This mechanism, helps reducing the long optimization time in terms of man-years of effort devoted to large-scale parallel applications.- Finally, we have stated the main characteristics that future multicore processors with thousands of cores should have. An enhanced coordination between the software and hardware layers has been proposed to better manage the shared resources in these architectures. load balancing quality of service performance predictability cache partitioning shared cache CMP architectures 004

Search results