• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 192
  • 29
  • 20
  • 18
  • 7
  • 4
  • 3
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 335
  • 335
  • 81
  • 66
  • 60
  • 59
  • 55
  • 55
  • 54
  • 54
  • 54
  • 54
  • 54
  • 54
  • 49
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
131

Learning, probabilistic, and asynchronous technologies for an ultra efficient datapath

Marr, Bo 17 November 2009 (has links)
A novel microarchitecture and circuit design techniques are presented for an asynchronous datapath that not only exhibits an extremely high rate of performance, but is also energy efficient. A 0.5 um chip was fabricated and tested that contains test circuits for the asynchronous datapath. Results show an adder and multiplier design that due to the 2-dimensional bit pipelining techniques, speculative completion, dynamic asynchronous circuits, and bit-level reservation stations and reorder buffers can commit 16-bit additions and multiplications at 1 giga operation per second (GOPS). The synchronicity simulator is also shown that simulates the same architecture except at more modern transistor nodes showing adder and multiplier performances at up to 11.1 GOPS in a commerically available 65 nm process. When compared to other designs and results, these prove to be some of the fastest if not the fastest adders and multipliers to date. The chip technology also was tested down to supply voltages below threshold making it extremely energy efficient. The asynchronous architecture also allows more exotic technologies, which are presented. Learning digital circuits are presented whereby the current supplied to a digital gate can be dynamically updated with floating gate technology. Probabilistic digital signal processing is also presented where the probabilistic operation is due to the statistical delay through the asynchronous circuits. Results show successful image processing with probabilistic operation in the least significant bits of the datapath resulting in large performance and energy gains.
132

Housing projekt Pattaya Thailand

Lindberg, Karin, Nordlander, Anna January 2006 (has links)
<p>This report will examine the problems and possibilities of building a luxurious modern residence in Pattaya, Thailand, incorporating the old traditional building styles of the wooden houses to an ecological house with a low demand for technology.</p><p>The client, B. Grimm Group, has recently set up a polo club in the vicinity of Pattaya and has requested a complete set of layouts regarding a planned housing area on the premises. The project includes a structure plan of the village area, perspectives, facades, building layouts and axonometric views of all house types, as well as garden plans. The written report works as a complement to the designs and explains the background to the final proposal.</p><p>The report also handles the building technology and construction process of building a traditional Thai house and briefly investigates the ecological aspects of building in Thailand.</p>
133

Passivhusen på Oxtorget

Brandt, Fredrik, Jonsson, Mathilda January 2008 (has links)
<p>Syftet med vårt examensarbete är att undersöka hur passivhus eller så kallade nollenergihus skiljer sig i funktion samt uppbyggnad från konventionella hus.</p><p>I vår undersökning tar vi upp hur utformning, orientering, material samt ett väl fungerande klimatskal påverkar energiförbrukningen.</p><p>Vi har tittat närmare på faktorer som sparar energi samt hur ett typiskt passivhus är uppbyggt. För att se hur teorin fungerar i praktiken har vi tittat närmare på befintliga passivhus, nämligen de på Oxtorget i Värnamo.</p><p>Vi har kommit fram till att passivhus fungerar och vi anser det som väldigt viktigt att man fortsätter driva fram arbetet och informera om dess betydelse för miljön.</p><p>Passivhus blir mer och mer uppmärksammat. De är något dyrare att bygga, men man tjänar in det i längden. Lönsamheten är dock inte det viktigaste utan känslan av att man gör något bra för miljön.</p> / <p>The main purpose with our diploma work is to examine how passive houses or so called zero energy houses differ in function and construction compared to conventional houses. In our research we present different things such design, orientation, materials and how a fully functional climate shell affects the energy consumption.</p><p>We’ve looked closer at some factors that save energy and how a typical passive house is constructed. To see how theory works in practice we have looked at existing passive houses, and that is the passive houses on Oxtorget in Värnamo.</p><p>We have come to the conclusion that the passive houses works and we believe that it is very important that we continues to carry on the work and inform people about its importance to the environment. Passive houses are becoming more and more noticed. They are somewhat more expensive to build but in time you will earn the money spent back. The profit is not the most important but the feeling that you do something good for the environment.</p>
134

Energy efficient operation strategy design for the combined cooling, heating and power system

Liu, Mingxi 05 June 2012 (has links)
Combined cooling, heating and power (CCHP) systems are known as trigeneration systems, designed to provide electricity, cooling and heating simultaneously. The CCHP system has become a hot topic for its high system efficiency, high economic efficiency and less greenhouse gas (GHG) emissions in recent years. The efficiency of the CCHP system depends on the appropriate system configuration, operation strategy and facility size. Due to the inherent and inevitable energy waste of the traditional operation strategies, i.e., following the electric load (FEL) and following the thermal load (FTL), more efficient operation strategy should be designed. To achieve the highest system efficiency, facilities in the system should be sized to match with the corresponding operation strategy. In order to reduce the energy waste in traditional operation strategies and improve the system efficiency, two operation strategy design methods and sizing problems are studied (In Chapter 2 and Chapter 3). Most of the improved operation strategies in the literature are based on the ''balance'' plane, which implies the match of the electric demands and thermal demands. However, in more than 95% energy demand patterns, the demands cannot match with each other at this exact ''balance'' plane. To continuously use the ''balance'' concept, in Chapter 2, the system configuration is modified from the one with single absorption chiller to be the one with hybrid chillers and expand the ''balance'' plane to be a ''balance'' space by tuning the electric cooling to cool load ratio. With this new ''balance'' space, an operation strategy is designed and the power generation unit (PGU) capacity is optimized according to the proposed operation strategy to reduce the energy waste and improve the system efficiency. A case study is conducted to verify the feasibility and effectiveness of the proposed operation strategy. In Chapter 3, a more mathematical approach to schedule the energy input and power flow is proposed. By using the concept of energy hub, the CCHP system is modelled in a matrix form. As a result, the whole CCHP system is an input-output model. Setting the objective function to be a weighted summation of primary energy savings (PESs), hourly total cost savings (HTCs) and carbon dioxide emissions reduction (CDER), the optimization problem, constrained by equality and inequality constraints, is solved by the sequential quadratic programming (SQP). The PGU capacity is also sized under the proposed optimal operation strategy. In the case study, compared to FEL and FTL, the proposed optimal operation strategy saves more primary energy and annual total cost, and can be more environmental friendly. Finally, the conclusions of this thesis is summarized and some future work is discussed. / Graduate
135

Towards a Sustainable Future: Courtyard in Contemporary Beijing

Zhu, Ningxin January 2013 (has links)
China has become one of the world’s economic engines. One major driving force is the rapid urbanization. Such fast development results in resource and energy depletion, pollution and environmental deterioration. The government has recently endorsed green buildings and urged ministries to work out a national action plan. It is predicted that green building will be the next big thing in China. But before importing any foreign green technology and green designs, is there something to be learned from the Chinese ancestors? In the long history of China, the Chinese have always employed a system of construction with the influences of geography, climate, culture, philosophy, economy and politics deeply rooted in China, making the Chinese traditional architecture distinct. Embedded in the formation of the city, siheyuan 四合院, the courtyard house in Beijing was one exceptional dwelling example that inherited the quintessence of the thousand years of building experiences and knowledge of the ancestors. This traditional urban type not only celebrated the rich and unique cultural heritage of China, it also played an important role in maximizing the natural forces to create a pleasant and comfortable environment for living. Population growth, political and economic reforms over time however have drastically changed the fate of this historical heritage. Especially under the pressure of the fast development and economic boom after the introduction of the Open Door Policy in 1978, the traditional courtyards were the first to be demolished due to the lack of modern facilities and the inability to accommodate the growing population. They were often replaced by apartment blocks and high-rise towers – imported types based on planning regulations developed in the West, outside the cultural and environmental milieu of Beijing. As a result, the city is now filled with many energy intensive buildings that eat away both the “city’s essence” and the valuable natural resources. With the current policy and ambition of China, the teardown courtyard sites within the old city wall that are still waiting for development offer the potential to address the remediation and reinterpretation of the traditional typology in a contemporary city. The thesis investigates the essences of the traditional courtyard house and explores the way to apply such qualities to the design of a new courtyard typology in contemporary Beijing. The proposal anticipates a holistic approach on both environmental, social, cultural and economic level, so as to carry out preservation that manifests in experience rather than physical restoration, and to create a project that is truly sustainable.
136

Path Selection Based Branching for Coarse Grained Reconfigurable Arrays

January 2014 (has links)
abstract: Coarse Grain Reconfigurable Arrays (CGRAs) are promising accelerators capable of achieving high performance at low power consumption. While CGRAs can efficiently accelerate loop kernels, accelerating loops with control flow (loops with if-then-else structures) is quite challenging. Techniques that handle control flow execution in CGRAs generally use predication. Such techniques execute both branches of an if-then-else structure and select outcome of either branch to commit based on the result of the conditional. This results in poor utilization of CGRA s computational resources. Dual-issue scheme which is the state of the art technique for control flow fetches instructions from both paths of the branch and selects one to execute at runtime based on the result of the conditional. This technique has an overhead in instruction fetch bandwidth. In this thesis, to improve performance of control flow execution in CGRAs, I propose a solution in which the result of the conditional expression that decides the branch outcome is communicated to the instruction fetch unit to selectively issue instructions from the path taken by the branch at run time. Experimental results show that my solution can achieve 34.6% better performance and 52.1% improvement in energy efficiency on an average compared to state of the art dual issue scheme without imposing any overhead in instruction fetch bandwidth. / Dissertation/Thesis / Masters Thesis Electrical Engineering 2014
137

Proposição de diretrizes para a implementação de soluções energeticamente eficientes e sustentáveis em edifícios da Universidade Federal de Viçosa/MG / Proposing guidelines for the implementation of sustainable and energy efficient solutions in buildings of the Federal University of Viçosa/MG

Domingos, Cintia Ataliba 30 September 2010 (has links)
Made available in DSpace on 2015-03-26T13:28:03Z (GMT). No. of bitstreams: 1 texto completo.pdf: 3403228 bytes, checksum: 8ccd0606f89da5ea0762960e15800c76 (MD5) Previous issue date: 2010-09-30 / The new buildings, resulting from physical expansion of the UFV can cause impacts, particularly regarding energy demand. Seen from this perspective it is important to tailor projects to UFV that buildings are energy efficient when put into use. Therefore, it is necessary to evaluate and rate the energy efficiency of buildings still in desing stage so that necessary adjustments are made before the execution of the building. Thus, the aim of this paper is to propose guidelines for the implementation of sustainable and nergy efficient solutions in buildings in the Campus of UFV/Viçosa. As the object of study hás become the Building Health, part of the works of REUNI/UFV that is in the process of implementing the Campus UFV/Viçosa since September 2009. The evaluation and classification of this building were performed using the Prescribed Method of tagging, which is a classifiction of perfomance requirements of the building of the three parts: the envelope, the lighting system anda ir conditioning system. Though this work can be seen that the Classification and Labellign of Buildins is na efficient technique to avaluate energy consumption of building and, if incorporated into the routine of preparing for construction work for the Campus UFV/Viçosa, contribute to the minimization of costs energy and performance for a more sustainable infrastructure of this instituition, and constitue na important benchmark for society. / Os novos edifícios, decorrentes da expansão física da UFV podem ocasionar impactos, principalmente quanto à demanda energética. Considerando-se essa perspectiva, é importante para a UFV adequar os projetos para que os edifícios sejam energeticamente eficientes quando postos em uso. Para tanto, é necessário avaliar e classificar a eficiência energética dos edifícios ainda na etapa de projeto, para que as adequações necessárias sejam feitas antes da execução do edifício. Nesse sentido, o objetivo geral deste trabalho é propor diretrizes para a implementação de soluções energeticamente eficientes e sustentáveis em edifícios do Campus da UFV/Viçosa. Como objeto de estudo tomou-se o Edifício da Saúde, que faz parte das obras do REUNI/UFV que se encontra em processo de execução no Campus UFV/Viçosa desde setembro de 2009. A avaliação e classificação deste edifício foram realizadas utilizando-se o Método Prescritivo de Etiquetagem, que consiste em uma classificação dos requisitos relativos ao desempenho de três partes do edifício: a envoltória, o sistema de iluminação e o sistema de condicionamento do ar. Por meio deste trabalho pode-se observar que a Etiquetagem e Classificação de Edifícios é uma eficiente técnica para avaliar consumo energético de edificações e, se incorporada à rotina de elaboração de projetos de obras para os campi da UFV, contribuirá para a minimização dos custos de energia e para um desempenho mais sustentável da infra-estrutura desta instituição, além de se constituir num importante referencial para a sociedade.
138

Increasing energy efficiency of processor caches via line usage predictors / Aumentando a eficiência energética da memória cache de processadores através de preditores de uso de linhas da cache

Alves, Marco Antonio Zanata January 2014 (has links)
O consumo de energia se torna cada vez mais importante para a arquitetura de processadores, onde o número de cores dentro de um mesmo chip está aumentando mas o total de energia disponível se mantém no mesmo nível ou até mesmo se reduz. Assim, técnicas para economizar energia, tais como opções de escala de frequência e desligamento automático de subsistemas, estão sendo usadas para manter a troca entre energia e desempenho. Para se obter alto desempenho, os atuais Chip Multiprocessors (CMPs) integram grandes memórias cache a fim de reduzir a latência média para acesso a memória principal, através da alocação do conjunto de dados da aplicação dentro do chip. Essas memórias cache tem sido projetadas tradicionalmente para explorar a localidade temporal usando políticas de substituição inteligentes e localidade espacial buscando todos os dados da linha da cache após uma falta de dados. Entretanto, estudos recentes mostraram que o número de sub-blocos dentro da linha da memória cache, que são realmente usados, costuma ser baixo, sendo que, os sub-blocos que são usados recebem poucos acessos antes de se tornarem mortos (isto é, nunca mais são acessados). Além disso, muitas da linhas da memória cache permanecem ligadas por longos períodos de tempo, mesmo que os dados não sejam usados novamente ou são inválidos. Para linhas de cache modificadas, a memória cache aguarda até que a linha seja expulsa para que esta seja gravada (write-back) de volta no próximo nível de memória. Essas escritas competem com as requisições de leitura (demanda do processador e prébusca da cache), aumentando a pressão no controlador de memória. Por essas razões, a eficiência energética e o desempenho das memórias cache não são ideais. Essa tese propõe a aplicação de preditores de uso de linhas da cache para aumentar a eficiência energética das memórias cache. São propostos os mecanismos Dead Sub-Block Predictor (DSBP) e Dead Line and Early Write-Back Predictor (DEWP) para permitir economia de energia sem que haja degradação do desempenho. DSBP é usado para prever quais sub-blocos da linha da cache serão usados e quantas vezes eles serão acessados de forma a trazer para a cache apenas os sub-blocos úteis e desliga-los após eles serem acessados pelo número de vezes previsto. DEWP prevê linhas de cache mortas assim que elas recebem o último acesso, desligando essas linhas. As linhas sujas são escalonadas para sofrerem write-back após a última operação de escrita, aumentando o potencial de salvar energia, reduzindo também a pressão no controlador de memória. Ambos os mecanismos propostos também reduzem a poluição nas memórias cache, dando prioridade para a expulsão de linhas mortas, melhorando as atuais políticas de substituição. Embora cada mecanismo apresentado seja capaz de funcionar separadamente dentro do sistema, ambos os mecanismos podem também ser misturados em uma mesma hierarquia de cache. Essa implementação mista é interessante pois a granularidade de sub-bloco é preferível para níveis de cache próximos do processador, onde as linhas de memória cache são expulsas rapidamente, enquanto o último nível de cache tende a usar toda a linha antes da sua expulsão. Com o intuito de avaliar os mecanismos propostos, é apresentado o Simulator of Non- Uniform Cache Architectures (SiNUCA). Esse simulador de microarquitetura com precisão de ciclos é validado em termos de desempenho e consumo de energia através da comparação com um processador real. Os resultados de desempenho foram obtidos executando aplicações das cargas de trabalho single-threaded do conjunto SPEC-CPU2006 e aplicações multi-threaded dos conjuntos SPEC-OMP2001 e NAS-NPB. Os resultados relativos a energia foram obtidos integrando o SiNUCA com as ferramentas de modelagem Multi-core Power, Area, and Timing (McPAT) e CACTI. Quando aplicados os mecanismos em todos os níveis de memória cache, observou-se em média uma redução de 36% no consumo de energia usando o DSBP, 25% usando o DEWP e 37% quando usou-se o DSBP nos níveis L1 e L2 e o DEWP no último nível. Todas essas reduções causaram uma perda desprezível de desempenho de menos de 4% em média. / Energy consumption is becoming more important for processor architectures, where the number of cores inside the chip is increasing and the total power budget is kept at the same level or even reduced. Thus, energy saving techniques such as frequency scaling options and automatic shutdown of sub-systems are being used to maintain the trade-off between power and performance. To deliver high performance, current Chip Multiprocessors (CMPs) integrate large caches in order to reduce the average memory access latency by allocating the applications’ working set on-chip. These cache memories have traditionally been designed to exploit temporal locality by using smart replacement policies, and spatial locality by fetching entire cache lines from memory on a cache miss. However, recent studies have shown that the number of sub-blocks within a line that are actually used is often low, and those sub-blocks that are used are accessed only a few times before becoming dead (that is, never accessed again). Additionally, many of the cache lines remain powered for a long period of time even if the data is not used again, or is invalid. For modified cache lines, the cache memory waits until the line is evicted to perform the write-back to next memory level. These write-backs compete with read requests (processor demand and cache prefetch), increasing the pressure on the memory controller. For these reasons, the energy efficiency and performance of cache memories are not ideal. This thesis introduces cache line usage predictors to increase the energy efficiency of cache memories. We propose the Dead Sub-Block Predictor (DSBP) and Dead Line and Early Write-Back Predictor (DEWP) mechanisms to enable energy savings without performance degradation. DSBP is used to predict which sub-blocks of a cache line will be actually accessed and how many times they will be used in order to bring into the cache only those sub-blocks that are necessary, and power them off after they are accessed the predicted number of times. DEWP predicts dead lines as soon as they receive the last access, and turns off these lines. Dirty lines are scheduled for write-back after the last write operation occurs, increasing the energy savings potential and also reducing the pressure on the memory controller. Both proposed mechanisms also reduce pollution in cache memories by prioritizing dead lines for eviction in the existing replacement policy. Although each introduced mechanism is capable of performing separately inside a system, both mechanisms can also be mixed in the same cache hierarchy. This mixed implementation is interesting because the sub-block granularity is more suitable for cache levels closer to the processor, where the cache lines are quickly evicted, while the Last- Level Cache (LLC) tends to use the whole cache line before its eviction. In order to evaluate our proposed mechanisms, we introduce the Simulator of Non- Uniform Cache Architectures (SiNUCA). This cycle-accurate microarchitecture simulator is validated in terms of performance and energy consumption by comparing it to a real processor. Our performance results were obtained executing single-threaded applications from SPEC-CPU2006 and multi-threaded applications from SPEC-OMP2001 and NASNPB benchmark suites. The energy related results were obtained by integrating SiNUCA with the Multi-core Power, Area, and Timing (McPAT) framework and the CACTI power modeling tool. When applying our mechanisms on all the cache levels, we observe on average a 36% energy reduction for DSBP, 25% energy reduction using DEWP and an average reduction of 37% in the energy consumption applying DSBP on L1 and L2 and DEWP on the LLC. All these reductions caused a negligible performance loss of less than 4% on average.
139

EPMOSt: an energy-efficient passive monitoring system for Wireless Sensor Networks / EPMOSt: um sistema de monitoramento passivo energeticamente eficiente para Redes de Sensores Sem Fio.

Fernando Parente Garcia 26 November 2014 (has links)
nÃo hà / Monitoring systems are important for debugging and analyzing Wireless Sensor Networks (WSN). In passive monitoring, a monitoring network needs to be deployed in addition to the network to be monitored, called target network. The monitoring network captures and analyzes packets sent by the target network. An energy-efficient passive monitoring system is necessary when there is a need to monitor a WSN in a real scenario because the lifetime of the monitoring network is extended and, consequently, the target network benefits from the monitoring for a longer time. In this thesis, initially, the main passive monitoring systems proposed for WSN have been identified, analyzed and compared. During the literature review, no passive monitoring system for WSN that aims to reduce the energy consumption of the monitoring network has been identified. Therefore, this thesis proposes an Energy-efficient Passive MOnitoring System for WSN (EPMOSt) that extends the lifetime of the monitoring network. EPMOSt uses two mechanisms to reduce the energy consumption of the monitoring network: sniffer election and aggregation of headers. By using the sniffer election, in general only one sniffer (a node of the monitoring network) captures packets sent by a given node of the target network, thereby reducing the transmission of packets captured by the monitoring network and, thus, considerably reducing the energy consumption of this network. By using aggregation of headers, only the information present in the headers of captured packets is sent through the monitoring network. Thus, the headers of several packets may be sent in the same monitoring message, hence reducing the overhead of transmission and consequently reducing the energy consumption of the monitoring network. Experiments performed with real sensors and with a WSN simulator in various scenarios are conducted to evaluate the proposed monitoring system. The obtained results show the energy efficiency of the EPMOSt and the viability of using it to monitor WSN in real scenarios. / Sistemas de monitoramento permitem depurar e analisar o funcionamento de uma Rede de Sensores Sem Fio (RSSF). No monitoramento passivo, uma rede de monitoramento adicional à implantada com o intuito de capturar e analisar os pacotes transmitidos pela rede a ser monitorada, denominada rede alvo. Quando se deseja monitorar uma RSSF em um ambiente real, um sistema de monitoramento passivo energeticamente eficiente à necessÃrio, pois, caso contrÃrio, a rede de monitoramento pode ter um tempo de vida bem menor do que a rede alvo. Nesta tese, inicialmente, os principais sistemas de monitoramento passivo propostos para RSSF foram identificados, analisados e comparados. Durante as pesquisas realizadas na literatura, nÃo foi identificado nenhum sistema de monitoramento passivo que se preocupasse em reduzir o consumo de energia da rede de monitoramento. Sendo assim, esta tese propÃe um sistema de monitoramento passivo energeticamente eficiente para RSSF, denominado EPMOSt (Energy-efficient Passive MOnitoring System), que prolonga o tempo de vida da rede de monitoramento. O EPMOSt utiliza dois mecanismos para reduzir o consumo de energia da rede de monitoramento: eleiÃÃo de sniffers (nÃs da rede de monitoramento) e agregaÃÃo de cabeÃalhos. A eleiÃÃo de sniffers garante que durante a maior parte do tempo apenas um sniffer captura os pacotes transmitidos por um determinado nà da rede alvo, reduzindo assim a transmissÃo de pacotes capturados redundantes atravÃs da rede de monitoramento e, consequentemente, reduzindo consideravelmente o consumo de energia desta rede. Com a agregaÃÃo de cabeÃalhos, apenas as informaÃÃes presentes nos cabeÃalhos dos pacotes capturados sÃo enviadas atravÃs da rede de monitoramento. Assim, os cabeÃalhos de vÃrios pacotes podem ser enviados na mesma mensagem de monitoramento, reduzindo assim o overhead de transmissÃo e, consequentemente, reduzindo tambÃm o consumo de energia da rede de monitoramento. Experimentos com sensores reais e com um simulador de RSSF sÃo realizados em vÃrios cenÃrios para avaliar o sistema de monitoramento proposto. Os resultados obtidos mostram a eficiÃncia energÃtica do EPMOSt e a viabilidade de utilizÃ-lo para monitorar RSSF em ambientes reais.
140

Increasing energy efficiency of processor caches via line usage predictors / Aumentando a eficiência energética da memória cache de processadores através de preditores de uso de linhas da cache

Alves, Marco Antonio Zanata January 2014 (has links)
O consumo de energia se torna cada vez mais importante para a arquitetura de processadores, onde o número de cores dentro de um mesmo chip está aumentando mas o total de energia disponível se mantém no mesmo nível ou até mesmo se reduz. Assim, técnicas para economizar energia, tais como opções de escala de frequência e desligamento automático de subsistemas, estão sendo usadas para manter a troca entre energia e desempenho. Para se obter alto desempenho, os atuais Chip Multiprocessors (CMPs) integram grandes memórias cache a fim de reduzir a latência média para acesso a memória principal, através da alocação do conjunto de dados da aplicação dentro do chip. Essas memórias cache tem sido projetadas tradicionalmente para explorar a localidade temporal usando políticas de substituição inteligentes e localidade espacial buscando todos os dados da linha da cache após uma falta de dados. Entretanto, estudos recentes mostraram que o número de sub-blocos dentro da linha da memória cache, que são realmente usados, costuma ser baixo, sendo que, os sub-blocos que são usados recebem poucos acessos antes de se tornarem mortos (isto é, nunca mais são acessados). Além disso, muitas da linhas da memória cache permanecem ligadas por longos períodos de tempo, mesmo que os dados não sejam usados novamente ou são inválidos. Para linhas de cache modificadas, a memória cache aguarda até que a linha seja expulsa para que esta seja gravada (write-back) de volta no próximo nível de memória. Essas escritas competem com as requisições de leitura (demanda do processador e prébusca da cache), aumentando a pressão no controlador de memória. Por essas razões, a eficiência energética e o desempenho das memórias cache não são ideais. Essa tese propõe a aplicação de preditores de uso de linhas da cache para aumentar a eficiência energética das memórias cache. São propostos os mecanismos Dead Sub-Block Predictor (DSBP) e Dead Line and Early Write-Back Predictor (DEWP) para permitir economia de energia sem que haja degradação do desempenho. DSBP é usado para prever quais sub-blocos da linha da cache serão usados e quantas vezes eles serão acessados de forma a trazer para a cache apenas os sub-blocos úteis e desliga-los após eles serem acessados pelo número de vezes previsto. DEWP prevê linhas de cache mortas assim que elas recebem o último acesso, desligando essas linhas. As linhas sujas são escalonadas para sofrerem write-back após a última operação de escrita, aumentando o potencial de salvar energia, reduzindo também a pressão no controlador de memória. Ambos os mecanismos propostos também reduzem a poluição nas memórias cache, dando prioridade para a expulsão de linhas mortas, melhorando as atuais políticas de substituição. Embora cada mecanismo apresentado seja capaz de funcionar separadamente dentro do sistema, ambos os mecanismos podem também ser misturados em uma mesma hierarquia de cache. Essa implementação mista é interessante pois a granularidade de sub-bloco é preferível para níveis de cache próximos do processador, onde as linhas de memória cache são expulsas rapidamente, enquanto o último nível de cache tende a usar toda a linha antes da sua expulsão. Com o intuito de avaliar os mecanismos propostos, é apresentado o Simulator of Non- Uniform Cache Architectures (SiNUCA). Esse simulador de microarquitetura com precisão de ciclos é validado em termos de desempenho e consumo de energia através da comparação com um processador real. Os resultados de desempenho foram obtidos executando aplicações das cargas de trabalho single-threaded do conjunto SPEC-CPU2006 e aplicações multi-threaded dos conjuntos SPEC-OMP2001 e NAS-NPB. Os resultados relativos a energia foram obtidos integrando o SiNUCA com as ferramentas de modelagem Multi-core Power, Area, and Timing (McPAT) e CACTI. Quando aplicados os mecanismos em todos os níveis de memória cache, observou-se em média uma redução de 36% no consumo de energia usando o DSBP, 25% usando o DEWP e 37% quando usou-se o DSBP nos níveis L1 e L2 e o DEWP no último nível. Todas essas reduções causaram uma perda desprezível de desempenho de menos de 4% em média. / Energy consumption is becoming more important for processor architectures, where the number of cores inside the chip is increasing and the total power budget is kept at the same level or even reduced. Thus, energy saving techniques such as frequency scaling options and automatic shutdown of sub-systems are being used to maintain the trade-off between power and performance. To deliver high performance, current Chip Multiprocessors (CMPs) integrate large caches in order to reduce the average memory access latency by allocating the applications’ working set on-chip. These cache memories have traditionally been designed to exploit temporal locality by using smart replacement policies, and spatial locality by fetching entire cache lines from memory on a cache miss. However, recent studies have shown that the number of sub-blocks within a line that are actually used is often low, and those sub-blocks that are used are accessed only a few times before becoming dead (that is, never accessed again). Additionally, many of the cache lines remain powered for a long period of time even if the data is not used again, or is invalid. For modified cache lines, the cache memory waits until the line is evicted to perform the write-back to next memory level. These write-backs compete with read requests (processor demand and cache prefetch), increasing the pressure on the memory controller. For these reasons, the energy efficiency and performance of cache memories are not ideal. This thesis introduces cache line usage predictors to increase the energy efficiency of cache memories. We propose the Dead Sub-Block Predictor (DSBP) and Dead Line and Early Write-Back Predictor (DEWP) mechanisms to enable energy savings without performance degradation. DSBP is used to predict which sub-blocks of a cache line will be actually accessed and how many times they will be used in order to bring into the cache only those sub-blocks that are necessary, and power them off after they are accessed the predicted number of times. DEWP predicts dead lines as soon as they receive the last access, and turns off these lines. Dirty lines are scheduled for write-back after the last write operation occurs, increasing the energy savings potential and also reducing the pressure on the memory controller. Both proposed mechanisms also reduce pollution in cache memories by prioritizing dead lines for eviction in the existing replacement policy. Although each introduced mechanism is capable of performing separately inside a system, both mechanisms can also be mixed in the same cache hierarchy. This mixed implementation is interesting because the sub-block granularity is more suitable for cache levels closer to the processor, where the cache lines are quickly evicted, while the Last- Level Cache (LLC) tends to use the whole cache line before its eviction. In order to evaluate our proposed mechanisms, we introduce the Simulator of Non- Uniform Cache Architectures (SiNUCA). This cycle-accurate microarchitecture simulator is validated in terms of performance and energy consumption by comparing it to a real processor. Our performance results were obtained executing single-threaded applications from SPEC-CPU2006 and multi-threaded applications from SPEC-OMP2001 and NASNPB benchmark suites. The energy related results were obtained by integrating SiNUCA with the Multi-core Power, Area, and Timing (McPAT) framework and the CACTI power modeling tool. When applying our mechanisms on all the cache levels, we observe on average a 36% energy reduction for DSBP, 25% energy reduction using DEWP and an average reduction of 37% in the energy consumption applying DSBP on L1 and L2 and DEWP on the LLC. All these reductions caused a negligible performance loss of less than 4% on average.

Page generated in 0.083 seconds