Spelling suggestions: "subject:"lowerpower design"" "subject:"compower design""
31 |
CDMA Channel Selection Using Switched Capacitor TechniqueNejadmalayeri, Amir Hossein January 2001 (has links)
CDMA channel selection requires sharp as well as wide-band Filtering. SAW Filters which have been used for this purpose are only available in IF range. In direct conversion receivers this has to be done at low frequencies. Switched Capacitor technique has been employed to design a low power, highly selective low-pass channel select Filter for CDMA wireless receivers. The topology which has been chosen ensures the low sensitivity of the Filter response. The circuit has been designed in a mixed-mode 0. 18u CMOS technology working with a single supply of 1. 8 V while its current consumption is less than 10 mA.
|
32 |
Algorithm/architecture codesign of low power and high performance linear algebra compute fabricsPedram, Ardavan 27 September 2013 (has links)
In the past, we could rely on technology scaling and new micro-architectural techniques to improve the performance of processors. Nowadays, both of these methods are reaching their limits. The primary concern in future architectures with billions of transistors on a chip and limited power budgets is power/energy efficiency. Full-custom design of application-specific cores can yield up to two orders of magnitude better power efficiency over conventional general-purpose cores. However, a tremendous design effort is required in integrating a new accelerator for each new application. In this dissertation, we present the design of specialized compute fabrics that maintain the efficiency of full custom hardware while providing enough flexibility to execute a whole class of coarse-grain operations. The broad vision is to develop integrated and specialized hardware/software solutions that are co-optimized and co-designed across all layers ranging from the basic hardware foundations all the way to the application programming support through standard linear algebra libraries. We try to address these issues specifically in the context of dense linear algebra applications. In the process, we pursue the main questions that architects will face while designing such accelerators. How broad is this class of applications that the accelerator can support? What are the limiting factors that prevent utilization of these accelerators on the chip? What is the maximum achievable performance/efficiency? Answering these questions requires expertise and careful codesign of the algorithms and the architecture to select the best possible components, datapaths, and data movement patterns resulting in a more efficient hardware-software codesign. In some cases, codesign reduces complexities that are imposed on the algorithm side due to the initial limitations in the architectures. We design a specialized Linear Algebra Processor (LAP) architecture and discuss the details of mapping of matrix-matrix multiplication onto it. We further verify the flexibility of our design for computing a broad class of linear algebra kernels. We conclude that this architecture can perform a broad range of matrix-matrix operations as complex as matrix factorizations, and even Fast Fourier Transforms (FFTs), while maintaining its ASIC level efficiency. We present a power-performance model that compares state-of-the-art CPUs and GPUs with our design. Our power-performance model reveals sources of inefficiencies in CPUs and GPUs. We demonstrate how to overcome such inefficiencies in the process of designing our LAP. As we progress through this dissertation, we introduce modifications of the original matrix-matrix multiplication engine to facilitate the mapping of more complex operations. We observe the resulting performance and efficiencies on the modified engine using our power estimation methodology. When compared to other conventional architectures for linear algebra applications and FFT, our LAP is over an order of magnitude better in terms of power efficiency. Based on our estimations, up to 55 and 25 GFLOPS/W single- and double-precision efficiencies are achievable on a single chip in standard 45nm technology. / text
|
33 |
Energy-efficient memory hierarchy for motion and disparity estimation in multiview video codingSampaio, Felipe Martin January 2013 (has links)
Esta dissertação de mestrado propõe uma hierarquia de memória para a Estimação de Movimento e de Disparidade (ME/DE) centrada nas referências da codificação, estratégia chamada de Reference-Centered Data Reuse (RCDR), com foco em redução de energia em codificadores de vídeo multivistas (MVC - Multiview Video Coding). Nos codificadores MVC, a ME/DE é responsável por praticamente 98% do consumo total de energia. Além disso, até 90% desta energia está relacionada com a memória do codificador: (a) acessos à memória externa para a busca das referências da ME/DE (45%) e (b) memória interna (cache) para manter armazenadas as amostras da área de busca e enviá-las para serem processadas pela ME/DE (45%). O principal objetivo deste trabalho é minimizar de maneira conjunta a energia consumida pelo módulo de ME/DE com relação às memórias externa e interna necessárias para a codificação MVC. A hierarquia de memória é composta por uma memória interna (a qual armazena a área de busca inteira), um controle dinâmico para a estratégia de power-gating da memória interna e um compressor de resultados parciais. Um controle de buscas foi proposto para explorar o comportamento da busca com o objetivo de atingir ainda mais reduções de energia. Além disso, este trabalho também agrega à hierarquia de memória um compressor de quadros de referência de baixa complexidade. A estratégia RCDR provê reduções de até 68% no consumo de energia quando comparada com estratégias estadoda- arte que são centradas no bloco atual da codificação. O compressor de resultados parciais é capaz de reduzir em 52% a comunicação com memória externa necessária para o armazenamento desses elementos. Quando comparada a técnicas de reuso de dados que não acessam toda área de busca, a estratégia RCDR também atinge os melhores resultados em consumo de energia, visto que acessos regulares a memórias externas DDR são energeticamente mais eficientes. O compressor de quadros de referência reduz ainda mais o número de acessos a memória externa (2,6 vezes menos acessos), aliando isso a perdas insignificantes na eficiência da codificação MVC. A memória interna requerida pela estratégia RCDR é até 74% menor do que estratégias centradas no bloco atual, como Level C. Além disso, o controle dinâmico para a técnica de power-gating provê reduções de até 82% na energia estática, o que é o melhor resultado entre os trabalho relacionados. A energia dinâmica é tratada pela técnica de união dos blocos candidatos, atingindo ganhos de mais de 65%. Considerando as reduções de consumo de energia atingidas pelas técnicas propostas neste trabalho, conclui-se que o sistema de hierarquia de memória proposto nesta dissertação atinge seu objetivo de atender às restrições impostas pela codificação MVC, no que se refere ao processamento do módulo de ME/DE. / This Master Thesis proposes a memory hierarchy for the Motion and Disparity Estimation (ME/DE) centered on the encoding references, called Reference-Centered Data Reuse (RCDR), focusing on energy reduction in the Multiview Video Coding (MVC). In the MVC encoders the ME/DE represents more than 98% of the overall energy consumption. Moreover, in the overall ME/DE energy, up to 90% is related to the memory issues, and only 10% is related to effective computation. The two items to be concerned with: (1) off-chip memory communication to fetch the reference samples (45%) and (2) on-chip memory to keep stored the search window samples and to send them to the ME/DE processing core (45%). The main goal of this work is to jointly minimize the on-chip and off-chip energy consumption in order to reduce the overall energy related to the ME/DE on MVC. The memory hierarchy is composed of an onchip video memory (which stores the entire search window), an on-chip memory gating control, and a partial results compressor. A search control unit is also proposed to exploit the search behavior to achieve further energy reduction. This work also aggregates to the memory hierarchy a low-complexity reference frame compressor. The experimental results proved that the proposed system accomplished the goal of the work of jointly minimizing the on-chip and off-chip energies. The RCDR provides off-chip energy savings of up to 68% when compared to state-of-the-art. the traditional MBcentered approach. The partial results compressor is able to reduce by 52% the off-chip memory communication to handle this RCDR penalty. When compared to techniques that do not access the entire search window, the proposed RCDR also achieve the best results in off-chip energy consumption due to the regular access pattern that allows lots of DDR burst reads (30% less off-chip energy consumption). Besides, the reference frame compressor is capable to improve by 2.6x the off-chip memory communication savings, along with negligible losses on MVC encoding performance. The on-chip video memory size required for the RCDR is up to 74% smaller than the MB-centered Level C approaches. On top of that, the power-gating control is capable to save 82% of leakage energy. The dynamic energy is treated due to the candidate merging technique, with savings of more than 65%. Due to the jointly off-chip communication and on-chip storage energy savings, the proposed memory hierarchy system is able to meet the MVC constraints for the ME/DE processing.
|
34 |
Energy-efficient memory hierarchy for motion and disparity estimation in multiview video codingSampaio, Felipe Martin January 2013 (has links)
Esta dissertação de mestrado propõe uma hierarquia de memória para a Estimação de Movimento e de Disparidade (ME/DE) centrada nas referências da codificação, estratégia chamada de Reference-Centered Data Reuse (RCDR), com foco em redução de energia em codificadores de vídeo multivistas (MVC - Multiview Video Coding). Nos codificadores MVC, a ME/DE é responsável por praticamente 98% do consumo total de energia. Além disso, até 90% desta energia está relacionada com a memória do codificador: (a) acessos à memória externa para a busca das referências da ME/DE (45%) e (b) memória interna (cache) para manter armazenadas as amostras da área de busca e enviá-las para serem processadas pela ME/DE (45%). O principal objetivo deste trabalho é minimizar de maneira conjunta a energia consumida pelo módulo de ME/DE com relação às memórias externa e interna necessárias para a codificação MVC. A hierarquia de memória é composta por uma memória interna (a qual armazena a área de busca inteira), um controle dinâmico para a estratégia de power-gating da memória interna e um compressor de resultados parciais. Um controle de buscas foi proposto para explorar o comportamento da busca com o objetivo de atingir ainda mais reduções de energia. Além disso, este trabalho também agrega à hierarquia de memória um compressor de quadros de referência de baixa complexidade. A estratégia RCDR provê reduções de até 68% no consumo de energia quando comparada com estratégias estadoda- arte que são centradas no bloco atual da codificação. O compressor de resultados parciais é capaz de reduzir em 52% a comunicação com memória externa necessária para o armazenamento desses elementos. Quando comparada a técnicas de reuso de dados que não acessam toda área de busca, a estratégia RCDR também atinge os melhores resultados em consumo de energia, visto que acessos regulares a memórias externas DDR são energeticamente mais eficientes. O compressor de quadros de referência reduz ainda mais o número de acessos a memória externa (2,6 vezes menos acessos), aliando isso a perdas insignificantes na eficiência da codificação MVC. A memória interna requerida pela estratégia RCDR é até 74% menor do que estratégias centradas no bloco atual, como Level C. Além disso, o controle dinâmico para a técnica de power-gating provê reduções de até 82% na energia estática, o que é o melhor resultado entre os trabalho relacionados. A energia dinâmica é tratada pela técnica de união dos blocos candidatos, atingindo ganhos de mais de 65%. Considerando as reduções de consumo de energia atingidas pelas técnicas propostas neste trabalho, conclui-se que o sistema de hierarquia de memória proposto nesta dissertação atinge seu objetivo de atender às restrições impostas pela codificação MVC, no que se refere ao processamento do módulo de ME/DE. / This Master Thesis proposes a memory hierarchy for the Motion and Disparity Estimation (ME/DE) centered on the encoding references, called Reference-Centered Data Reuse (RCDR), focusing on energy reduction in the Multiview Video Coding (MVC). In the MVC encoders the ME/DE represents more than 98% of the overall energy consumption. Moreover, in the overall ME/DE energy, up to 90% is related to the memory issues, and only 10% is related to effective computation. The two items to be concerned with: (1) off-chip memory communication to fetch the reference samples (45%) and (2) on-chip memory to keep stored the search window samples and to send them to the ME/DE processing core (45%). The main goal of this work is to jointly minimize the on-chip and off-chip energy consumption in order to reduce the overall energy related to the ME/DE on MVC. The memory hierarchy is composed of an onchip video memory (which stores the entire search window), an on-chip memory gating control, and a partial results compressor. A search control unit is also proposed to exploit the search behavior to achieve further energy reduction. This work also aggregates to the memory hierarchy a low-complexity reference frame compressor. The experimental results proved that the proposed system accomplished the goal of the work of jointly minimizing the on-chip and off-chip energies. The RCDR provides off-chip energy savings of up to 68% when compared to state-of-the-art. the traditional MBcentered approach. The partial results compressor is able to reduce by 52% the off-chip memory communication to handle this RCDR penalty. When compared to techniques that do not access the entire search window, the proposed RCDR also achieve the best results in off-chip energy consumption due to the regular access pattern that allows lots of DDR burst reads (30% less off-chip energy consumption). Besides, the reference frame compressor is capable to improve by 2.6x the off-chip memory communication savings, along with negligible losses on MVC encoding performance. The on-chip video memory size required for the RCDR is up to 74% smaller than the MB-centered Level C approaches. On top of that, the power-gating control is capable to save 82% of leakage energy. The dynamic energy is treated due to the candidate merging technique, with savings of more than 65%. Due to the jointly off-chip communication and on-chip storage energy savings, the proposed memory hierarchy system is able to meet the MVC constraints for the ME/DE processing.
|
35 |
Mixed RTL and gate-level power estimation with low power design iteration / Lågeffektsestimering på kombinerad RTL- och grind-nivå med lågeffekts design iterationNilsson, Jesper January 2003 (has links)
In the last three decades we have witnessed a remarkable development in the area of integrated circuits. From small logic devices containing some hundred transistors to modern processors containing several tens of million transistors. However, power consumption has become a real problem and may very well be the limiting factor of future development. Designing for low power is therefore increasingly important. To accomplice an efficient low power design, accurate power estimation at early design stage is essential. The aim of this thesis was to set up a power estimation flow to estimate the power consumption at early design stage. The developed flow spans over both RTL- and gate-level incorporating Mentor Graphics Modelsim (RTL-level simulator), Cadence PKS (gate- level synthesizer) and own developed power estimation tools. The power consumption is calculated based on gate-level physical information and RTL- level toggle information. To achieve high estimation accuracy, real node annotations is used together with an own developed on-chip wire model to estimate node voltage swing. Since the power estimation may be very time consuming, the flow also includes support for low power design iteration. This gives efficient power estimation speedup when concentrating on smaller sub- parts of the design.
|
36 |
Validation of Power Dissipation of SerDes IPsKas, Adem January 2021 (has links)
Post-Silicon validation of a designed ASIC is an essential step in the product development process. During the validation process, all specifications of the ASICs have to be controlled in a lab environment. Serializer/Deserialiser(SerDes) blocks in an ASIC are used to perform high-speed serial data communication between distinct integrated circuits. The goal of the thesis is to validate the power consumption of SerDes IP blocks provided by different vendors in an ASIC. To validate power consumption, current and voltage values are read from power supply lines. Then these values are digitized and stored on a Raspberry Pi. To perform these operations, the initial firmware provided by vendors is improved to control SerDes operations, and software is developed to control the Raspberry Pi. Power measured operation is performed for every possible data rate for each SerDes modules. Power measurement is also performed for different temperature range in industry standards with the highest possible data rate for each SerDes IP block. As a final step, measured power consumption values are compared to vendors’ data. / Validering av en designad ASIC efter kisel är ett viktigt steg i produktutvecklingsprocessen. Under valideringsprocessen måste alla specifikationer för ASIC kontrolleras i en laboratoriemiljö. Serializer / Deserialiser (SerDes) -block i en ASIC används för att utföra höghastighets seriell datakommunikation mellan distinkta integrerade kretsar. Målet med avhandlingen är att validera strömförbrukningen för SerDes IP-block som tillhandahålls av olika leverantörer i en ASIC. För att validera strömförbrukningen läses strömoch spänningsvärden från strömförsörjningsledningarna. Sedan digitaliseras dessa värden och lagras på en Raspberry Pi. För att utföra dessa operationer förbättras den inledande firmware som tillhandahålls av leverantörer för att styra SerDesoperationer och programvara utvecklas för att styra Raspberry Pi. Effektmätt operation utförs för varje möjlig datahastighet för varje SerDes-modul. Mätoperationer utförs också för olika temperaturintervall i branschstandarder med högsta möjliga datahastighet för varje SerDes IP-block. Som ett sista steg jämförs uppmätta energiförbrukningsvärden med leverantörens data.
|
37 |
Analytical and Experimental Performance Analysis of Enhanced Wake-Up Receivers Based on Low-Power Base-Band AmplifiersSchott, Lydia, Fromm, Robert, Bouattour, Ghada, Kanoun, Olfa, Derbel, Faouzi 09 June 2023 (has links)
With the introduction of Internet of Things (IoT) technology in several sectors, wireless,
reliable, and energy-saving communication in distributed sensor networks are more important than
ever. Thereby, wake-up technologies are becoming increasingly important as they significantly
contribute to reducing the energy consumption of wireless sensor nodes. In an indoor environment,
the use of wireless sensors, in general, is more challenging due to signal fading and reflections and
needs, therefore, to be critically investigated. This paper discusses the performance analysis of wakeup
receiver (WuRx) architectures based on two low frequency (LF) amplifier approaches with regard
to sensitivity, power consumption, and package error rate (PER). Factors that affect systems were
compared and analyzed by analytical modeling, simulation results, and experimental studies with
both architectures. The developedWuRx operates in the 868MHz band using on-off-keying (OOK)
signals while supporting address detection to wake up only the targeted network node. By using
an indoor setup, the signal strength and PER of received signal strength indicator (RSSI) in different
rooms and distances were determined to build a wireless sensor network. The results show a wake-up
packets (WuPts) detection probability of about 90% for an interior distance of up to 34 m.
|
38 |
The Thermal-Constrained Real-Time Systems Design on Multi-Core Platforms -- An Analytical ApproachSHA, SHI 21 March 2018 (has links)
Over the past decades, the shrinking transistor size enabled more transistors to be integrated into an IC chip, to achieve higher and higher computing performances. However, the semiconductor industry is now reaching a saturation point of Moore’s Law largely due to soaring power consumption and heat dissipation, among other factors. High chip temperature not only significantly increases packing/cooling cost, degrades system performance and reliability, but also increases the energy consumption and even damages the chip permanently. Although designing 2D and even 3D multi-core processors helps to lower the power/thermal barrier for single-core architectures by exploring the thread/process level parallelism, the higher power density and longer heat removal path has made the thermal problem substantially more challenging, surpassing the heat dissipation capability of traditional cooling mechanisms such as cooling fan, heat sink, heat spread, etc., in the design of new generations of computing systems. As a result, dynamic thermal management (DTM), i.e. to control the thermal behavior by dynamically varying computing performance and workload allocation on an IC chip, has been well-recognized as an effective strategy to deal with the thermal challenges.
Over the past decades, the shrinking transistor size, benefited from the advancement of IC technology, enabled more transistors to be integrated into an IC chip, to achieve higher and higher computing performances. However, the semiconductor industry is now reaching a saturation point of Moore’s Law largely due to soaring power consumption and heat dissipation, among other factors. High chip temperature not only significantly increases packing/cooling cost, degrades system performance and reliability, but also increases the energy consumption and even damages the chip permanently. Although designing 2D and even 3D multi-core processors helps to lower the power/thermal barrier for single-core architectures by exploring the thread/process level parallelism, the higher power density and longer heat removal path has made the thermal problem substantially more challenging, surpassing the heat dissipation capability of traditional cooling mechanisms such as cooling fan, heat sink, heat spread, etc., in the design of new generations of computing systems. As a result, dynamic thermal management (DTM), i.e. to control the thermal behavior by dynamically varying computing performance and workload allocation on an IC chip, has been well-recognized as an effective strategy to deal with the thermal challenges.
Different from many existing DTM heuristics that are based on simple intuitions, we seek to address the thermal problems through a rigorous analytical approach, to achieve the high predictability requirement in real-time system design. In this regard, we have made a number of important contributions. First, we develop a series of lemmas and theorems that are general enough to uncover the fundamental principles and characteristics with regard to the thermal model, peak temperature identification and peak temperature reduction, which are key to thermal-constrained real-time computer system design. Second, we develop a design-time frequency and voltage oscillating approach on multi-core platforms, which can greatly enhance the system throughput and its service capacity. Third, different from the traditional workload balancing approach, we develop a thermal-balancing approach that can substantially improve the energy efficiency and task partitioning feasibility, especially when the system utilization is high or with a tight temperature constraint. The significance of our research is that, not only can our proposed algorithms on throughput maximization and energy conservation outperform existing work significantly as demonstrated in our extensive experimental results, the theoretical results in our research are very general and can greatly benefit other thermal-related research.
|
39 |
Power and Energy Efficiency Evaluation for HW and SW Implementation of nxn Matrix Multiplication on Altera FPGAsRenbi, Abdelghani January 2009 (has links)
<p>In addition to the performance, low power design became an important issue in the design process of mobile embedded systems. Mobile electronics with rich features most often involve complex computation and intensive processing, which result in short battery lifetime and particularly when low power design is not taken in consideration. In addition to mobile computers, thermal design is also calling for low power techniques to avoid components overheat especially with VLSI technology. Low power design has traced a new era. In this thesis we examined several techniques to achieve low power design for FPGAs, ASICs and Processors where ASICs were more flexible to exploit the HW oriented techniques for low power consumption. We surveyed several power estimation methodologies where all of them were prone to at least one disadvantage. We also compared and analyzed the power and energy consumption in three different designs, which perform matrix multiplication within Altera platform and using state-of-the-art FPGA device. We concluded that NIOS II\e is not an energy efficient alternative to multiply nxn matrices compared to HW matrix multipliers on FPGAs and configware is an enormous potential to reduce the energy consumption costs.</p>
|
40 |
Conception de dispositifs de contrôle asynchrones et distribués pour la gestion de l’énergie / Design of control devices for distributed power managementAl Khatib, Chadi 01 March 2016 (has links)
Les systèmes intégrés sont aujourd’hui de plus en plus fréquemment confrontés à des contraintes de faible consommation ou d’efficacité énergétique. Ces problématiques se doivent d’être intégrées le plus en amont possible dans le flot de conception afin de réduire les temps de design et d’éviter de nombreuses itérations dans le flot. Dans ce contexte, le projet collaboratif HiCool, partenariat entre les laboratoires LIRMM et TIMA, les sociétés Defacto, Docea et ST Microelectronics, a mis en place une stratégie et un flot de conception pour concevoir des systèmes intégrés faible consommation tout en facilitant la réutilisation de blocks matériels (IPs) existants. L’approche proposée dans cette thèse s’intègre dans cette stratégie en apportant une petite dose d’asynchronisme dans des systèmes complètement synchrones. En effet, la réduction de la consommation est basée sur le constat que l’activation permanente de la totalité du circuit est inutile dans bien des cas. Néanmoins, contrôler l’activité avec des techniques de « clock gating » ou de « power gating » nécessitent usuellement d’effectuer un re-design du système et d’ajouter un organe de commande pour contrôler l’activation des zones effectuant un traitement. Le travail présenté dans ce manuscrit définit une stratégie basée sur des contrôleurs d’horloge et de domaine d’alimentation, asynchrones, distribués et facilement insérables dans un circuit avec un coût de re-design des plus réduit. / Today integrated systems are increasingly faced with the constraints of low consumption or energy efficiency. These issues need to be integrated as far upstream as possible in the design flow to reduce design time and avoid much iteration in the flow. In this context, the collaborative project HiCool, between LIRMM and TIMA laboratories, Defacto, Docea and ST Microelectronics companies, has set up a strategy and design flow to design integrated low power systems while facilitating the reuse of existing hardware blocks (IPs). The approach proposed in this thesis fits into this strategy by bringing a small dose of asynchrony in completely synchronous systems. Indeed, the reduction in consumption is based on the observation that permanent activation of the entire circuit is unnecessary in many cases. However, controlling the activity with techniques of "clock gating" or "power gating" usually need to perform a re-design of the system and to add a control device for controlling activation of areas effecting treatment. The work presented in this manuscript provides a strategy based clock controllers and power domain, asynchronous, distributed and easily insertable into a circuit with a low cost design.
|
Page generated in 0.0567 seconds