Global ETD Search

411	Reducing Subthreshold Leakage Power Through Hybrid MOSFET-NEMS Power Gating Kindel, David Garret 01 September 2016 (has links) Modern devices such as smartphones and smartwatches spend a large amount of their life idle, waiting for external events. During this time, they are expending energy, using up battery life. Increasing power consumption is a rising concern to users and researchers alike. Power gating, turning off a blocks of hardware when idle, reduces static power consumption. The Metal-Oxide-Semiconductor Field-Effect Transistors (MOSFETs) currently employed in processors leak current. Even in power gated circuits, MOSFET power gating may only save between 60-80% of power. A different type of switch, a Nanoelectromechanical Systems (NEMS) switch, presents an air gap between the source and drain while in the off state, eliminating subthreshold leakage current. The NEMS switch is slower to operate and only has a finite number of switching before breaking. They should be switched with caution. Proposed in this thesis is a hybrid power gating model wherein a MOSFET is placed in series with a NEMS switch. Power gating the Floating Point Unit (FPU) of a processor is studied through the use of modern open source computer architecture simulators. Each switch type is used to model power gating to observe energy savings and performance costs. The hybrid power gating model is more flexible across a variety of applications. Energy savings are comparable to single NEMS switch power gating for applications with low FPU activity. Any performance loss remains low, matching that of MOSFETs. Processor electrical costs are heavily reduced while devices remain operating at a near-optimal speed. / Master of Science / Modern devices such as smartphones and smartwatches spend a large amount of their life idle, waiting for external input. During this time, they are expending energy, using up battery life. The transistors that are inside of them, the minuscule electronics that make these devices work, are not perfect and “leak” current even when not in use. Another type of switch, a mechanical one, has been under development over the last decade. This mechanical switch is slower to operate and is not as reliable as current transistors yet yields a complete disconnection when turned off. Thus, no energy is wasted when a device is sitting idle. While this saves more energy, using a mechanical switch also has the potential to degrade a device’s performance due to its slow operation. In this thesis, the effectiveness of combining the two types of transistors into one process is analyzed. The fast switching times of the currently used transistors can be used in situations where it is difficult to determine whether shutting down a piece of hardware is a good decision. If it has been determined that the circuit may be put to sleep for a long amount of time, the slower but more energy efficient mechanical switch may be used. With this hybrid operation, each transistor is only used in a mode that suits them most appropriately. NEMS Power Gating Low Power Simulator Computer Architecture
412	Hardware Control Unit For Trusted Program Verification System Alt, Jake Owen 01 October 2024 (has links) (PDF) Trust in the underlying hardware is the foundational step towards trusting the correctness and integrity of a software application. However, verifying that today's extremely complex processors work exactly as intended has not been feasible, as evidenced by several recent hardware bugs. Trustworthy, formally verified processors currently forego intricate performance enhancements such as out-of-order execution, hampering them substantially versus their less secure counterparts. The Containment Architecture with Verified Output (CAVO) system solves this problem by isolating the host system and requiring the result of each instruction to be validated by a small, trusted hardware module called the Sentry. Any transmissions to the outside world must be performed through the Sentry, which ensures all prior instructions have been computed correctly. The first version of CAVO was centered around a customized host CPU with hardware modifications to manage the Sentry with minimal overhead, while the second used compiler tooling and a software version of the Sentry controller, incurring a significant performance penalty on checked programs. This paper proposes a novel hardware-based Sentry control system that serves as a first step toward fast checking of native programs while greatly reducing modifications to the host, all without expanding the root of trust. We implement a proof-of-concept hardware design and verify its correctness using two SPECINT2006 benchmarks, demonstrating steady-state performance of 1 instruction per clock and an average overhead of 45 clocks per cache miss. Computer Architecture Security Semory Integrity Hardware Security Computer and Systems Architecture
413	Investigation of the Effect of Functional Units/Connectivity Arrangement on Energy Consumption of Reconfigurable Architectures Using an Interactive Design Framework Bhargava, Arpita 08 1900 (has links) Allocation of expensive resources, (such as Multiplier) onto the CGRA has been of interest from quite some time. For these architectural solutions to fulfill the designers' requirements, it is of utmost importance that the design offers high performance, low power consumption, and effective area utilization. The allocation problem is studied using the UntangledII gaming environment, which has been developed at the Reconfigurable Computing Lab at UNT to discover the design of custom domain-specific architectures. This thesis explores several case-studies to investigate the arrangement of functional units and interconnects to achieve a low power, high performance, and flexible heterogeneous designs that can fit for a suite of applications. In the later part, several human mapping strategies of top and bottom players to design a custom domain-specific architecture are presented. Some common trends that were examined while analyzing the mapping strategies of the players are also discussed. Custom domain-speciﬁc architectures UntangledII Adaptive computing systems. Computer architecture.
414	Arquitetura modular de processador multicore, flexível, segura e tolerante a falhas, para sistemas embarcados ciberfísicos. / Modular multicore processor architecture, flexible, securi and fault tolerant, to embedded cyber-physical systems. Cesar Giacomini Penteado 08 December 2010 (has links) Sistemas Ciberfísicos (SCF) são sistemas onde existe uma união entre computação e física. Os SCF serão utilizados nas mais diversas áreas, formando uma nova era de produtos e estarão em qualquer lugar, sendo utilizados por qualquer um e para qualquer tarefa. Aplicações para SCF incluem sistemas e dispositivos médicos altamente confiáveis, controle de tráfego e segurança, sistemas automotivos avançados, controle de processos, conservação de energia, controle ambiental, aviação, instrumentação, controle de infra estrutura crítica, sistemas de defesa, fabricação e estruturas inteligentes. O cenário de sistemas ciberfísicos (SCF) exigirá dos processadores de sistemas embarcados melhorias em características além de processamento de I/O, consumo de energia e comunicação, ou seja, as futuras arquiteturas de processadores deverão possuir também características de segurança, tolerância à falhas e flexibilidade arquitetural para adequação aos diversos cenários alvo de SCF. Neste contexto, nesta tese de doutorado, idealizou-se uma arquitetura modular multicore (AMM), voltada à SCF, composta por processadores multicore, hardware dedicado ou ambos. Dessa maneira, propõe-se um processador para a arquitetura AMM e avalia-se seu correto funcionamento por meio de simulações no software Modelsim e ferramentas de simulação de circuitos integrados. Apresenta-se um protótipo para uma primeira versão da arquitetura AMM e detalham-se alguns programas especificamente escritos para comprovar as principais características da arquitetura. Na tese, apresentam-se testes funcionais em FPGA para o processador base do protótipo AMM, dados de utilização do protótipo do processador da arquitetura AMM em FPGA e um protótipo do processador da AMM em silício. Analisa-se o protótipo da arquitetura AMM com aplicações criticas e de uso em SCF, tais como: segurança, redundância, e tolerância a falhas; as quais permitem concluir que os processadores futuros de SCF devem ter essas características. A tese mostra que esses quesitos podem ser incluídos em sistemas embarcados com características multicore dedicados a aplicações e necessidades de sistemas SCF. / Cyber-physical Systems (CPS) are systems where there is an union between computing and physics. The CPS will be used in several areas, forming a new era of systems or devices and could be anywhere, being used by anyone and anything. Applications for CPS include highly reliable medical systems and devices, traffic control and security, advanced automotive, process control, energy conservation, environmental control, aviation, instrumentation, control of critical infrastructure, defense systems, manufacturing, and smart structures. So, CPS scenario needs requirements design of embedded systems, composed by processors with new features in addition to I/O processing, power consumption, and communication. Then, the future of processor architectures should also have security, fault tolerance, architectural adaptation and flexibility to various and different scenarios. In this context, in this thesis, it is proposed a modular architecture to multicore processor (AMM) to use in the CPS. It is composed by multicore processors, dedicated hardware or both. Thus, in this thesis, we have proposed one processor architecture and we have done verification based on simulations using Modelsim software and simulation tools for integrated circuits, and we have running applications programs to demonstrate the main features of the AMM architecture. We also show a prototype of AMM using FPGA as well as implementation data such as FPGA usage and resources in silicon area. It is also presented an ASIC prototype of AMM core. The prototype architecture of the AMM was analyzed with critical applications which are used in CPS, such as security, redundancy and fault tolerance, and these tests suggest that the future CPS processors must have those characteristics. Thus, the thesis shows that these aspects can be included in embedded systems with dedicated features to multicore applications and systems used in CPS. Arquitetura de computadores Confiabilidade Sistemas ciberfísicos Sistemas embarcados Computer architecture Computer architecture and organization Embedded systems Reliability
415	Arquitetura modular de processador multicore, flexível, segura e tolerante a falhas, para sistemas embarcados ciberfísicos. / Modular multicore processor architecture, flexible, securi and fault tolerant, to embedded cyber-physical systems. Penteado, Cesar Giacomini 08 December 2010 (has links) Sistemas Ciberfísicos (SCF) são sistemas onde existe uma união entre computação e física. Os SCF serão utilizados nas mais diversas áreas, formando uma nova era de produtos e estarão em qualquer lugar, sendo utilizados por qualquer um e para qualquer tarefa. Aplicações para SCF incluem sistemas e dispositivos médicos altamente confiáveis, controle de tráfego e segurança, sistemas automotivos avançados, controle de processos, conservação de energia, controle ambiental, aviação, instrumentação, controle de infra estrutura crítica, sistemas de defesa, fabricação e estruturas inteligentes. O cenário de sistemas ciberfísicos (SCF) exigirá dos processadores de sistemas embarcados melhorias em características além de processamento de I/O, consumo de energia e comunicação, ou seja, as futuras arquiteturas de processadores deverão possuir também características de segurança, tolerância à falhas e flexibilidade arquitetural para adequação aos diversos cenários alvo de SCF. Neste contexto, nesta tese de doutorado, idealizou-se uma arquitetura modular multicore (AMM), voltada à SCF, composta por processadores multicore, hardware dedicado ou ambos. Dessa maneira, propõe-se um processador para a arquitetura AMM e avalia-se seu correto funcionamento por meio de simulações no software Modelsim e ferramentas de simulação de circuitos integrados. Apresenta-se um protótipo para uma primeira versão da arquitetura AMM e detalham-se alguns programas especificamente escritos para comprovar as principais características da arquitetura. Na tese, apresentam-se testes funcionais em FPGA para o processador base do protótipo AMM, dados de utilização do protótipo do processador da arquitetura AMM em FPGA e um protótipo do processador da AMM em silício. Analisa-se o protótipo da arquitetura AMM com aplicações criticas e de uso em SCF, tais como: segurança, redundância, e tolerância a falhas; as quais permitem concluir que os processadores futuros de SCF devem ter essas características. A tese mostra que esses quesitos podem ser incluídos em sistemas embarcados com características multicore dedicados a aplicações e necessidades de sistemas SCF. / Cyber-physical Systems (CPS) are systems where there is an union between computing and physics. The CPS will be used in several areas, forming a new era of systems or devices and could be anywhere, being used by anyone and anything. Applications for CPS include highly reliable medical systems and devices, traffic control and security, advanced automotive, process control, energy conservation, environmental control, aviation, instrumentation, control of critical infrastructure, defense systems, manufacturing, and smart structures. So, CPS scenario needs requirements design of embedded systems, composed by processors with new features in addition to I/O processing, power consumption, and communication. Then, the future of processor architectures should also have security, fault tolerance, architectural adaptation and flexibility to various and different scenarios. In this context, in this thesis, it is proposed a modular architecture to multicore processor (AMM) to use in the CPS. It is composed by multicore processors, dedicated hardware or both. Thus, in this thesis, we have proposed one processor architecture and we have done verification based on simulations using Modelsim software and simulation tools for integrated circuits, and we have running applications programs to demonstrate the main features of the AMM architecture. We also show a prototype of AMM using FPGA as well as implementation data such as FPGA usage and resources in silicon area. It is also presented an ASIC prototype of AMM core. The prototype architecture of the AMM was analyzed with critical applications which are used in CPS, such as security, redundancy and fault tolerance, and these tests suggest that the future CPS processors must have those characteristics. Thus, the thesis shows that these aspects can be included in embedded systems with dedicated features to multicore applications and systems used in CPS. Arquitetura de computadores Computer architecture Computer architecture and organization Confiabilidade Embedded systems Reliability Sistemas ciberfísicos Sistemas embarcados
416	Multi-core processors and the future of parallelism in software Youngman, Ryan Christopher 01 January 2007 (has links) The purpose of this thesis is to examine multi-core technology. Multi-core architecture provides benefits such as less power consumption, scalability, and improved application performance enabled by thread-level parallelism. Computer architecture High performance processors Simultaneous multithreading processors Computer architecture High performance processors Simultaneous multithreading processors. Systems Architecture
417	Low-cost and efficient architectural support for correctness and performance debugging Venkataramani, Guru Prasadh V. 15 July 2009 (has links) With rapid growth in computer hardware technologies and architectures, software programs have become increasingly complex and error-prone. This software complexity has resulted in program crashes and even security threats. Correctness Debugging is making sure that the program does not exhibit any unintended behavior at runtime. A fully correct program without good performance does not lend any commercial success to the software product. Performance Debugging ensures good performance on hardware platforms. A number of prior debugging solutions either suffer from huge performance overheads or incur high implementation costs. We propose low-cost and efficient hardware solutions that target three specific correctness and performance problems, namely, memory debugging, taint propagation and comprehensive cache miss classification. Experiments show that our mechanisms incur low performance overheads and can be designed with minimal changes to existing processor hardware. While architects invest time and resources into designing high-end architectures, we show that it is equally important to incorporate useful debugging features into these processors in order to enhance the ease of use for programmers. Scalability in multi-cores Debugging Computer architecture Computer programs Correctness Debugging in computer science Computer systems Reliability Computer architecture
418	Design and evaluation of a technology-scalable architecture for instruction-level parallelism Nagarajan, Ramadass, January 1900 (has links) Thesis (Ph. D.)--University of Texas at Austin, 2007. / Vita. Includes bibliographical references.
419	A semi-formal comparison between the Common Object Request Broker Architecture (COBRA) and the Distributed Component Object Model (DCOM) Conradie, Pieter Wynand 06 1900 (has links) The way in which application systems and software are built has changed dramatically over the past few years. This is mainly due to advances in hardware technology, programming languages, as well as the requirement to build better software application systems in less time. The importance of mondial (worldwide) communication between systems is also growing exponentially. People are using network-based applications daily, communicating not only locally, but also globally. The Internet, the global network, therefore plays a significant role in the development of new software. Distributed object computing is one of the computing paradigms that promise to solve the need to develop clienVserver application systems, communicating over heterogeneous environments. This study, of limited scope, concentrates on one crucial element without which distributed object computing cannot be implemented. This element is the communication software, also called middleware, which allows objects situated on different hardware platforms to communicate over a network. Two of the most important middleware standards for distributed object computing today are the Common Object Request Broker Architecture (CORBA) from the Object Management Group, and the Distributed Component Object Model (DCOM) from Microsoft Corporation. Each of these standards is implemented in commercially available products, allowing distributed objects to communicate over heterogeneous networks. In studying each of the middleware standards, a formal way of comparing CORBA and DCOM is presented, namely meta-modelling. For each of these two distributed object infrastructures (middleware), meta-models are constructed. Based on this uniform and unbiased approach, a comparison of the two distributed object infrastructures is then performed. The results are given as a set of tables in which the differences and similarities of each distributed object infrastructure are exhibited. By adopting this approach, errors caused by misunderstanding or misinterpretation are minimised. Consequently, an accurate and unbiased comparison between CORBA and DCOM is made possible, which constitutes the main aim of this dissertation. / Computing / M. Sc. (Computer Science) 004.36 CORBA (Computer architecture) DCOM (Computer architecture) Client/server computing Internet programming
420	RETHROTTLE : Execution Throttling In The REDEFINE SoC Architecture Satrawala, Amar Nath 06 1900 (has links) REDEFINE is a reconfigurable SoC architecture that provides a unique platform for high performance and low power computing by exploiting the synergistic interaction between coarse grain dynamic dataflow model of computation (to expose abundant parallelism in the applications) and runtime composition of efficient compute structures (on the reconfigurable computation resources). Computer architectures based on the dynamic dataflow model of computation have to be an infinite resource implementation to be able to exploit all available parallelism in all applications. It is not feasible for any real architectural implementation. When limited resource implementations are considered, there is a possibility of loss of performance (inability to efficiently exploit available parallelism). In this thesis, we study the throttling of execution in the REDEFINE architecture to maximize the architecture efficiency. We have formulated it as a design space exploration problem at two levels i.e. architectural configurations and throttling schemes. Reduced feature/high level simulation or feature specific analytical approaches are very useful for the selective study/exploration of early in design phase architectures/systems. Our approach is similar to that of SEASAME Framework which is used for the study of MPSoC (Multiprocessor SoC) architectures. We have used abstraction (feature reduction) at the levels of architecture and model of computation to make the problem approachable and practically feasible. A feature specific fast hybrid (mixed level) simulation framework for the early in design phase study is developed and implemented for the huge design space exploration (1284 throttling schemes, 128 architectural configurations and 10 applications i.e. 1.6 million executions). We have done performance modeling in terms of selection of important performance criteria, ranking of the explored throttling schemes and investigation of the effectiveness of the design space exploration using statistical hypothesis testing. We found some interesting obvious/intuitive and some non-obvious/counterintuitive results. The two performance criteria namely Exec.T and Avg.TU were found sufficient to represent the performance and the resource usage characteristics of the architecture independent of the throttling schemes, the architectural configurations and the applications. The ranking of the throttling schemes based on the selected performance criteria is found to be statistically very significant. The intuitive throttling schemes span the range of performance from the best to the worst. We found absence of trade-off amongst all of the performance criteria. The best throttling schemes give appreciable overall performance (25%) and resource usage (37%) gains in the throttling unit simultaneously. The design space exploration of the throttling schemes is found to be fine and uniform. SoC Architecture Computer Architecture Semiconductor-on-Chip Architecture Dataflow Models Throttling Computer Simulation REDEFINE Architecture Computer Architecture - Modeling Hybrid Computer Simulation Von Neumann Architecture Coarse Grain Computer Science

Search results