• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 472
  • 85
  • 80
  • 20
  • 10
  • 7
  • 3
  • 3
  • 3
  • 3
  • 3
  • 3
  • 3
  • 3
  • 2
  • Tagged with
  • 832
  • 832
  • 186
  • 173
  • 161
  • 124
  • 121
  • 119
  • 117
  • 101
  • 95
  • 94
  • 83
  • 83
  • 79
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
291

A VLSI architecture for a neurocomputer using higher-order predicates

Geller, Ronnie Dee 05 1900 (has links) (PDF)
M.S. / Computer Science & Engineering / Some biological aspects of neural interactions are presented and used as a basis for a computational model in the development of a new type of computer architecture. A VLSI microarchitecture is proposed that efficiently implements the neural-based computing methods. An analysis of the microarchitecture is presented to show that it is feasible using currently available VLSI technology. The performance expectations of the proposed system are analyzed and compared to conventional computer systems executing similar algorithms. The proposed system is shown to have comparatively attractive performance and cost/performance ratio characteristics. Some discussion is given on system level characteristics including initialization and learning.
292

Simulation and performance evaluation of a graph reduction machine architecture

Sarangi, Ananda G. 07 1900 (has links) (PDF)
M.S. / Computer Science & Engineering / The Graph Reduction Machine (G-Machine) is an architecture intended to achieve high performance in executing functional language programs. The success or failure of this novel architecture can only be determined by its performance in executing "real" programs. The simulator of the G-Machine, described in this thesis, makes possible detailed studies of the performance of the G-Machine architecture even though the hardware implementation of a G-Machine is not complete.
293

Modeling and Optimization of Delay and Power for Key Components of Modern High-performance Processors

Safi, Elham 13 April 2010 (has links)
In designing a new processor, computer architects consider a myriad of possible organizations and designs to decide which best meets the constraints on performance, power and cost for each particular processor. To identify practical designs, architects need to have insight into the physical-level characteristics (delay, power and area) of various components of modern processors implemented in recent fabrication technologies. During early stages of design exploration, however, developing physical-level implementations for various design options (often in the order of thousands) is impractical or undesirable due to time and/or cost constraints. In lieu of actual measurements, analytical and/or empirical models can offer reasonable estimates of these physical-level characteristics. However, existing models tend to be out-dated for three reasons: (i) They have been developed based on old circuits in old fabrication technologies; (ii) The high-level designs of the components have evolved and older designs may no longer be representative; and, (iii) The overall architecture of processors has changed significantly, and new components for which no models exist have been introduced or are being considered. This thesis studies three key components of modern high-performance processors: Counting Bloom Filters (CBFs), Checkpointed Register Alias Tables (RATs), and Compacted Matrix Schedulers (CMSs). CBFs optimize membership tests (e.g., whether a block is cached). RAT and CMS increase the opportunities for exploiting instruction-level parallelism; RAT is the core of the renaming stage, and CMS is an implementation for the instruction scheduler. Physical-level studies or models for these components have been limited or non-existent. In addition to investigating these components at the physical level, this thesis (i) proposes a novel speed- and energy-efficient CBF implementation; (ii) studies how the number of RAT checkpoints affects its latency and energy, and overall processor performance; and, (iii) studies the CMS and its accompanying logic at the physical level. This thesis also develops empirical and analytical latency and energy models that can be adapted for newer fabrication technologies. Additionally, this thesis proposes physical-level latency and energy optimizations for these components motivated by design inefficiencies exposed during the physical-level study phase.
294

A new RISC architecture for high speed data acquisition

Gribble, Donald L. 12 November 1991 (has links)
This thesis describes the design of a RISC architecture for high speed data acquisition. The structure of existing data acquisition systems is first examined. An instruction set is created to allow the data acquisition system to serve a wide variety of applications. The architecture is designed to allow the execution of an instruction each clock cycle. The utility of the RISC system is illustrated by implementing several representative applications. Performance of the system is analyzed and future enhancements discussed. / Graduation date: 1992
295

Modeling and Optimization of Delay and Power for Key Components of Modern High-performance Processors

Safi, Elham 13 April 2010 (has links)
In designing a new processor, computer architects consider a myriad of possible organizations and designs to decide which best meets the constraints on performance, power and cost for each particular processor. To identify practical designs, architects need to have insight into the physical-level characteristics (delay, power and area) of various components of modern processors implemented in recent fabrication technologies. During early stages of design exploration, however, developing physical-level implementations for various design options (often in the order of thousands) is impractical or undesirable due to time and/or cost constraints. In lieu of actual measurements, analytical and/or empirical models can offer reasonable estimates of these physical-level characteristics. However, existing models tend to be out-dated for three reasons: (i) They have been developed based on old circuits in old fabrication technologies; (ii) The high-level designs of the components have evolved and older designs may no longer be representative; and, (iii) The overall architecture of processors has changed significantly, and new components for which no models exist have been introduced or are being considered. This thesis studies three key components of modern high-performance processors: Counting Bloom Filters (CBFs), Checkpointed Register Alias Tables (RATs), and Compacted Matrix Schedulers (CMSs). CBFs optimize membership tests (e.g., whether a block is cached). RAT and CMS increase the opportunities for exploiting instruction-level parallelism; RAT is the core of the renaming stage, and CMS is an implementation for the instruction scheduler. Physical-level studies or models for these components have been limited or non-existent. In addition to investigating these components at the physical level, this thesis (i) proposes a novel speed- and energy-efficient CBF implementation; (ii) studies how the number of RAT checkpoints affects its latency and energy, and overall processor performance; and, (iii) studies the CMS and its accompanying logic at the physical level. This thesis also develops empirical and analytical latency and energy models that can be adapted for newer fabrication technologies. Additionally, this thesis proposes physical-level latency and energy optimizations for these components motivated by design inefficiencies exposed during the physical-level study phase.
296

Verification-Aware Processor Design

Lungu, Anita January 2009 (has links)
<p>As technological advances enable computers to permeate many of our society's critical application domains (such as medicine, finances, transportation), the requirement for computers to always behave correctly becomes critical as well. Currently, ensuring that processor designs are correct represents a major challenge for the computing industry consuming the majority (up to 70%) of the resources allocated for the creation of a new processor. Looking towards the future, we see that with each new processor generation, even more transistors fit on the same chip area and more complex designs become possible, which makes it unlikely that the difficulty of the design verification problem will decrease by itself.</p><p>We believe that the difficulty of the design verification problem is compounded by the current processor design flow. In most design cycles, a design's verifiability is not explicitly considered at an early stage - when decisions are most influential - because that initial focus is exclusively on improving the design on more traditional metrics like performance, power, and area. It is thus possible for the resulting design to be very difficult to verify in the end, specifically because its verifiability was not ranked high on the priority list in the beginning. </p><p>In this thesis we propose to view verifiability as a critical design constraint to be considered, together with other established metrics, like performance and power, from the initial stages of design. Our high level goal is for this approach to make designs more verifiable, which would both decrease the resources invested in the verification step and lead to more robust designs. </p><p>More specifically, we make five main contributions in this thesis. The first is our proposal for a change in design perspective towards considering verifiability as a first class constraint. Second, we use formal verification (through a combination of theorem proving, model checking, and probabilistic model checking ) to quantitatively evaluate the impact on verifiability of various design choices like the organization of caches, TLBs, pipeline, operand bypass network, and dynamic power management mechanisms. Our third contribution is to evaluate design trade-offs between verifiability and other established metrics, like performance and power, in the context of multi-core dynamic power management schemes. Fourth, we re-design several components for increasing their verifiability. Finally, we propose design guidelines for increasing verifiability. In the context of single core processors our guidelines refer to the organization of caches and translation lookaside buffers (TLBs), the depth of the core's pipeline, the type of ALUs used, while for multi-core processors we refer to dynamic power management schemes (DPMs) for power capping. </p><p>Our results confirm that making design choices with verifiability as a first class design constraint has the capacity to decrease the verification effort. Furthermore, making explicit trade-offs between verifiability, performance and power helps identify better design points for given verification, performance, and power goals.</p> / Dissertation
297

SoftCache Architecture

Fryman, Joshua Bruce 19 July 2005 (has links)
Multiple trends in computer architecture are beginning to collide as process technology reaches ever smaller feature sizes. Problems with managing power, access times across a die, and increasing complexity to sustain growth are now blocking commercial products like the Pentium 4. These problems also occur in the embedded system space, albeit in a slightly different form. However, as process technology marches on, today's high-performance space is becoming tomorrow's embedded space. New techniques are needed to overcome these problems. In this thesis, we propose a novel architecture called SoftCache to address these emerging issues for embedded systems. We reduce the on-die memory controller infrastructure which reduces both power and space requirements, using the ubiquitous network device arena as a proving ground of viability. In addition, the SoftCache achieves further power and area savings by converting on-die cache structures into directly addressable SRAM and reducing or eliminating the external DRAM. To avoid the burden of programming complexity this approach presents to the application developer, we provide a transparent client-server dynamic binary translation system that runs arbitrary ELF executables on a stripped-down embedded target. One drawback to such a scheme lies in the overhead of additional instructions required to effect cache behavior, particularly with respect to data caching. Another drawback is the power use when fetching from remote memory over the network. The SoftCache comprises a dynamic client-server translation system on simplified hardware, targeted at Intel XScale client devices controlled from servers over the network. Reliance upon a network server as a ``backing store' introduces new levels of complexity, yet also allows for more efficient use of local space. The explicitly software managed aspects create a cache of variable line size, full associativity, and high flexibility. This thesis explores these particular issues, while approaching everything from the perspective of feasibility and actual architectural changes.
298

Temporal and spatial modeling of analog memristors

Greenlee, Jordan 08 July 2011 (has links)
As silicon meets its performance limits, new materials and methods for advancing computing and electronics as a whole are being intensely researched, as described in Chapter 1. Memristors are a fusion of these two research areas, with new materials being pursued concurrently to development of novel architectures to take advantage of these new devices. A background of memristors and an overview of different memristive developments in the field are reviewed in Chapter 2. Chapter 3 delves into the physical mechanisms of analog memristors. To investigate and understand the operation of analog memristors, a finite element method model has been developed. More specifically, the devices simulated include a simple memristor simulation where the lithium ions (dopants) are confined to the device, but allowed to move in response to a voltage applied across the device. To model a more physical memristor, charge carrier mobility dependence on dopant levels was added to the device, resulting in a simulated device that operates similarly to the first simulation. Thereafter, the effect of varying geometries was modeled, and it was determined that both the speed and the resistance change of the device were improved by increasing the ratio of the top and bottom metal contact lengths in a restrictive flow geometry. Finally, the effect of dopant removal was investigated. It was determined that if the greatest change in resistance is required, then the removal of dopants is the optimal operating regime for an analog memristor. Through a greater understanding of analog memristors developed by the simulation described herein, researchers will be able to better harness their power and implement them in bio-inspired systems and architectures.
299

Performance enhancing software loop transformations for embedded VLIW/EPIC processors

Akturan, Cagdas, January 2001 (has links)
Thesis (Ph. D.)--University of Texas at Austin, 2001. / Vita. Includes bibliographical references. Available also from UMI/Dissertation Abstracts International.
300

Architectural techniques to accelerate multimedia applications on general-purpose processors

Talla, Deependra, January 2001 (has links)
Thesis (Ph. D.)--University of Texas at Austin, 2001. / Vita. Includes bibliographical references. Available also from UMI/Dissertation Abstracts International.

Page generated in 0.1014 seconds