Spelling suggestions: "subject:"computer architecture -- design"" "subject:"computer architecture -- 1design""
1 |
A methodology for automated design of computer instruction setsBennett, J. P. January 1987 (has links)
No description available.
|
2 |
Complete Design Methodology of a Massively Parallel and Pipelined Memristive Stateful IMPLY Logic Based Reconfigurable ArchitectureRahman, Kamela Choudhury 06 June 2016 (has links)
Continued dimensional scaling of CMOS processes is approaching fundamental limits and therefore, alternate new devices and microarchitectures are explored to address the growing need of area scaling and performance gain. New nanotechnologies, such as memristors, emerge. Memristors can be used to perform stateful logic with nanowire crossbars, which allows for implementation of very large binary networks that can be easily reconfigured. This research involves the design of a memristor-based massively parallel datapath for various applications, specifically SIMD (Single Instruction Multiple Data) like architecture, and parallel pipelines. The dissertation develops a new model of massively parallel memristor-CMOS hybrid datapath architectures at a systems level, as well as a complete methodology to design them. One innovation of the proposed approach is that the datapath design is based on space-time diagrams that use stateful IMPLY gates built from binary memristors. This notation aids in the circuit minimization in logic design, calculations of delay and memristor costs, and sneak-path avoidance. Another innovation of the proposed methodology is a general, new, architecture model, MsFSMD (Memristive stateful Finite State Machine with Datapath) that has two interacting sub-systems: 1) a controller composed of a memristive RAM, MsRAM, to act as a pulse generator, along with a finite state machine realized in CMOS, a CMOS counter, CMOS multiplexers and CMOS decoders, 2) massively parallel, pipelined, datapath realized with a new variant of a CMOL-like nanowire crossbar array, MsCMOL (Memristive stateful CMOL), with binary stateful memristor-based IMPLY gates. Next contribution of the dissertation is the new type of FPGA. In contrast to the previous memristor-based FPGA (mrFPGA), the proposed MsFPGA (Memristive stateful logic Field Programmable Gate Array) uses memristors for memory, connections programming, and combinational logic implementation. With a regular structure of square abutting blocks of memristive nanowire crossbars and their short connections, proposed architecture is highly reconfigurable. As an example of using the proposed new FPGA to realize biologically inspired systems, the detailed design of a pipelined Euclidean Distance processor was presented and its various applications are mentioned. Euclidean Distance calculation is widely used by many neural network and associative memory based algorithms.
|
3 |
The Design of a Simple, Spiking Sparse Coding Algorithm for Memristive HardwareWoods, Walt 11 March 2016 (has links)
Calculating a sparse code for signals with high dimensionality, such as high-resolution images, takes substantial time to compute on a traditional computer architecture. Memristors present the opportunity to combine storage and computing elements into a single, compact device, drastically reducing the area required to perform these calculations. This work focused on the analysis of two existing sparse coding architectures, one of which utilizes memristors, as well as the design of a new, third architecture that employs a memristive crossbar. These architectures implement either a non-spiking or spiking variety of sparse coding based on the Locally Competitive Algorithm (LCA) introduced by Rozell et al. in 2008. Each architecture receives an arbitrary number of input lines and drives an arbitrary number of output lines. Training of the dictionary used for the sparse code was implemented through external control signals that approximate Oja's rule. The resulting designs were capable of representing input in real-time: no resets would be needed between frames of a video, for instance, though some settle time would be needed. The spiking architecture proposed is novel, emphasizing simplicity to achieve lower power than existing designs.
The architectures presented were tested for their ability to encode and reconstruct 8 x 8 patches of natural images. The proposed network reconstructed patches with a normalized, root-mean-square error of 0.13, while a more complicated CMOS-only approach yielded 0.095, and a non-spiking approach yielded 0.074. Several outputs competing for representation of the input was shown to improve reconstruction quality and preserve more subtle components in the final encoding; the proposed algorithm lacks this feature. Steps to address this were proposed for future work by scaling input spikes according to the current expected residual, without adding much complexity. The architectures were also tested with the MNIST digit database, passing a sparse code onto a basic classifier. The proposed architecture scored 81% on this test, a CMOS-only spiking variant scored 76%, and the non-spiking algorithm scored 85%. Power calculations were made for each design and compared against other publications. The overall findings showed great promise for spiking memristor-based ASICs, consuming only 28% of the power used by non-spiking architectures and 6.6% as much power as a CMOS-only spiking architecture on this task. The spike-based nature of the novel design was also parameterized into several intuitive parameters that could be adjusted to prefer either performance or power efficiency.
The design and analysis of architectures for sparse coding should greatly reduce the amount of future work needed to implement an end-to-end classification pipeline for images or other signal data. When lower power is a primary concern, the proposed architecture should be considered as it surpassed other published algorithms. These pipelines could be used to provide low-power visual assistance, highlighting objects within high-definition video frames in real-time. The technology could also be used to help self-driving cars identify hazards more quickly and efficiently.
|
4 |
Characterization and Avoidance of Critical Pipeline Structures in Aggressive Superscalar ProcessorsSassone, Peter G. 20 July 2005 (has links)
In recent years, with only small fractions of modern processors now accessible in a single cycle, computer architects constantly fight against propagation issues across the die. Unfortunately this trend continues to shift inward, and now the even most internal features of the pipeline are designed around communication, not computation. To address the inward creep of this constraint, this work focuses on the characterization of communication within the pipeline itself, architectural techniques to avoid it when possible, and layout co-design for early detection of problems.
I present work in creating a novel detection tool for common case operand movement which can rapidly characterize an applications dataflow patterns. The results produced are suitable for exploitation as a small number of patterns can describe a significant portion of modern applications.
Work on dynamic dependence collapsing takes the observations from the pattern results and shows how certain groups of operations can be dynamically grouped, avoiding unnecessary communication between individual instructions. This technique also amplifies the efficiency of pipeline data structures such as the reorder buffer, increasing both IPC and frequency.
I also identify the same sets of collapsible instructions at compile time, producing the same benefits with minimal hardware complexity. This technique is also done in a backward compatible manner as the groups are exposed by simple reordering of the binarys instructions.
I present aggressive pipelining approaches for these resources which avoids the critical timing often presumed necessary in aggressive superscalar processors. As these structures are designed for the worst case, pipelining them can produce greater frequency benefit than IPC loss. I also use the observation that the dynamic issue order for instructions in aggressive superscalar processors is predictable. Thus, a hardware mechanism is introduced for caching the wakeup order for groups of instructions efficiently. These wakeup vectors are then used to speculatively schedule instructions, avoiding the dynamic scheduling when it is not necessary.
Finally, I present a novel approach to fast and high-quality chip layout. By allowing architects to quickly evaluate what if scenarios during early high-level design, chip designs are less likely to encounter implementation problems later in the process.
|
5 |
Spare Block Cache Architecture to Enable Low-Voltage OperationSiddique, Nafiul Alam 01 January 2011 (has links)
Power consumption is a major concern for modern processors. Voltage scaling is one of the most effective mechanisms to reduce power consumption. However, voltage scaling is limited by large memory structures, such as caches, where many cells can fail at low voltage operation. As a result, voltage scaling is limited by a minimum voltage (Vccmin), below which the processor may not operate reliably. Researchers have proposed architectural mechanisms, error detection and correction techniques, and circuit solutions to allow the cache to operate reliably at low voltages. Architectural solutions reduce cache capacity at low voltages at the expense of logic complexity. Circuit solutions change the SRAM cell organization and have the disadvantage of reducing the cache capacity (for the same area) even when the system runs at a high voltage. Error detection and correction mechanisms use Error Correction Codes (ECC) codes to keep the cache operation reliable at low voltage, but have the disadvantage of increasing cache access time. In this thesis, we propose a novel architectural technique that uses spare cache blocks to back up a set-associative cache at low voltage. In our mechanism, we perform memory tests at low voltage to detect errors in all cache lines and tag them as faulty or fault-free. We have designed shifter and adder circuits for our architecture, and evaluated our design using the SimpleScalar simulator. We constructed a fault model for our design to find the cache set failure probability at low voltage. Our evaluation shows that, at 485mV, our designed cache operates with an equivalent bit failure probability to a conventional cache operating at 782mV. We have compared instructions per cycle (IPC), miss rates, and cache accesses of our design with a conventional cache operating at nominal voltage. We have also compared our cache performance with a cache using the previously proposed Bit-Fix mechanism. Our result show that our designed spare cache mechanism is 15% more area efficient compared to Bit-Fix. Our proposed approach provides a significant improvement in power and EPI (energy per instruction) over a conventional cache and Bit-Fix, at the expense of having lower performance at high voltage.
|
6 |
Design and evaluation of a technology-scalable architecture for instruction-level parallelismNagarajan, Ramadass, 1977- 28 August 2008 (has links)
Not available
|
7 |
Cultural and linguistic localization of the virtual shop owner interfaces of e commerce platforms for rural developmentDyakalashe, Siyabulela January 2009 (has links)
The introduction of Information and Communication Technologies (ICTs) for rural development in rural marginalized societies is vastly growing. However, the success of developing and deploying ICT related services is still in question as influential factors such as adaptability, scalability, sustainability, and usability have great effect on the rate of growth of ICTs in rural environments. The problem is that these ICT services should be maintained and sustained by the targeted communities. The main cause for rural marginalization is the fact that some communities situated in rural settings are educationally challenged and computer illiterate or semiliterate in comparison with urban communities. An ICT for development (ICT4D) intervention in the form of an e-Commerce platform that targets the social and economic growth of rural marginalized communities has been developed and field tested at Dwesa, a rural community located on the Wild Coast of the former homeland of Transkei in the Eastern Cape Province. The e-Commerce platform is known as “buy at Dwesa” and can be visited at this URL, http://www.dwesa.com. The aim of the e-Commerce platform is to motivate small entrepreneurs in rural areas to market their products and themselves to the global market as they lack the skills and resources for marketing their art and crafts. Virtual stores are created for a small group of entrepreneurs who will maintain and sustain the stores on their own. These entrepreneurs are often elderly women with limited education and little to no computer literacy - meaning that sustaining the stores may prove difficult for them. In this research we discuss the re-design and re-development of the virtual shop-owner interfaces of the e-Commerce platform to make them more culturally and linguistically localized. The virtual shops allow shop-owners to upload their artifacts to advertise and sell on the customer’s end of the e-Commerce platform. For multilingual and multicultural communities, adoption of the software interfaces to the user’s cultural and linguistic needs and modes of expression is important as failure to do so may reduce the level of benefits of e-Commerce initiatives.
|
Page generated in 0.1034 seconds