Global ETD Search

261	Design Space Exploration of Domain Specific CGRAs Using Crowd-sourcing Sistla, Anil Kumar 08 1900 (has links) CGRAs (coarse grained reconfigurable array architectures) try to ﬁll the gap between FPGAs and ASICs. Over three decades, the research towards CGRA design has produced number of architectures. Each of these designs lie at diﬀerent points on a line drawn between FPGAs and ASICs, depending on the tradeoﬀs and design choices made during the design of architectures. Thus, design space exploration (DSE) takes a very important role in the circuit design process. In this work I propose the design space exploration of CGRAs can be done quickly and eﬃciently through crowd-sourcing and a game driven approach based on an interactive mapping game UNTANGLED and a design environment called SmartBricks. Both UNTANGLED and SmartBricks have been developed by our research team at Reconfigurable Computing Lab, UNT. I present the results of design space exploration of domain-specific reconfigurable architectures and compare the results comparing stripe vs mesh style, heterogeneous vs homogeneous. I also compare the results obtained from diﬀerent interconnection topologies in mesh. These results show that this approach oﬀers quick DSE for designers and also provides low power architectures for a suite of benchmarks. All results were obtained using standard cell ASICs with 90 nm process. ASIC FPGA CGRA Computer architecture.
262	Efficient, scalable, and fair read-modify-writes Rajaram, Bharghava January 2015 (has links) Read-Modify-Write (RMW) operations, or atomics, have widespread application in (a) synchronization, where they are used as building blocks of various synchronization constructs like locks, barriers, and lock-free data structures (b) supervised memory systems, where every memory operation is effectively an RMW that reads and modifies metadata associated with memory addresses and (c) profiling, where RMW instructions are used to increment shared counters to convey meaningful statistics about a program. In each of these scenarios, the RMWs pose a bottleneck to performance and scalability. We observed that the cost of RMWs is dependent on two major factors – the memory ordering enforced by the RMW, and contention amongst processors performing RMWs to the same memory address. In the case of both synchronization and supervised memory systems, the RMWs are expensive due to the memory ordering enforced due to the atomic RMW operation. Performance overhead due to contention is more prevalent in parallel programs which frequently make use of RMWs to update concurrent data structures in a non-blocking manner. Such programs also suffer from a degradation in fairness amongst concurrent processors. In this thesis, we study the cost of RMWs in the above applications, and present solutions to obtain better performance and scalability from RMW operations. Firstly, this thesis tackles the large overhead of RMW instructions when used for synchronization in the widely used x86 processor architectures, like in Intel, AMD, and Sun processors. The x86 processor architecture implements a variation of the Total-Store-Order (TSO) memory consistency model. RMW instructions in existing TSO architectures (we call them type-1 RMW) are ordered like memory fences, which makes them expensive. The strong fence-like ordering of type-1 RMWs is unnecessary for the memory ordering required by synchronization. We propose weaker RMW instructions for TSO consistency; we consider two weaker definitions: type-2 and type-3, each causing subtle ordering differences. Type-2 and type-3 RMWs avoid the fence-like ordering of type-1 RMWs, thereby reducing their overhead. Recent work has shown that the new C/C++11 memory consistency model can be realized by generating type-1 RMWs for SC-atomic-writes and/or SC-atomic-reads. We formally prove that this is equally valid for the proposed type-2 RMWs, and partially for type-3 RMWs. We also propose efficient implementations for type-2 (type-3) RMWs. Simulation results show that our implementation reduces the cost of an RMW by up to 58.9% (64.3%), which translates into an overall performance improvement of up to 9.0% (9.2%) for the programs considered. Next, we argue the case for an efficient and correct supervised memory system for the TSO memory consistency model. Supervised memory systems make use of RMW-like supervised memory instructions (SMIs) to atomically update metadata associated with every memory address used by an application program. Such a system is used to help increase reliability, security and accuracy of parallel programs by offering debugging/monitoring features. Most existing supervised memory systems assume a sequentially consistent memory. For weaker consistency models, like TSO, correctness issues (like imprecise exceptions) arise if the ordering requirement of SMIs is neglected. In this thesis, we show that it is sufficient for supervised instructions to only read and process their metadata in order to ensure correctness. We propose SuperCoP, a supervised memory system for relaxed memory models in which SMIs read and process metadata before retirement, while allowing data and metadata writes to retire into the write-buffer. Our experimental results show that SuperCoP performs better than the existing state-of-the-art correct supervision system by 16.8%. Finally, we address the issue of contention and contention-based failure of RMWs in non-blocking synchronization mechanisms. We leverage the fact that most existing lock-free programs make use of compare-and-swap (CAS) loops to access the concurrent data structure. We propose DyFCoM (Dynamic Fairness and Contention Management), a holistic scheme which addresses both throughput and fairness under increased contention. DyFCoM monitors the number of successful and failed RMWs in each thread, and uses this information to implement a dynamic backoff scheme to optimize throughput. We also use this information to throttle faster threads and give slower threads a higher chance of performing their lock-free operations, to increase fairness among threads. Our experimental results show that our contention management scheme alone performs better than the existing state-of-the-art CAS contention management scheme by an average of 7.9%. When fairness management is included, our scheme provides an average of 3.4% performance improvement over the constant backoff scheme, while showing increased fairness values in all cases (up to 43.6%). 004
263	A specialised architecture for embedding trust evaluation capabilities in intelligent mobile agents 24 February 2010 (has links) M.Sc.(Computer Science) / The dissertation investigates trust and reputation as a specialisation of agent technology. The research presented herein aims to establish and demonstrate how it is possible for one rational agent to trust another entity. Furthermore, the research presented herein aims to determine the extent of the limitations of trust and reputation models, and of the demonstrable solution in particular. To this end, the dissertation investigates theoretical aspects of trust. The dissertation investigates several existing trust models and establishes criteria for a qualitative analysis. Supplementary techniques aimed at enhancing trust evaluation are also investigated. The research also identifies architectural abstractions suitable for developing agents capable of intelligent trust evaluation. The main focus of the research is enhancing agent protection through a trust-based approach. A particular problem is the threats posed to mobile agents from malicious agent hosts. Therefore, a solution is sought that can be used to augment existing mechanisms aimed at mobile agent protection and agent protection in general. Thus, the research also examines mobile agents and mobile agent systems in an effort to produce a general trust-based solution that can be applied in most mobile agent systems. The solution presented in the dissertation proposes the concept of an evaluator agent as an add-on to existing mobile agent systems. The evaluator agent is presented as a rational agent with an embedded intelligent trust evaluation capability. The intelligent trust evaluation capability is provided via a set of reusable components. The solution demonstrates how a rational agent may evaluate the trustworthiness of other entities. The dissertation further analyses the strengths and limitations of the approach. The dissertation provides results that quantitatively demonstrate the extent of the limitations of the trust-based approach. The contribution of the dissertation partly lies in the service orientation of the evaluator agent approach. The service orientation of the solution provides an abstraction and a degree of heterogeneity suitable for handling the challenges of open environments. The solution can be deployed in most mobile agent systems to provide a trust evaluation service without the need to redesign existing mobile agent systems. More broadly, the research is another step towards the development of cognitive social agents. Intelligent agents (Computer software) Computer architecture Mobile agent systems
264	Proposta e simulação de uma arquitetura RISC / Design and simulation of a RISC architecture Valente, Fredy Joao 12 April 1991 (has links) RISC - Uma nova tendência em arquitetura de computadores. Este trabalho apresenta um estudo de como surgiu esta nova arquitetura, e suas características básicas, que a diferencia das arquiteturas convencionais. Uma proposta de microprocessador RISC é apresentada, com sua rota de dados completamente detalhada. Um simulador para arquitetura RISC foi então construído, para se testar este microprocessador. Para validar o simulador, que é a idéia principal deste trabalho, e para se avaliar a arquitetura do microprocessador proposto, usou-se o benchmark Dhrystone, e os resultados foram comparados com máquinas comerciais. / RISC - A new trend in computer architecture. This work presents a study of how this new architecture emerged, and the basic caracteristics that diferentiate it from the conventional architectures. A proposed RISC microprocessor is presented with the completely detailed data-path. A simulator for the RIse architecture was built to test this microprocessor. To validate the simulator, which is the main idea of this work, and to evaluate the architecture of the proposed microprocessor, the Dhrystone benchmark was used and the results were compared with commercial machines. Architecture simulation Arquitetura de computadores Computer architecture RISC RISC Simulação de arquiteturas
265	Parallel Instruction Decoding for DSP Controllers with Decoupled Execution Units Pettersson, Andreas January 2019 (has links) Applications run on embedded processors are constantly evolving. They are for the most part growing more complex and the processors have to increase their performance to keep up. In this thesis, an embedded DSP SIMT processor with decoupled execution units is under investigation. A SIMT processor exploits the parallelism gained from issuing instructions to functional units or to decoupled execution units. In its basic form only a single instruction is issued per cycle. If the control of the decoupled execution units become too fine-grained or if the control burden of the master core becomes sufficiently high, the fetching and decoding of instructions can become a bottleneck of the system. This thesis investigates how to parallelize the instruction fetch, decode and issue process. Traditional parallel fetch and decode methods in superscalar and VLIW architectures are investigated. Benefits and drawbacks of the two are presented and discussed. One superscalar design and one VLIW design are implemented in RTL, and their costs and performances are compared using a benchmark program and synthesis. It is found that both the superscalar and the VLIW designs outperform a baseline scalar processor as expected, with the VLIW design performing slightly better than the superscalar design. The VLIW design is found to be able to achieve a higher clock frequency, with an area comparable to the area of the superscalar design. This thesis also investigates how instructions can be encoded to lower the decode complexity and increase the speed of issue to decoupled execution units. A number of possible encodings are proposed and discussed. Simulations show that the encodings have a possibility to considerably lower the time spent issuing to decoupled execution units. superscalar VLIW SIMT computer architecture DSP Computer Engineering Datorteknik
266	Very large register file for BLAS-3 operations. January 1995 (has links) by Aylwin Chung-Fai, Yu. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1995. / Includes bibliographical references (leaves 117-118). / Abstract --- p.i / Acknowledgement --- p.iii / List of Tables --- p.v / List of Figures --- p.vi / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- BLAS-3 Operations --- p.2 / Chapter 1.2 --- Organization of Thesis --- p.2 / Chapter 1.3 --- Contribution --- p.3 / Chapter 2 --- Background Studies --- p.4 / Chapter 2.1 --- Registers & Cache Memory --- p.4 / Chapter 2.2 --- Previous Research --- p.6 / Chapter 2.3 --- Problem of Register & Cache --- p.8 / Chapter 2.4 --- BLAS-3 Operations On RISC Microprocessor --- p.10 / Chapter 3 --- Compiler Optimization Techniques for BLAS-3 Operations --- p.12 / Chapter 3.1 --- One-Dimensional Q-Way J-Loop Unrolling --- p.13 / Chapter 3.2 --- Two-Dimensional P×Q -Ways I×J-Loops Unrolling --- p.15 / Chapter 3.3 --- Addition of Code to Remove Redundant Code --- p.17 / Chapter 3.4 --- Simulation Result --- p.17 / Chapter 3.5 --- Summary --- p.23 / Chapter 4 --- Architectural Model of Very Large Register File --- p.25 / Chapter 4.1 --- Architectural Model --- p.26 / Chapter 4.2 --- Traditional Register File vs. Very Large Register File --- p.32 / Chapter 5 --- Ideal Case Study of Very Large Register File --- p.35 / Chapter 5.1 --- Matrix Multiply --- p.36 / Chapter 5.2 --- LU Decomposition --- p.41 / Chapter 5.3 --- Convolution --- p.50 / Chapter 6 --- Worst Case Study of Very Large Register File --- p.58 / Chapter 6.1 --- Matrix Multiply --- p.59 / Chapter 6.2 --- LU Decomposition --- p.65 / Chapter 6.3 --- Convolution --- p.74 / Chapter 7 --- Proposed Case Study of Very Large Register File --- p.81 / Chapter 7.1 --- Matrix Multiply --- p.82 / Chapter 7.2 --- LU Decomposition --- p.91 / Chapter 7.3 --- Convolution --- p.102 / Chapter 7.4 --- Comparison --- p.111 / Chapter 8 --- Conclusion & Future Work --- p.114 / Chapter 8.1 --- Summary --- p.114 / Chapter 8.2 --- Future Work --- p.115 / Bibliography --- p.117 Registers (Computers) Reduced instruction set computers Computer architecture
267	Investigations in CPU design: a triple-instruction computer. January 1994 (has links) Wai-Tung Chung. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1994. / Includes bibliographical references (leaves 102-104). / Chapter 1. --- Introduction --- p.1 / Chapter 1.1 --- Central Processing Unit innovation / Chapter 1.2 --- Long Instruction Word computer / Chapter 1.3 --- Prior attempts / Chapter 2. --- The new architecture --- p.11 / Chapter 2.1 --- The triple-instruction word / Chapter 2.2 --- Functional view of the architecture / Chapter 2.3 --- Inter-functional units synchronization / Chapter 2.4 --- Instruction set design / Chapter 2.5 --- Special features / Chapter 3. --- Simulation of the architecture --- p.39 / Chapter 3.1 --- Computer architecture simulation / Chapter 3.2 --- The simulation language used: APL / Chapter 3.3 --- Simulation environment / Chapter 3.4 --- Simulation design / Chapter 3.5 --- The micro-architecture / Chapter 3.6 --- Implementation details / Chapter 4. --- The supporting environment --- p.53 / Chapter 4.1 --- The environment / Chapter 4.2 --- The Pseudo-machine configuration / Chapter 4.3 --- Assembly language description / Chapter 4.4 --- Details of the utilities / Chapter 5. --- Evaluation --- p.53 / Chapter 5.1 --- Case Study / Chapter 5.2 --- Results and comparison / Chapter 5.3 --- Summary / Chapter 6. --- Discussion and conclusion --- p.96 / Chapter 6.1 --- The triple-instruction computer / Chapter 6.2 --- Use of APL for architectural simulation / Chapter 6.3 --- Further considerations / Chapter 7. --- References --- p.81 / Chapter 8. --- Appendix I: Program listing for the TIC simulator / Chapter 9. --- Appendix II: Screen dump of the simulation runs Computer architecture System design APL (Computer program language)
268	Procedure graphs and computer optimizations. January 1992 (has links) by Ho Kei Shiu Edward. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1992. / Includes bibliographical references (leaves 199-202). / Acknowledgement / Abstract / Chapter Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Initial Motivations --- p.1 / Chapter 1.2 --- Objectives of Our Study --- p.2 / Chapter 1.3 --- Outline of the Thesis --- p.3 / Chapter Chapter 2 --- Basics of the Procedure Graph Theory --- p.6 / Chapter 2.1 --- Introducing Procedure Graph Theory --- p.6 / Chapter 2.1.1 --- "Nodes, Arcs and Pseudo-time Labels" --- p.7 / Chapter 2.2 --- Examples --- p.12 / Chapter 2.3 --- Exploring the Meanings of the Pseudo-time Labels --- p.13 / Chapter 2.4 --- Equivalence and Transformation --- p.16 / Chapter 2.4.1 --- Equivalence --- p.16 / Chapter 2.4.2 --- Transmission Track and Causality Preservation --- p.16 / Chapter 2.4.3 --- Transformation --- p.17 / Chapter 2.4.3.1 --- Serial-to-Parallel Transformations (SP) --- p.18 / Chapter 2.4.3.2 --- Parallel-to-Serial Transformations (PS) --- p.20 / Chapter 2.4.3.3 --- Store-Store Cancellations (SSC) --- p.21 / Chapter 2.4.3.4 --- Normalization of Pseudo-time Labels --- p.23 / Chapter 2.4.3.5 --- Boundary Conditions and Multi-level Pseudo-time Labels --- p.24 / Chapter 2.5 --- Procedure Graph Optimizations --- p.28 / Chapter 2.5.1 --- Representing Dependencies --- p.28 / Chapter 2.5.2 --- Eliminating Unnecessary Dependencies --- p.32 / Chapter 2.6 --- Simulation Program --- p.36 / Chapter 2.6.1 --- Preliminary Study Using the Simulation Program --- p.36 / Chapter 2.6.2 --- Economic Factors --- p.37 / Chapter 2.6.3 --- Combinatorial Explosion of Procedure Graphs --- p.38 / Chapter Chapter 3 --- Extending the Procedure Graph Theory --- p.45 / Chapter 3.1 --- The T-Operator and the F-Operator --- p.45 / Chapter 3.2 --- Modifying the Firing Rule --- p.46 / Chapter 3.3 --- Procedure Graph Representation for Different Branch Strategies --- p.49 / Chapter 3.3.1 --- Multiple-Path Execution --- p.49 / Chapter 3.3.2 --- Conditional Execution with Delayed Commitment of Results --- p.51 / Chapter 3.3.3 --- Speculative Execution with Register Backup and Branch Repair --- p.52 / Chapter 3.4 --- Procedure Graph Representation for a Stack --- p.56 / Chapter 3.5 --- Vector Forwarding --- p.58 / Chapter 3.5.1 --- An Example of Vector Chaining in Cray-1 --- p.58 / Chapter 3.5.2 --- "Vector SP, PS and SSC" --- p.59 / Chapter 3.5.3 --- A Note Concerning the Use of Algorithmic Time Labels --- p.61 / Chapter 3.5.4 --- Further Consideration of Vector Forwarding --- p.62 / Chapter Chapter 4 --- Hardware Realization of Procedure Graph Optimizations --- p.64 / Chapter 4.1 --- Node-Oriented Versus Arc-Oriented Representation --- p.64 / Chapter 4.2 --- Backward Pointers Versus Forward Pointers --- p.65 / Chapter 4.3 --- Backward Pointers as Hardware Tags --- p.69 / Chapter 4.4 --- Pointer Algebra --- p.72 / Chapter 4.4.1 --- Serial-to-Parallel Transformations --- p.72 / Chapter 4.4.2 --- Store-Store Cancellations --- p.73 / Chapter 4.4.3 --- Parallel-to-Serial Transformations --- p.74 / Chapter 4.5 --- Drawbacks of Using Backward Pointers --- p.75 / Chapter 4.6 --- Multiple Tags --- p.76 / Chapter Chapter 5 --- A Backward-Pointer Representation Scheme :The T-Architecture --- p.82 / Chapter 5.1 --- The T-Architecture --- p.82 / Chapter 5.2 --- Local Addressing Space Within the CPU --- p.83 / Chapter 5.3 --- Why Reservation Stations --- p.84 / Chapter 5.4 --- Memory Data Forwarding --- p.89 / Chapter 5.4.1 --- The Updating Buffer --- p.90 / Chapter 5.4.2 --- Ordering and Consistency --- p.96 / Chapter 5.4.2.1 --- Store After Store --- p.96 / Chapter 5.4.2.2 --- Store After Load --- p.97 / Chapter 5.5 --- Speculative Execution --- p.97 / Chapter 5.5.1 --- Procedural Dependencies --- p.97 / Chapter 5.5.2 --- Branch Instruction Format --- p.98 / Chapter 5.5.3 --- Branch Prediction --- p.99 / Chapter 5.5.4 --- Branch Instruction Unit --- p.99 / Chapter 5.5.5 --- Register Backups --- p.100 / Chapter 5.5.5.1 --- Branch is Correctly Predicted --- p.101 / Chapter 5.5.5.2 --- Branch Repair --- p.102 / Chapter 5.5.5.3 --- Example --- p.102 / Chapter 5.5.6 --- Total Ordering Memory Stores --- p.110 / Chapter 5.5.7 --- Simplifying the Checkpoint Repair Mechanism --- p.112 / Chapter 5.6 --- A Simulator for the T-Architecture --- p.113 / Chapter 5.6.1 --- Basic Configuration of the Simulator --- p.114 / Chapter 5.6.2 --- Parameters of the Simulator --- p.115 / Chapter 5.6.3 --- Benchmark Programs --- p.116 / Chapter 5.7 --- Experiments --- p.118 / Chapter 5.7.1 --- Experiment1 --- p.119 / Chapter 5.7.2 --- Experiment2 --- p.121 / Chapter 5.7.3 --- Experiment3 --- p.123 / Chapter 5.7.4 --- Experiment4 --- p.127 / Chapter Chapter 6 --- Predictive Procedure Graph Optimizations in the S-Prototype --- p.137 / Chapter 6.1 --- Keys to Higher Performance --- p.138 / Chapter 6.2 --- The Superscalar Approach --- p.139 / Chapter 6.3 --- Processor Architecture of the S-Prototype --- p.139 / Chapter 6.4 --- Design Strategies of the S-Prototype --- p.141 / Chapter 6.4.1 --- Fetching Multiple Instructions --- p.142 / Chapter 6.4.2 --- Handling Procedural Dependencies : Branching Instructions --- p.142 / Chapter 6.4.2.1 --- Branch Unit and Branch Predicting Buffer --- p.143 / Chapter 6.4.2.2 --- Branch Repairing - Recovering Machine State --- p.144 / Chapter 6.4.3 --- Extensive Tagging and Result Forwarding --- p.147 / Chapter 6.4.4 --- Static and Dynamic Data Dependencies --- p.148 / Chapter 6.4.4.1 --- Handling Static Dependencies by using the Multitag Pool --- p.149 / Chapter 6.4.4.2 --- Handling Dynamic Dependencies by using the Reservation Stations --- p.150 / Chapter 6.4.5 --- Extracting Parallelism --- p.152 / Chapter 6.4.5.1 --- Representing Data Dependency in the Multitag Pool --- p.153 / Chapter 6.4.5.2 --- Implementing Transformation Rules --- p.156 / Chapter 6.4.6 --- Out-of-order Issue and Execution --- p.157 / Chapter 6.4.7 --- Memory Accesses --- p.158 / Chapter 6.4.8 --- Bus Contention and Arbitration --- p.160 / Chapter Chapter 7 --- An Attempt To Simulate Procedure Graphs Using Graph Grammar --- p.161 / Chapter 7.1 --- Introducing Graph Grammar --- p.161 / Chapter 7.2 --- Basic Concepts in Sequential Graph Grammar --- p.161 / Chapter 7.2.1 --- Production Rules and Interface Graph --- p.162 / Chapter 7.2.2 --- Gluing Constructions and Pushouts --- p.162 / Chapter 7.2.3 --- Gluing Conditions --- p.163 / Chapter 7.3 --- Initial Considerations to Simulate Procedure Graphs --- p.165 / Chapter 7.4 --- Example --- p.165 / Chapter 7.5 --- Problems Encountered --- p.167 / Chapter 7.6 --- Some Insights into the Unsolved Problem --- p.168 / Chapter 7.7 --- "Parallelism, Concurrency and New Transformation Rules" --- p.171 / Chapter Chapter 8 --- Representing Causality Using Petri Nets --- p.175 / Chapter 8.1 --- Defining Petri Nets --- p.175 / Chapter 8.1.1 --- Petri Nets as a Tool for System Modeling --- p.176 / Chapter 8.1.2 --- The Characteristics of a Petri Net --- p.177 / Chapter 8.1.3 --- Useful Extensions --- p.178 / Chapter 8.2 --- Program Analysis and Modeling Computer Operations --- p.179 / Chapter 8.2.1 --- Representing Causality Relationships --- p.180 / Chapter 8.2.2 --- Representing the Total Ordering of Instructions in a Sequential Program --- p.184 / Chapter 8.3 --- Extending the Model --- p.186 / Chapter 8.4 --- Comparing Procedure Graphs and Petri Nets --- p.188 / Chapter Chapter 9 --- Conclusion and Future Research Directions --- p.190 / Chapter 9.1 --- Formalizing the Procedure Graph Theory --- p.190 / Chapter 9.2 --- Mathematical Properties of Procedure Graphs --- p.191 / Chapter 9.3 --- Register Abuses --- p.192 / Chapter 9.4 --- Hardware Representation of Procedure Graphs --- p.194 / Chapter 9.5 --- Tags Describing Tags --- p.196 / Chapter 9.6 --- Software Optimizations --- p.197 / Chapter 9.7 --- Simulation Programs --- p.198 / References --- p.199 Computer architecture Graph theory--Data processing
269	An ALU design using a novel asynchronous pipeline architecture. January 2000 (has links) Tang, Tin-Yau. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2000. / Includes bibliographical references (leaves 122-123). / Abstracts in English and Chinese. / Table of Content --- p.2 / List of Figures --- p.4 / List of Tables --- p.6 / Acknowledgements --- p.7 / Abstract --- p.8 / Chapter I. --- Introduction --- p.11 / Chapter 1.1 --- Asynchronous Design --- p.12 / Chapter 1.1.1 --- What is asynchronous design? --- p.12 / Chapter 1.1.2 --- Potential advantages of asynchronous design --- p.12 / Chapter 1.1.3 --- Design methodology for asynchronous circuit --- p.15 / Chapter 1.1.4 --- Difficulty and limitation of asynchronous design --- p.19 / Chapter 1.2 --- Pipeline and Asynchronous Pipeline --- p.21 / Chapter 1.2.1 --- What is pipeline? --- p.21 / Chapter 1.2.2 --- Property of pipeline system --- p.21 / Chapter 1.2.3 --- Asynchronous pipeline --- p.23 / Chapter 1.3 --- Design Motivation --- p.26 / Chapter II. --- Design Theory --- p.27 / Chapter 2.1 --- A Novel Asynchronous Pipeline Architecture --- p.28 / Chapter 2.1.1 --- The problem of classical asynchronous pipeline --- p.28 / Chapter 2.1.2 --- The new handshake cell --- p.28 / Chapter 2.1.3 --- The modified asynchronous pipeline architecture --- p.29 / Chapter 2.2 --- Design of the ALU --- p.36 / Chapter 2.2.1 --- The functionality of ALU --- p.36 / Chapter 2.2.2 --- The choice of the adder and the BLC adder --- p.37 / Chapter III. --- Implementation --- p.41 / Chapter 3.1 --- ALU Detail --- p.42 / Chapter 3.1.1 --- Global arrangement --- p.42 / Chapter 3.1.2 --- Shift and Rotate --- p.46 / Chapter 3.1.3 --- Flags generation --- p.49 / Chapter 3.2 --- Application of the Pipeline Architecture --- p.53 / Chapter 3.2.1 --- The reset network for the pipeline architecture --- p.53 / Chapter 3.2.2 --- Handshake simplification for splitting and joining of datapath. --- p.55 / Chapter IV. --- Result --- p.59 / Chapter 4.1 --- Measurement and Simulation Result --- p.60 / Chapter 4.2 --- Global Routing Parasites --- p.63 / Chapter 4.3 --- Low Power Application --- p.65 / Chapter V. --- Conclusion --- p.67 / Chapter VI. --- Appendixes --- p.69 / Chapter 6.1 --- The Small Micro-coded Processor --- p.69 / Chapter 6.2 --- The Instruction Table of the ALU --- p.70 / Chapter 6.3 --- Measurement and Simulation Result --- p.71 / Chapter 6.4 --- "VHDLs, Schematics and Layout" --- p.87 / Chapter 6.5 --- Pinout of the Test Chip --- p.120 / Chapter 6.6 --- The Chip Photo --- p.121 / Chapter VII. --- Reference --- p.122 Pipelining (Electronics) Computer architecture
270	Parameterized complexity of graph contraction problems. / CUHK electronic theses & dissertations collection January 2013 (has links) 在給定圖類II的II收縮問題中，對于任意的輸入圖G和參數k，該問題詢問能否收縮G中的至多k條邊使得到的新圖屬于類II。此問題與諸多圖算法問題相關聯，且在之前的文獻中有較多研究成果。 / 在本文中我們將研究圖收縮問題對于不同的類II的參數化復雜度。我們首先給出完全圖收縮問題的FPT算法。并且相較于另兩個已知的FPT問題：弦圖删邊問題及弦圖增邊問題，我們證明弦圖收縮問題是W[2]難問題。我們還將說明當H為一個特定的圈、路、星圖時，無H收縮問題是W[2]難問題，并且給出對所有連通度至少為3的圖H，無H收縮問題之參數化復雜度的一個完備界定。此外，我們證明了任意的無完全圖收縮問題、完全二分圖收縮問題以及分裂圖收縮問題都是FPT時間可解決，并且這些問題都不大可能包含多項式核心。 / 除此之外，我們將研究諸如點染色問題及支配集問題等經典NP難問題在一類特殊參數化圖上的參數化復雜度。 / 我們相信本文對于圖收縮問題的研究有著重要的貢獻，同時文中的一些思想和方法也將有助于對此問題進一步的研究。 / The II-Contraction problem is to determine whether an input graph G can be modified into a graph belonging to a desired class II by contracting at most k edges. It is closely related to graph minors and graph modification problems, and has been studied in the literature for several classes II. / In this thesis, we study parameterized complexity of II-Contraction problems in terms of graph classes II. We present FPT algorithms for Clique Contraction, which is the dual of a well-known FPT problem: Maximum Clique Minor. In contrast to FPT of Chordal Deletion and Chordal Completion, we prove the W[2]-hardness of Chordal Contraction. We also show that H-Free Contraction is W[2]-hard whenever H is a fixed graph that is a cycle C₁ for l≥4, an odd path P₁ for l≥5, a star K₁,[subscript r] for r≥4 or a diamond, and completely characterize the parameterized complexity of H-Free Contraction for every fixed 3-connected H. Moreover, we prove that K[subscript t]-free Contraction for every fixed t≥3, Biclique Contraction and Split Contraction are FPT but have no polynomial kernels unless NP ⊆ coNP/poly. / Furthermore, we study the parameterized complexity of some NP-complete classical problems such as Vertex Coloring and Dominating Set on F↑ke graphs, which are graphs that can be made into F-graphs by contracting at most k edges. / We believe that the thesis has made an important contribution to the study of graph contraction problems, and our ideas and techniques will be useful to future studies of graph contraction problems. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Guo, Chengwei. / Thesis (Ph.D.)--Chinese University of Hong Kong, 2013. / Includes bibliographical references (leaves 109-117). / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Abstracts also in Chinese. / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Background --- p.3 / Chapter 1.1.1 --- Graph minors --- p.3 / Chapter 1.1.2 --- Graph modification problems --- p.4 / Chapter 1.1.3 --- Previous results for edge contraction --- p.5 / Chapter 1.2 --- Organization of the thesis --- p.6 / Chapter 2 --- Notation and definitions --- p.9 / Chapter 2.1 --- Basic notations on graphs --- p.9 / Chapter 2.2 --- Parameterized complexity --- p.12 / Chapter 3 --- FPT algorithms for -Contraction --- p.16 / Chapter 3.1 --- General results --- p.17 / Chapter 3.1.1 --- FPT algorithm for bounded-degree graphs --- p.17 / Chapter 3.1.2 --- Kernelization of special graph families --- p.18 / Chapter 3.1.3 --- Component theorem for H-Free Contraction . --- p.19 / Chapter 3.2 --- Contraction to clique-free graphs --- p.21 / Chapter 3.3 --- Contraction to cliques --- p.23 / Chapter 3.3.1 --- FPT algorithm for Clique Contraction --- p.24 / Chapter 3.3.2 --- Kernelization of Clique Contraction --- p.29 / Chapter 3.4 --- Contraction to bicliques --- p.32 / Chapter 3.4.1 --- Kernelization of Biclique Contraction --- p.33 / Chapter 3.5 --- Contraction to split graphs --- p.34 / Chapter 3.5.1 --- FPT algorithm for Split Contraction --- p.35 / Chapter 4 --- Hardness of H-Free Contraction --- p.40 / Chapter 4.1 --- Contraction to cycle-free graphs --- p.42 / Chapter 4.1.1 --- Cl-Free Contraction --- p.42 / Chapter 4.1.2 --- Chordal Contraction --- p.44 / Chapter 4.2 --- Contraction to path-free graphs --- p.45 / Chapter 4.2.1 --- Odd path --- p.45 / Chapter 4.2.2 --- Even path --- p.47 / Chapter 4.3 --- Contraction to star-free graphs --- p.49 / Chapter 4.4 --- Contraction to diamond-free graphs --- p.54 / Chapter 4.5 --- 3-connected H --- p.56 / Chapter 5 --- Incompressibility --- p.62 / Chapter 5.1 --- General techniques --- p.63 / Chapter 5.2 --- Incompressibility of clique-free contraction problems --- p.64 / Chapter 5.3 --- Incompressible of other -Contraction problems --- p.72 / Chapter 5.4 --- Conjecture for Clique Contraction --- p.77 / Chapter 6 --- Parameterized graph families --- p.81 / Chapter 6.1 --- Background and general results --- p.82 / Chapter 6.2 --- Clique ↑ ke graphs --- p.84 / Chapter 6.3 --- Other graph classes --- p.89 / Chapter 7 --- Concluding remarks --- p.93 / Chapter 7.1 --- Summary --- p.93 / Chapter 7.2 --- Open problems on edge contraction --- p.95 / Chapter 7.3 --- Related graph operations --- p.97 / Chapter 7.3.1 --- Vertex merging --- p.97 / Chapter 7.3.2 --- Neighborhood contraction --- p.99 / Chapter A --- Graph classes --- p.102 / Bibliography --- p.109 Computer architecture Graph theory--Data processing

Search results