Global ETD Search

11	Microarchitecture and FPGA Implementation of the Multi-level Computing Architecture Capalija, Davor 30 July 2008 (has links) We design the microarchitecture of the Multi-Level Computing Architecture (MLCA), focusing on its Control Processor (CP). The design of the microarchitecture of the CP faces us with both opportunities and challenges that stem from the coarse granularity of the tasks and the large number of inputs and outputs for each task instruction. Thus, we explore changes to standard superscalar microarchitectural techniques. We design the entire CP microarchitecture and implement it on an FPGA using SystemVerilog. We synthesize and evaluate the MLCA system based on a 4-processor shared-memory multiprocessor. The performance of realistic applications shows scalable speedups that are comparable to that of simulation. We believe that our implementation achieves low complexity in terms of FPGA resource usage and operating frequency. In addition, we argue that our design methodology allows the scalability of the CP as the entire system grows. Computer architecture FPGA applications Microarchitecture Parallelism Embedded systems Multi-core systems 0984
12	New abstractions and mechanisms for virtualizing future many-core systems Kumar, Sanjay 08 July 2008 (has links) To abstract physical into virtual computing infrastructures is a longstanding goal. Efforts in the computing industry started with early work on virtual machines in IBM's VM370 operating system and architecture, continued with extensive developments in distributed systems in the context of grid computing, and now involve investments by key hardware and software vendors to efficiently virtualize common hardware platforms. Recent efforts in virtualization technology are driven by two facts: (i) technology push -- new hardware support for virtualization in multi- and many-core hardware platforms and in the interconnects and networks used to connect them, and (ii) technology pull -- the need to efficiently manage large-scale data-centers used for utility computing and extending from there, to also manage more loosely coupled virtual execution environments like those used in cloud computing. Concerning (i), platform virtualization is proving to be an effective way to partition and then efficiently use the ever-increasing number of cores in many-core chips. Further, I/O Virtualization enables I/O device sharing with increased device throughput, providing required I/O functionality to the many virtual machines (VMs) sharing a single platform. Concerning (ii), through server consolidation and VM migration, for instance, virtualization increases the flexibility of modern enterprise systems and creates opportunities for improvements in operational efficiency, power consumption, and the ability to meet time-varying application needs. This thesis contributes (i) new technologies that further increase system flexibility, by addressing some key problems of existing virtualization infrastructures, and (ii) it then directly addresses the issue of how to exploit the resulting increased levels of flexibility to improve data-center operations, e.g., power management, by providing lightweight, efficient management technologies and techniques that operate across the range of individual many-core platforms to data-center systems. Concerning (i), the thesis contributes, for large many-core systems, insights into how to better structure virtual machine monitors (VMMs) to provide more efficient utilization of cores, by implementing and evaluating the novel Sidecore approach that permits VMMs to exploit the computational power of parallel cores to improve overall VMM and I/O performance. Further, I/O virtualization still lacks the ability to provide complete transparency between virtual and physical devices, thereby limiting VM mobility and flexibility in accessing devices. In response, this thesis defines and implements the novel Netchannel abstraction that provides complete location transparency between virtual and physical I/O devices, thereby decoupling device access from device location and enabling live VM migration and device hot-swapping. Concerning (ii), the vManage set of abstractions, mechanisms, and methods developed in this work are shown to substantially improve system manageability, by providing a lightweight, system-level architecture for implementing and running the management applications required in data-center and cloud computing environments. vManage simplifies management by making it possible and easier to coordinate the management actions taken by the many management applications and subsystems present in data-center and cloud computing systems. Experimental evaluations of the Sidecore approach to VMM structure, Netchannel, and of vManage are conducted on representative platforms and server systems, with consequent improvements in flexibility, in I/O performance, and in management efficiency, including power management. Virtualization Many-core systems Management Data-centers I/O Virtual computer systems
13	Design of a Distributed Transactional Memory for Many-core systems Trigonakis, Vasileios January 2011 (has links) The emergence of Multi/Many-core systems signified an increasing need for parallel programming. Transactional Memory (TM) is a promising programming paradigm for creating concurrent applications. At current date, the design of Distributed TM (DTM) tailored for non coherent Manycore architectures is largely unexplored. This thesis addresses this topic by analysing, designing, and implementing a DTM system suitable for low latency message passing platforms. The resulting system, named SC-TM, the Single-Chip Cloud TM, is a fully decentralized and scalable DTM, implemented on Intel’s SCC processor; a 48-core ’concept vehicle’ created by Intel Labs as a platform for Many-core software research. SC-TM is one of the first fully decentralized DTMs that guarantees starvation-freedom and the first to use an actual pluggable Contention Manager (CM) to ensure liveness. Finally, this thesis introduces three completely decentralized CMs; Offset-Greedy, a decentralized version of Greedy, Wholly, which relies on the number of completed transactions, and FairCM, that makes use off the effective transactional time. The evaluation showed the latter outperformed the three. Engineering and Technology Teknik och teknologier
14	Coordinated system level resource management for heterogeneous many-core platforms Gupta, Vishakha 24 August 2011 (has links) A challenge posed by future computer architectures is the efficient exploitation of their many and sometimes heterogeneous computational cores. This challenge is exacerbated by the multiple facilities for data movement and sharing across cores resident on such platforms. To answer the question of how systems software should treat heterogeneous resources, this dissertation describes an approach that (1) creates a common manageable pool for all the resources present in the platform, and then (2) provides virtual machines (VMs) with multiple `personalities', flexibly mapped to and efficiently run on the heterogeneous underlying hardware. A VM's personality is its execution context on the different types of available processing resources usable by the VM. We provide mechanisms for making such platforms manageable and evaluate coordinated scheduling policies for mapping different VM personalities on heterogeneous hardware. Towards that end, this dissertation contributes technologies that include (1) restructuring hypervisor and system functions to create high performance environments that enable flexibility of execution and data sharing, (2) scheduling and other resource management infrastructure for supporting diverse application needs and heterogeneous platform characteristics, and (3) hypervisor level policies to permit efficient and coordinated resource usage and sharing. Experimental evaluations on multiple heterogeneous platforms, like one comprised of x86-based cores with attached NVIDIA accelerators and others with asymmetric elements on chip, demonstrate the utility of the approach and its ability to efficiently host diverse applications and resource management methods. Coordinated scheduling Heterogeneous many-core systems Asymmetric multi-cores Virtualization Kinship model Performance points Virtual computer systems Computing platforms Computer architecture Heterogeneous computing High performance computing
15	Jack Rabbit : an effective Cell BE programming system for high performance parallelism Ellis, Apollo Isaac Orion 08 July 2011 (has links) The Cell processor is an example of the trade-offs made when designing a mass market power efficient multi-core machine, but the machine-exposing architecture and raw communication mechanisms of Cell are hard to manage for a programmer. Cell's design is simple and causes software complexity to go up in the areas of achieving low threading overhead, good bandwidth efficiency, and load balance. Several attempts have been made to produce efficient and effective programming systems for Cell, but the attempts have been too specialized and thus fall short. We present Jack Rabbit, an efficient thread pool work queue implementation, with load balancing mechanisms and double buffering. Our system incurs low threading overhead, gets good load balance, and achieves bandwidth efficiency. Our system represents a step towards an effective way to program Cell and any similar current or future processors. / text Cell processor Multi-core systems High performance computing Runtime Barnes Hut LU factorization Mandelbrot Double buffering Thread pool Work queue Load balance

Page generated in 0.0565 seconds