Global ETD Search

41	PERFORMANCE EVALUATION OF MEMORY AND COMPUTATIONALLY BOUND CHEMISTRY APPLICATIONS ON STREAMING GPGPUS AND MULTI-CORE X86 CPUS Weber III, Frederick E 01 May 2010 (has links) In recent years, multi-core processors have come to dominate the field in desktop and high performance computing. Graphics processors traditionally used in CAD, video games, and other 3-d applications, have become more programmable and are now suitable for general purpose computing. This thesis explores multi-core processors and GPU performance and limitations in two computational chemistry applications: a memory bound component of ab-initio modeling and a computationally bound Monte Carlo simulation. For the applications presented in this thesis, exploiting multiple processors is done using a variety of tools and languages including OpenMP and MKL. Brook+ and the Compute Abstraction Layer streaming environments are used to accelerate applications on AMD GPUs. This thesis gives qualitative assertions about these languages and tools regarding ease of use and optimization in addition to quantitative analyses of performance. GPUs can yield modest performance improvements with little effort in some applications and even larger speedups with simple optimizations. GPU multi-core Monte Carlo parallel computing Computer and Systems Architecture
42	A virtual reality modeling tool for students of architecture / Wong, Wai-sang. January 2000 (has links) Thesis (M. Phil.)--University of Hong Kong, 2000. / Includes bibliographical references (leaves 59-62).
43	Fuzzy Modeling and Control Based Virtual Machine Resource Management Wang, Lixi 06 March 2015 (has links) Virtual machines (VMs) are powerful platforms for building agile datacenters and emerging cloud systems. However, resource management for a VM-based system is still a challenging task. First, the complexity of application workloads as well as the interference among competing workloads makes it difficult to understand their VMs’ resource demands for meeting their Quality of Service (QoS) targets; Second, the dynamics in the applications and system makes it also difficult to maintain the desired QoS target while the environment changes; Third, the transparency of virtualization presents a hurdle for guest-layer application and host-layer VM scheduler to cooperate and improve application QoS and system efficiency. This dissertation proposes to address the above challenges through fuzzy modeling and control theory based VM resource management. First, a fuzzy-logic-based nonlinear modeling approach is proposed to accurately capture a VM’s complex demands of multiple types of resources automatically online based on the observed workload and resource usages. Second, to enable fast adaption for resource management, the fuzzy modeling approach is integrated with a predictive-control-based controller to form a new Fuzzy Modeling Predictive Control (FMPC) approach which can quickly track the applications’ QoS targets and optimize the resource allocations under dynamic changes in the system. Finally, to address the limitations of black-box-based resource management solutions, a cross-layer optimization approach is proposed to enable cooperation between a VM’s host and guest layers and further improve the application QoS and resource usage efficiency. The above proposed approaches are prototyped and evaluated on a Xen-based virtualized system and evaluated with representative benchmarks including TPC-H, RUBiS, and TerraFly. The results demonstrate that the fuzzy-modeling-based approach improves the accuracy in resource prediction by up to 31.4% compared to conventional regression approaches. The FMPC approach substantially outperforms the traditional linear-model-based predictive control approach in meeting application QoS targets for an oversubscribed system. It is able to manage dynamic VM resource allocations and migrations for over 100 concurrent VMs across multiple hosts with good efficiency. Finally, the cross-layer optimization approach further improves the performance of a virtualized application by up to 40% when the resources are contended by dynamic workloads. Autonomic Computing Resource Management Performance Fuzzy Modeling Computer and Systems Architecture
44	Assisting Children Action Association Through Visual Queues and Wearable Technology Young, Anthony 14 October 2016 (has links) Autism Spectrum Disorder makes it difficult to for a child communicate, have social interactions and go through daily life. Visual cues are often used to help a child associate an image with an event. With technology becoming more and more advanced, we now have a way to remind a child of an event with wearable technology, such as a watch. This new technology can help a child directly with the Visual Scheduling Application and various other applications. These applications allow children and their families to be easily able to keep track of the events on their schedule and notify them when an event occurs. With the Autism Management Platform and related website, a parent can easily create events to help a child throughout the day. The child can associate an image with events, allowing for a clearer understanding of what to do when an event occurs. Wearable technology has become a new way to interact with the user in a very unobtrusive manner. With this new technology, we can help associate a visual event to a child’s schedule and interrupt when needed to help make the child’s life easier on a daily basis. Autism Visual Schedules Computing Computational Engineering Computer and Systems Architecture
45	A Regression Approach to Execution Time Estimation for Programs Running on Multicore Systems Alshamlan, Mohammad 21 March 2014 (has links) Execution time estimation plays an important role in computer system design. It is particularly critical in real-time system design, where to meet a deadline can be as important as to ensure the logical correctness of a program. To accurately estimate the execution time of a program can be extremely challenging, since the execution time of a program varies with inputs, the underlying computer architectures, and run-time dynamics, among other factors. The problem becomes even more challenging as computing systems moving from single core to multi-core platforms, with more hardware resources shared by multiple processing cores. The goal of this research is to investigate the relationship between the execution time of a program and the underlying architecture features (e.g. cache size, associativity, memory latency), as well as its run-time characteristics (e.g. cache miss ratios), and based on which, to estimate its execution time on a multi-core platform based on a regression approach. We developed our test platform based on GEM5, an open-source multi-core cycle-accurate simulation tool set. Our experimental results show clearly the strong relationship of the program execution time to architecture features and run-time characteristics. Moreover, we developed different execution time estimation algorithms using the regression approach for different programs with different software characteristics to improve the estimation accuracy. real-time system Computer and Systems Architecture Other Computer Engineering
46	Design and implemetation of internet mail servers with embedded data compression Nand, Alka 01 January 1997 (has links) No description available. Internet programming Embedded computer systems Computer and Systems Architecture
47	ENERGY EFFICIENCY EXPLORATION OF COARSE-GRAIN RECONFIGURABLE ARCHITECTURE WITH EMERGING NONVOLATILE MEMORY Liu, Xiaobin 18 March 2015 (has links) With the rapid growth in consumer electronics, people expect thin, smart and powerful devices, e.g. Google Glass and other wearable devices. However, as portable electronic products become smaller, energy consumption becomes an issue that limits the development of portable systems due to battery lifetime. In general, simply reducing device size cannot fully address the energy issue. To tackle this problem, we propose an on-chip interconnect infrastructure and pro- gram storage structure for a coarse-grained reconfigurable architecture (CGRA) with emerging non-volatile embedded memory (MRAM). The interconnect is composed of a matrix of time-multiplexed switchboxes which can be dynamically reconfigured with the goal of energy reduction. The number of processors performing computation can also be adapted. The use of MRAM provides access to high-density storage and lower memory energy consumption versus more standard SRAM technologies. The combination of CGRA, MRAM, and flexible on-chip interconnection is considered for signal processing. This application domain is of interest based on its time-varying computing demands. To evaluate CGRA architectural features, prototype architectures have been pro- totyped in a field-programmable gate array (FPGA). Measurements of energy, power, instruction count, and execution time performance are considered for a scalable num- ber of processors. Applications such as adaptive Viterbi decoding and Reed Solomon coding are used for evaluation. To complete this thesis, a time-scheduled switchbox was integrated into our CGRA model. This model was prototyped on an FPGA. It is shown that energy consumption can be reduced by about 30% if dynamic design reconfiguration is performed. CGRA MRAM time-scheduled interconnect Computer and Systems Architecture Hardware Systems
48	Modifying Instruction Sets In The Gem5 Simulator To Support Fault Tolerant Designs Zhang, Chuan 23 November 2015 (has links) Traditional fault tolerant techniques such as hardware or time redundancy incur high overhead and are inefficient for checking arithmetic operations. Our objective is to study an alternative approach of adding new instructions to check arithmetic operations. These checking instructions either rely on error detecting code or calculate approximate results and consequently, consume much less execution time. To evaluate the effectiveness of such an approach we wish to modify several benchmarks to use checking instructions and run simulation experiments to find out their execution time and memory usage. However, the checking instructions are not included in the instruction set and as a result, are not supported by current architecture simulators. Therefore, another objective of this thesis is to develop a method for inserting new instructions in the Gem5 simulator and cross compiler. The insertion process is integrated into a software tool called Gtool. Gtool can add an error checking capability to C programs by using the new instructions. Gem5 compiler error checking ISA modification Computer and Systems Architecture
49	Compacting Loads and Stores for Code Size Reduction Asay, Isaac 01 March 2014 (has links) It is important for compilers to generate executable code that is as small as possible, particularly when generating code for embedded systems. One method of reducing code size is to use instruction set architectures (ISAs) that support combining multiple operations into single operations. The ARM ISA allows for combining multiple memory operations to contiguous memory addresses into a single operation. The LLVM compiler contains a specific memory optimization to perform this combining of memory operations, called ARMLoadStoreOpt. This optimization, however, relies on another optimization (ARMPreAllocLoadStoreOpt) to move eligible memory operations into proximity in order to perform properly. This mover optimization occurs before register allocation, while ARMLoadStoreOpt occurs after register allocation. This thesis implements a similar mover optimization (called MagnetPass) after register allocation is performed, and compares this implementation with the existing optimization. While in most cases the two optimizations provide comparable results, our implementation in its current state requires some improvements before it will be a viable alternative to the existing optimization. Specifically, the algorithm will need to be modified to reduce computational complexity, and our implementation will need to take care not to interfere with other LLVM optimizations. compiler optimization loads stores memory arm Computer and Systems Architecture
50	Virtual Reality Engine Development Varahamurthy, Varun 01 June 2014 (has links) With the advent of modern graphics and computing hardware and cheaper sensor and display technologies, virtual reality is becoming increasingly popular in the fields of gaming, therapy, training and visualization. Earlier attempts at popularizing VR technology were plagued by issues of cost, portability and marketability to the general public. Modern screen technologies make it possible to produce cheap, light head-mounted displays (HMDs) like the Oculus Rift, and modern GPUs make it possible to create and deliver a seamless real-time 3D experience to the user. 3D sensing has found an application in virtual and augmented reality as well, allowing for a higher level of interaction between the real and the simulated. There are still issues that persist, however. Many modern graphics/game engines still do not provide developers with an intuitive or adaptable interface to incorporate these new technologies. Those that do, tend to think of VR as a novelty afterthought, and even then only provide tailor-made extensions for specific hardware. The goal of this paper is to design and implement a functional, general-purpose VR engine using abstract interfaces for much of the hardware components involved to allow for easy extensibility for the developer. Computer and Systems Architecture Graphics and Human Computer Interfaces Signal Processing

Search results