Global ETD Search

231	Normally-Off Computing Design Methodology Using Spintronics: From Devices to Architectures Roohi, Arman 01 May 2019 (has links) Energy-harvesting-powered computing offers intriguing and vast opportunities to dramatically transform the landscape of Internet of Things (IoT) devices and wireless sensor networks by utilizing ambient sources of light, thermal, kinetic, and electromagnetic energy to achieve battery-free computing. In order to operate within the restricted energy capacity and intermittency profile of battery-free operation, it is proposed to innovate Elastic Intermittent Computation (EIC) as a new duty-cycle-variable computing approach leveraging the non-volatility inherent in post-CMOS switching devices. The foundations of EIC will be advanced from the ground up by extending Spin Hall Effect Magnetic Tunnel Junction (SHE-MTJ) device models to realize SHE-MTJ-based Majority Gate (MG) and Polymorphic Gate (PG) logic approaches and libraries, that leverage intrinsic-non-volatility to realize middleware-coherent, intermittent computation without checkpointing, micro-tasking, or software bloat and energy overheads vital to IoT. Device-level EIC research concentrates on encapsulating SHE-MTJ behavior with a compact model to leverage the non-volatility of the device for intrinsic provision of intermittent computation and lifetime energy reduction. Based on this model, the circuit-level EIC contributions will entail the design, simulation, and analysis of PG-based spintronic logic which is adaptable at the gate-level to support variable duty cycle execution that is robust to brief and extended supply outages or unscheduled dropouts, and development of spin-based research synthesis and optimization routines compatible with existing commercial toolchains. These tools will be employed to design a hybrid post-CMOS processing unit utilizing pipelining and power-gating through state-holding properties within the datapath itself, thus eliminating checkpointing and data transfer operations. Computer Engineering
232	Simulation, Analysis, and Optimization of Heterogeneous CPU-GPU Systems Giles, Christopher 01 January 2019 (has links) With the computing industry's recent adoption of the Heterogeneous System Architecture (HSA) standard, we have seen a rapid change in heterogeneous CPU-GPU processor designs. State-of-the-art heterogeneous CPU-GPU processors tightly integrate multicore CPUs and multi-compute unit GPUs together on a single die. This brings the MIMD processing capabilities of the CPU and the SIMD processing capabilities of the GPU together into a single cohesive package with new HSA features comprising better programmability, coherency between the CPU and GPU, shared Last Level Cache (LLC), and shared virtual memory address spaces. These advancements can potentially bring marked gains in heterogeneous processor performance and have piqued the interest of researchers who wish to unlock these potential performance gains. Therefore, in this dissertation I explore the heterogeneous CPU-GPU processor and application design space with the goal of answering interesting research questions, such as, (1) what are the architectural design trade-offs in heterogeneous CPU-GPU processors and (2) how do we best maximize heterogeneous CPU-GPU application performance on a given system. To enable my exploration of the heterogeneous CPU-GPU design space, I introduce a novel discrete event-driven simulation library called KnightSim and a novel computer architectural simulator called M2S-CGM. M2S-CGM includes all of the simulation elements necessary to simulate coherent execution between a CPU and GPU with shared LLC and shared virtual memory address spaces. I then utilize M2S-CGM for the conduct of three architectural studies. First, I study the architectural effects of shared LLC and CPU-GPU coherence on the overall performance of non-collaborative GPU-only applications. Second, I profile and analyze a set of collaborative CPU-GPU applications to determine how to best optimize them for maximum collaborative performance. Third, I study the impact of varying four key architectural parameters on collaborative CPU-GPU performance by varying GPU compute unit coalesce size, GPU to memory controller bandwidth, GPU frequency, and system wide switching fabric latency. Computer Engineering
233	Expression Morphing Between Different Orientations Fu, Tao 01 January 2004 (has links) How to generate new views based on given reference images has been an important and interesting topic in the area of image-based rendering. Two important algorithms that can be used are field morphing and view morphing. Field morphing, which is an algorithm of image morphing, generates new views based on two reference images which were taken at the same viewpoint. The most successful result of field morphing is morphing from one person's face to the other one's face. View morphing, which is an algorithm of view synthesis, generates in between views based on two reference views which were taken at different viewpoints for the same object. The result of view morphing is often an animation of moving one object from the viewpoint of one reference image to the viewpoint of the other one. In this thesis, we proposed a new framework that integrates field morphing and view morphing to solve the problem of expression morphing. Based on four reference images, we successfully generate the morphing from one viewpoint with one expression to another viewpoint with a different expression. We also proposed a new approach to eliminate artifacts that frequently occur in view morphing due to occlusions and in field morphing due to some unforeseen combination of feature lines. We solve these problems by relaxing the monotonicity assumption to piece-wise monotonicity along the epipolar lines. Our experimental results demonstrate the efficiency of this approach in handling occlusions for more realistic synthesis of novel views. Computer Engineering
234	Scalable Map Information Dissemination for Connected and Automated Vehicle Systems Gani, S. M. Osman 01 January 2019 (has links) Situational awareness in connected and automated vehicle (CAV) systems becomes particularly challenging in the presence of non-line of sight objects and/or objects beyond the sensing range of local onboard sensors. Despite the fact that fully autonomous driving requires the use of multiple redundant sensor systems, primarily including camera, radar, and LiDAR, the non-line of sight object detection problem still persists due to the inherent limitations of those sensing techniques. To tackle this challenge, the inter-vehicle communication system is envisioned that allows vehicles to exchange self-status updates aiming to extend their effective field of view and thus compensate for the limitations of the vehicle tracking subsystem that relies substantially on onboard sensing devices. Tracking capability in such systems can be further improved through the cooperative sharing of locally created map data instead of transmitting only self-update messages containing core basic safety message (BSM) data. In the cooperative sharing of safety messages, it is imperative to have a scalable communication protocol to ensure optimal use of the communication channel. This dissertation contributes to the analysis of the scalability issue in vehicle-to-everything (V2X) communication and then addresses the range issue of situational awareness in CAV systems by proposing a content-adaptive V2X communication architecture. To that end, we first analyze the BSM scheduling protocol standardized in the SAE J2945/1 and present large-scale scalability results obtained from a high-fidelity simulation platform to demonstrate the protocol's efficacy to address the scalability issues in V2X communication. By employing a distributed opportunistic approach, the SAE J2945/1 congestion control algorithm keeps the overall offered channel load within an optimal operating range, while meeting the minimum tracking requirements set forth by upper-layer applications. This scheduling protocol allows event-triggered and vehicle-dynamics driven message transmits that further the situational awareness in a cooperative V2X context. Presented validation results of the congestion control algorithm include position tracking errors as the performance measure, with the age of communicated information as the evaluation measure. In addition, we examine the optimality of the default settings of the congestion control parameters. Comprehensive analysis and trade-off study of the control parameters reveal some areas of improvement to further the algorithm's efficacy. Motivated by the effectiveness of channel congestion control mechanism, we further investigate message content and length adaptations, together with transmit rate control. Reasonably, the content of the exchanged information has a significant impact on the map accuracy in cooperative driving systems. We investigate different content control schemes for a communication architecture aimed at map sharing and evaluate their performance in terms of position tracking error. This dissertation determines that message content should be concentrated to mapped objects that are located farther away from the sender to the edge of the local sensor range. This dissertation also finds that optimized combination of message length and transmit rate ensures the optimal channel utilization for cooperative vehicular communication, which in turn improves the situational awareness of the whole system. Computer Engineering
235	Automated Synthesis of Unconventional Computing Systems Ul Hassen, Amad 01 January 2019 (has links) Despite decades of advancements, modern computing systems which are based on the von Neumann architecture still carry its shortcomings. Moore's law, which had substantially masked the effects of the inherent memory-processor bottleneck of the von Neumann architecture, has slowed down due to transistor dimensions nearing atomic sizes. On the other hand, modern computational requirements, driven by machine learning, pattern recognition, artificial intelligence, data mining, and IoT, are growing at the fastest pace ever. By their inherent nature, these applications are particularly affected by communication-bottlenecks, because processing them requires a large number of simple operations involving data retrieval and storage. The need to address the problems associated with conventional computing systems at the fundamental level has given rise to several unconventional computing paradigms. In this dissertation, we have made advancements for automated syntheses of two types of unconventional computing paradigms: in-memory computing and stochastic computing. In-memory computing circumvents the problem of limited communication bandwidth by unifying processing and storage at the same physical locations. The advent of nanoelectronic devices in the last decade has made in-memory computing an energy-, area-, and cost-effective alternative to conventional computing. We have used Binary Decision Diagrams (BDDs) for in-memory computing on memristor crossbars. Specifically, we have used Free-BDDs, a special class of binary decision diagrams, for synthesizing crossbars for flow-based in-memory computing. Stochastic computing is a re-emerging discipline with several times smaller area/power requirements as compared to conventional computing systems. It is especially suited for fault-tolerant applications like image processing, artificial intelligence, pattern recognition, etc. We have proposed a decision procedures-based iterative algorithm to synthesize Linear Finite State Machines (LFSM) for stochastically computing non-linear functions such as polynomials, exponentials, and hyperbolic functions. Computer Engineering
236	Rethinking Routing and Peering in the era of Vertical Integration of Network Functions Dey, Prasun Kanti 01 January 2019 (has links) Content providers typically control the digital content consumption services and are getting the most revenue by implementing an "all-you-can-eat" model via subscription or hyper-targeted advertisements. Revamping the existing Internet architecture and design, a vertical integration where a content provider and access ISP will act as unibody in a sugarcane form seems to be the recent trend. As this vertical integration trend is emerging in the ISP market, it is questionable if existing routing architecture will suffice in terms of sustainable economics, peering, and scalability. It is expected that the current routing will need careful modifications and smart innovations to ensure effective and reliable end-to-end packet delivery. This involves new feature developments for handling traffic with reduced latency to tackle routing scalability issues in a more secure way and to offer new services at cheaper costs. Considering the fact that prices of DRAM or TCAM in legacy routers are not necessarily decreasing at the desired pace, cloud computing can be a great solution to manage the increasing computation and memory complexity of routing functions in a centralized manner with optimized expenses. Focusing on the attributes associated with existing routing cost models and by exploring a hybrid approach to SDN, we also compare recent trends in cloud pricing (for both storage and service) to evaluate whether it would be economically beneficial to integrate cloud services with legacy routing for improved cost-efficiency. In terms of peering, using the US as a case study, we show the overlaps between access ISPs and content providers to explore the viability of a future in terms of peering between the new emerging content-dominated sugarcane ISPs and the healthiness of Internet economics. To this end, we introduce meta-peering, a term that encompasses automation efforts related to peering – from identifying a list of ISPs likely to peer, to injecting control-plane rules, to continuous monitoring and notifying any violation – one of the many outcroppings of vertical integration procedure which could be offered to the ISPs as a standalone service. Computer Engineering
237	Data-Driven Modeling and Optimization of Building Energy Consumption Grover, Divas 01 January 2019 (has links) Sustainability and reducing energy consumption are targets for building operations. The installation of smart sensors and Building Automation Systems (BAS) makes it possible to study facility operations under different circumstances. These technologies generate large amounts of data. That data can be scrapped and used for the analysis. In this thesis, we focus on the process of data-driven modeling and decision making from scraping the data to simulate the building and optimizing the operation. The City of Orlando has similar goals of sustainability and reduction of energy consumption so, they provided us access to their BAS for the data and study the operation of its facilities. The data scraped from the City's BAS serves can be used to develop statistical/machine learning methods for decision making. We selected a mid-size pilot building to apply these techniques. The process begins with the collection of data from BAS. An Application Programming Interface (API) is developed to login to the servers and scrape data for all data points and store it on the local machine. Then data is cleaned to analyze and model. The dataset contains various data points ranging from indoor and outdoor temperature to fan speed inside the Air Handling Unit (AHU) which are operated by Variable Frequency Drive (VFD). This whole dataset is a time series and is handled accordingly. The cleaned dataset is analyzed to find different patterns and investigate relations between different data points. The analysis helps us in choosing parameters for models that are developed in the next step. Different statistical models are developed to simulate building and equipment behavior. Finally, the models along with the data are used to optimize the building Operation with the equipment constraints to make decisions for building operation which leads to a reduction in energy consumption while maintaining temperature and pressure inside the building. Computer Engineering
238	Predicting software effort for a new project using data from a casebase of previously completed projects Chan, Wai Lun 01 January 1997 (has links) No description available. Computer Engineering
239	Neural algorithms for EMI based landmine detection Draper, Matthew C. 01 January 2003 (has links) Landmines are a major problem facing the world today. There are millions of these deadly weapons still buried in various countries around the world. Humanitarian organizations dedicate an immeasurable amount of time, effort, and money to find and remove as many of these mines as possible. Over the past decade the US Government has become involved and has encouraged much research into improving landmine sensor technologies such as Ground Penetrating Radar, Infrared Cameras, Electro-Magnetic Induction sensors, and a variety of other technologies. The major goal of this research has been two-fold; it is important to improve the probability of detection of landmines, and, equally important, to reduce the probability of false alarms. The major cost of de-mining is incurred in the efforts to safely remove suspected landmines from the ground. The technicians have to carefully dig up the object, treating it as a live mine or piece of unexploded ordinance. Unfortunately, landmines can be made out of fairly common materials such as metal, wood, and plastic, which can confuse the sensor and cause it to erroneously report normal material in the field as mines. In an effort to reduce the number of false alarms, researchers have investigated the use of computers to analyze the raw data coming from the sensor. These computers could process the raw data and decide whether or not a certain location contains a mine. One popular avenue in this field of research is using neural networks. This thesis takes a look at a variety of neural network approaches to mine detection and looks specifically at the use of an artificial neural network (ANN) with data that has been pre-processed with the 8-technique and S-Statistic. It is shown that an ANN that uses the 8-technique and S-Statistic as inputs will achieve an acceptably high probability of detection with a low probability of false alarms. It is also shown that the pre-processing is responsible for most of the performance gain, as the Back Propagation Neural Network (BPNN) and Random Neural Network (RNN) models achieve similar probabilities of detection. The BPNN, however, does consistently perform better than the RNN by a small margin. Computer Engineering
240	Modeling and optimization of emerging on-chip cooling technologies via machine learning Yuan, Zihao 30 August 2022 (has links) Over the last few decades, processor performance has continued to grow due to the down-scaling of transistor dimensions. This performance boost has translated into high power densities and localized hot spots, which decrease the lifetime of processors and increase transistor delays and leakage power. Conventional on-chip cooling solutions are often insufficient to efficiently mitigate such high-power-density hot spots. Emerging cooling technologies such as liquid cooling via microchannels, thermoelectric coolers (TECs), two-phase vapor chambers (VCs), and hybrid cooling options (e.g., of liquid cooling via microchannels and TECs) have the potential to provide better cooling performance compared to conventional cooling solutions. However, these potential solutions’ cooling performance and cooling power vary significantly based on their design and operational parameters (such as liquid flow velocity, evaporator design, TEC current, etc.) and the chip specifications. In addition, the cooling models of such emerging cooling technologies may require additional Computational Fluid Dynamics (CFD) simulations (e.g., two-phase cooling), which are time-consuming and have large memory requirements. Given the vast solution space of possible cooling solutions (including possible hybrids) and cooling subsystem parameters, the optimal solution search time is also prohibitively time-consuming. To minimize the cooling power overhead while satisfying chip thermal constraints, there is a need for an optimization flow that enables rapid and accurate thermal simulation and selection of the best cooling solution and the associated cooling parameters for a given chip design and workload profile. This thesis claims that combining the compact thermal modeling methodology with machine learning (ML) models enables rapidly and accurately carrying out thermal simulations and predicting the optimal cooling solution and its cooling parameters for arbitrary chip designs. The thesis aims to realize this optimization flow through three fronts. First, it proposes a parallel compact thermal simulator, PACT, that enables speedy and accurate standard-cell-level to architecture-level thermal analysis for processors. PACT has high extensibility and applicability and models and evaluates thermal behaviors of emerging integration (e.g., monolithic 3D) and cooling technologies (e.g., two-phase VCs). Second, it proposes an ML-based temperature-dependent simulation framework designed for two-phase cooling methods to enable fast and accurate thermal simulations. This simulation framework can also be applied to other emerging cooling technologies. Third, this thesis proposes a systematic way to create novel deep learning (DL) models to predict the optimal cooling methods and cooling parameters for a given chip design. Through experiments based on real-world high-power-density chips and their floorplans, this thesis aims to demonstrate that using ML models substantially minimizes the simulation time of emerging cooling technologies (e.g., up to 21x) and improves the optimization time of emerging cooling solutions (e.g., up to 140x) while achieving the same optimization accuracy compared to brute force methods. / 2023-02-28T00:00:00Z Computer engineering

Search results