Spelling suggestions: "subject:"engineering, computer."" "subject:"engineering, coomputer.""
31 |
Opportunities for near data computing in MapReduce workloadsPugsley, Seth Hintze 25 June 2015 (has links)
<p> In-memory big data applications are growing in popularity, including in-memory versions of the MapReduce framework. The move away from disk-based datasets shifts the performance bottleneck from slow disk accesses to memory bandwidth. MapReduce is a data-parallel application, and is therefore amenable to being executed on as many parallel processors as possible, with each processor requiring high amounts of memory bandwidth. We propose using Near Data Computing (NDC) as a means to develop systems that are optimized for in-memory MapReduce workloads, offering high compute parallelism and even higher memory bandwidth. This dissertation explores three different implementations and styles of NDC to improve MapReduce execution. First, we use 3D-stacked memory+logic devices to process the Map phase on compute elements in close proximity to database splits. Second, we attempt to replicate the performance characteristics of the 3D-stacked NDC using only commodity memory and inexpensive processors to improve performance of both Map and Reduce phases. Finally, we incorporate fixed-function hardware accelerators to improve sorting performance within the Map phase. This dissertation shows that it is possible to improve in-memory MapReduce performance by potentially two orders of magnitude by designing system and memory architectures that are specifically tailored to that end.</p>
|
32 |
Wearable human activity recognition systemsAmeri-Daragheh, Alireza 12 September 2015 (has links)
<p> In this thesis, we focused on designing wearable human activity recognition (WHAR) systems. As the first step, we conducted a thorough research over the publications during the recent ten years in this area. Then, we proposed an all-purpose architecture for designing the software of WHAR systems. Afterwards, among various applications of these wearable systems, we decided to work on wearable virtual fitness coach device which can recognize various types and intensities of warm-up exercises that an athlete performs. We first proposed a basic hardware platform for implementing the WHAR software. Afterwards, the software design was done in two phases. In the first phase, we focused on four simple activities to be recognized by the wearable device. We used Weka machine learning tool to build a mathematical model which could recognize the four activities with the accuracy of 99.32%. Moreover, we proposed an algorithm to measure the intensity of the activities with the accuracy of 93%. In the second phase, we focused on eight complex warm-up exercises. After building the mathematical model, the WHAR system could recognize the eight activities with the accuracy of 95.60%.</p>
|
33 |
Monitoring vehicle entry and departure through location-based servicesChopra, Varun Nikhil G. 10 December 2015 (has links)
<p> Shipping ports and terminals are usually very busy with high traffic duration times. The high trafficked areas of shipping terminals at ports often contribute to the high density traffic volume which affects the community and the port by possibly extending commuters' travel time, delaying shipments of goods, and potentially being a safety hazard. Location-based services would be able to measure the time a vehicle enters and exits the terminals at the port. Location-based services such as geofencing would help determine entry and exit times of vehicles. These services would be used in hopes of determining an efficient way to reduce traffic conditions by notifying terminals of entry and departure times of vehicles. By gathering travel times of vehicles, a process could be developed by representatives of the terminals at the port to more efficiently operate. A system which consists of two architectures is built to gather adequate travel times. The first system is a server side application with REST endpoints exposed and the second application is a client side application which consumes those endpoints. This study provides an analysis and implementation on how location-based services establishes a means to measure entry and exit times of vehicles moving through geofenced gates.</p>
|
34 |
Beneath the Attack SurfaceMowery, Keaton 18 August 2015 (has links)
<p> Computer systems are often analyzed as purely virtual artifacts, a collection of software operating on a Platonic ideal of a computer. When software is executed, it runs on actual hardware: an increasingly complex web of analog physical components and processes, cleverly strung together to present an illusion of pure computation. When an abstract software system is combined with individual hardware instances to form functioning systems, the overall behavior varies subtly with the hardware. These minor variations can change the security and privacy guarantees of the entire system, in both beneficial and harmful ways. We examine several such security effects in this dissertation. </p><p> First, we look at the fingerprinting capability of JavaScript and HTML5: when invoking existing features of modern browsers, such as JavaScript execution and 3-D graphics, how are the results affected by underlying hardware, and how distinctive is the resulting fingerprint?</p><p> Second, we discuss AES side channel timing attacks, a technique to extract information from AES encryption running on hardware. We present several reasons why we were unable to reproduce this attack against modern hardware and a modern browser.</p><p> Third, we examine positive uses of hardware variance: namely, seeding Linux's pseudorandom number generator at kernel initialization time with true entropy gathered during early boot. We examine the utility of these techniques on a variety of embedded devices, and give estimates for the amount of entropy each can generate.</p><p> Lastly, we evaluate a cyberphysical system: one which combines physical processes and analog sensors with software control and interpretation. Specifically, we examine the Rapiscan Secure~1000 backscatter X-ray full-body scanner, a device for looking under a scan subject's clothing, discovering any contraband secreted about their person. We present a full security analysis of this system, including its hardware, software, and underlying physics, and show how an adaptive, motivated adversary can completely subvert the scan to smuggle contraband, such as knives, firearms, and plastic explosives, past a Secure~1000 checkpoint. These attacks are entirely based upon understanding the physical processes and sensors which underlie this cyberphysical system, and involve adjusting the contraband's location and shape until it simply disappears.</p>
|
35 |
Post-silicon Functional Validation with Virtual PrototypesCong, Kai 27 August 2015 (has links)
<p> Post-silicon validation has become a critical stage in the system-on-chip (SoC) development cycle, driven by increasing design complexity, higher level of integration and decreasing time-to-market. According to recent reports, post-silicon validation effort comprises more than 50% of the overall development effort of an 65nm SoC. Though post-silicon validation covers many aspects ranging from electronic properties of hardware to performance and power consumption of whole systems, a central task remains validating functional correctness of both hardware and its integration with software. There are several key challenges to achieving accelerated and low-cost post-silicon functional validation. First, there is only limited silicon observability and controllability; second, there is no good test coverage estimation over a silicon device; third, it is difficult to generate good post-silicon tests before a silicon device is available; fourth, there is no effective software robustness testing approaches to ensure the quality of hardware/software integration.</p><p> We propose a systematic approach to accelerating post-silicon functional validation with virtual prototypes. Post-silicon test coverage is estimated in the pre-silicon stage by evaluating the test cases on the virtual prototypes. Such analysis is first conducted on the initial test suite assembled by the user and subsequently on the expanded test suite which includes test cases that are automatically generated. Based on the coverage statistics of the initial test suite on the virtual prototypes, test cases are automatically generated to improve the test coverage. In the post-silicon stage, our approach supports coverage evaluation of test cases on silicon devices to ensure fidelity of early coverage evaluation. The generated test cases are issued to silicon devices to detect inconsistencies between virtual prototypes and silicon devices using conformance checking. We further extend the test case generation framework to generate and inject fault scenario with virtual prototypes for driver robustness testing. Besides virtual prototype-based fault injection, an automatic driver fault injection approach is developed to support runtime fault generation and injection for driver robustness testing. Since virtual prototype enables early driver development, our automatic driver fault injection approach can be applied to driver testing in both pre-silicon and post-silicon stages. </p><p> For preliminary evaluation, we have applied our coverage evaluation and test generation to several network adapters and their virtual prototypes. We have conducted coverage analysis for a suite of common tests on both the virtual prototypes and silicon devices. The results show that our approach can estimate the test coverage with high fidelity. Based on the coverage estimation, we have employed our automatic test generation approach to generate additional tests. When the generated test cases were issued to both virtual prototypes and silicon devices, we observed significant coverage improvement. And we detected 20 inconsistencies between virtual prototypes and silicon devices, each of which reveals a virtual prototype or silicon device defect. After we applied virtual prototype-based fault injection approach to virtual prototypes for three widely-used network adapters, we generated and injected thousands of fault scenarios and found 2 driver bugs. For automatic driver fault injection, we have applied our approach to 12 widely used drivers with either virtual prototypes or silicon devices. After testing all these drivers, we found 28 distinct bugs.</p>
|
36 |
Document and natural image applications of deep learningKang, Le 31 October 2015 (has links)
<p> A tremendous amount of digital visual data is being collected every day, and we need efficient and effective algorithms to extract useful information from that data. Considering the complexity of visual data and the expense of human labor, we expect algorithms to have enhanced generalization capability and depend less on domain knowledge. While many topics in computer vision have benefited from machine learning, some document analysis and image quality assessment problems still have not found the best way to utilize it. In the context of document images, a compelling need exists for reliable methods to categorize and extract key information from captured images. In natural image content analysis, accurate quality assessment has become a critical component for many applications. Most current approaches, however, rely on the heuristics designed by human observations on severely limited data. These approaches typically work only on specific types of images and are hard to generalize on complex data from real applications. </p><p> This dissertation looks to address the challenges of processing heterogeneous visual data by applying effective learning methods that directly model the data with minimal preprocessing and feature engineering. We focus on three important problems - text line detection, document image categorization, and image quality assessment. The data we work on typically contains unconstrained layouts, styles, or noise, which resemble the real data from applications. First, we present a graph-based method, learning the line structure from training data for text line segmentation in handwritten document images, and a general framework to detect multi-oriented scene text lines using Higher-Order Correlation Clustering. Our method depends less on domain knowledge and is robust to variations in fonts or languages. Second, we introduce a general approach for document image genre classification using Convolutional Neural Networks (CNN). The introduction of CNNs for document image genre classification largely reduces the needs of hand-crafted features or domain knowledge. Third, we present our CNN based methods to general-purpose No-Reference Image Quality Assessment (NR-IQA). Our methods bridge the gap between NR-IQA and CNN and opens the door to a broad range of deep learning methods. With excellent local quality estimation ability, our methods demonstrate the state of art performance on both distortion identification and quality estimation.</p>
|
37 |
Efficient ray tracing architecturesSpjut, Josef Bo 22 October 2015 (has links)
<p> This dissertation presents computer architecture designs that are efficient for ray tracing based rendering algorithms. The primary observation is that ray tracing maps better to independent thread issue hardware designs than it does to dependent thread and data designs used in most commercial architectures. While the independent thread issue causes extra overhead in the fetch and issue parts of the pipeline, the number of computation resources required can be reduced through the sharing of less frequently used execution units. Furthermore, since all the threads run a single program on multiple data (SPMD), thread processors can share instruction and data caches. Ray tracing needs read-only access to the scene data during each frame, so caches can be optimized for reading, and traditional cache coherence protocols are unnecessary for maintaining coherent memory access. The resultant image exists as a write only frame buffer, allowing memory writes to avoid the cache entirely, preventing cache pollution and increasing the performance of smaller caches. </p><p> Commercial real-time rendering systems lean heavily on high-performance graphics processing units (GPU) that use the rasterization and z-buffer algorithms for rendering. A single pass of rasterization throws out much of the global scene information by streaming the surface data that a ray tracer keeps resident in memory. As a result, ray tracing is more naturally able to support rendering effects involving global information, such as shadows, reflections, refractions and camera lens effects. Rasterization has a time complexity of approximately <i> O</i>(<i>N log</i>(<i>P</i>)) where <i>N</i> is the number of primitive polygons and <i>P</i> is the number of pixels in the image. Ray tracing, in contrast, has a time complexity of <i> O</i>(<i>P log</i>(<i>N</i>)) making ray tracing scale better to large scenes with many primitive polygons, allowing for increased surface detail. Finally, once the number of pixels reaches its limit, ray tracing should exceed the performance of rasterization by allowing the number of objects to increase with less of a penalty on performance.</p>
|
38 |
Fast modular exponentiation using residue domain representation| A hardware implementation and analysisNguyen, Christopher Dinh 01 March 2014 (has links)
<p> Using modular exponentiation as an application, we engineered on FPGA fabric and analyzed the first implementation of two arithmetic algorithms in Reduced-Precision Residue Number Systems (RP-RNS): the partial-reconstruction algorithm and quotient-first scaling algorithm. Residue number systems (RNS) provide an alternative representation to the binary system for computation. They offer full parallel computation for addition, subtraction, and multiplication. However, base extension, division, and sign detection become harder operations. Phatak's RP-RNS uses a time-memory trade-off to achieve O(lg N) running time for base extension and scaling, where N is the bit-length of the operands, compared with Kawamura's Cox-Rower architecture and its derivatives, which appear to take O(N) steps and therefore O(N) delay to the best of our knowledge. We implemented the fully parallel RP-RNS architecture based on Phatak's description and architecture diagrams. Our design decisions included distributing the lookup tables among each channel, removing the adder trees, and removing the parallel table access thus trading size for speed. In retrospect, we should have hosted the tables in memory off the FPGA. We measured the FPGA utilization, storage size, and cycle counts. The data we present, though less than optimal, confirms the theoretical trends calculated by Phatak. FPGA utilization grows proportional K log(K) where K is the number of hardware channels. Storage grows proportional to O(N</p><p>3 lg lg N). When using Phatak's recommendations,cycle count grows proportional to O(lg N). Our contributions include documentation of our design, architecture, and implementation; a detailed testing methodology; and performance data based on our implementation to enable others to replicate our implementation and findings.</p>
|
39 |
PLC code vulnerabilities through SCADA systemsValentine, Sidney E. 15 June 2013 (has links)
<p> Supervisory Control and Data Acquisition (SCADA) systems are widely used in automated manufacturing and in all areas of our nation's infrastructure. Applications range from chemical processes and water treatment facilities to oil and gas production and electric power generation and distribution. Current research on SCADA system security focuses on the primary SCADA components and targets network centric attacks. Security risks via attacks against the peripheral devices such as the Programmable Logic Controllers (PLCs) have not been sufficiently addressed. Our research results address the need to develop PLC applications that are correct, safe and secure. This research provides an analysis of software safety and security threats. We develop countermeasures that are compatible with the existing PLC technologies. We study both intentional and unintentional software errors and propose methods to prevent them. The main contributions of this dissertation are: 1). Develop a taxonomy of software errors and attacks in ladder logic 2). Model ladder logic vulnerabilities 3). Develop security design patterns to avoid software vulnerabilities and incorrect practices 4). Implement a proof of concept static analysis tool which detects the vulnerabilities in the PLC code and recommend corresponding design patterns.</p>
|
40 |
Result Distribution in Big Data SystemsCheelangi, Madhusudan 09 August 2013 (has links)
<p> We are building a Big Data Management System (BDMS) called <b>AsterixDB </b> at UCI. Since AsterixDB is designed to operate on large volumes of data, the results for its queries can be potentially very large, and AsterixDB is also designed to operate under high concurency workloads. As a result, we need a specialized mechanism to manage these large volumes of query results and deliver them to the clients. In this thesis, we present an architecture and an implementation of a new result distribution framework that is capable of handling large volumes of results under high concurency workloads. We present the various components of this result distribution framework and show how they interact with each other to manage large volumes of query results and deliver them to clients. We also discuss various result distribution policies that are possible with our framework and compare their performance through experiments. </p><p> We have implemented a REST-like HTTP client interface on top of the result distribution framework to allow clients to submit queries and obtain their results. This client interface provides two modes for clients to choose from to read their query results: synchronous mode and asynchronous mode. In synchronous mode, query results are delivered to a client as a direct response to its query within the same request-response cycle. In asynchronous mode, a query handle is returned instead to the client as a response to its query. The client can store the handle and send another request later, including the query handle, to read the result for the query whenever it wants. The architectural support for these two modes is also described in this thesis. We believe that the result distribution framework, combined with this client interface, successfully meets the result management demands of AsterixDB. </p>
|
Page generated in 0.2981 seconds