Global ETD Search

51	Optimizing Virtual Machine I/O Performance in Cloud Environments Lu, Tao 01 January 2016 (has links) Maintaining closeness between data sources and data consumers is crucial for workload I/O performance. In cloud environments, this kind of closeness can be violated by system administrative events and storage architecture barriers. VM migration events are frequent in cloud environments. VM migration changes VM runtime inter-connection or cache contexts, significantly degrading VM I/O performance. Virtualization is the backbone of cloud platforms. I/O virtualization adds additional hops to workload data access path, prolonging I/O latencies. I/O virtualization overheads cap the throughput of high-speed storage devices and imposes high CPU utilizations and energy consumptions to cloud infrastructures. To maintain the closeness between data sources and workloads during VM migration, we propose Clique, an affinity-aware migration scheduling policy, to minimize the aggregate wide area communication traffic during storage migration in virtual cluster contexts. In host-side caching contexts, we propose Successor to recognize warm pages and prefetch them into caches of destination hosts before migration completion. To bypass the I/O virtualization barriers, we propose VIP, an adaptive I/O prefetching framework, which utilizes a virtual I/O front-end buffer for prefetching so as to avoid the on-demand involvement of I/O virtualization stacks and accelerate the I/O response. Analysis on the traffic trace of a virtual cluster containing 68 VMs demonstrates that Clique can reduce inter-cloud traffic by up to 40%. Tests of MPI Reduce_scatter benchmark show that Clique can keep VM performance during migration up to 75% of the non-migration scenario, which is more than 3 times of the Random VM choosing policy. In host-side caching environments, Successor performs better than existing cache warm-up solutions and achieves zero VM-perceived cache warm-up time with low resource costs. At system level, we conducted comprehensive quantitative analysis on I/O virtualization overheads. Our trace replay based simulation demonstrates the effectiveness of VIP for data prefetching with ignorable additional cache resource costs. Cloud Computing Virtualization Virtual Machine Live Migration Cache Virtual I/O Computer and Systems Architecture Data Storage Systems Electrical and Computer Engineering Systems and Communications
52	DATA MINING: TRACKING SUSPICIOUS LOGGING ACTIVITY USING HADOOP Sodhi, Bir Apaar Singh 01 March 2016 (has links) In this modern rather interconnected era, an organization’s top priority is to protect itself from major security breaches occurring frequently within a communicational environment. But, it seems, as if they quite fail in doing so. Every week there are new headlines relating to information being forged, funds being stolen and corrupt usage of credit card and so on. Personal computers are turned into “zombie machines” by hackers to steal confidential and financial information from sources without disclosing hacker’s true identity. These identity thieves rob private data and ruin the very purpose of privacy. The purpose of this project is to identify suspicious user activity by analyzing a log file which then later can help an investigation agency like FBI to track and monitor anonymous user(s) who seek for weaknesses to attack vulnerable parts of a system to have access of it. The project also emphasizes the potential damage that a malicious activity could have on the system. This project uses Hadoop framework to search and store log files for logging activities and then performs a ‘Map Reduce’ programming code to finally compute and analyze the results. Parallel Computing Distributed File System Java Programming Parser Big Data Partitioner Reducer Combiner Mapper Computer and Systems Architecture Data Storage Systems Information Security Programming Languages and Compilers
53	Rethinking the I/O Stack for Persistent Memory Chowdhury, Mohammad Ataur Rahman 28 March 2018 (has links) Modern operating systems have been designed around the hypotheses that (a) memory is both byte-addressable and volatile and (b) storage is block addressable and persistent. The arrival of new Persistent Memory (PM) technologies, has made these assumptions obsolete. Despite much of the recent work in this space, the need for consistently sharing PM data across multiple applications remains an urgent, unsolved problem. Furthermore, the availability of simple yet powerful operating system support remains elusive. In this dissertation, we propose and build The Region System – a high-performance operating system stack for PM that implements usable consistency and persistence for application data. The region system provides support for consistently mapping and sharing data resident in PM across user application address spaces. The region system creates a novel IPI based PMSYNC operation, which ensures atomic persistence of mapped pages across multiple address spaces. This allows applications to consume PM using the well understood and much desired memory like model with an easy-to-use interface. Next, we propose a metadata structure without any redundant metadata to reduce CPU cache flushes. The high-performance design minimizes the expensive PM ordering and durability operations by embracing a minimalistic approach to metadata construction and management. To strengthen the case for the region system, in this dissertation, we analyze different types of applications to identify their dependence on memory mapped data usage, and propose user level libraries LIBPM-R and LIBPMEMOBJ-R to support shared persistent containers. The user level libraries along with the region system demonstrate a comprehensive end-to-end software stack for consuming the PM devices. Persistent Memory Storage Systems Operating Systems Persistent Containers Computer and Systems Architecture Computer Engineering Computer Sciences Data Storage Systems OS and Networks Software Engineering Systems Architecture
54	Optimizing Main Memory Usage in Modern Computing Systems to Improve Overall System Performance Campello, Daniel Jose 20 June 2016 (has links) Operating Systems use fast, CPU-addressable main memory to maintain an application’s temporary data as anonymous data and to cache copies of persistent data stored in slower block-based storage devices. However, the use of this faster memory comes at a high cost. Therefore, several techniques have been implemented to use main memory more efficiently in the literature. In this dissertation we introduce three distinct approaches to improve overall system performance by optimizing main memory usage. First, DRAM and host-side caching of file system data are used for speeding up virtual machine performance in today’s virtualized data centers. The clustering of VM images that share identical pages, coupled with data deduplication, has the potential to optimize main memory usage, since it provides more opportunity for sharing resources across processes and across different VMs. In our first approach, we study the use of content and semantic similarity metrics and a new algorithm to cluster VM images and place them in hosts where through deduplication we improve main memory usage. Second, while careful VM placement can improve memory usage by eliminating duplicate data, caches in current systems employ complex machinery to manage the cached data. Writing data to a page not present in the file system page cache causes the operating system to synchronously fetch the page into memory, blocking the writing process. In this thesis, we address this limitation with a new approach to managing page writes involving buffering the written data elsewhere in memory and unblocking the writing process immediately. This buffering allows the system to service file writes faster and with less memory resources. In our last approach, we investigate the use of emerging byte-addressable persistent memory technology to extend main memory as a less costly alternative to exclusively using expensive DRAM. We motivate and build a tiered memory system wherein persistent memory and DRAM co-exist and provide improved application performance at lower cost and power consumption with the goal of placing the right data in the right memory tier at the right time. The proposed approach seamlessly performs page migration across memory tiers as access patterns change and/or to handle tier memory pressure. operating systems storage persistent memory clustering virtual machine caching memory tiering asynchronous I/O system performance Computer Sciences Data Storage Systems OS and Networks Systems Architecture Theory and Algorithms
55	Sustainable Resource Management for Cloud Data Centers Mahmud, A. S. M. Hasan 15 June 2016 (has links) In recent years, the demand for data center computing has increased significantly due to the growing popularity of cloud applications and Internet-based services. Today's large data centers host hundreds of thousands of servers and the peak power rating of a single data center may even exceed 100MW. The combined electricity consumption of global data centers accounts for about 3% of worldwide production, raising serious concerns about their carbon footprint. The utility providers and governments are consistently pressuring data center operators to reduce their carbon footprint and energy consumption. While these operators (e.g., Apple, Facebook, and Google) have taken steps to reduce their carbon footprints (e.g., by installing on-site/off-site renewable energy facility), they are aggressively looking for new approaches that do not require expensive hardware installation or modification. This dissertation focuses on developing algorithms and systems to improve the sustainability in data centers without incurring significant additional operational or setup costs. In the first part, we propose a provably-efficient resource management solution for a self-managed data center to cap and reduce the carbon emission while maintaining satisfactory service performance. Our solution reduces the carbon emission of a self-managed data center to net-zero level and achieves carbon neutrality. In the second part, we consider minimizing the carbon emission in a hybrid data center infrastructure that includes geographically distributed self-managed and colocation data centers. This segment identifies and addresses the challenges of resource management in a hybrid data center infrastructure and proposes an efficient distributed solution to optimize the workload and resource allocation jointly in both self-managed and colocation data centers. In the final part, we explore sustainable resource management from cloud service users' point of view. A cloud service user purchases computing resources (e.g., virtual machines) from the service provider and does not have direct control over the carbon emission of the service provider's data center. Our proposed solution encourages a user to take part in sustainable (both economical and environmental) computing by limiting its spending on cloud resource purchase while satisfying its application performance requirements. Data center resource management carbon neutrality distributed resource management ADMM cloud service budget Computer and Systems Architecture Data Storage Systems Systems Architecture Theory and Algorithms
56	Hadoop Based Data Intensive Computation on IAAS Cloud Platforms Vijayakumar, Sruthi 01 January 2015 (has links) Cloud computing is a relatively new form of computing which uses virtualized resources. It is dynamically scalable and is often provided as pay for use service over the Internet or Intranet or both. With increasing demand for data storage in the cloud, the study of data-intensive applications is becoming a primary focus. Data intensive applications are those which involve high CPU usage, processing large volumes of data typically in size of hundreds of gigabytes, terabytes or petabytes. The research in this thesis is focused on the Amazon’s Elastic Cloud Compute (EC2) and Amazon Elastic Map Reduce (EMR) using HiBench Hadoop Benchmark suite. HiBench is a Hadoop benchmark suite and is used for performing and evaluating Hadoop based data intensive computation on both these cloud platforms. Both quantitative and qualitative comparisons of Amazon EC2 and Amazon EMR are presented. Also presented are their pricing models and suggestions for future research. Thesis University of North Florida UNF Computer and Systems Architecture Data Storage Systems Hardware Systems Other Computer Engineering
57	A book management system eLibrary Song, Shanpeng 01 January 2004 (has links) "eLibrary" is a book management software application that runs on Microsoft Windows platforms. The software incorporates a Windows Explorer like interface and XML/XSL to display book details. The purpose of this project is to build a full-featured, commerical-quality software package to help people manage their books (either printed or electronic). The goal is for eLibrary to be a complete solution for people who wish to build their own personal electronic library catalog. Electronic books -- Computer programs Private libraries -- Catalogs Cataloging -- Software Database design Data Storage Systems Software Engineering
58	Entertainics Garza, Jesus Mario Torres 01 January 2003 (has links) Entertainics is a web-based software application used to gather information about DVD players from several web-sites on the internet. The purpose of this software is to help users search for DVD players in a faster and easier way, by avoiding the navigation on every web-site that contains this product. Web search engines World Wide Web -- Subject access DVD players User interfaces (Computer systems) Cataloging of computer network resources Data Storage Systems Software Engineering
59	System-wide Performance Analysis for Virtualization Jensen, Deron Eugene 13 June 2014 (has links) With the current trend in cloud computing and virtualization, more organizations are moving their systems from a physical host to a virtual server. Although this can significantly reduce hardware, power, and administration costs, it can increase the cost of analyzing performance problems. With virtualization, there is an initial performance overhead, and as more virtual machines are added to a physical host the interference increases between various guest machines. When this interference occurs, a virtualized guest application may not perform as expected. There is little or no information to the virtual OS about the interference, and the current performance tools in the guest are unable to show this interference. We examine the interference that has been shown in previous research, and relate that to existing tools and research in root cause analysis. We show that in virtualization there are additional layers which need to be analyzed, and design a framework to determine if degradation is occurring from an external virtualization layer. Additionally, we build a virtualization test suite with Xen and PostgreSQL and run multiple tests to create I/O interference. We show that our method can distinguish between a problem caused by interference from external systems and a problem from within the virtual guest. Computer systems -- Evaluation Cloud computing -- Research System design -- Research Input-output analysis Computer and Systems Architecture Data Storage Systems
60	A Method for Monitoring Operating Equipment Effectiveness with the Internet of Things and Big Data Hays, Carl D, III 01 June 2021 (has links) (PDF) The purpose of this paper was to use the Overall Equipment Effectiveness productivity formula in plant manufacturing and convert it to measuring productivity for forklifts. Productivity for a forklift was defined as being available and picking up and moving containers at port locations in Seattle and Alaska. This research uses performance measures in plant manufacturing and applies them to mobile equipment in order to establish the most effective means of analyzing reliability and productivity. Using the Internet of Things to collect data on fifteen forklift trucks in three different locations, this data was then analyzed over a six-month period to rank the forklifts’ productivity from 1 – 15 using the Operating Equipment Effectiveness formula (OPEE). This ranking was compared to the industry standard for utilization to demonstrate how this approach would yield a better performance analysis and provide a more accurate tool for operations managers to manage their fleets of equipment than current methods. This analysis was shared with a fleet operations manager, and his feedback indicated there would be considerable value to analyzing his operations using this process. The results of this research identified key areas for improvement in equipment reliability and the need for additional operator training on the proper use of machines and provided insights into equipment operations in remote locations to managers who had not visited or evaluated those locations on-site. IoT IIoT Telematics Big Data OEE Industry 4.0 Agribusiness Business Analytics Business Intelligence Computer and Systems Architecture Data Storage Systems Hardware Systems

Search results