• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 4
  • Tagged with
  • 67
  • 67
  • 67
  • 52
  • 33
  • 28
  • 16
  • 15
  • 14
  • 11
  • 11
  • 9
  • 9
  • 8
  • 8
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
41

Shelfaware: Accelerating Collaborative Awareness with Shelf CRDT

Waidhofer, John C 01 March 2023 (has links) (PDF)
Collaboration has become a key feature of modern software, allowing teams to work together effectively in real-time while in different locations. In order for a user to communicate their intention to several distributed peers, computing devices must exchange high-frequency updates with transient metadata like mouse position, text range highlights, and temporary comments. Current peer-to-peer awareness solutions have high time and space complexity due to the ever-expanding logs that each client must maintain in order to ensure robust collaboration in eventually consistent environments. This paper proposes an awareness Conflict-Free Replicated Data Type (CRDT) library that provides the tooling to support an eventually consistent, decentralized, and robust multi-user collaborative environment. Our library is tuned for rapid iterative updates that communicate fine-grained user actions across a network of collaborators. Our approach holds memory constant for subsequent writes to an existing key on a shared resource and completely prunes stale data from shared documents. These features allow us to keep the CRDT's memory footprint small, making it a feasible solution for memory constrained applications. Results show that our CRDT implementation is comparable to or exceeds the performance of similar data structures in high-frequency read/write scenarios.
42

API-Based Acquisition of Evidence from Cloud Storage Providers

Barreto, Andres E 11 August 2015 (has links)
Cloud computing and cloud storage services, in particular, pose a new challenge to digital forensic investigations. Currently, evidence acquisition for such services still follows the traditional approach of collecting artifacts on a client device. In this work, we show that such an approach not only requires upfront substantial investment in reverse engineering each service, but is also inherently incomplete as it misses prior versions of the artifacts, as well as cloud-only artifacts that do not have standard serialized representations on the client. In this work, we introduce the concept of API-based evidence acquisition for cloud services, which addresses these concerns by utilizing the officially supported API of the service. To demonstrate the utility of this approach, we present a proof-of-concept acquisition tool, kumodd, which can acquire evidence from four major cloud storage providers: Google Drive, Microsoft One, Dropbox, and Box. The implementation provides both command-line and web user interfaces, and can be readily incorporated into established forensic processes.
43

Curricular Optimization: Solving for the Optimal Student Success Pathway

Thompson-Arjona, William G. 01 January 2019 (has links)
Considering the significant investment of higher education made by students and their families, graduating in a timely manner is of the utmost importance. Delay attributed to drop out or the retaking of a course adds cost and negatively affects a student’s academic progression. Considering this, it becomes paramount for institutions to focus on student success in relation to term scheduling. Often overlooked, complexity of a course schedule may be one of the most important factors in whether or not a student successfully completes his or her degree. More often than not students entering an institution as a first time full time (FSFT) freshman follow the advised and published schedule given by administrators. Providing the optimal schedule that gives the student the highest probability of success is critical. In efforts to create this optimal schedule, this thesis introduces a novel optimization algorithm with the objective to separate courses which when taken together hurt students’ pass rates. Inversely, we combine synergistic relationships that improve a students probability for success when the courses are taken in the same semester. Using actual student data at the University of Kentucky, we categorically find these positive and negative combinations by analyzing recorded pass rates. Using Julia language on top of the Gurobi solver, we solve for the optimal degree plan of a student in the electrical engineering program using a linear and non-linear multi-objective optimization. A user interface is created for administrators to optimize their curricula at main.optimizeplans.com.
44

Efficient Storage and Domain-Specific Information Discovery on Semistructured Documents

Farfan, Fernando R 12 November 2009 (has links)
The increasing amount of available semistructured data demands efficient mechanisms to store, process, and search an enormous corpus of data to encourage its global adoption. Current techniques to store semistructured documents either map them to relational databases, or use a combination of flat files and indexes. These two approaches result in a mismatch between the tree-structure of semistructured data and the access characteristics of the underlying storage devices. Furthermore, the inefficiency of XML parsing methods has slowed down the large-scale adoption of XML into actual system implementations. The recent development of lazy parsing techniques is a major step towards improving this situation, but lazy parsers still have significant drawbacks that undermine the massive adoption of XML. Once the processing (storage and parsing) issues for semistructured data have been addressed, another key challenge to leverage semistructured data is to perform effective information discovery on such data. Previous works have addressed this problem in a generic (i.e. domain independent) way, but this process can be improved if knowledge about the specific domain is taken into consideration. This dissertation had two general goals: The first goal was to devise novel techniques to efficiently store and process semistructured documents. This goal had two specific aims: We proposed a method for storing semistructured documents that maps the physical characteristics of the documents to the geometrical layout of hard drives. We developed a Double-Lazy Parser for semistructured documents which introduces lazy behavior in both the pre-parsing and progressive parsing phases of the standard Document Object Model's parsing mechanism. The second goal was to construct a user-friendly and efficient engine for performing Information Discovery over domain-specific semistructured documents. This goal also had two aims: We presented a framework that exploits the domain-specific knowledge to improve the quality of the information discovery process by incorporating domain ontologies. We also proposed meaningful evaluation metrics to compare the results of search systems over semistructured documents.
45

Billing and receivables database application

Lukalapu, Sushma 01 January 2000 (has links)
The purpose of this project is to design, build, and implement an information retrieval database system for the Accounting Department at CSUSB. The database will focus on the financial details of the student accounts maintained by the accounting personnel. It offers detailed information pertinent to tuition, parking, housing, boarding, etc.
46

Energy Agile Cluster Communication

Mustafa, Muhammad Zain 18 March 2015 (has links)
Computing researchers have long focused on improving energy-efficiency?the amount of computation per joule? under the implicit assumption that all energy is created equal. Energy however is not created equal: its cost and carbon footprint fluctuates over time due to a variety of factors. These fluctuations are expected to in- tensify as renewable penetration increases. Thus in my work I introduce energy-agility a design concept for a platform?s ability to rapidly and efficiently adapt to such power fluctuations. I then introduce a representative application to assess energy-agility for the type of long-running, parallel, data-intensive tasks that are both common in data centers and most amenable to delays from variations in available power. Multiple variants of the application are implemented to illustrate the fundamental tradeoffs in designing energy-agile parallel applications. I find that with inactive power state transition latencies of up to 15 seconds, a design that regularly ”blinks” servers out- performs one that minimizes transitions by only changing power states when power varies. While the latter approach has much lower transition overhead, it requires additional I/O, since servers are not always concurrently active. Unfortunately, I find that most server-class platforms today are not energy-agile: they have transition la- tencies beyond one minute, forcing them to minimize transition and incur additional I/O.
47

An Interactive Visualization Model for Analyzing Data Storage System Workloads

Pungdumri, Steven Charubhat 01 March 2012 (has links)
The performance of hard disks has become increasingly important as the volume of data storage increases. At the bottom level of large-scale storage networks is the hard disk. Despite the importance of hard drives in a storage network, it is often difficult to analyze the performance of hard disks due to the sheer size of the datasets seen by hard disks. Additionally, hard drive workloads can have several multi-dimensional characteristics, such as access time, queue depth and block-address space. The result is that hard drive workloads are extremely diverse and large, making extracting meaningful information from hard drive workloads very difficult. This is one reason why there are several inefficiencies in storage networks. In this paper, we develop a tool that assists in communicating valuable insights into these datasets, resulting in an approach that utilizes parallel coordinates to model data storage workloads captured with bus analyzers. Users are presented with an effective visualization of workload captures with this implementation, along with methods to interact with and manipulate the model in order to more clearly analyze the lowest level of their storage systems. Design decisions regarding the feature set of this tool are based on the analysis needs of domain experts and feedback from a conducted user study. Results from our user study evaluations demonstrate the efficacy of our tool to observe valuable insights, which can potentially assist in future storage system design and deployment decisions. Using this tool, domain experts were able to model storage system datasets with various features to manipulate the visualization to make observations and discoveries, such as detecting logical block address banding and observe various dataset trends which were not readily noticeable using conventional analysis methods.
48

Design of an Open-Source Sata Core for Virtex-4 FPGAs

Gorman, Cory 01 January 2013 (has links) (PDF)
Many hard drives manufactured today use the Serial ATA (SATA) protocol to communicate with the host machine, typically a PC. SATA is a much faster and much more robust protocol than its predecessor, ATA (also referred to as Parallel ATA or IDE). Many hardware designs, including those using Field-Programmable Gate Arrays (FPGAs), have a need for a long-term storage solution, and a hard drive would be ideal. One such design is the high-speed Data Acquisition System (DAS) created for the NASA Surface Water and Ocean Topography mission. This system utilizes a Xilinx Virtex-4 FPGA. Although the DAS includes a SATA connector for interfacing with a disk, a SATA core is needed to implement the protocol for disk operations. In this work, an open-source SATA core for Virtex-4 FPGAs has been created. SATA cores for Virtex-5 and Virtex-6 devices were already available, but they are not compatible with the different serial transceivers in the Virtex-4. The core can interface with disks at SATA I or SATA II speeds, and has been shown working at rates up to 180MB/s. It has been successfully integrated into the hardware design of the DAS board so that radar samples can be stored on the disk.
49

Relevance Analysis for Document Retrieval

Labouve, Eric 01 March 2019 (has links) (PDF)
Document retrieval systems recover documents from a dataset and order them according to their perceived relevance to a user’s search query. This is a difficult task for machines to accomplish because there exists a semantic gap between the meaning of the terms in a user’s literal query and a user’s true intentions. Even with this ambiguity that arises with a lack of context, users still expect that the set of documents returned by a search engine is both highly relevant to their query and properly ordered. The focus of this thesis is on document retrieval systems that explore methods of ordering documents from unstructured, textual corpora using text queries. The main goal of this study is to enhance the Okapi BM25 document retrieval model. In doing so, this research hypothesizes that the structure of text inside documents and queries hold valuable semantic information that can be incorporated into the Okapi BM25 model to increase its performance. Modifications that account for a term’s part of speech, the proximity between a pair of related terms, the proximity of a term with respect to its location in a document, and query expansion are used to augment Okapi BM25 to increase the model’s performance. The study resulted in 87 modifications which were all validated using open source corpora. The top scoring modification from the validation phase was then tested under the Lisa corpus and the model performed 10.25% better than Okapi BM25 when evaluated under mean average precision. When compared against two industry standard search engines, Lucene and Solr, the top scoring modification largely outperforms these systems by upwards to 21.78% and 23.01%, respectively.
50

Optimizing Virtual Machine I/O Performance in Cloud Environments

Lu, Tao 01 January 2016 (has links)
Maintaining closeness between data sources and data consumers is crucial for workload I/O performance. In cloud environments, this kind of closeness can be violated by system administrative events and storage architecture barriers. VM migration events are frequent in cloud environments. VM migration changes VM runtime inter-connection or cache contexts, significantly degrading VM I/O performance. Virtualization is the backbone of cloud platforms. I/O virtualization adds additional hops to workload data access path, prolonging I/O latencies. I/O virtualization overheads cap the throughput of high-speed storage devices and imposes high CPU utilizations and energy consumptions to cloud infrastructures. To maintain the closeness between data sources and workloads during VM migration, we propose Clique, an affinity-aware migration scheduling policy, to minimize the aggregate wide area communication traffic during storage migration in virtual cluster contexts. In host-side caching contexts, we propose Successor to recognize warm pages and prefetch them into caches of destination hosts before migration completion. To bypass the I/O virtualization barriers, we propose VIP, an adaptive I/O prefetching framework, which utilizes a virtual I/O front-end buffer for prefetching so as to avoid the on-demand involvement of I/O virtualization stacks and accelerate the I/O response. Analysis on the traffic trace of a virtual cluster containing 68 VMs demonstrates that Clique can reduce inter-cloud traffic by up to 40%. Tests of MPI Reduce_scatter benchmark show that Clique can keep VM performance during migration up to 75% of the non-migration scenario, which is more than 3 times of the Random VM choosing policy. In host-side caching environments, Successor performs better than existing cache warm-up solutions and achieves zero VM-perceived cache warm-up time with low resource costs. At system level, we conducted comprehensive quantitative analysis on I/O virtualization overheads. Our trace replay based simulation demonstrates the effectiveness of VIP for data prefetching with ignorable additional cache resource costs.

Page generated in 0.0995 seconds