Global ETD Search

351	Energy-Efficient Databases Using Sweet Spot Frequencies Lehner, Wolfgang, Götz, Sebastian, Ilsche, Thomas, Cardoso, Jorge, Spillner, Josef, Kissinger, Thomas, Aßmann, Uwe, Nagel, Wolfgang E., Schill, Alexander 12 January 2023 (has links) Database management systems (DBMS) are typically tuned for high performance and scalability. Nevertheless, carbon footprint and energy efficiency are also becoming increasing concerns. Unfortunately, existing studies mainly present theoretical contributions but fall short on proposing practical techniques. These could be used by administrators or query optimizers to increase the energy efficiency of the DBMS. Thus, this paper explores the effect of so-called sweet spots, which are energy-efficient CPU frequencies, on the energy required to execute queries. From our findings, we derive the Sweet Spot Technique, which relies on identifying energy-efficient sweet spots and the optimal number of threads that minimizes energy consumption for a query or an entire database workload. The technique is simple and has a practical implementation leading to energy savings of up to 50% compared to using the nominal frequency and maximum number of threads. databases, energy-efficiency, DVFS Datenbanken, Energieeffizienz, DVFS
352	Anti-Tamper Databases: Querying Encrypted Databases Chung, Sun S. 09 January 2006 (has links) No description available. Database Security, Querying Encrypted Databases, Database Encryption
353	Computing Label-Constraint Reachability in Graph Databases HONG, HUI 16 April 2012 (has links) No description available. Computer Science reachability graph databases algorithm
354	FUNCTION COMPUTING IN VERTICALLY PARTITIONED DISTRIBUTED DATABASES SHINDE, KAUSTUBH ARUN January 2006 (has links) No description available. Computer Science distributed databases function computing
355	A Novel Index Method for Write Optimization on Out-of-Core Column-Store Databases Matacic, Tyler Joseph January 2016 (has links) No description available. Computer Science databases column stores efficiency information
356	Query processing in distributed database systems / Unnava, Vasundhara January 1992 (has links) No description available. Business Administration Distributed databases Database searching
357	Design and Performance analysis of a relational replicated database systems Hanson, Jon Gregory 01 January 1987 (has links) (PDF) The hardware organization and software structure of a new database system are presented. This system, the relational replicated database system (RRDS), is based on a set of replicated processors operating on a partitioned database. Performance improvements and capacity growth can be obtained by adding more processors to the configuration. Based on designing goals a set of hardware and software design questions were developed. The system then evolved according to a five-phase process, based on simulation and analysis, which addressed and resolved the design questions. Strategies and algorithms were developed for data access, data placement, and directory management for the hardware organization. A predictive performance analysis was conducted to determine the extent to which original design goals were satisfied. The predictive performance results, along with an analytical comparison with three other relational multi-backend systems, provided information about the strengths and weaknesses of our design as well as a basis for future research. Databases Multiprocessors Computer Sciences Physical Sciences and Mathematics
358	Multicore Scalability Through Asynchronous Work Mathew, Ajit 13 January 2020 (has links) With the end of Moore's Law, computer architects have turned to multicore architecture to provide high performance. Unfortunately, to achieve higher performance, multicores require programs to be parallelized which is an untamed problem. Amdahl's law tells that the maximum theoretical speedup of a program is dictated by the size of the non-parallelizable section of a program. Hence to achieve higher performance, programmers need to reduce the size of sequential code in the program. This thesis explores asynchronous work as a means to reduce sequential portions of program. Using asynchronous work, a programmer can remove tasks which do not affect data consistency from the critical path and can be performed using background thread. Using this idea, the thesis introduces two systems. First, a synchronization mechanism, Multi-Version Read-Log-Update(MV-RLU), which extends Read-Log-Update (RLU) through multi-versioning. At the core of MV-RLU design is a concurrent garbage collection algorithm which reclaims obsolete versions asynchronously reducing blocking of threads. Second, a concurrent and highly scalable index-structure called Hydralist for multi-core. The key idea behind design of Hydralist is that an index-structure can be divided into two component (search layer and data layer) and updates to data layer can be done synchronously while updates to search layer can be propagated asynchronously using background threads. / Master of Science / Up until mid-2000s, Moore's law predicted that performance CPU doubled every two years. This is because improvement in transistor technology allowed smaller transistor which can switch at higher frequency leading to faster CPU clocks. But faster clock leads to higher heat dissipation and as chips reached their thermal limits, computer architects could no longer increase clock speeds. Hence they moved to multicore architecture, wherein a single die contains multiple CPUs, to allow higher performance. Now programmers are required to parallelize their code to take advangtage of all the CPUs in a chip which is a non trivial problem. The theoretical speedup achieved by a program on multicore architecture is dictated by Amdahl's law which describes the non parallelizable code in a program as the limiting factor for speedup. For example, a program with 99% parallelizable code can achieve speedup of 20 whereas a program with 50% parallelizable code can only achieve speedup of 2. Therefore to achieve high speedup, programmers need to reduce size of serial section in their program. One way to reduce sequential section in a program is to remove non-critical task from the sequential section and perform the tasks asynchronously using background thread. This thesis explores this technique in two systems. First, a synchronization mechanism which is used co-ordinate access to shared resource called Multi-Version Read-Log-Update (MV-RLU). MV-RLU achieves high performance by removing garbage collection from critical path and performing it asynchronously using background thread. Second, an index structure, Hydralist, which based on the insight that an index structure can be decomposed into two components, search layer and data layer, and decouples updates to both the layer which allows higher performance. Updates to search layer is done synchronously while updates to data layer is done asynchronously using background threads. Evaluation shows that both the systems perform better than state-of-the-art competitors in a variety of workloads. Multicore scalability synchonization index structures databases
359	Database alignment: fundamental limits and multiple databases setting K, Zeynep 13 September 2024 (has links) In modern data analysis, privacy is a critical concern when dealing with user-related databases. Ensuring user anonymity while extracting meaningful correlations from the data poses a significant challenge, especially when side information can potentially enable de-anonymization. This dissertation explores the standard information-theoretic problems in the correlated databases model. We define a "database" as a simple probabilistic model that contains a random feature vector for each user, with user labels shuffled to ensure anonymity. We first investigate correlation detection between two databases, formulating it as a composite binary hypothesis testing problem. Under the alternate hypothesis, there exists an unknown permutation that aligns users in the first database with those in the second, thereby matching correlated entries. The null hypothesis assumes that the databases are independent, with no such alignment. For the special case of Gaussian feature vectors, we derive both upper and lower bounds on the correlation required to achieve or fail to achieve this statistical problem. Our results are tight up to a constant factor when the feature length exceeds the number of users. Regarding our achievability boundary, we draw connections to the user labeling recovery problem, highlighting significant parallels and insights. Additionally, for the two databases model, we initially examine the potential gaps in the statistical analysis conducted thus far for the large number of users regime by drawing parallels with similar problems in the literature. Motivated by these comparisons, we propose a novel approach to address the detection problem, focusing on the hidden permutation structure and intricate dependencies characterizing these relationships. Building on our research, we present a comprehensive model for handling multiple correlated databases. In this multiple-databases setting, we address another fundamental information-theoretic problem: user label recovery. We evaluate the performance of the typicality matching estimator in relation to the asymptotic behavior of feature length, demonstrating an impossibility result that holds up to a multiplicative constant factor. This exploration into multiple databases not only broadens the scope of our study but also underscores the complexity and richness of correlation detection in a more generalized framework. In conclusion, we summarize the statistical gaps identified in our findings, exploring their possible origins. We also discuss the limitations of our simple probabilistic model and propose strategies to address them. Finally, we outline potential future research directions, including the information-theoretic problem of change detection, which remains an open area of significant interest. Electrical engineering Correlated databases Hypothesis testing Recovery
360	Relational Computing Using HPC Resources: Services and Optimizations Soundarapandian, Manikandan 15 September 2015 (has links) Computational epidemiology involves processing, analysing and managing large volumes of data. Such massive datasets cannot be handled efficiently by using traditional standalone database management systems, owing to their limitation in the degree of computational efficiency and bandwidth to scale to large volumes of data. In this thesis, we address management and processing of large volumes of data for modeling, simulation and analysis in epidemiological studies. Traditionally, compute intensive tasks are processed using high performance computing resources and supercomputers whereas data intensive tasks are delegated to standalone databases and some custom programs. DiceX framework is a one-stop solution for distributed database management and processing and its main mission is to leverage and utilize supercomputing resources for data intensive computing, in particular relational data processing. While standalone databases are always on and a user can submit queries at any time for required results, supercomputing resources must be acquired and are available for a limited time period. These resources are relinquished either upon completion of execution or at the expiration of the allocated time period. This kind of reservation based usage style poses critical challenges, including building and launching a distributed data engine onto the supercomputer, saving the engine and resuming from the saved image, devising efficient optimization upgrades to the data engine and enabling other applications to seamlessly access the engine . These challenges and requirements cause us to align our approach more closely with cloud computing paradigms of Infrastructure as a Service(IaaS) and Platform as a Service(PaaS). In this thesis, we propose cloud computing like workflows, but using supercomputing resources to manage and process relational data intensive tasks. We propose and implement several services including database freeze and migrate and resume, ad-hoc resource addition and table redistribution. These services assist in carrying out the workflows defined. We also propose an optimization upgrade to the query planning module of postgres-XC, the core relational data processing engine of the DiceX framework. With a knowledge of domain semantics, we have devised a more robust data distribution strategy that would enable to push down most time consuming sql operations forcefully to the postgres-XC data nodes, bypassing its query planner's default shippability criteria without compromising correctness. Forcing query push down reduces the query processing time by a factor of almost 40%-60% for certain complex spatio-temporal queries on our epidemiology datasets. As part of this work, a generic broker service has also been implemented, which acts as an interface to the DiceX framework by exposing restful apis, which applications can make use of to query and retrieve results irrespective of the programming language or environment. / Master of Science distributed databases HPC supercomputers computational epidemiology

Search results