121 |
Implementation and applications of recursively defined relationsClouâtre, André January 1987 (has links)
No description available.
|
122 |
On automated query modification techniques for databasesDu, Kaizheng January 1993 (has links)
No description available.
|
123 |
Design, implementation, testing, and documentation of employee processing/tracking systemCardell, Justin Edward 01 January 1993 (has links)
The purpose of the design, implementation, testing, and documentation of the Employee Processing\Tracking Database System is to provide the user the capabilities of processing and tracking all employee information by using database files. An user manual and technical manual are provided here for reference to the system. Clipper 5.1.
|
124 |
Dynamically tuning LSM tree based databasesSharma, Sakshi 02 July 2024 (has links)
Log-Structured Merge (LSM) trees are a popular choice of data structure for key-value database systems due to their high ingestion rate and fast reads. They achieve this by appending new writes and updates sequentially and buffering changes in memory before flushing them to disk in sorted order. The LSM tree behavior can be dynamically altered by a large set of tunable parameters to accommodate a wide range of workloads. While these parameters provide flexibility, identifying the optimal value for these parameters, in order to maximize system performance, is a known hard problem. Offline tuning approaches can provide optimal configurations, however, they require knowledge about the workload a priori and/or evaluating hundreds of configurations, meaning that they lack the flexibility to adapt to evolving conditions. In the online setting, evaluating the performance impact of a particular tuning knob can be an expensive endeavor.
To this end, we propose Onix, a tuning framework that focuses on tackling the online setting of the tuning problem, specifically dynamically tuning LSM trees using Bayesian Optimization (BO). BO constructs a probabilistic model to navigate the space of tuning knobs, striking a careful balance between exploring uncharted parameter configurations and exploiting areas already identified as promising. We leverage BO’s efficient convergence to minimize the number of configurations deployed for exploration. Onix integrates Microsoft’s BO-based system tuning framework, MLOS, with Meta’s state-of-the-art LSM tree implementation, RocksDB. As workloads are executed on RocksDB, Onix propagates appropriate information to MLOS, which in turn recommends the correct configuration for the current workload. This process is repeated periodically, thus re-evaluating if a new tuning suggestion (i.e., configuration) can provide better performance. In the best-case scenario, Onix can achieve up to 2× better performance (in terms of average read latency) than the default configuration, while performing at least as well as default tuning in the worst case guaranteeing no performance regression.
|
125 |
Distributed Graph Storage And Querying SystemBalaji, Janani 12 August 2016 (has links)
Graph databases offer an efficient way to store and access inter-connected data. However, to query large graphs that no longer fit in memory, it becomes necessary to make multiple trips to the storage device to filter and gather data based on the query. But I/O accesses are expensive operations and immensely slow down query response time and prevent us from fully exploiting the graph specific benefits that graph databases offer.
The storage models of most existing graph database systems view graphs as indivisible structures and hence do not allow a hierarchical layering of the graph. This adversely affects query performance for large graphs as there is no way to filter the graph on a higher level without actually accessing the entire information from the disk. Distributing the storage and processing is one way to extract better performance. But current distributed solutions to this problem are not entirely effective, again due to the indivisible representation of graphs adopted in the storage format. This causes unnecessary latency due to increased inter-processor communication.
In this dissertation, we propose an optimized distributed graph storage system for scalable and faster querying of big graph data. We start with our unique physical storage model, in which the graph is decomposed into three different levels of abstraction, each with a different storage hierarchy. We use a hybrid storage model to store the most critical component and restrict the I/O trips to only when absolutely necessary. This lets us actively make use of multi-level filters while querying, without the need of comprehensive indexes. Our results show that our system outperforms established graph databases for several class of queries. We show that this separation also eases the difficulties in distributing graph data and go on propose a more efficient distributed model for querying general purpose graph data using the Spark framework.
|
126 |
Semantic Assistance for Data Utilization and CurationBecker, Brian J 06 August 2013 (has links)
We propose that most data stores for large organizations are ill-designed for the future, due to limited searchability of the databases. The study of the Semantic Web has been an emerging technology since first proposed by Berners-Lee. New vocabularies have emerged, such as FOAF, Dublin Core, and PROV-O ontologies. These vocabularies, combined, can relate people, places, things, and events. Technologies developed for the Semantic Web, namely the standardized vocabularies for expressing metadata, will make data easier to utilize. We gathered use cases for various data sources, from human resources to big enterprise. Most of our use cases reflect real-world data. We developed a software package for transforming data into these semantic vocabularies, and developed a method of querying via graphical constructs. The development and testing proved itself to be useful. We conclude that data can be preserved or revived through the use of the metadata techniques for the Semantic Web.
|
127 |
Biometrics Technology: Understanding Dynamics Influencing Adoption for Control of Identification Deception Within NigeriaNwatu, Gideon U. 01 January 2011 (has links)
One of the objectives of any government is the establishment of an effective solution to significantly control crime. Identity fraud in Nigeria has generated global attention and negative publicity toward its citizens. The research problem addressed in this study was the lack of understanding of the dynamics that influenced the adoption and usability of biometrics technology for reliable identification and authentication to control identity deception. The support for this study was found in the theoretical framework of the technology acceptance model (TAM). The purpose of the study was to provide scholarly research about the factors that influenced the adoption of biometrics technology to reliably identify and verify individuals in Nigeria to control identity fraud. The mixed-method descriptive and inferential study used interview and survey questionnaires for data collection. The binary logistic regression, point bi-serial correlation, independent samples t test, and content analyses were performed using SPSS version 18, Microsoft Excel spreadsheet 2007, and Nvivo 7.0 software. The results from the findings indicated statistical correlation between adopt biometrics technology and three other variables, ease of use (r = .38, n = 120, p <.01), perceived usefulness (ri = .41, n = 120, p < .01), and awareness (ri = .33, ni = 120, p < .01). The implications for social change include leveraging biometrics technology for recognition, confirmation, and accountability of individuals to prevent identity scheming, ensure security, and control the propagation of personal information. Beyond these immediate benefits, this research presents an example that other developing countries may use to facilitate the adoption of biometrics technology.
|
128 |
Security of genetic databasesGiggins, Helen January 2009 (has links)
Research Doctorate - Doctor of Philosophy (PhD) / The rapid pace of growth in the field of human genetics has left researchers with many new challenges in the area of security and privacy. To encourage participation and foster trust towards research, it is important to ensure that genetic databases are adequately protected. This task is a particularly challenging one for statistical agencies due to the high prevalence of categorical data contained within statistical genetic databases. The absence of natural ordering makes the application of traditional Statistical Disclosure Control (SDC) methods less straightforward, which is why we have proposed a new noise addition technique for categorical values. The main contributions of the thesis are as follows. We provide a comprehensive analysis of the trust relationships that occur between the different stakeholders in a genetic data warehouse system. We also provide a quantifiable model of trust that allows the database manager to granulate the level of protection based on the amount of trust that exists between the stakeholders. To the best of our knowledge, this is the first time that trust has been applied in the SDC context. We propose a privacy protection framework for genetic databases which is designed to deal with the fact that genetic data warehouses typically contain a high proportion of categorical data. The framework includes the use of a clustering technique which allows for the easier application of traditional noise addition techniques for categorical values. Another important contribution of this thesis is a new similarity measure for categorical values, which aims to capture not only the direct similarity between values, but also some sense of transitive similarity. This novel measure also has possible applications in providing a way of ordering categorical values, so that more traditional SDC methods can be more easily applied to them. Our analysis of experimental results also points to a numerical attribute phenomenon, whereby we typically have high similarity between numerical values that are close together, and where the similarity decreases as the absolute value of the difference between numerical values increases. However, some numerical attributes appear to not behave in a strictly `numerical' way. That is, values which are close together numerically do not always appear very similar. We also provide a novel noise addition technique for categorical values, which employs our similarity measure to partition the values in the data set. Our method - VICUS - then perturbs the original microdata file so that each value is more likely to be changed to another value in the same partition than one from a different partition. The technique helps to ensure that the perturbed microdata file retains data quality while also preserving the privacy of individual records.
|
129 |
Optimisation de requêtes XQuery dans des bases de données XML distribuées sur des réseaux pair-à-pairButnaru, Bogdan 12 April 2012 (has links) (PDF)
XML distribuées basées sur les réseaux pair-à-pair. Notre approche est unique parce qu'elle est axée sur le traitement global du langage XQuery plutôt que l'étude d'un langage réduit spécifique aux index utilisés. Le système XQ2P présenté dans cette thèse intègre cette architecture ; il se présente comme une collection complète de blocs de logiciels fondamentaux pour développer des applications similaires. L'aspect pair-à-pair est fourni par P2PTester, un " framework " fournissant des modules pour les fonctionnalités P2P de base et un système distribué pour des tests et simulations. Une version de l'algorithme TwigStack adapté au P2P, utilisant un index structurel basé sur le numérotage des noeuds, y est intégré. Avec le concours d'un système de pré-traitement des requêtes il permet à XQ2P l'évaluation efficace des requêtes structurelles sur la base de données distribuée. Une version alternative du même algorithme est aussi utilisée pour l'évaluation efficace de la plupart des requêtes en langage XQuery. L'une des nouveautés majeures de XQuery 3.0 est l'étude des séries temporelles. Nous avons défini un modèle pour traiter ce type de données, utilisant le modèle XML comme représentation des valeurs et des requêtes XQuery 3.0 pour les manipuler. Nous ajoutons à XQ2P un index adapté à ce modèle ; le partitionnement horizontal des longues séries de données chronologiques, des opérateurs optimisés et une technique d'évaluation parallèle des sous-expressions permettent l'exécution efficace d'opérations avec des volumes de données importants.
|
130 |
Design Guidelines for Reducing Redundancy in Relational and XML DataKolahi, Solmaz 31 July 2008 (has links)
In this dissertation, we propose new design guidelines to reduce the amount of redundancy that databases carry. We use techniques from information theory to define a measure that evaluates a database design based on the worst possible redundancy carried in the instances. We then continue by revisiting the design problem of relational data with functional dependencies, and measure the lowest price, in terms of redundancy, that has to be paid to guarantee a dependency-preserving normalization for all schemas. We provide a formal justification for the Third Normal Form (3NF) by showing that we can achieve this lowest price by doing a good 3NF normalization.
We then study the design problem for XML documents that are views of relational data. We show that we can design a redundancy-free XML representation for some relational schemas while preserving all data dependencies. We present an algorithm for converting a relational schema to such an XML design.
We finally study the design problem for XML documents that are stored in relational databases. We look for XML design criteria that ensure a relational storage with low redundancy. First, we characterize XML designs that have a redundancy-free relational storage. Then we propose a restrictive condition for XML functional dependencies that guarantees a low redundancy for data values in the relational storage.
|
Page generated in 0.1798 seconds