Global ETD Search

1	Implementing a heterogeneous relational database node Long, J. A. January 1985 (has links) No description available. 621.39 Distributed database system
2	Selective Data Replication for Distributed Geographical Data Sets Gu, Xuan January 2008 (has links) The main purpose of this research is to incorporate additional higher-level semantics into the existing data replication strategies in such a way that their flexibility and performance can be improved in favour of both data providers and consumers. The resulting approach from this research is referred to as the selective data replication system. With this system, the data that has been updated by a data provider is captured and batched into messages known as update notifications. Once update notifications are received by data consumers, they are used to evaluate so-called update policies, which are specified by data consumers containing details on when data replications need to occur and what data needs to be updated during the replications. data replication distributed database ontology geographical information system
3	The Customized Database Fragmentation Technique in Distributed Database Systems : A case Study Shareef, Mohammed Ibrahim, Rawi, Aus Wail-Al January 2012 (has links) In current age, various companies are using a centralized database system for dailybusiness transactions in different domains. Some critical issues have been observedrelated to the complexity, maintenance, performance and communication cost of datain centralized data repository for query processing, according to the demand of endusersfrom different locations. So, different enterprises are striving to implementefficient distributed database systems in their business environments for scalability.The distributed database architecture covers different factors such as transparentmanagement system, replication, fragmentation and allocation etc. This dissertationfocuses on database fragmentation and techniques which are useful for performingdatabase fragmentation. The objective of this research is to investigate efficient algorithm and technique fordatabase fragmentation in distributed environment. We proposed a customized ISUD(Insert, Select, Update, Delete) technique after comparative study of the best suitabletechniques, which is selected for implementation purpose. The functionality of thecustomized ISUD technique helps to get the precedence of the attribute of a relationhorizontally in database from various sites or location. The practical objective of this dissertation is to design the architecture and develop,implement customized ISUD (Insert, Select, Update, Delete) user interface, and to testthe selected algorithm or technique by using the interface. We used C#.Net as adevelopment tool. This user interface accepts ISUD frequency as an input andproduces ALP (attribute location precedence) values as output. We have incorporateddesign science research (DSR) method for customized ISUD technique development.This customized ISUD technique can be considered as a foundation to implementhorizontal database fragmentation in distributed environment, so that the databaseadministrator can take a proper decision for allocating the fragmented data to varioussites at initial state of distributed database design. Distributed database Database Fragmentation Attribute Locality precedence Customized ISUD.
4	Selective Data Replication for Distributed Geographical Data Sets Gu, Xuan January 2008 (has links) The main purpose of this research is to incorporate additional higher-level semantics into the existing data replication strategies in such a way that their flexibility and performance can be improved in favour of both data providers and consumers. The resulting approach from this research is referred to as the selective data replication system. With this system, the data that has been updated by a data provider is captured and batched into messages known as update notifications. Once update notifications are received by data consumers, they are used to evaluate so-called update policies, which are specified by data consumers containing details on when data replications need to occur and what data needs to be updated during the replications. data replication distributed database ontology geographical information system
5	DESIGN OF DECOMPOSABLE ALGORITHMS FOR DISTRIBUTED DATABASES KHEDR, AHMED MOHAMED 17 April 2003 (has links) No description available. Computer Science decomposable algorithm distributed database clustering graph
6	A C++ Distributed Database Select-project-join Queryprocessor On A Hpc Cluster Ceran, Erhan 01 May 2012 (has links) (PDF) High performance computer clusters have become popular as they are more scalable, affordable and reliable than their centralized counterparts. Database management systems are particularly suitable for distributed architectures / however distributed DBMS are still not used widely because of the design difficulties. In this study, we aim to help overcome these difficulties by implementing a simulation testbed for a distributed query plan processor. This testbed works on our departmental HPC cluster machine and is able to perform select, project and join operations. A data generation module has also been implemented which preserves the foreign key and primary key constraints in the database schema. The testbed has capability to measure, simulate and estimate the response time of a given query execution plan using specified communication network parameters. Extensive experimental work is performed to show the correctness of the produced results. The estimated execution time costs are also compared with the actual run-times obtained from the testbed to verify the proposed estimation functions. Thus, we make sure that these estimation iv functions can be used in distributed database query optimization and distributed database design tools. QA Computer Software 76.75-76.765
7	Medical Data Management on the cloud / Gestion de données médicales sur le cloud Mohamad, Baraa 23 June 2015 (has links) Résumé indisponible / Medical data management has become a real challenge due to the emergence of new imaging technologies providing high image resolutions.This thesis focuses in particular on the management of DICOM files. DICOM is one of the most important medical standards. DICOM files have special data format where one file may contain regular data, multimedia data and services. These files are extremely heterogeneous (the schema of a file cannot be predicted) and have large data sizes. The characteristics of DICOM files added to the requirements of medical data management in general – in term of availability and accessibility- have led us to construct our research question as follows:Is it possible to build a system that: (1) is highly available, (2) supports any medical images (different specialties, modalities and physicians’ practices), (3) enables to store extremely huge/ever increasing data, (4) provides expressive accesses and (5) is cost-effective .In order to answer this question we have built a hybrid (row-column) cloud-enabled storage system. The idea of this solution is to disperse DICOM attributes thoughtfully, depending on their characteristics, over both data layouts in a way that provides the best of row-oriented and column-oriented storage models in one system. All with exploiting the interesting features of the cloud that enables us to ensure the availability and portability of medical data. Storing data on such hybrid data layout opens the door for a second research question, how to process queries efficiently over this hybrid data storage with enabling new and more efficient query plansThe originality of our proposal comes from the fact that there is currently no system that stores data in such hybrid storage (i.e. an attribute is either on row-oriented database or on column-oriented one and a given query could interrogate both storage models at the same time) and studies query processing over it.The experimental prototypes implemented in this thesis show interesting results and opens the door for multiple optimizations and research questions. Imagerie médicale Medical Imaging Hybrid database Cloud computing Query processing Multi-database DICOM Distributed database
8	Distributed indexing and scalable query processing for interactive big data explorations Guzun, Gheorghi 01 August 2016 (has links) The past few years have brought a major surge in the volumes of collected data. More and more enterprises and research institutions find tremendous value in data analysis and exploration. Big Data analytics is used for improving customer experience, perform complex weather data integration and model prediction, as well as personalized medicine and many other services. Advances in technology, along with high interest in big data, can only increase the demand on data collection and mining in the years to come. As a result, and in order to keep up with the data volumes, data processing has become increasingly distributed. However, most of the distributed processing for large data is done by batch processing and interactive exploration is hardly an option. To efficiently support queries over large amounts of data, appropriate indexing mechanisms must be in place. This dissertation proposes an indexing and query processing framework that can run on top of a distributed computing engine, to support fast, interactive data explorations in data warehouses. Our data processing layer is built around bit-vector based indices. This type of indexing features fast bit-wise operations and scales up well for high dimensional data. Additionally, compression can be applied to reduce the index size, and thus utilize less memory and network communication. Our work can be divided into two areas: index compression and query processing. Two compression schemes are proposed for sparse and dense bit-vectors. The design of these encoding methods is hardware-driven, and the query processing is optimized for the available computing hardware. Query algorithms are proposed for selection, aggregation, and other specialized queries. The query processing is supported on single machines, as well as computer clusters. Bit-vector Database Indexing Data Compression Data Exploration Distributed Database Query Algorithm Electrical and Computer Engineering
9	Genetic Algorithms For Distributed Database Design And Distributed Database Query Optimization Sevinc, Ender 01 October 2009 (has links) (PDF) The increasing performance of computers, reduced prices and ability to connect systems with low cost gigabit ethernet LAN and ATM WAN networks make distributed database systems an attractive research area. However, the complexity of distributed database query optimization is still a limiting factor. Optimal techniques, such as dynamic programming, used in centralized database query optimization are not feasible because of the increased problem size. The recently developed genetic algorithm (GA) based optimization techniques presents a promising alternative. We compared the best known GA with a random algorithm and showed that it achieves almost no improvement over the random search algorithm generating an equal number of random solutions. Then, we analyzed a set of possible GA parameters and determined that two-point truncate technique using GA gives the best results. New mutation and crossover operators defined in our GA are experimentally analyzed within a synthetic distributed database having increasing the numbers of relations and nodes. The designed synthetic database replicated relations, but there was no horizontal/vertical fragmentation. We can translate a select-project-join query including a fragmented relation with N fragments into a corresponding query with N relations. Comparisons with optimal results found by exhaustive search are only 20% off the results produced by our new GA formulation showing a 50% improvement over the previously known GA based algorithm.
10	Replica selection in Apache Cassandra : Reducing the tail latency for reads using the C3 algorithm Thorsen, Sofie January 2015 (has links) Keeping response times low is crucial in order to provide a good user experience. Especially the tail latency proves to be a challenge to keep low as size, complexity and overall use of services scale up. In this thesis we look at reducing the tail latency for reads in the Apache Cassandra database system by implementing the new replica selection algorithm called C3, recently developed by Lalith Suresh, Marco Canini, Stefan Schmid and Anja Feldmann. Through extensive benchmarks with several stress tools, we find that C3 indeed decreases the tail latencies of Cassandra on generated load. However, when evaluating C3 on production load, results does not show any particular improvement. We argue that this is mostly due to the variable size records in the data set and token awareness in the production client. We also present a client-side implementation of C3 in the DataStax Java driver in an attempt to remove the caveat of token aware clients. The client-side implementation did give positive results, but as the benchmark results showed a lot of variance we deem the results to be too inconclusive to confirm that the implementation works as intended. We conclude that the server-side C3 algorithm will work effectively for systems with homogeneous row sizes where the clients are not token aware. / För att kunna erbjuda en bra användarupplevelse så är det av högsta vikt att hålla responstiden låg. Speciellt svanslatensen är en utmaning att hålla låg då dagens applikationer växer både i storlek, komplexitet och användning. I denna rapport undersöker vi svanslatensen vid läsning i databassystemet Apache Cassandra och huruvida den går att förbättra. Detta genom att implementera den nya selektionsalgoritmen för replikor, kallad C3, nyligen framtagen av Lalith Suresh, Marco Canini, Stefan Schmid och Anja Feldmann. Genom utförliga tester med flera olika stressverktyg så finner vi att C3 verkligen förbättrar Cassandras svanslatenser på genererad last. Dock så visade använding av C3 på produktionslast ingen större förbättring. Vi hävdar att detta framförallt beror på en variabel storlek på datasetet och att produktionsklienten är tokenmedveten. Vi presenterar också en klientimplementation av C3 i Java-drivrutinen från DataStax, i ett försök att åtgärda problemet med tokenmedventa klienter. Klientimplementationen av C3 gav positiva resultat, men då testresultaten uppvisade stor varians så anser vi att resultaten är för osäkra för att kunna bekräfta att implentationen fungerar så som den är avsedd. Vi drar slutsatsen att C3, implementerad på servern, fungerar effektivt på system med homogen storlek på datat och där klienter ej är tokenmedvetna. cassandra replica selection distributed database tail latency Computer Sciences Datavetenskap (datalogi)

Search results