21 |
Supporting multiple data stores based applications in cloud environments / Soutenir les applications utilisant des bases de données multiples dans un environnement Cloud ComputingSellami, Rami 05 February 2016 (has links)
Avec l’avènement du cloud computing et des big data, de nouveaux systèmes de gestion de bases de données sont apparus, connus en général sous le vocable systèmes NoSQL. Par rapport aux systèmes relationnels, ces systèmes se distinguent par leur absence de schéma, une spécialisation pour des types de données particuliers (documents, graphes, clé/valeur et colonne) et l’absence de langages de requêtes déclaratifs. L’offre est assez pléthorique et il n’y a pas de standard aujourd’hui comme peut l’être SQL pour les systèmes relationnels. De nombreuses applications peuvent avoir besoin de manipuler en même temps des données stockées dans des systèmes relationnels et dans des systèmes NoSQL. Le programmeur doit alors gérer deux (au moins) modèles de données différents et deux (au moins) langages de requêtes différents pour pouvoir écrire son application. De plus, il doit gérer explicitement tout son cycle de vie. En effet, il a à (1) coder son application, (2) découvrir les services de base de données déployés dans chaque environnement Cloud et choisir son environnement de déploiement, (3) déployer son application, (4) exécuter des requêtes multi-sources en les programmant explicitement dans son application, et enfin le cas échéant (5) migrer son application d’un environnement Cloud à un autre. Toutes ces tâches sont lourdes et fastidieuses et le programmeur risque d’être perdu dans ce haut niveau d’hétérogénéité. Afin de pallier ces problèmes et aider le programmeur tout au long du cycle de vie des applications utilisant des bases de données multiples, nous proposons un ensemble cohérent de modèles, d’algorithmes et d’outils. En effet, notre travail dans ce manuscrit de thèse se présente sous forme de quatre contributions. Tout d’abord, nous proposons un modèle de données unifié pour couvrir l’hétérogénéité entre les modèles de données relationnelles et NoSQL. Ce modèle de données est enrichi avec un ensemble de règles de raffinement. En se basant sur ce modèle, nous avons défini notre algèbre de requêtes. Ensuite, nous proposons une interface de programmation appelée ODBAPI basée sur notre modèle de données unifié, qui nous permet de manipuler de manière uniforme n’importe quelle source de données qu’elle soit relationnelle ou NoSQL. ODBAPI permet de programmer des applications indépendamment des bases de données utilisées et d’exprimer des requêtes simples et complexes multi-sources. Puis, nous définissons la notion de bases de données virtuelles qui interviennent comme des médiateurs et interagissent avec les bases de données intégrées via ODBAPI. Ce dernier joue alors le rôle d’adaptateur. Les bases de données virtuelles assurent l’exécution des requêtes d’une façon optimale grâce à un modèle de coût et un algorithme de génération de plan d’exécution optimal que nous définis. Enfin, nous proposons une approche automatique de découverte de bases de données dans des environnements Cloud. En effet, les programmeurs peuvent décrire leurs exigences en termes de bases de données dans des manifestes, et grâce à notre algorithme d’appariement, nous sélectionnons l’environnement le plus adéquat à notre application pour la déployer. Ainsi, nous déployons l’application en utilisant une API générique de déploiement appelée COAPS. Nous avons étendue cette dernière pour pouvoir déployer les applications utilisant plusieurs sources de données. Un prototype de la solution proposée a été développé et mis en œuvre dans des cas d'utilisation du projet OpenPaaS. Nous avons également effectué diverses expériences pour tester l'efficacité et la précision de nos contributions / The production of huge amount of data and the emergence of Cloud computing have introduced new requirements for data management. Many applications need to interact with several heterogeneous data stores depending on the type of data they have to manage: traditional data types, documents, graph data from social networks, simple key-value data, etc. Interacting with heterogeneous data models via different APIs, and multiple data stores based applications imposes challenging tasks to their developers. Indeed, programmers have to be familiar with different APIs. In addition, the execution of complex queries over heterogeneous data models cannot, currently, be achieved in a declarative way as it is used to be with mono-data store application, and therefore requires extra implementation efforts. Moreover, developers need to master and deal with the complex processes of Cloud discovery, and application deployment and execution. In this manuscript, we propose an integrated set of models, algorithms and tools aiming at alleviating developers task for developing, deploying and migrating multiple data stores applications in cloud environments. Our approach focuses mainly on three points. First, we provide a unified data model used by applications developers to interact with heterogeneous relational and NoSQL data stores. This model is enriched by a set of refinement rules. Based on that, we define our query algebra. Developers express queries using OPEN-PaaS-DataBase API (ODBAPI), a unique REST API allowing programmers to write their applications code independently of the target data stores. Second, we propose virtual data stores, which act as a mediator and interact with integrated data stores wrapped by ODBAPI. This run-time component supports the execution of single and complex queries over heterogeneous data stores. It implements a cost model to optimally execute queries and a dynamic programming based algorithm to generate an optimal query execution plan. Finally, we present a declarative approach that enables to lighten the burden of the tedious and non-standard tasks of (1) discovering relevant Cloud environments and (2) deploying applications on them while letting developers to simply focus on specifying their storage and computing requirements. A prototype of the proposed solution has been developed and implemented use cases from the OpenPaaS project. We also performed different experiments to test the efficiency and accuracy of our proposals
|
22 |
DATA INTEGRITY IN THE HEALTHCARE INDUSTRY: ANALYZING THE EFFECTIVENESS OF DATA SECURITY IN GOOD DATA AND RECORD MANAGEMENT PRACTICES (A CASE STUDY OF COMPUTERIZING THE COMPETENCE MATRIX FOR A QUALITY CONTROL DRUG LABORATORY)Marcel C Okezue (12522565) 06 October 2022 (has links)
<p> </p>
<p>This project analyzes the concept of time efficiency in the data management process associated with the personnel training and competence assessments in the quality control (QC) laboratory of Nigeria’s foods and drugs authority (NAFDAC). The laboratory administrators are encumbered with a lot of mental and paper-based record keeping because the personnel training data is managed manually. Consequently, the personnel training and competence assessments in the laboratory are not efficiently conducted. The Microsoft Excel spreadsheet provided by a Purdue doctoral dissertation as a remedial to this is found to be deficient in handling operations in database tables. As a result, hence doctoral dissertation did not appropriately address the inefficiencies.</p>
<p>The problem addressed by this study is the operational inefficiency that results from the manual or Excel-based personnel training data management process in the NAFDAC laboratory. The purpose, therefore, is to reduce the time it essentially takes to generate, obtain, manipulate, exchange, and securely store the personnel competence training and assessment data. To do this, the study developed a software system that is integrated with a relational database management system (RDBMS) to improve the manual/Microsoft Excel-based data management procedure. This project examines the operational (time) efficiencies in using manual or Excel-based format in comparison with the new system that this project developed, as a method to ascertain its validity.</p>
<p>The data used in this qualitative research is from literary sources and from simulating the distinction between the times spent in administering personnel training and competence assessment using the New system developed by this study and the Excel system by another project, respectively. The fundamental finding of this study is that the idea of improving the operational (time) efficiency in the personnel training and competence assessment process in the QC laboratory is valid. Doing that will reduce human errors, achieve enhanced time-efficient operation, and improve personnel training and competence assessment processes.</p>
<p>Recommendations are made as to the procedure the laboratory administrator must adopt to take advantage of the new system. The study also recommended the steps for the potential research to extend the capability of this project. </p>
|
23 |
Performance investigation into selected object persistence storesVan Zyl, Pieter 21 July 2010 (has links)
The current popular, distributed, n-tiered, object-oriented application architecture pro- vokes many design debates. Designs of such applications are often divided into logical layer (or tiers) - usually user interface, business logic and domain object (or data) layer, each with their own design issues. In particular, the latter contains data that needs to be stored and retrieved from permanent storage. Decisions need to be made as to the most appropriate way of doing this - the choices are usually whether to use an object database, to communicate directly with a relational database, or to use object-relational mapping (ORM) tools to allow objects to be translated to and from their relational form. Most often, depending on the perceived profile of the application, software architects make these decisions using rules of thumb derived from particular experience or the design patterns literature. Although helpful, these rules are often highly context-dependent and are of- ten misapplied. Research into the nature and magnitude of 'design forces' in this area has resulted in a series of benchmarks, intended to allow architects to understand more clearly the implications of design decisions concerning persistence. This study provides some results to help guide the architect's decisions. The study investigated and focused on the <i.performance of object persistence and com- pared ORM tools to object databases. ORM tools provide an extra layer between the business logic layer and the data layer. This study began with the hypothesis that this extra layer and mapping that happens at that point, slows down the performance of object persistence. The aim was to investigate the influence of this extra layer against the use of object databases that remove the need for this extra mapping layer. The study also investigated the impact of certain optimisation techniques on performance. A benchmark was used to compare ORM tools to object databases. The benchmark provided criteria that were used to compare them with each other. The particular benchmark chosen for this study was OO7, widely used to comprehensively test object persistence performance. Part of the study was to investigate the OO7 benchmark in greater detail to get a clearer understanding of the OO7 benchmark code and inside workings thereof. Included in this study was a comparison of the performance of an open source object database, db4o, against a proprietary object database, Versant. These representatives of object databases were compared against one another as well as against Hibernate, a popular open source representative of the ORM stable. It is important to note that these applications were initially used in their default modes (out of the box). Later some optimisation techniques were incorporated into the study, based on feedback obtained from the application developers. There is a common perception that an extra layer as introduced by Hibernate nega- tively impacts on performance. This study showed that such a layer has minimal impact on the performance. With the use of caching and other optimisation techniques, Hibernate compared well against object databases. Versant, a proprietary object database, was faster than Hibernate and the db4o open source object database. Copyright / Dissertation (MSc)--University of Pretoria, 2010. / Computer Science / unrestricted
|
24 |
Nativní XML rozhraní pro relační databázi / Native XML Interface for a Relational DatabasePiwko, Karel January 2010 (has links)
XML has emerged as leading document format for exchanging data. Because of vast amounts of XML documents available and transfered, there is a strong need to store and query information in these documents. However, the most companies are still using a RDBMS for their data warehouses and it is often necessary to combine legacy data with the ones in XML format, so it might be useful to consider storage possibilities for XML documents in a relation database. In this thesis we focused on structured and semi-structured data-based XML documents, because they are the most common when exchanging data and they can be easily validated against an XML schema. We propose a slightly modified Hybrid algorithm to shred doc- uments into relations using an XSD scheme and we allowed redundancy to make queries faster. Our goal was not to provide an academic solution, but fully working system supporting latest standards, which will beat up native XML databases both by performance and vertical scalability.
|
Page generated in 0.0247 seconds