Global ETD Search

1	Prédire les performances des requêtes et expliquer les résultats pour assister la consommation de données liées / Predicting query performance and explaining results to assist Linked Data consumption Hasan, Rakebul 04 November 2014 (has links) Prédire les performances des requêtes et expliquer les résultats pour assister la consommation de données liées. Notre objectif est d'aider les utilisateurs à comprendre les performances d'interrogation SPARQL, les résultats de la requête, et dérivations sur les données liées. Pour aider les utilisateurs à comprendre les performances des requêtes, nous fournissons des prévisions de performances des requêtes sur la base de d’historique de requêtes et d'apprentissage symbolique. Nous n'utilisons pas de statistiques sur les données sous-jacentes à nos prévisions. Ce qui rend notre approche appropriée au Linked Data où les statistiques sont souvent absentes. Pour aider les utilisateurs des résultats de la requête dans leur compréhension, nous fournissons des explications de provenance. Nous présentons une approche sans annotation pour expliquer le “pourquoi” des résultats de la requête. Notre approche ne nécessite pas de reconception du processeur de requêtes, du modèle de données, ou du langage de requête. Nous utilisons SPARQL 1.1 pour générer la provenance en interrogeant les données, ce qui rend notre approche appropriée pour les données liées. Nous présentons également une étude sur les utilisateurs montrant l'impact des explications. Enfin, pour aider les utilisateurs à comprendre les dérivations sur les données liées, nous introduisons le concept d’explications liées. Nous publions les métadonnées d’explication comme des données liées. Cela permet d'expliquer les résultats en suivant les liens des données utilisées dans le calcul et les liens des explications. Nous présentons une extension de l'ontologie PROV W3C pour décrire les métadonnées d’explication. Nous présentons également une approche pour résumer ces explications et aider les utilisateurs à filtrer les explications. / Our goal is to assist users in understanding SPARQL query performance, query results, and derivations on Linked Data. To help users in understanding query performance, we provide query performance predictions based on the query execution history. We present a machine learning approach to predict query performances. We do not use statistics about the underlying data for our predictions. This makes our approach suitable for the Linked Data scenario where statistics about the underlying data is often missing such as when the data is controlled by external parties. To help users in understanding query results, we provide provenance-based query result explanations. We present a non-annotation-based approach to generate why-provenance for SPARQL query results. Our approach does not require any re-engineering of the query processor, the data model, or the query language. We use the existing SPARQL 1.1 constructs to generate provenance by querying the data. This makes our approach suitable for Linked Data. We also present a user study to examine the impact of query result explanations. Finally to help users in understanding derivations on Linked Data, we introduce the concept of Linked Explanations. We publish explanation metadata as Linked Data. This allows explaining derived data in Linked Data by following the links of the data used in the derivation and the links of their explanation metadata. We present an extension of the W3C PROV ontology to describe explanation metadata. We also present an approach to summarize these explanations to help users filter information in the explanation, and have an understanding of what important information was used in the derivation. Données liées Performances des requêtes Explication SPARQL Linked data Explanation Query performance Prediction SPARQL
2	GraphQL query performance comparison using MySQL and MongoDB : By conducting Experiments with and without a DataLoader Nordström, Didrik, Vilhelmsson, Marcus January 2022 (has links) GraphQL is a query language rising in popularity, causing many transitions from traditional API endpoints to a GraphQL solution. Reflecting upon the positives and the flaws to using GraphQL, a DataLoader for batching queries sent to databases is sold as a solution to the infamous N+1 problem. Experiments were conducted to test how GraphQL response time, with and without DataLoader, changes when paired with MySQL and MongoDB. Along with the experiments, a Literature Review was conducted reflecting over the databases structural differences that could affect the response time for GraphQL. Results suggest that no major differences were to be found, and the explanation for the minor differences could rather be because of the disparity in query optimization instead of architectural differences for MySQL and MongoDB. DataLoader GraphQL MongoDB MySQL Query Query Performance Computer Sciences Datavetenskap (datalogi)
3	Um novo processo para refatoração de bancos de dados. / A new process to database refactoring. Domingues, Márcia Beatriz Pereira 15 May 2014 (has links) O projeto e manutenção de bancos de dados é um importante desafio, tendo em vista as frequentes mudanças de requisitos solicitados pelos usuários. Para acompanhar essas mudanças o esquema do banco de dados deve passar por alterações estruturais que muitas vezes prejudicam o desempenho e o projeto das consultas, tais como: relacionamentos desnecessários, chaves primárias ou estrangeiras criadas fortemente acopladas ao domínio, atributos obsoletos e tipos de atributos inadequados. A literatura sobre Métodos Ágeis para desenvolvimento de software propõe o uso de refatorações para evolução do esquema do banco de dados quando há mudanças de requisitos. Uma refatoração é uma alteração simples que melhora o design, mas não altera a semântica do modelo de dados, nem adiciona novas funcionalidades. Esta Tese apresenta um novo processo para aplicar refatorações ao esquema do banco de dados. Este processo é definido por um conjunto de tarefas com o objetivo de executar as refatorações de uma forma controlada e segura, permitindo saber o impacto no desempenho do banco de dados para cada refatoração executada. A notação BPMN foi utilizada para representar e executar as tarefas do processo. Como estudo de caso foi utilizado um banco de dados relacional, o qual é usado por um sistema de informação para agricultura de precisão. Esse sistema, baseado na Web, necessita fazer grandes consultas para plotagem de gráficos com informações georreferenciadas. / The development and maintenance of a database is an important challenge, due to frequent changes and requirements from users. To follow these changes, the database schema suffers structural modifications that, many times, negatively affect its performance and the result of the queries, such as: unnecessary relationships, primary and foreign keys, created strongly attached to the domain, with obsolete attributes or inadequate types of attributes. The literature about Agile Methods for software development suggests the use of refactoring for the evolution of database schemas when there are requirement changes. A refactoring is a simple change that improves the design, but it does not alter the semantics of the data model neither adds new functionalities. This thesis aims at proposing a new process to apply many refactoring to the database schema. This process is defined by a set of refactoring tasks, which is executed in a controlled, secure and automatized form, aiming at improving the design of the schema and allowing the DBA to know exactly the impact on the performance of the database for each refactoring performed. A notation BPMN has been used to represent and execute the tasks of the workflow. As a case study, a relational database, which is used by an information system for precision agriculture was used. This system is web based, and needs to perform large consultations to transfer graphics with geo-referential information. Database refactoring Database schema Esquemas Evolução de banco de dados Evolutionary databases Performance de consultas Processo Query performance Refatoração de banco de dados Workflow
4	Um novo processo para refatoração de bancos de dados. / A new process to database refactoring. Márcia Beatriz Pereira Domingues 15 May 2014 (has links) O projeto e manutenção de bancos de dados é um importante desafio, tendo em vista as frequentes mudanças de requisitos solicitados pelos usuários. Para acompanhar essas mudanças o esquema do banco de dados deve passar por alterações estruturais que muitas vezes prejudicam o desempenho e o projeto das consultas, tais como: relacionamentos desnecessários, chaves primárias ou estrangeiras criadas fortemente acopladas ao domínio, atributos obsoletos e tipos de atributos inadequados. A literatura sobre Métodos Ágeis para desenvolvimento de software propõe o uso de refatorações para evolução do esquema do banco de dados quando há mudanças de requisitos. Uma refatoração é uma alteração simples que melhora o design, mas não altera a semântica do modelo de dados, nem adiciona novas funcionalidades. Esta Tese apresenta um novo processo para aplicar refatorações ao esquema do banco de dados. Este processo é definido por um conjunto de tarefas com o objetivo de executar as refatorações de uma forma controlada e segura, permitindo saber o impacto no desempenho do banco de dados para cada refatoração executada. A notação BPMN foi utilizada para representar e executar as tarefas do processo. Como estudo de caso foi utilizado um banco de dados relacional, o qual é usado por um sistema de informação para agricultura de precisão. Esse sistema, baseado na Web, necessita fazer grandes consultas para plotagem de gráficos com informações georreferenciadas. / The development and maintenance of a database is an important challenge, due to frequent changes and requirements from users. To follow these changes, the database schema suffers structural modifications that, many times, negatively affect its performance and the result of the queries, such as: unnecessary relationships, primary and foreign keys, created strongly attached to the domain, with obsolete attributes or inadequate types of attributes. The literature about Agile Methods for software development suggests the use of refactoring for the evolution of database schemas when there are requirement changes. A refactoring is a simple change that improves the design, but it does not alter the semantics of the data model neither adds new functionalities. This thesis aims at proposing a new process to apply many refactoring to the database schema. This process is defined by a set of refactoring tasks, which is executed in a controlled, secure and automatized form, aiming at improving the design of the schema and allowing the DBA to know exactly the impact on the performance of the database for each refactoring performed. A notation BPMN has been used to represent and execute the tasks of the workflow. As a case study, a relational database, which is used by an information system for precision agriculture was used. This system is web based, and needs to perform large consultations to transfer graphics with geo-referential information. Esquemas Evolução de banco de dados Performance de consultas Processo Refatoração de banco de dados Database refactoring Database schema Evolutionary databases Query performance Workflow
5	Evaluation of PR-tree Window Query Performance : Under Modification By Heuristic Update Algorithms / Utvärdering av prestanda för fönstersökning i PR-träd : Under modifikation av heuristiska uppdateringsalgoritmer Kratz, Jakob January 2024 (has links) Spatial data arises in applications such as geographical information systems, computer aided design and computer vision. A practical indexing method for spatial data is the R-tree [1]. A common query to an R-tree is a window query which outputs all spatial objects that intersect a rectangular region in space. The PR-tree is the first R-tree variant where window query performance is asymptotically optimal in the worst case. In this work a PR-tree is updated using algorithms defined by Antonin Guttman [2] and Beckmann et al. [3], respectively and query performance is evaluated. The conclusion is that the R*-tree algorithms by Beckmann et al. [3] is superior to the algorithms by Antonin Guttman [2] for maintaining good query performance while updating a PR-tree / Spatial data är vanligt förekommande i geografiska informationssystem, CAD och datorseende. Ett praktiskt index för spatial data är ett R-träd [1]. En vanlig sökfråga till ett R-träd är en så kallad window query som ges ett rektangulärt område och returnerar alla spatiala objekt som skär detta område. PR-trädet är det första R-trädet med asymptotiskt optimal tidskomplexitet för en window query. I detta arbete används PR-trädet som bas och det modifieras med respektiva algoritmer definerade av Antonin Guttman [2] och Beckmann m. fl. [3] och prestanda för rektangulära sökfrågor utvärderas. Slutsatsen är att om målet är att bibehålla bra prestanda för sökfrågor medan PR-trädet modifieras ska algoritmerna av Beckmann m. fl. [3] föredras. Algorithms Computational Geometry Dynamic PR-tree Query performance evaluation Algoritmer Beräkningsgeometri Dynamiskt PR-träd prestandautvärdering Computer Sciences Datavetenskap (datalogi)
6	Efficient Reorganisation of Hybrid Index Structures Supporting Multimedia Search Criteria Kropf, Carsten 11 February 2017 (has links) (PDF) This thesis describes the development and setup of hybrid index structures. They are access methods for retrieval techniques in hybrid data spaces which are formed by one or more relational or normalised columns in conjunction with one non-relational or non-normalised column. Examples for these hybrid data spaces are, among others, textual data combined with geographical ones or data from enterprise content management systems. However, all non-relational data types may be stored as well as image feature vectors or comparable types. Hybrid index structures are known to function efficiently regarding retrieval operations. Unfortunately, little information is available about reorganisation operations which insert or update the row tuples. The fundamental research is mainly executed in simulation based environments. This work is written ensuing from a previous thesis that implements hybrid access structures in realistic database surroundings. During this implementation it has become obvious that retrieval works efficiently. Yet, the restructuring approaches require too much effort to be set up, e.g., in web search engine environments where several thousands of documents are inserted or modified every day. These search engines rely on relational database systems as storage backends. Hence, the setup of these access methods for hybrid data spaces is required in real world database management systems. This thesis tries to apply a systematic approach for the optimisation of the rearrangement algorithms inside realistic scenarios. Thus, a measurement and evaluation scheme is created which is repeatedly deployed to an evolving state and a model of hybrid index structures in order to optimise the regrouping algorithms to make a setup of hybrid index structures in real world information systems possible. Thus, a set of input corpora is selected which is applied to the test suite as well as an evaluation scheme. To sum up, it can be said that this thesis describes input sets, a test suite including an evaluation scheme as well as optimisation iterations on reorganisation algorithms reflecting a theoretical model framework to provide efficient reorganisations of hybrid index structures supporting multimedia search criteria. Algorithmenoptimierung Hybride Indexstrukturen Komplexität Geo-Textuelle Suchen Datenbanken Suchperformance Einfügeperformance Algorithmic Optimization Hybrid Index Structures Complexity Geo-Textual Searches Databases Query Performance Insertion Performance ddc:330 rvk:ST 130
7	Highspeed Graph Processing Exploiting Main-Memory Column Stores Hauck, Matthias, Paradies, Marcus, Fröning, Holger, Lehner, Wolfgang, Rauhe, Hannes 03 February 2023 (has links) A popular belief in the graph database community is that relational database management systems are generally ill-suited for efficient graph processing. This might apply for analytic graph queries performing iterative computations on the graph, but does not necessarily hold true for short-running, OLTP-style graph queries. In this paper we argue that, instead of extending a graph database management system with traditional relational operators—predicate evaluation, sorting, grouping, and aggregations among others—one should consider adding a graph abstraction and graph-specific operations, such as graph traversals and pattern matching, to relational database management systems. We use an exemplary query from the interactive query workload of the LDBC social network benchmark and run it against our enhanced in-memory, columnar relational database system to support our claims. Our performance measurements indicate that a columnar RDBMS—extended by graph-specific operators and data structures—can serve as a foundation for high-speed graph processing on big memory machines with non-uniform memory access and a large number of available cores. info:eu-repo/classification/ddc/004 ddc:004
8	A comparison of the impact of data vault and dimensional modelling on data warehouse performance and maintenance / Marius van Schalkwyk Van Schalkwyk, Marius January 2014 (has links) This study compares the impact of dimensional modelling and data vault modelling on the performance and maintenance effort of data warehouses. Dimensional modelling is a data warehouse modelling technique pioneered by Ralph Kimball in the 1980s that is much more effective at querying large volumes of data in relational databases than third normal form data models. Data vault modelling is a relatively new modelling technique for data warehouses that, according to its creator Dan Linstedt, was created in order to address the weaknesses of dimensional modelling. To date, no scientific comparison between the two modelling techniques have been conducted. A scientific comparison was achieved in this study, through the implementation of several experiments. The experiments compared the data warehouse implementations based on dimensional modelling techniques with data warehouse implementations based on data vault modelling techniques in terms of load performance, query performance, storage requirements, and flexibility to business requirements changes. An analysis of the results of each of the experiments indicated that the data vault model outperformed the dimensional model in terms of load performance and flexibility. However, the dimensional model required less storage space than the data vault model. With regards to query performance, no statistically significant differences existed between the two modelling techniques. / MSc (Computer Science), North-West University, Potchefstroom Campus, 2014 Data warehouse Dimensional modelling Data vault modelling Data warehouse modelling techniques Kimball Linstedt Query performance Load performance ETL performance Flexibility Storage requirements Datapakhuis Dimensionele modellering Datakluismodellering Datapakhuismodellering Navraagwerkverrigting Laaiwerkverrigting Aanpasbaarheid Stoorvereistes
9	A comparison of the impact of data vault and dimensional modelling on data warehouse performance and maintenance / Marius van Schalkwyk Van Schalkwyk, Marius January 2014 (has links) This study compares the impact of dimensional modelling and data vault modelling on the performance and maintenance effort of data warehouses. Dimensional modelling is a data warehouse modelling technique pioneered by Ralph Kimball in the 1980s that is much more effective at querying large volumes of data in relational databases than third normal form data models. Data vault modelling is a relatively new modelling technique for data warehouses that, according to its creator Dan Linstedt, was created in order to address the weaknesses of dimensional modelling. To date, no scientific comparison between the two modelling techniques have been conducted. A scientific comparison was achieved in this study, through the implementation of several experiments. The experiments compared the data warehouse implementations based on dimensional modelling techniques with data warehouse implementations based on data vault modelling techniques in terms of load performance, query performance, storage requirements, and flexibility to business requirements changes. An analysis of the results of each of the experiments indicated that the data vault model outperformed the dimensional model in terms of load performance and flexibility. However, the dimensional model required less storage space than the data vault model. With regards to query performance, no statistically significant differences existed between the two modelling techniques. / MSc (Computer Science), North-West University, Potchefstroom Campus, 2014 Data warehouse Dimensional modelling Data vault modelling Data warehouse modelling techniques Kimball Linstedt Query performance Load performance ETL performance Flexibility Storage requirements Datapakhuis Dimensionele modellering Datakluismodellering Datapakhuismodellering Navraagwerkverrigting Laaiwerkverrigting Aanpasbaarheid Stoorvereistes
10	Nativní XML rozhraní pro relační databázi / Native XML Interface for a Relational Database Piwko, Karel January 2010 (has links) XML has emerged as leading document format for exchanging data. Because of vast amounts of XML documents available and transfered, there is a strong need to store and query information in these documents. However, the most companies are still using a RDBMS for their data warehouses and it is often necessary to combine legacy data with the ones in XML format, so it might be useful to consider storage possibilities for XML documents in a relation database. In this thesis we focused on structured and semi-structured data-based XML documents, because they are the most common when exchanging data and they can be easily validated against an XML schema. We propose a slightly modified Hybrid algorithm to shred doc- uments into relations using an XSD scheme and we allowed redundancy to make queries faster. Our goal was not to provide an academic solution, but fully working system supporting latest standards, which will beat up native XML databases both by performance and vertical scalability.

Search results