Global ETD Search

1	SPACE ALLOCATION FOR MATERIALIZED VIEWS AND INDEXES USING GENETIC ALGORITHMS MACHIRAJU, SIRISHA 16 September 2002 (has links) No description available. data warehousing materialized views genetic algorithms index selection space allocation
2	The Impact of Storage Strategies on Maintenance of XML Views Åhgren, Mikael January 2001 (has links) <p>Information in a data warehouse is stored in materialized views, which must be kept consistent with respect to changes made in the sources. This problem has been extensively studied in the relational model. The process is referred to as view maintenance.</p><p>XML is emerging as the de facto standard for data representation and data exchange of semistructured data. Most discussions involving XML assume the XML data is stored in plain text files. However, there are a number of different approaches for storing XML data, which can be categorized according to the underlying system used.</p><p>Views and materialized views can also be specified in XML. This dissertation investigates how view maintenance in an XML context is influenced by the utilized approach for storage. We survey existing storage strategies using a relational database as the underlying system for storage, and storage strategies using plain text files. Further, we survey approaches for maintenance in the context of XML. We investigate three selected storage strategies in detail. We conclude with some insights gained during the investigation.</p> XML Storage Strategies Materialized Views View Maintenance Computer and systems science Data- och systemvetenskap
3	Materialized Views over Heterogeneous Structured Data Sources in a Distributed Event Stream Processing Environment January 2011 (has links) abstract: Data-driven applications are becoming increasingly complex with support for processing events and data streams in a loosely-coupled distributed environment, providing integrated access to heterogeneous data sources such as relational databases and XML documents. This dissertation explores the use of materialized views over structured heterogeneous data sources to support multiple query optimization in a distributed event stream processing framework that supports such applications involving various query expressions for detecting events, monitoring conditions, handling data streams, and querying data. Materialized views store the results of the computed view so that subsequent access to the view retrieves the materialized results, avoiding the cost of recomputing the entire view from base data sources. Using a service-based metadata repository that provides metadata level access to the various language components in the system, a heuristics-based algorithm detects the common subexpressions from the queries represented in a mixed multigraph model over relational and structured XML data sources. These common subexpressions can be relational, XML or a hybrid join over the heterogeneous data sources. This research examines the challenges in the definition and materialization of views when the heterogeneous data sources are retained in their native format, instead of converting the data to a common model. LINQ serves as the materialized view definition language for creating the view definitions. An algorithm is introduced that uses LINQ to create a data structure for the persistence of these hybrid views. Any changes to base data sources used to materialize views are captured and mapped to a delta structure. The deltas are then streamed within the framework for use in the incremental update of the materialized view. Algorithms are presented that use the magic sets query optimization approach to both efficiently materialize the views and to propagate the relevant changes to the views for incremental maintenance. Using representative scenarios over structured heterogeneous data sources, an evaluation of the framework demonstrates an improvement in performance. Thus, defining the LINQ-based materialized views over heterogeneous structured data sources using the detected common subexpressions and incrementally maintaining the views by using magic sets enhances the efficiency of the distributed event stream processing environment. / Dissertation/Thesis / Ph.D. Computer Science 2011 Computer Science Common Subexpressions Incremental View Maintenance LINQ Magic Sets Materialized Views Metadata Repository
4	The Impact of Storage Strategies on Maintenance of XML Views Åhgren, Mikael January 2001 (has links) Information in a data warehouse is stored in materialized views, which must be kept consistent with respect to changes made in the sources. This problem has been extensively studied in the relational model. The process is referred to as view maintenance. XML is emerging as the de facto standard for data representation and data exchange of semistructured data. Most discussions involving XML assume the XML data is stored in plain text files. However, there are a number of different approaches for storing XML data, which can be categorized according to the underlying system used. Views and materialized views can also be specified in XML. This dissertation investigates how view maintenance in an XML context is influenced by the utilized approach for storage. We survey existing storage strategies using a relational database as the underlying system for storage, and storage strategies using plain text files. Further, we survey approaches for maintenance in the context of XML. We investigate three selected storage strategies in detail. We conclude with some insights gained during the investigation. XML Storage Strategies Materialized Views View Maintenance Information Systems
5	Hypergraphs in the Service of Very Large Scale Query Optimization. Application : Data Warehousing / Les hypergraphes au service de l'optimisation de requêtes à très large échelle. Application : Entrepôt de données Boukorca, Ahcène 12 December 2016 (has links) L'apparition du phénomène Big-Data, a conduit à l'arrivée de nouvelles besoins croissants et urgents de partage de données qui a engendré un grand nombre de requêtes que les SGBD doivent gérer. Ce problème a été aggravé par d 'autres besoins de recommandation et d 'exploration des requêtes. Vu que le traitement de données est toujours possible grâce aux solutions liées à l'optimisation de requêtes, la conception physique et l'architecture de déploiement, où ces solutions sont des résultats de problèmes combinatoires basés sur les requêtes, il est indispensable de revoir les méthodes traditionnelles pour répondre aux nouvelles besoins de passage à l'échelle. Cette thèse s'intéresse à ce problème de nombreuses requêtes et propose une approche, implémentée par un Framework appelé Big-Quereis, qui passe à l'échelle et basée sur le hypergraph, une structure de données flexible qui a une grande puissance de modélisation et permet des formulations précises de nombreux problèmes d•combinatoire informatique. Cette approche est. le fruit. de collaboration avec l'entreprise Mentor Graphies. Elle vise à capturer l'interaction de requêtes dans un plan unifié de requêtes et utiliser des algorithmes de partitionnement pour assurer le passage à l'échelle et avoir des structures d'optimisation optimales (vues matérialisées et partitionnement de données). Ce plan unifié est. utilisé dans la phase de déploiement des entrepôts de données parallèles, par le partitionnement de données en fragments et l'allocation de ces fragments dans les noeuds de calcule correspondants. Une étude expérimentale intensive a montré l'intérêt de notre approche en termes de passage à l'échelle des algorithmes et de réduction de temps de réponse de requêtes. / The emergence of the phenomenon Big-Data conducts to the introduction of new increased and urgent needs to share data between users and communities, which has engender a large number of queries that DBMS must handle. This problem has been compounded by other needs of recommendation and exploration of queries. Since data processing is still possible through solutions of query optimization, physical design and deployment architectures, in which these solutions are the results of combinatorial problems based on queries, it is essential to review traditional methods to respond to new needs of scalability. This thesis focuses on the problem of numerous queries and proposes a scalable approach implemented on framework called Big-queries and based on the hypergraph, a flexible data structure, which bas a larger modeling power and may allow accurate formulation of many problems of combinatorial scientific computing. This approach is the result of collaboration with the company Mentor Graphies. It aims to capture the queries interaction in an unified query plan and to use partitioning algorithms to ensure scalability and to optimal optimization structures (materialized views and data partitioning). Also, the unified plan is used in the deploymemt phase of parallel data warehouses, by allowing data partitioning in fragments and allocating these fragments in the correspond processing nodes. Intensive experimental study sbowed the interest of our approach in terms of scaling algorithms and minimization of query response time. Conception physique Fragmentation de données Vues matérialisées Physical design Data partitioning Materialized views
6	Vers une conception logique et physique des bases de données avancées dirigée par la variabilité / Towards a Variability-Aware Logical and Physical Database Design Bouarar, Selma 13 December 2016 (has links) Le processus de conception des BD ne cesse d'augmenter en complexité et d'exiger plus de temps et de ressources afin de contenir la diversité des applications BD. Rappelons qu’il se base essentiellement sur le talent et les connaissances des concepteurs. Ces bases s'avèrent de plus en plus insuffisantes face à la croissante diversité de choix de conception, en soulevant le problème de la fiabilité et de l'exhaustivité de cette connaissance. Ce problème est bien connu sous le nom de la gestion de la variabilité en génie logiciel. S’il existe quelques travaux de gestion de variabilité portant sur les phases physique et conceptuelle, peu se sont intéressés à la phase logique. De plus, ces travaux abordent les phases de conception de manière séparée, ignorant ainsi les différentes interdépendances.Dans cette thèse, nous présentons d'abord la démarche à suivre afin d'adopter la technique des lignes de produits et ce sur l'ensemble du processus de conception afin de (i) considérer les interdépendances entre les phases, (ii) offrir une vision globale au concepteur, et (iii) augmenter l'automatisation. Vu l'étendue de la question, nous procédons par étapes dans la réalisation de cette vision, en consacrant cette thèse à l'étude d'un cas choisi de façon à montrer : (i) l'importance de la variabilité de la conception logique, (ii) comment la gérer en offrant aux concepteurs l'exhaustivité des choix, et la fiabilité de la sélection, (iii) son impact sur la conception physique (gestion multiphase),(iv) l'évaluation de la conception logique, et de l'impact de la variabilité logique sur la conception physique (sélection des vues matérialisées) en termes des besoins non fonctionnel(s) :temps d'exécution, consommation d'énergie voire l'espace de stockage. / The evolution of computer technology has strongly impacted the database design process which is henceforth requiring more time and resources to encompass the diversity of DB applications.Note that designers rely on their talent and knowledge, which have proven insufficient to face the increasing diversity of design choices, raising the problem of the reliability and completeness of this knowledge. This problem is well known as variability management in software engineering. While there exist some works on managing variability of physical and conceptual phases, very few have focused on logical design. Moreover, these works focus on design phases separately, thus ignore the different interdependencies. In this thesis, we first present a methodology to manage the variability of the whole DB design process using the technique of software product lines, so that (i)interdependencies between design phases can be considered, (ii) a holistic vision is provided to the designer and (iii) process automation is increased. Given the scope of the study, we proceed step-bystepin implementing this vision, by studying a case that shows: (i) the importance of logical design variability (iii) its impact on physical design (multi-phase management), (iv) the evaluation of logical design, and the impact of logical variability on the physical design (materialized view selection) in terms of non-functional requirements: execution time, energy consumption and storage space. Gestion de la variabilité Conception physique Vues matérialisées Variability management Physical design Materialized views
7	Optimierung der materialisierten Sichten in einem Datawarehouse auf der Grundlage der aus einem ERP-System übernommenen operativen Daten Achs, Thomas Ludwig 10 1900 (has links) (PDF) Das Planen und Entwickeln eines optimalen Data Warehouse-Systems ist ein Ansinnen vieler Wissenschaftler und Forscher aus unterschiedlichen Bereichen. Zahlreiche Publikationen wurden zu diesem Thema verfasst und in den letzten Jahren veröffentlicht. In dieser Literatur wird versucht eine Heuristik zu entwickeln, welche eine Lösung nahe am Optimum für das Materialisierungsproblem im Data Warehouse liefert. In der Vergangenheit wurden in zahlreichen Publikationen Annahmen, wie unbegrenzte Ressourcen oder rasche Zugriffszeit getroffen, welche in der realen Welt allerdings nicht vorhanden sind. Die Vision, welche hinter dieser Arbeit steckt, ist es, ein Instrument zu entwickeln, welches diese limitierenden Faktoren mitberücksichtigt, bzw. dieses versucht. Dabei hat sich insbesondere die Modellierungsmethode des Aggregation Path Arrays von Prosser und Ossimitz als geeignet erwiesen, in diesem Problembereich einen Lösungsansatz zu finden. Vor allem ist diese Methode durch die einfache graphische Darstellungsfähigkeit besonders für informationstechnische Darstellung geeignet. Dabei ist es auch unerfahrenen Endbenutzer möglich, das Design eines Warehouses zu bewerkstelligen. Aus diesem Grund ist die Methode auch für Schulungs- und Ausbildungszwecke besonders geeignet. Die kostenminimale physische Bereitstellung der wichtigen Informationen für die Entscheidungsträger in Unternehmen stellt das Ziel dieser Arbeit dar. Dabei ist ein Optimierungsproblem zu lösen, welches limitierende Zeit- und Speicherressourcen bei gleichzeitigem Berücksichtigen wichtiger Information beachtet. Leider ist diese Information nicht immer als homogen anzusehen. Es gibt beispielsweise wichtige Information, welche für das Überleben einer Organisation notwendig ist und Information, welche wichtig, aber nicht ständig verfügbar sein muss. Der Versuch einen Lösungsansatz für diese Problematik zu finden, stellt das Herzstück meiner Arbeit dar. (Autorenref.)
8	Boa Views: Enabling Modularization and Sharing of Boa Queries Hung, Che Shian 09 August 2019 (has links) No description available. Computer Science
9	Interactive visualization of financial data : Development of a visual data mining tool Saltin, Joakim January 2012 (has links) In this project, a prototype visual data mining tool was developed, allowing users to interactively investigate large multi-dimensional datasets visually (using 2D visualization techniques) using so called drill-down, roll-up and slicing operations. The project included all steps of the development, from writing specifications and designing the program to implementing and evaluating it. Using ideas from data warehousing, custom methods for storing pre-computed aggregations of data (commonly referred to as materialized views) and retrieving data from these were developed and implemented in order to achieve higher performance on large datasets. View materialization enables the program to easily fetch or calculate a view using other views, something which can yield significant performance gains if view sizes are much smaller than the underlying raw dataset. The choice of which views to materialize was done in an automated manner using a well-known algorithm - the greedy algorithm for view materialization - which selects the fraction of all possible views that is likely (but not guaranteed) to yield the best performance gain. The use of materialized views was shown to have good potential to increase performance for large datasets, with an average speedup (compared to on-the-fly queries) between 20 and 70 for a test dataset containing 500~000 rows. The end result was a program combining flexibility with good performance, which was also reflected by good scores in a user-acceptance test, with participants from the company where this project was carried out. visual data mining visualization data warehousing software engineering materialized views OLAP OLAP cubes greedy algorithm high-performance query
10	Scalable view-based techniques for web data : algorithms and systems Katsifodimos, Asterios 03 July 2013 (has links) (PDF) XML was recommended by W3C in 1998 as a markup language to be used by device- and system-independent methods of representing information. XML is nowadays used as a data model for storing and querying large volumes of data in database systems. In spite of significant research and systems development, many performance problems are raised by processing very large amounts of XML data. Materialized views have long been used in databases to speed up queries. Materialized views can be seen as precomputed query results that can be re-used to evaluate (part of) another query, and have been a topic of intensive research, in particular in the context of relational data warehousing. This thesis investigates the applicability of materialized views techniques to optimize the performance of Web data management tools, in particular in distributed settings, considering XML data and queries. We make three contributions.We first consider the problem of choosing the best views to materialize within a given space budget in order to improve the performance of a query workload. Our work is the first to address the view selection problem for a rich subset of XQuery. The challenges we face stem from the expressive power and features of both the query and view languages and from the size of the search space of candidate views to materialize. While the general problem has prohibitive complexity, we propose and study a heuristic algorithm and demonstrate its superior performance compared to the state of the art.Second, we consider the management of large XML corpora in peer-to-peer networks, based on distributed hash tables (or DHTs, in short). We consider a platform leveraging distributed materialized XML views, defined by arbitrary XML queries, filled in with data published anywhere in the network, and exploited to efficiently answer queries issued by any network peer. This thesis has contributed important scalability oriented optimizations, as well as a comprehensive set of experiments deployed in a country-wide WAN. These experiments outgrow by orders of magnitude similar competitor systems in terms of data volumes and data dissemination throughput. Thus, they are the most advanced in understanding the performance behavior of DHT-based XML content management in real settings.Finally, we present a novel approach for scalable content-based publish/subscribe (pub/sub, in short) in the presence of constraints on the available computational resources of data publishers. We achieve scalability by off-loading subscriptions from the publisher, and leveraging view-based query rewriting to feed these subscriptions from the data accumulated in others. Our main contribution is a novel algorithm for organizing subscriptions in a multi-level dissemination network in order to serve large numbers of subscriptions, respect capacity constraints, and minimize latency. The efficiency and effectiveness of our algorithm are confirmed through extensive experiments and a large deployment in a WAN. [INFO:INFO_OH] Computer Science/Other [INFO:INFO_OH] Informatique/Autre XML Web data Materialized views Query optimization View selection Publish/subscribe Data management

Search results