• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 23
  • 6
  • 5
  • 3
  • 2
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 43
  • 43
  • 16
  • 12
  • 12
  • 12
  • 12
  • 10
  • 9
  • 9
  • 8
  • 8
  • 8
  • 7
  • 7
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Designing Conventional, Spatial, and Temporal Data Warehouses: Concepts and Methodological Framework

Malinowski Gajda, Elzbieta 02 October 2006 (has links)
Decision support systems are interactive, computer-based information systems that provide data and analysis tools in order to better assist managers on different levels of organization in the process of decision making. Data warehouses (DWs) have been developed and deployed as an integral part of decision support systems. A data warehouse is a database that allows to store high volume of historical data required for analytical purposes. This data is extracted from operational databases, transformed into a coherent whole, and loaded into a DW during the extraction-transformation-loading (ETL) process. DW data can be dynamically manipulated using on-line analytical processing (OLAP) systems. DW and OLAP systems rely on a multidimensional model that includes measures, dimensions, and hierarchies. Measures are usually numeric additive values that are used for quantitative evaluation of different aspects about organization. Dimensions provide different analysis perspectives while hierarchies allow to analyze measures on different levels of detail. Nevertheless, currently, designers as well as users find difficult to specify multidimensional elements required for analysis. One reason for that is the lack of conceptual models for DW and OLAP system design, which would allow to express data requirements on an abstract level without considering implementation details. Another problem is that many kinds of complex hierarchies arising in real-world situations are not addressed by current DW and OLAP systems. In order to help designers to build conceptual models for decision-support systems and to help users in better understanding the data to be analyzed, in this thesis we propose the MultiDimER model - a conceptual model used for representing multidimensional data for DW and OLAP applications. Our model is mainly based on the existing ER constructs, for example, entity types, attributes, relationship types with their usual semantics, allowing to represent the common concepts of dimensions, hierarchies, and measures. It also includes a conceptual classification of different kinds of hierarchies existing in real-world situations and proposes graphical notations for them. On the other hand, currently users of DW and OLAP systems demand also the inclusion of spatial data, visualization of which allows to reveal patterns that are difficult to discover otherwise. The advantage of using spatial data in the analysis process is widely recognized since it allows to reveal patterns that are difficult to discover otherwise. However, although DWs typically include a spatial or a location dimension, this dimension is usually represented in an alphanumeric format. Furthermore, there is still a lack of a systematic study that analyze the inclusion as well as the management of hierarchies and measures that are represented using spatial data. With the aim of satisfying the growing requirements of decision-making users, we extend the MultiDimER model by allowing to include spatial data in the different elements composing the multidimensional model. The novelty of our contribution lays in the fact that a multidimensional model is seldom used for representing spatial data. To succeed with our proposal, we applied the research achievements in the field of spatial databases to the specific features of a multidimensional model. The spatial extension of a multidimensional model raises several issues, to which we refer in this thesis, such as the influence of different topological relationships between spatial objects forming a hierarchy on the procedures required for measure aggregations, aggregations of spatial measures, the inclusion of spatial measures without the presence of spatial dimensions, among others. Moreover, one of the important characteristics of multidimensional models is the presence of a time dimension for keeping track of changes in measures. However, this dimension cannot be used to model changes in other dimensions. Therefore, usual multidimensional models are not symmetric in the way of representing changes for measures and dimensions. Further, there is still a lack of analysis indicating which concepts already developed for providing temporal support in conventional databases can be applied and be useful for different elements composing a multidimensional model. In order to handle in a similar manner temporal changes to all elements of a multidimensional model, we introduce a temporal extension for the MultiDimER model. This extension is based on the research in the area of temporal databases, which have been successfully used for modeling time-varying information for several decades. We propose the inclusion of different temporal types, such as valid and transaction time, which are obtained from source systems, in addition to the DW loading time generated in DWs. We use this temporal support for a conceptual representation of time-varying dimensions, hierarchies, and measures. We also refer to specific constraints that should be imposed on time-varying hierarchies and to the problem of handling multiple time granularities between source systems and DWs. Furthermore, the design of DWs is not an easy task. It requires to consider all phases from the requirements specification to the final implementation including the ETL process. It should also take into account that the inclusion of different data items in a DW depends on both, users' needs and data availability in source systems. However, currently, designers must rely on their experience due to the lack of a methodological framework that considers above-mentioned aspects. In order to assist developers during the DW design process, we propose a methodology for the design of conventional, spatial, and temporal DWs. We refer to different phases, such as requirements specification, conceptual, logical, and physical modeling. We include three different methods for requirements specification depending on whether users, operational data sources, or both are the driving force in the process of requirement gathering. We show how each method leads to the creation of a conceptual multidimensional model. We also present logical and physical design phases that refer to DW structures and the ETL process. To ensure the correctness of the proposed conceptual models, i.e., with conventional data, with the spatial data, and with time-varying data, we formally define them providing their syntax and semantics. With the aim of assessing the usability of our conceptual model including representation of different kinds of hierarchies as well as spatial and temporal support, we present real-world examples. Pursuing the goal that the proposed conceptual solutions can be implemented, we include their logical representations using relational and object-relational databases.
2

The inner and inter construct associations of the quality of data warehouse customer relationship data for problem enactment

Abril, Raul Mario January 2005 (has links)
The literature identifies perceptions of data quality as a key factor influencing a wide range of attitudes and behaviors related to data in organizational settings (e.g. decision confidence). In particular, there is an overwhelming consensus that effective customer relationship management, CRM, depends on the quality of customer data. Data warehouses, if properly implemented, enable data integration which is a key attribute of data quality. The literature highlights the relevance of formulating problem statements because this will determine the course of action. CRM managers formulate problem statements through a cognitive process known as enactment. The literature on data quality is very fragmented. It posits that this construct is of a high order nature (it is dimensional), it is contextual and situational, and it is closely linked to a utilitarian value. This study addresses all these disperse views of the nature of data quality from a holistic perspective. Social cognitive theory, SCT, is the backbone for studying data quality in terms of information search behavior and enhancements in formulating problem statements. The main objective of this study is to explore the nature of a data warehouse's customer relationship data quality in situations where there is a need for understanding a customer relationship problem. The research question is What are the inner and inter construct associations of the quality of data warehouse customer relationship data for problem enactment? To reach this objective, a positivistic approach was adopted complemented with qualitative interventions along the research process. Observations were gathered with a survey. Scales were adjusted using a construct-based approach. Research findings confirm that data quality is a high order construct with a contextual dimension and a situational dimension. Problem sense making enhancements is a dependent variable of data quality in a confirmed positive association between both constructs. Problem sense making enhancements is also a high order construct with a mastering experience dimension and a self-efficacy dimension. Behavioral patterns for information search mode (scanning mode orientation vs. focus mode orientation) and for information search heuristic (template heuristic orientation vs. trial-and-error heuristic orientation) have been identified. Focus is the predominant information search mode orientation and template is the predominant information search heuristic orientation. Overall, the research findings support the associations advocated by SCT. The self-efficacy dimension in problem sense making enhancements is a discriminant for information search mode orientation (focus mode orientation vs. scanning mode orientation). The contextual dimension in data quality (i.e. data task utility) is a discriminant for information search heuristic (template heuristic orientation vs. trial-and-error heuristic orientation). A data quality cognitive metamodel and a data quality for problem enactment model are suggested for research in the areas of data quality, information search behavior, and cognitive enhancements.
3

Επέκταση του OLAP μοντέλου με σημαντικά δίκτυα

Στραγαλινός, Ευάγγελος 19 May 2011 (has links)
Λαμβάνοντας υπόψη ότι η χρήση των υπολογιστών μεταφέρθηκε από τους ερευνητικούς οργανισμούς στις επιχειρήσεις, διαπιστώνουμε ότι αποτελεί πλέον αναπόσπαστο επιχειρηματικό εργαλείο. Φτάνοντας στις αρχές της δεκαετίας του ‘90, μεγάλοι επιχειρηματικοί και κρατικοί φορείς διέθεταν τεράστιες ποσότητες δεδομένων που θα έπρεπε να εκμεταλλευθούν. Τα παραπάνω σε συνδυασμό με την εδραιωμένη πλέον αντίληψη ότι η πληροφορία αποτελεί το πιο πολύτιμο αγαθό, οδήγησαν στην ανάγκη για εφαρμογές ανάλυσης και επεξεργασίας μεγάλου όγκου δεδομένων. Την λύση δίνουν οι τεχνολογίες των Αποθηκών Δεδομένων (Data warehouses) και της αναλυτικής Επεξεργασίας Δεδομένων (OLAP). Τον τελευταίο καιρό διεξάγεται σημαντική έρευνα σε οντολογίες και σημασιολογικά δίκτυα, όπου πλέον η πληροφορία περιγράφεται εννοιολογικά για να είναι ευκολότερη η ανάκτηση της, η χρησιμοποίηση της και η σύγκριση της, δημιουργείται η ανάγκη εύρεσης ενός νέου συνδυαστικού τρόπου αναπαράστασης της γνώσης. Στην παρούσα διπλωματική εργασία ο αναγνώστης εισάγεται στα σημασιολογικά δίκτυα. Εκεί παρουσιάζονται αναλυτικά τα είδη των σημασιολογικών δικτύων και ιδιαίτερα ορισμένα που χρησιμοποιούνται για την αντιμετώπιση εξαιρέσεων. Δίνετε εισαγωγική περιγραφή της θεωρίας που αναπτύσσεται στις βάσεις δεδομένων με ιδιαίτερη έμφαση στις πολυδιάστατες βάσεις και τις αποθήκες δεδομένων. Παρουσιάζονται αναλυτικά τα OLAP εργαλεία καθώς και οι βασικές έννοιες που τα αποτελούν. Τέλος, γίνεται η παρουσίαση μιας νέας τεχνικής που βασίζεται στις μεθόδους αντιμετώπισης εξαιρέσεων σε ένα σημασιολογικό δίκτυο με σκοπό να επεκτείνει ένα OLAP μοντέλο και να το καταστήσει ικανό να αντιμετωπίσει εξαιρέσεις. Η προτεινόμενη επέκταση επιτρέπει την ύπαρξη εξαιρέσεων μεταξύ των τιμών των διαστάσεων ενός υπερκύβου δεδομένων, οι οποίες παίζουν σημαντικό ρόλο στην σωστή εξαγωγή συμπερασμάτων. / Taking into consideration that the use of computers has been transferred from research institutes to private companies and industries, we found out that computers constitute henceforth an integral enterprising tool. During the early 90’s, large private and public institutions afforded enormous quantities of data which revealed the need to be exploited. According to the above and having in mind that standing concepts handle information as the most important asset, it is noted that there is a need for applications that analyse and handle large amount of data. A solution to this problem was given by the data warehouse technologies and the analytical data processing (OLAP). Recently considerable research is conducted on ontologies and semantic networks, where information is already described in a conceptual way, which makes it easier to recover, to use it and compare to other similar. Thus, it is created the need of finding a new combination method of representation of knowledge. In the present master’s dissertation the reader is introduced into semantic networks. All the types of certain semantic networks are presented in detail and particularly those which are dealing with the representation of exceptions. Emphasis is given to the case of multidimensional databases as well as data deposits. Among them, OLAP tools and their basic theory is being described in more analysis. Finally, there is a presentation of a new technique that is based on methods for encountering exceptions in a semantic network whose goal is to extend one OLAP model in order to enable exception overcoming. The proposed expansion allows the existence of exceptions among the dimension values of a data hypercube which play a significant role to the right export of results.
4

Does data warehouse end-user metadata add value?

Foshay, N, Mukherjee, Avinandan, Taylor, W. Andrew January 2007 (has links)
No / Many data warehouses are currently underutilized by managers and knowledge workers. Can high-quality end-user metadata help to increase levels of adoption and use?
5

Processamento de consultas SOLAP drill-across e com junção espacial em data warehouses geográficos / Processing of drill-across and spatial join SOLAP queries over geographic data warehouses

Jaqueline Joice Brito 28 November 2012 (has links)
Um data warehouse geográco (DWG) é um banco de dados multidimensional, orientado a assunto, integrado, histórico, não-volátil e geralmente organizado em níveis de agregação. Além disso, também armazena dados espaciais em uma ou mais dimensões ou em pelo menos uma medida numérica. Visando oferecer suporte à tomada de decisão, é possível realizar em DWGs consultas SOLAP (spatial online analytical processing ), isto é, consultas analíticas multidimensionais (e.g., drill-down, roll-up, drill-across ) com predicados espaciais (e.g., intersecta, contém, está contido) denidos para range queries e junções espaciais. Um desafio no processamento dessas consultas é recuperar, de forma eficiente, dados espaciais e convencionais em DWGs muito volumosos. Na literatura, existem poucos índices voltados à indexação de DWGs, e ainda assim nenhum desses índices dedica-se a indexar consultas SOLAP drill-across e com junção espacial. Esta dissertação visa suprir essa limitação, por meio da proposta de estratégias para o processamento dessas consultas complexas. Para o processamento de consultas SOLAP drill-across foram propostas duas estratégias, Divide e Única, além da especicação de um conjunto de diretrizes que deve ser seguido para o projeto de um esquema de DWG que possibilite a execução dessas consultas e da especicação de classes de consultas. Para o processamento de consultas SOLAP com junção espacial foi proposta a estratégia SJB, além da identicação de quais características o esquema de DWG deve possuir para possibilitar a execução dessas consultas e da especicação do formato dessas consultas. A validação das estratégias propostas foi realizada por meio de testes de desempenho considerando diferentes congurações, sendo que os resultados obtidos foram contrastados com a execução de consultas do tipo junção estrela e o uso de visões materializadas. Os resultados mostraram que as estratégias propostas são muito eficientes. No processamento de consultas SOLAP drill-across, as estratégias Divide e Única mostraram uma redução no tempo de 82,7% a 98,6% com relação à junção estrela e ao uso de visões materializadas. No processamento de consultas SOLAP com junção espacial, a estratégia SJB garantiu uma melhora de desempenho na grande maioria das consultas executadas. Para essas consultas, o ganho de desempenho variou de 0,3% até 99,2% / A geographic data warehouse (GDW) is a special kind of multidimensional database. It is subject-oriented, integrated, historical, non-volatile and usually organized in levels of aggregation. Furthermore, a GDW also stores spatial data in one or more dimensions or at least in one numerical measure. Aiming at decision support, GDWs allow SOLAP (spatial online analytical processing) queries, i.e., multidimensional analytical queries (e.g., drill-down, roll-up, drill-across) extended with spatial predicates (e.g., intersects, contains, is contained) dened for range and spatial join queries. A challenging issue related to the processing of these complex queries is how to recover spatial and conventional data stored in huge GDWs eciently. In the literature, there are few access methods dedicated to index GDWs, and none of these methods focus on drill-across and spatial join SOLAP queries. In this master\'s thesis, we propose novel strategies for processing these complex queries. We introduce two strategies for processing SOLAP drill-across queries (namely, Divide and Unique), dene a set of guidelines for the design of a GDW schema that enables the execution of these queries, and determine a set of classes of these queries to be issued over a GDW schema that follows the proposed guidelines. As for the processing of spatial join SOLAP queries, we propose the SJB strategy, and also identify the characteristics of a DWG schema that enables the execution of these queries as well as dene the format of these queries. We validated the proposed strategies through performance tests that compared them with the star join computation and the use of materialized views. The obtained results showed that our strategies are very ecient. Regarding the SOLAP drill-across queries, the Divide and Unique strategies showed a time reduction that ranged from 82,7% to 98,6% with respect to star join computation and the use of materialized views. Regarding the SOLAP spatial join queries, the SJB strategy guaranteed best results for most of the analyzed queries. For these queries, the performance gain of the SJB strategy ranged from 0,3% to 99,2% over the star join computation and the use of materialized view
6

Processamento de consultas SOLAP drill-across e com junção espacial em data warehouses geográficos / Processing of drill-across and spatial join SOLAP queries over geographic data warehouses

Brito, Jaqueline Joice 28 November 2012 (has links)
Um data warehouse geográco (DWG) é um banco de dados multidimensional, orientado a assunto, integrado, histórico, não-volátil e geralmente organizado em níveis de agregação. Além disso, também armazena dados espaciais em uma ou mais dimensões ou em pelo menos uma medida numérica. Visando oferecer suporte à tomada de decisão, é possível realizar em DWGs consultas SOLAP (spatial online analytical processing ), isto é, consultas analíticas multidimensionais (e.g., drill-down, roll-up, drill-across ) com predicados espaciais (e.g., intersecta, contém, está contido) denidos para range queries e junções espaciais. Um desafio no processamento dessas consultas é recuperar, de forma eficiente, dados espaciais e convencionais em DWGs muito volumosos. Na literatura, existem poucos índices voltados à indexação de DWGs, e ainda assim nenhum desses índices dedica-se a indexar consultas SOLAP drill-across e com junção espacial. Esta dissertação visa suprir essa limitação, por meio da proposta de estratégias para o processamento dessas consultas complexas. Para o processamento de consultas SOLAP drill-across foram propostas duas estratégias, Divide e Única, além da especicação de um conjunto de diretrizes que deve ser seguido para o projeto de um esquema de DWG que possibilite a execução dessas consultas e da especicação de classes de consultas. Para o processamento de consultas SOLAP com junção espacial foi proposta a estratégia SJB, além da identicação de quais características o esquema de DWG deve possuir para possibilitar a execução dessas consultas e da especicação do formato dessas consultas. A validação das estratégias propostas foi realizada por meio de testes de desempenho considerando diferentes congurações, sendo que os resultados obtidos foram contrastados com a execução de consultas do tipo junção estrela e o uso de visões materializadas. Os resultados mostraram que as estratégias propostas são muito eficientes. No processamento de consultas SOLAP drill-across, as estratégias Divide e Única mostraram uma redução no tempo de 82,7% a 98,6% com relação à junção estrela e ao uso de visões materializadas. No processamento de consultas SOLAP com junção espacial, a estratégia SJB garantiu uma melhora de desempenho na grande maioria das consultas executadas. Para essas consultas, o ganho de desempenho variou de 0,3% até 99,2% / A geographic data warehouse (GDW) is a special kind of multidimensional database. It is subject-oriented, integrated, historical, non-volatile and usually organized in levels of aggregation. Furthermore, a GDW also stores spatial data in one or more dimensions or at least in one numerical measure. Aiming at decision support, GDWs allow SOLAP (spatial online analytical processing) queries, i.e., multidimensional analytical queries (e.g., drill-down, roll-up, drill-across) extended with spatial predicates (e.g., intersects, contains, is contained) dened for range and spatial join queries. A challenging issue related to the processing of these complex queries is how to recover spatial and conventional data stored in huge GDWs eciently. In the literature, there are few access methods dedicated to index GDWs, and none of these methods focus on drill-across and spatial join SOLAP queries. In this master\'s thesis, we propose novel strategies for processing these complex queries. We introduce two strategies for processing SOLAP drill-across queries (namely, Divide and Unique), dene a set of guidelines for the design of a GDW schema that enables the execution of these queries, and determine a set of classes of these queries to be issued over a GDW schema that follows the proposed guidelines. As for the processing of spatial join SOLAP queries, we propose the SJB strategy, and also identify the characteristics of a DWG schema that enables the execution of these queries as well as dene the format of these queries. We validated the proposed strategies through performance tests that compared them with the star join computation and the use of materialized views. The obtained results showed that our strategies are very ecient. Regarding the SOLAP drill-across queries, the Divide and Unique strategies showed a time reduction that ranged from 82,7% to 98,6% with respect to star join computation and the use of materialized views. Regarding the SOLAP spatial join queries, the SJB strategy guaranteed best results for most of the analyzed queries. For these queries, the performance gain of the SJB strategy ranged from 0,3% to 99,2% over the star join computation and the use of materialized view
7

Análise de desempenho de consultas OLAP espaçotemporais em função da ordem de processamento dos predicados convencional, espacial e temporal

Joaquim Neto, Cesar 08 March 2016 (has links)
Submitted by Daniele Amaral (daniee_ni@hotmail.com) on 2016-10-07T20:05:05Z No. of bitstreams: 1 DissCJN.pdf: 5948964 bytes, checksum: e7e719e26b50a85697e7934bde411070 (MD5) / Approved for entry into archive by Marina Freitas (marinapf@ufscar.br) on 2016-10-20T19:30:58Z (GMT) No. of bitstreams: 1 DissCJN.pdf: 5948964 bytes, checksum: e7e719e26b50a85697e7934bde411070 (MD5) / Approved for entry into archive by Marina Freitas (marinapf@ufscar.br) on 2016-10-20T19:31:04Z (GMT) No. of bitstreams: 1 DissCJN.pdf: 5948964 bytes, checksum: e7e719e26b50a85697e7934bde411070 (MD5) / Made available in DSpace on 2016-10-20T19:31:09Z (GMT). No. of bitstreams: 1 DissCJN.pdf: 5948964 bytes, checksum: e7e719e26b50a85697e7934bde411070 (MD5) Previous issue date: 2016-03-08 / Não recebi financiamento / By providing ever-growing processing capabilities, many database technologies have been becoming important support tools to enterprises and institutions. The need to include (and control) new data types to the existing database technologies has brought also new challenges and research areas, arising the spatial, temporal, and spatiotemporal databases. Besides that, new analytical capabilities were required facilitating the birth of the data warehouse technology and, once more, the need to include spatial or temporal data (or both) to it, thus originating the spatial, temporal, and spatio-temporal data warehouses. The queries used in each database type had also evolved, culminating in the STOLAP (Spatio Temporal OLAP) queries, which are composed of predicates dealing with conventional, spatial, and temporal data with the possibility of having their execution aided by specialized index structures. This work’s intention is to investigate how the execution of each predicate affects the performance of STOLAP queries by varying the used indexes, their execution order and the query’s selectivity. Bitmap Join Indexes will help in conventional predicate’s execution and in some portions of the temporal processing, which will also count with the use of SQL queries for some of the alternatives used in this research. The SB-index and HSB-index will aid the spatial processing while the STB-index will be used to process temporal and spatial predicates together. The expected result is an analysis of the best predicate order while running the queries also considering their selectivity. Another contribution of this work is the evolution of the HSB-index to a hierarchized version called HSTB-index, which should complement the execution options. / Por proverem uma capacidade de processamento de dados cada vez maior, várias tecnologias de bancos de dados têm se tornado importantes ferramentas de apoio a empresas e instituições. A necessidade de se incluir e controlar novos tipos de dados aos bancos de dados já existentes fizeram também surgir novos desafios e novas linhas de pesquisa, como é o caso dos bancos de dados espaciais, temporais e espaçotemporais. Além disso, novas capacidades analíticas foram se fazendo necessárias culminando com o surgimento dos data warehouses e, mais uma vez, com a necessidade de se incluir dados espaciais e temporais (ou ambos) surgindo os data warehouses espaciais, temporais e espaço-temporais. As consultas relacionadas a cada tipo de banco de dados também evoluíram culminando com as consultas STOLAP (Spatio-Temporal OLAP) que são compostas basicamente por predicados envolvendo dados convencionais, espaciais e temporais e cujo processamento pode ser auxiliado por estruturas de indexação especializadas. Este trabalho pretende investigar como a execução de cada um dos tipos de predicados afeta o desempenho de consultas STOLAP variando-se os índices utilizados, a ordem de execução dos predicados e a seletividade das consultas. Índices Bitmap de Junção auxiliarão na execução dos predicados convencionais e de algumas partes dos predicados temporais que também contarão com o auxílio de consultas SQL, enquanto os índices SB-index e HSB-index serão utilizados para auxiliar na execução dos predicados espaciais das consultas. O STB-index também será utilizado nas comparações e envolve ambos os predicados espacial e temporal. Espera-se obter uma análise das melhores opções de combinação de execução dos predicados em consultas STOLAP tendo em vista também a seletividade das consultas. Outra contribuição deste trabalho é a evolução do HSB-index para uma versão hierarquizada chamada HSTB-index e que servirá para complementar as opções de processamento de consultas STOLAP.
8

Evaluation of view maintenance with complex joins in a data warehouse environment

Asthorsson, Kjartan January 2002 (has links)
<p>Data warehouse maintenance and maintenance cost has been well studied in the literature. Integrating data sources, in a data warehouse environment, may often need data cleaning, transformation, or any other function applied to the data in order to integrate it. The impact on view maintenance, when data is integrated with other comparison operators than defined in theta join, has, however, not been closely looked at in previous studies.</p><p>In this study the impact of using a complex join in data warehouse environment is analyzed to measure how different maintenance strategies are affected when data needs to be integrated using other comparison operators than defined in a theta join. The analysis shows that maintenance cost is greatly increased when using complex joins since such joins often lack optimization techniques which are available when using a theta join. The study shows, among other things, that the join aware capability of sources is not of importance when performing complex joins, and incremental view maintenance is better approach than using recomputed view maintenance, when using complex joins. Strategies for maintaining data warehouses when data is integrated using a complex join are therefore different than when a theta join is used, and different maintenance strategies need to be applied.</p>
9

Evaluation of view maintenance with complex joins in a data warehouse environment

Asthorsson, Kjartan January 2002 (has links)
Data warehouse maintenance and maintenance cost has been well studied in the literature. Integrating data sources, in a data warehouse environment, may often need data cleaning, transformation, or any other function applied to the data in order to integrate it. The impact on view maintenance, when data is integrated with other comparison operators than defined in theta join, has, however, not been closely looked at in previous studies. In this study the impact of using a complex join in data warehouse environment is analyzed to measure how different maintenance strategies are affected when data needs to be integrated using other comparison operators than defined in a theta join. The analysis shows that maintenance cost is greatly increased when using complex joins since such joins often lack optimization techniques which are available when using a theta join. The study shows, among other things, that the join aware capability of sources is not of importance when performing complex joins, and incremental view maintenance is better approach than using recomputed view maintenance, when using complex joins. Strategies for maintaining data warehouses when data is integrated using a complex join are therefore different than when a theta join is used, and different maintenance strategies need to be applied.
10

Formal Assessment and Measurement of Data Utilization and Value for Mines

Rogers, William Pratt January 2015 (has links)
Most large contemporary mines already have considerable amounts of data, much of which goes largely unused. The key challenge in big data is increasing data utilization. Much of the data in the mine (not plant) come from a variety of systems, each with different databases and reporting environments. Standard technology deployments create a "silo-ification" of data leading to poor system usage. Through modern server monitoring, data utilization can quantifiably be measured. A host of other quantifiable, often automated approaches, to measuring data use and value can also be incorporated as a means of monitoring value generation. A data valuation tool is presented to measure the data assets at an operation. The Data Value Index (DVI) quantifies business intelligence best practices and user interaction considering managerial flexibility and data utilization rates. The DVI is built considering many case studies of data warehousing at various mining companies, some of which will be presented.

Page generated in 0.2158 seconds