Global ETD Search

21	Shrinked Data Marts Enabled for Negative Caching Lehner, Wolfgang, Thiele, Maik 15 June 2022 (has links) Data marts storing pre-aggregated data, prepared for further roll-ups, play an essential role in data warehouse environments and lead to significant performance gains in the query evaluation. However, in order to ensure the completeness of query results on the data mart without to access the underlying data warehouse, null values need to be stored explicitly; this process is denoted as negative caching. Such null values typically occur in multidimensional data sets, which are naturally very sparse. To our knowledge, there is no work on shrinking the null tuples in a multi-dimensional data set within ROLAP. For these tuples, we propose a lossless compression technique, leading to a dramatic reduction in size of the data mart. Queries depending on null value information can be answered with 100% precision by partially inflating the shrunken data mart. We complement our analytical approach with an experimental evaluation using real and synthetic data sets, and demonstrate our results. info:eu-repo/classification/ddc/004 ddc:004
22	The Design of Vague Spatial Data Warehouses Lopes Siqueira, Thiago Luis 07 December 2015 (has links) (PDF) Spatial data warehouses (SDW) and spatial online analytical processing (SOLAP) enhance decision making by enabling spatial analysis combined with multidimensional analytical queries. A SDW is an integrated and voluminous multidimensional database containing both conventional and spatial data. SOLAP allows querying SDWs with multidimensional queries that select spatial data that satisfy a given topological relationship and that aggregate spatial data. Existing SDW and SOLAP applications mostly consider phenomena represented by spatial data having exact locations and sharp boundaries. They neglect the fact that spatial data may be affected by imperfections, such as spatial vagueness, which prevents distinguishing an object from its neighborhood. A vague spatial object does not have a precisely defined boundary and/or interior. Thus, it may have a broad boundary and a blurred interior, and is composed of parts that certainly belong to it and parts that possibly belong to it. Although several real-world phenomena are characterized by spatial vagueness, no approach in the literature addresses both spatial vagueness and the design of SDWs nor provides multidimensional analysis over vague spatial data. These shortcomings motivated the elaboration of this doctoral thesis, which addresses both vague spatial data warehouses (vague SDWs) and vague spatial online analytical processing (vague SOLAP). A vague SDW is a SDW that comprises vague spatial data, while vague SOLAP allows querying vague SDWs. The major contributions of this doctoral thesis are: (i) the Vague Spatial Cube (VSCube) conceptual model, which enables the creation of conceptual schemata for vague SDWs using data cubes; (ii) the Vague Spatial MultiDim (VSMultiDim) conceptual model, which enables the creation of conceptual schemata for vague SDWs using diagrams; (iii) guidelines for designing relational schemata and integrity constraints for vague SDWs, and for extending the SQL language to enable vague SOLAP; (iv) the Vague Spatial Bitmap Index (VSB-index), which improves the performance to process queries against vague SDWs. The applicability of these contributions is demonstrated in two applications of the agricultural domain, by creating conceptual schemata for vague SDWs, transforming these conceptual schemata into logical schemata for vague SDWs, and efficiently processing queries over vague SDWs. / Les entrepôts de données spatiales (EDS) et l'analyse en ligne spatiale (ALS) améliorent la prise de décision en permettant l'analyse spatiale combinée avec des requêtes analytiques multidimensionnelles. Un EDS est une base de données multidimensionnelle intégrée et volumineuse qui contient des données classiques et des données spatiales. L'ALS permet l'interrogation des EDS avec des requêtes multidimensionnelles qui sélectionnent des données spatiales qui satisfont une relation topologique donnée et qui agrègent les données spatiales. Les EDS et l'ALS considèrent essentiellement des phénomènes représentés par des données spatiales ayant une localisation exacte et des frontières précises. Ils négligent que les données spatiales peuvent être affectées par des imperfections, comme l'imprécision spatiale, ce qui empêche de distinguer précisément un objet de son entourage. Un objet spatial vague n'a pas de frontière et/ou un intérieur précisément définis. Ainsi, il peut avoir une frontière large et un intérieur flou, et est composé de parties qui lui appartiennent certainement et des parties qui lui appartiennent éventuellement. Bien que plusieurs phénomènes du monde réel sont caractérisés par l'imprécision spatiale, il n'y a pas dans la littérature des approches qui adressent en même temps l'imprécision spatiale et la conception d'EDS ni qui fournissent une analyse multidimensionnelle des données spatiales vagues. Ces lacunes ont motivé l'élaboration de cette thèse de doctorat, qui adresse à la fois les entrepôts de données spatiales vagues (EDS vagues) et l'analyse en ligne spatiale vague (ALS vague). Un EDS vague est un EDS qui comprend des données spatiales vagues, tandis que l'ALS vague permet d'interroger des EDS vagues. Les contributions majeures de cette thèse de doctorat sont: (i) le modèle conceptuel Vague Spatial Cube (VSCube), qui permet la création de schémas conceptuels pour des EDS vagues à l'aide de cubes de données; (ii) le modèle conceptuel Vague Spatial MultiDim (VSMultiDim), qui permet la création de schémas conceptuels pour des EDS vagues à l'aide de diagrammes; (iii) des directives pour la conception de schémas relationnels et des contraintes d'intégrité pour des EDS vagues, et pour l'extension du langage SQL pour permettre l'ALS vague; (iv) l'indice Vague Spatial Bitmap (VSB-index) qui améliore la performance pour traiter les requêtes adressées à des EDS vagues. L'applicabilité de ces contributions est démontrée dans deux applications dans le domaine agricole, en créant des schémas conceptuels des EDS vagues, la transformation de ces schémas conceptuels en schémas logiques pour des EDS vagues, et le traitement efficace des requêtes sur des EDS vagues. / O data warehouse espacial (DWE) é um banco de dados multidimensional integrado e volumoso que armazena dados espaciais e dados convencionais. Já o processamento analítico-espacial online (SOLAP) permite consultar o DWE, tanto pela seleção de dados espaciais que satisfazem um relacionamento topológico, quanto pela agregação dos dados espaciais. Deste modo, DWE e SOLAP beneficiam o suporte a tomada de decisão. As aplicações de DWE e SOLAP abordam majoritarimente fenômenos representados por dados espaciais exatos, ou seja, que assumem localizações e fronteiras bem definidas. Contudo, tais aplicações negligenciam dados espaciais afetados por imperfeições, tais como a vagueza espacial, a qual interfere na identificação precisa de um objeto e de seus vizinhos. Um objeto espacial vago não tem sua fronteira ou seu interior precisamente definidos. Além disso, é composto por partes que certamente pertencem a ele e partes que possivelmente pertencem a ele. Apesar de inúmeros fenômenos do mundo real serem caracterizados pela vagueza espacial, na literatura consultada não se identificaram trabalhos que considerassem a vagueza espacial no projeto de DWE e nem para consultar o DWE. Tal limitação motivou a elaboração desta tese de doutorado, a qual introduz os conceitos de DWE vago e de SOLAP vago. Um DWE vago é um DWE que armazena dados espaciais vagos, enquanto que SOLAP vago provê os meios para consultar o DWE vago. Nesta tese, o projeto de DWE vago é abordado e as principais contribuições providas são: (i) o modelo conceitual VSCube que viabiliza a criação de um cubos de dados multidimensional para representar o esquema conceitual de um DWE vago; (ii) o modelo conceitual VSMultiDim que permite criar um diagrama para representar o esquema conceitual de um DWE vago; (iii) diretrizes para o projeto lógico do DWE vago e de suas restrições de integridade, e para estender a linguagem SQL visando processar as consultas de SOLAP vago no DWE vago; e (iv) o índice VSB-index que aprimora o desempenho do processamento de consultas no DWE vago. A aplicabilidade dessas contribuições é demonstrada em dois estudos de caso no domínio da agricultura, por meio da criação de esquemas conceituais de DWE vago, da transformação dos esquemas conceituais em esquemas lógicos de DWE vago, e do processamento de consultas envolvendo as regiões vagas do DWE vago. / Doctorat en Sciences de l'ingénieur et technologie / Location of the public defense: Universidade Federal de São Carlos, São Carlos, SP, Brazil. / info:eu-repo/semantics/nonPublished Informatique administrative Informatique de gestion Systèmes d'information géographique Informatique générale spatial data warehouses spatial vagueness conceptual modeling logical design indexing
23	Intégration holistique et entreposage automatique des données ouvertes / Holistic integration and automatic warehousing of open data Megdiche Bousarsar, Imen 10 December 2015 (has links) Les statistiques présentes dans les Open Data ou données ouvertes constituent des informations utiles pour alimenter un système décisionnel. Leur intégration et leur entreposage au sein du système décisionnel se fait à travers des processus ETL. Il faut automatiser ces processus afin de faciliter leur accessibilité à des non-experts. Ces processus doivent pallier aux problèmes de manque de schémas, d'hétérogénéité structurelle et sémantique qui caractérisent les données ouvertes. Afin de répondre à ces problématiques, nous proposons une nouvelle démarche ETL basée sur les graphes. Pour l'extraction du graphe d'un tableau, nous proposons des activités de détection et d'annotation automatiques. Pour la transformation, nous proposons un programme linéaire pour résoudre le problème d'appariement holistique de données structurelles provenant de plusieurs graphes. Ce modèle fournit une solution optimale et unique. Pour le chargement, nous proposons un processus progressif pour la définition du schéma multidimensionnel et l'augmentation du graphe intégré. Enfin, nous présentons un prototype et les résultats d'expérimentations. / Statistical Open Data present useful information to feed up a decision-making system. Their integration and storage within these systems is achieved through ETL processes. It is necessary to automate these processes in order to facilitate their accessibility to non-experts. These processes have also need to face out the problems of lack of schemes and structural and sematic heterogeneity, which characterize the Open Data. To meet these issues, we propose a new ETL approach based on graphs. For the extraction, we propose automatic activities performing detection and annotations based on a model of a table. For the transformation, we propose a linear program fulfilling holistic integration of several graphs. This model supplies an optimal and a unique solution. For the loading, we propose a progressive process for the definition of the multidimensional schema and the augmentation of the integrated graph. Finally, we present a prototype and the experimental evaluations. Données ouvertes ETL Graphes Détection tableaux Intégration holistique Entrepôt de données Open data ETL Graphs Table detection Holistic integration Data warehouses
24	Entrepôts de données NoSQL orientés colonnes dans un environnement cloud / Columnar NoSQL data warehouses in the cloud environment. Dehdouh, Khaled 05 November 2015 (has links) Le travail présenté dans cette thèse vise à proposer des approches pour construire et développer des entrepôts de données selon le modèle NoSQL orienté colonnes. L'intérêt porté aux modèles NoSQL est motivé d'une part, par l'avènement des données massives et d'autre part, par l'incapacité du modèle relationnel, habituellement utilisés pour implémenter les entrepôts de données, à permettre le passage à très grande échelle. En effet, les différentes modèles NoSQL sont devenus des standards dans le stockage et la gestion des données massives. Ils ont été conçus à l'origine pour construire des bases de données dont le modèle de stockage est le modèle « clé/valeur ». D'autres modèles sont alors apparus pour tenir compte de la variabilité des données : modèles orienté colonne, orienté document et orienté graphe. Pour développer des entrepôts de données massives, notre choix s'est porté sur le modèle NoSQL orienté colonnes car il apparaît comme étant le plus approprié aux traitements des requêtes décisionnelles qui sont définies en fonction d'un ensemble de colonnes (mesures et dimensions) issues de l'entrepôt. Cependant, le modèle NoSQL en colonnes ne propose pas d'opérateurs de type analyse en ligne (OLAP) afin d'exploiter les entrepôts de données.Nous présentons dans cette thèse des solutions innovantes sur la modélisation logique et physique des entrepôts de données NoSQL en colonnes. Nous avons proposé une approche de construction des cubes de données qui prend compte des spécificités de l'environnement du stockage orienté colonnes. Par ailleurs, afin d'exploiter les entrepôts de données en colonnes, nous avons défini des opérateurs d'agrégation permettant de créer des cubes OLAP. Nous avons proposé l'opérateur C-CUBE (Columnar-Cube) permettant de construire des cubes OLAP stockés en colonnes dans un environnement relationnel en utilisant la jointure invisible. MC-CUBE (MapReduce Columnar-Cube) pour construire des cubes OLAP stockés en colonnes dans un environnement distribué exploitant la jointure invisible et le paradigme MapReduce pour paralléliser les traitements. Et enfin, nous avons développé l'opérateur CN-CUBE (Columnar-NoSQL Cube) qui tient compte des faits et des dimensions qui sont groupés dans une même table lors de la génération de cubes à partir d'un entrepôt dénormalisé selon un certain modèle logique. Nous avons réalisé une étude de performance des modèles de données dimensionnels NoSQL et de nos opérateurs OLAP. Nous avons donc proposé un index de jointure en étoile adapté aux entrepôts de données NoSQL orientés colonnes, baptisé C-SJI (Columnar-Star Join Index). Pour évaluer nos propositions, nous avons défini un modèle de coût pour mesurer l'impact de l'apport de cet index. D'autre part, nous avons proposé un modèle logique baptisé FLM (Flat Logical Model) pour implémenter des entrepôts de données NoSQL orientés colonnes et de permettre une meilleure prise en charge par les SGBD NoSQL de cette famille.Pour valider nos différentes contributions, nous avons développé une plate-forme logicielle CG-CDW (Cube Generation for Columnar Data Warehouses) qui permet de générer des cubes OLAP à partir d'entrepôts de données en colonnes. Pour terminer et afin d'évaluer nos contributions, nous avons tout d'abord développé un banc d'essai décisionnel NoSQL en colonnes (CNSSB : Columnar NoSQL Star Schema Benchmark) basé sur le banc d'essai SSB (Star Schema Benchmark), puis, nous avons procédé à plusieurs tests qui ont permis de montrer l'efficacité des différents opérateurs d'agrégation que nous avons proposé. / The work presented in this thesis aims at proposing approaches to build data warehouses by using the columnar NoSQL model. The use of NoSQL models is motivated by the advent of big data and the inability of the relational model, usually used to implement data warehousing, to allow data scalability. Indeed, the NoSQL models are suitable for storing and managing massive data. They are designed to build databases whose storage model is the "key/value". Other models, then, appeared to account for the variability of the data: column oriented, document oriented and graph oriented. We have used the column NoSQL oriented model for building massive data warehouses because it is more suitable for decisional queries that are defined by a set of columns (measures and dimensions) from warehouse. However, the NoSQL model columns do not offer online analysis operators (OLAP) for exploiting the data warehouse.We present in this thesis new solutions for logical and physical modeling of columnar NoSQL data warehouses. We have proposed a new approach that allows building data cubes by taking the characteristics of the columnar environment into account. Thus, we have defined new cube operators which allow building columnar cubes. C-CUBE (Columnar-CUBE) for columnar relational data warehouses. MC-CUBE (MapReduce Columnar-CUBE) for columnar NoSQL data warehouses when measures and dimensions are stored in different tables. Finally, CN-CUBE (Columnar NoSQL-CUBE) when measures and dimensions are gathered in the same table according a new logical model that we proposed. We have studied the NoSQL dimensional data model performance and our OLAP operators, and we have proposed a new star join index C-SJI (Columnar-Star join index) suitable for columnar NoSQL data warehouses which store measures and dimensions separately. To evaluate our contribution, we have defined a cost model to measure the impact of the use of this index. Furthermore, we have proposed a logic model called FLM (Flat Logical Model) to represent a data cube NoSQL oriented columns and enable a better management by columnar NoSQL DBMS.To validate our contributions, we have developed a software framework CG-CDW (Cube Generation for Data Warehouses Columnar) to generate OLAP cubes from columnar data warehouses. Also, we have developed a columnar NoSQL decisional benchmark CNSSB (Columnar NoSQL Star Schema Benchmark) based on the SSB and finally, we conducted several tests that have shown the effectiveness of different aggregation operators that we proposed. Entrepôts de données Cube OLAP Modèle NoSQL orienté colonnes MapReduce Data Warehouses OLAP cube Column NoSQL model MapReduce
25	Designing conventional, spatial, and temporal data warehouses: concepts and methodological framework Malinowski Gajda, Elzbieta 02 October 2006 (has links) Decision support systems are interactive, computer-based information systems that provide data and analysis tools in order to better assist managers on different levels of organization in the process of decision making. Data warehouses (DWs) have been developed and deployed as an integral part of decision support systems. <p><p>A data warehouse is a database that allows to store high volume of historical data required for analytical purposes. This data is extracted from operational databases, transformed into a coherent whole, and loaded into a DW during the extraction-transformation-loading (ETL) process. <p><p>DW data can be dynamically manipulated using on-line analytical processing (OLAP) systems. DW and OLAP systems rely on a multidimensional model that includes measures, dimensions, and hierarchies. Measures are usually numeric additive values that are used for quantitative evaluation of different aspects about organization. Dimensions provide different analysis perspectives while hierarchies allow to analyze measures on different levels of detail. <p><p>Nevertheless, currently, designers as well as users find difficult to specify multidimensional elements required for analysis. One reason for that is the lack of conceptual models for DW and OLAP system design, which would allow to express data requirements on an abstract level without considering implementation details. Another problem is that many kinds of complex hierarchies arising in real-world situations are not addressed by current DW and OLAP systems.<p><p>In order to help designers to build conceptual models for decision-support systems and to help users in better understanding the data to be analyzed, in this thesis we propose the MultiDimER model - a conceptual model used for representing multidimensional data for DW and OLAP applications. Our model is mainly based on the existing ER constructs, for example, entity types, attributes, relationship types with their usual semantics, allowing to represent the common concepts of dimensions, hierarchies, and measures. It also includes a conceptual classification of different kinds of hierarchies existing in real-world situations and proposes graphical notations for them.<p><p>On the other hand, currently users of DW and OLAP systems demand also the inclusion of spatial data, visualization of which allows to reveal patterns that are difficult to discover otherwise. The advantage of using spatial data in the analysis process is widely recognized since it allows to reveal patterns that are difficult to discover otherwise. <p><p>However, although DWs typically include a spatial or a location dimension, this dimension is usually represented in an alphanumeric format. Furthermore, there is still a lack of a systematic study that analyze the inclusion as well as the management of hierarchies and measures that are represented using spatial data. <p><p>With the aim of satisfying the growing requirements of decision-making users, we extend the MultiDimER model by allowing to include spatial data in the different elements composing the multidimensional model. The novelty of our contribution lays in the fact that a multidimensional model is seldom used for representing spatial data. To succeed with our proposal, we applied the research achievements in the field of spatial databases to the specific features of a multidimensional model. The spatial extension of a multidimensional model raises several issues, to which we refer in this thesis, such as the influence of different topological relationships between spatial objects forming a hierarchy on the procedures required for measure aggregations, aggregations of spatial measures, the inclusion of spatial measures without the presence of spatial dimensions, among others. <p><p>Moreover, one of the important characteristics of multidimensional models is the presence of a time dimension for keeping track of changes in measures. However, this dimension cannot be used to model changes in other dimensions. <p>Therefore, usual multidimensional models are not symmetric in the way of representing changes for measures and dimensions. Further, there is still a lack of analysis indicating which concepts already developed for providing temporal support in conventional databases can be applied and be useful for different elements composing a multidimensional model. <p><p>In order to handle in a similar manner temporal changes to all elements of a multidimensional model, we introduce a temporal extension for the MultiDimER model. This extension is based on the research in the area of temporal databases, which have been successfully used for modeling time-varying information for several decades. We propose the inclusion of different temporal types, such as valid and transaction time, which are obtained from source systems, in addition to the DW loading time generated in DWs. We use this temporal support for a conceptual representation of time-varying dimensions, hierarchies, and measures. We also refer to specific constraints that should be imposed on time-varying hierarchies and to the problem of handling multiple time granularities between source systems and DWs. <p><p>Furthermore, the design of DWs is not an easy task. It requires to consider all phases from the requirements specification to the final implementation including the ETL process. It should also take into account that the inclusion of different data items in a DW depends on both, users' needs and data availability in source systems. However, currently, designers must rely on their experience due to the lack of a methodological framework that considers above-mentioned aspects. <p><p>In order to assist developers during the DW design process, we propose a methodology for the design of conventional, spatial, and temporal DWs. We refer to different phases, such as requirements specification, conceptual, logical, and physical modeling. We include three different methods for requirements specification depending on whether users, operational data sources, or both are the driving force in the process of requirement gathering. We show how each method leads to the creation of a conceptual multidimensional model. We also present logical and physical design phases that refer to DW structures and the ETL process.<p><p>To ensure the correctness of the proposed conceptual models, i.e. with conventional data, with the spatial data, and with time-varying data, we formally define them providing their syntax and semantics. With the aim of assessing the usability of our conceptual model including representation of different kinds of hierarchies as well as spatial and temporal support, we present real-world examples. Pursuing the goal that the proposed conceptual solutions can be implemented, we include their logical representations using relational and object-relational databases.<p> / Doctorat en sciences appliquées / info:eu-repo/semantics/nonPublished Sciences de l'ingénieur Informatique générale OLAP technology Data warehousing Data warehousing -- Design Multidimensional databases OLAP, Technologie Entrepôts de données (Informatique) Bases de données multidimensionnelles temporal data warehouses spatial data warehouses OLAP hierarchies multidimensional model conceptual modeling data warehouses methodology for data warehouse design spatial OLAP
26	Analyse multidimensionnelle interactive de résultats de simulation : aide à la décision dans le domaine de l'agroécologie / Interactive multidimensional analysis of simulation results : decision support in the agroecology field Bouadi, Tassadit 28 November 2013 (has links) Dans cette thèse, nous nous sommes intéressés à l'analyse des données de simulation issues du modèle agro-hydrologique TNT. Les objectifs consistaient à élaborer des méthodes d'analyse des résultats de simulation qui replacent l'utilisateur au coeur du processus décisionnel, et qui permettent d'analyser et d'interpréter de gros volumes de données de manière efficace. La démarche développée consiste à utiliser des méthodes d'analyse multidimensionnelle interactive. Tout d'abord, nous avons proposé une méthode d'archivage des résultats de simulation dans une base de données décisionnelle (i.e. entrepôt de données), adaptée au caractère spatio-temporel des données de simulation produites. Ensuite, nous avons suggéré d'analyser ces données de simulations avec des méthodes d'analyse en ligne (OLAP) afin de fournir aux acteurs des informations stratégiques pour améliorer le processus d'aide à la prise de décision. Enfin, nous avons proposé deux méthodes d'extraction de skyline dans le contexte des entrepôts de données afin de permettre aux acteurs de formuler de nouvelles questions en combinant des critères environnementaux contradictoires, et de trouver les solutions compromis associées à leurs attentes, puis d'exploiter les préférences des acteurs pour détecter et faire ressortir les données susceptibles de les intéresser. La première méthode EC2Sky, permet un calcul incrémental et efficace des skyline en présence de préférences utilisateurs dynamiques, et ce malgré de gros volumes de données. La deuxième méthode HSky, étend la recherche des points skyline aux dimensions hiérarchiques. Elle permet aux utilisateurs de naviguer le long des axes des dimensions hiérarchiques (i.e. spécialisation / généralisation) tout en assurant un calcul en ligne des points skyline correspondants. Ces contributions ont été motivées et expérimentées par l'application de gestion des pratiques agricoles pour l'amélioration de la qualité des eaux des bassins versants agricoles, et nous avons proposé un couplage entre le modèle d'entrepôt de données agro-hydrologiques construit et les méthodes d'extraction de skyline proposées. / This thesis concerns the analysis of simulation data generated by the agrohydrological model TNT. Our objective is to develop analytical methods for massive simulation results. We want to place the user at the heart of the decision-making process, while letting him handle and analyze large amounts of data in a very efficient way. Our first contribution is an original approach N-Catch, relying on interactive multidimensional analysis methods for archiving simulation results in a decisional database (i.e. data warehouse) adapted to the spatio-temporal nature of the simulation data. In addition, we suggest to analyze the simulation data with online analytical methods (OLAP) to provide strategic information for stakeholders to improve the decision making process. Our second contribution concern two methods for computing skyline queries in the context of data warehouses. These methods enable stakeholders to formulate new questions by combining conflicting environmental criteria, to find compromise solutions associated with their expectations, and to exploit the stakeholder preferences to identify and highlight the data of potential interest. The first method EC2Sky, focuses on how to answer efficiently and progressively skyline queries in the presence of several dynamic user preferences despite of large volume of data. The second method HSky, extends the skyline computation to hierarchical dimensions. It allows the user to navigate along the dimensions hierarchies (i.e. specialize / generalize) while ensuring the online computation of associated skylines. Finally, we present the application of our proposals for managing agricultural practices to improve water quality in agricultural watersheds. We propose a coupling between the agro-hydrological data warehouse model N-Catch and the proposed skyline computation methods. Décision multicritère Entrepôts de données OLAP (Informatique) Systèmes d'aide à la décision Écologie agricole Multicriteria decision making Data warehouses OLAP technology Decision support systems Agricultural ecology
27	以業務流程角度建立資料倉儲模式之研究 - 以某事業單位為例嚴千凱, Yen , Chien-Kai Unknown Date (has links) 政府事業單位重要的任務之一，在於擬定相關管理與營運的政策，因此有關於營運面的相關資訊，就成為重要的決策依據。除此之外來自民意單位質詢，彈性相當大，範圍包含了整個事業業務，因此對於決策資訊需求，往往都是包羅萬象千變萬化的，因此如何快速且正確地取得營運資訊，便成為各單位一嚴峻的考驗，而目前以報表為基礎的營運決策模式，已經漸漸無法符合變動性日增的決策需求，而資料倉儲與多維度模型的設計，正可符合快速變動的決策資訊需求。本論文將透過對於個案組織的深入質化訪談，進一步建立個案組織的業務流程，並且透過對於個案組織業務內容的深入瞭解，對於其政策制訂之相關決策點、參考資訊、資訊流做詳細的分析，以便針對對決策資訊的來源與性質，找出相對應的屬性、維度與事實。如此一來，便可發現資料倉儲的設計，必須以流程模型為基礎，並且透過模型的指引，設計適合用戶業務內容的資料倉儲與資料模型，並且為未來組織轉型時的配適，提供最大的彈性。資料倉儲資料模型企業流程模型決策支援 Data Warehouses Data Model Business Process Decision Support
28	The design of vague spatial data warehouses Siqueira, Thiago Luís Lopes 07 December 2015 (has links) Made available in DSpace on 2016-06-02T19:04:00Z (GMT). No. of bitstreams: 1 6824.pdf: 22060515 bytes, checksum: bde19feb7a6e296214aebe081f2d09de (MD5) Previous issue date: 2015-12-07 / Universidade Federal de Minas Gerais / O data warehouse espacial (DWE) é um banco de dados multidimensional integrado e volumoso que armazena dados espaciais e dados convencionais. Já o processamento analítico espacial online (SOLAP) permite consultar o DWE, tanto pela seleção de dados espaciais que satisfazem um relacionamento topológico, quanto pela agregação dos dados espaciais. Deste modo, DWE e SOLAP beneficiam o suporte a tomada de decisão. As aplicações de DWE e SOLAP abordam majoritarimente fenômenos representados por dados espaciais exatos, ou seja, que assumem localizações e fronteiras bem definidas. Contudo, tais aplicações negligenciam dados espaciais afetados por imperfeições, tais como a vagueza espacial, a qual interfere na identificação precisa de um objeto e de seus vizinhos. Um objeto espacial vago não tem sua fronteira ou seu interior precisamente definidos. Além disso, é composto por partes que certamente pertencem a ele e partes que possivelmente pertencem a ele. Apesar de inúmeros fenômenos do mundo real serem caracterizados pela vagueza espacial, na literatura consultada não se identificaram trabalhos que considerassem a vagueza espacial no projeto de DWE e nem para consultar o DWE. Tal limitação motivou a elaboração desta tese de doutorado, a qual introduz os conceitos de DWE vago e de SOLAP vago. Um DWE vago é um DWE que armazena dados espaciais vagos, enquanto que SOLAP vago provê os meios para consultar o DWE vago. Nesta tese, o projeto de DWE vago é abordado e as principais contribuições providas são: (i) o modelo conceitual VSCube que viabiliza a criação de um cubos de dados multidimensional para representar o esquema conceitual de um DWE vago; (ii) o modelo conceitual VSMultiDim que permite criar um diagrama para representar o esquema conceitual de um DWE vago; (iii) diretrizes para o projeto lógico do DWE vago e de suas restrições de integridade, e para estender a linguagem SQL visando processar as consultas de SOLAP vago no DWE vago; e (iv) o índice VSB-index que aprimora o desempenho do processamento de consultas no DWE vago. A aplicabilidade dessas contribuições é demonstrada em dois estudos de caso no domínio da agricultura, por meio da criação de esquemas conceituais de DWE vago, da transformação dos esquemas conceituais em esquemas lógicos de DWE vago, e do processamento de consultas envolvendo as regiões vagas do DWE vago. / Spatial data warehouses (SDW) and spatial online analytical processing (SOLAP) enhance decision making by enabling spatial analysis combined with multidimensional analytical queries. A SDW is an integrated and voluminous multidimensional database containing both conventional and spatial data. SOLAP allows querying SDWs with multidimensional queries that select spatial data that satisfy a given topological relationship and that aggregate spatial data. Existing SDW and SOLAP applications mostly consider phenomena represented by spatial data having exact locations and sharp boundaries. They neglect the fact that spatial data may be affected by imperfections, such as spatial vagueness, which prevents distinguishing an object from its neighborhood. A vague spatial object does not have a precisely defined boundary and/or interior. Thus, it may have a broad boundary and a blurred interior, and is composed of parts that certainly belong to it and parts that possibly belong to it. Although several real-world phenomena are characterized by spatial vagueness, no approach in the literature addresses both spatial vagueness and the design of SDWs nor provides multidimensional analysis over vague spatial data. These shortcomings motivated the elaboration of this doctoral thesis, which addresses both vague spatial data warehouses (vague SDWs) and vague spatial online analytical processing (vague SOLAP). A vague SDW is a SDW that comprises vague spatial data, while vague SOLAP allows querying vague SDWs. The major contributions of this doctoral thesis are: (i) the Vague Spatial Cube (VSCube) conceptual model, which enables the creation of conceptual schemata for vague SDWs using data cubes; (ii) the Vague Spatial MultiDim (VSMultiDim) conceptual model, which enables the creation of conceptual schemata for vague SDWs using diagrams; (iii) guidelines for designing relational schemata and integrity constraints for vague SDWs, and for extending the SQL language to enable vague SOLAP; (iv) the Vague Spatial Bitmap Index (VSB-index), which improves the performance to process queries against vague SDWs. The applicability of these contributions is demonstrated in two applications of the agricultural domain, by creating conceptual schemata for vague SDWs, transforming these conceptual schemata into logical schemata for vague SDWs, and efficiently processing queries over vague SDWs. Spatial data warehouses Spatial vagueness Conceptual modeling Logical design Indexing Banco de dados Data warehouse espacial Modelo conceitual Projeto lógico Indexação Vagueza espacial
29	Modélisation de hiérarchies complexes dans les entrepôts de données XML et traitement des problèmes d'additivité dans l'analyse en ligne XOLAP / Modeling complex hierarchies in XML data warehouses and solving summarizability problems in XOLAP Hachicha, Marouane 26 November 2012 (has links) Depuis son apparition en 1998, le langage XML (eXtensible Markup Language) est devenu un standard pour la modélisation et l'échange de données. En effet, XML permet de modéliser des structures de données qui ne sont pas facilement représentées dans les systèmes relationnels. Dans ce contexte, les entrepôts de données XML représentent aujourd'hui la base de plusieurs applications décisionnelles qui exploitent des données hétérogènes (peu structurées et provenant des sources multiples) aux structures complexes comme par exemple des hiérarchies complexes.Dans ce mémoire, nous proposons une nouvelle solution XOLAP (XML-OLAP) en temps réel qui traite les problèmes d'additivité dus aux hiérarchies complexes. Tout d'abord, nous proposons un nouveau modèle de données : les arbres de données multidimensionnels, qui permet de modéliser les faits, les dimensions, les mesures et les hiérarchies complexes d'un entrepôt de données XML. Pour pouvoir interroger les arbres de données multidimensionnels, nous modélisons les requêtes utilisateur à l'aide de modèles d'arbre XML. Nous proposons ensuite un nouvel algorithme de regroupement et d'agrégation pour la résolution en temps réel des problèmes d'additivité dans les hiérarchies complexes. Nous généralisons enfin cet algorithme à un nouvel opérateur XOLAP de forage vers le haut (roll-up).Finalement, nous validons nos propositions de manière expérimentale. Pour cela, nous étendons le banc d'essais XWeB en introduisant des hiérarchies complexes dans son schéma. La comparaison de notre approche à une approche de référence montre que la surcharge due à l'exécution en temps réel de notre approche est tout à fait acceptable et que nos algorithmes sont susceptibles de passer à l'échelle. / Since its inception in 1998, the eXtensible Markup Language (XML) has emerged as a standard for data representation and exchange over the Internet. XML provides an opportunity for modeling data structures that are not easily represented in relational systems. In this context, XML data warehouses nowadays form the basis of several decision-support applications exploiting heterogeneous data (little structured and coming from various sources) bearing complex structures, such as complex hierarchies. In this thesis, we propose a novel XOLAP (XML-OLAP) approach that automatically detects and processes summarizability issues at query time, without requiring any particular expertise from the user. Thus, at the logical level, we choose XML data trees, so-called multidimensional data trees, to model the multidimensional structures (facts, dimensions, measures and complex hierarchies) of XML data warehouses. In order to query multidimensional data trees, we model user queries as XML pattern trees. Then, we introduce a new aggregation algorithm to address summarizability issues in complex hierarchies. On the basis of this algorithm, we propose a novel XOLAP roll-up operator. Finally, we experimentally validate our proposal and compare our approach with the reference approach for addressing summarizability issues in complex hierarchies. For this sake, we extend the XML warehouse benchmark XWeB with complex hierarchies to generate XML data warehouses with scalable complex hierarchies. The results of our experiments show that the overhead induced by managing hierarchy complexity at run-time is totally acceptable and that our approach is expected to scale up well. Entrepôts de données XML Xolap Hiérarchies complexes Agrégation Forage vers le haut Problèmes d'additivité Regroupement multiple XML data warehouses Xolap Complex hierarchies Aggregation Summarizability issues Grouping
30	Implementace Business Intelligence ve firmě Haguess, a. s. / Implementace BI ve firmě Haguess a.s. Bendák, Martin January 2010 (has links) The subject of this thesis is a pilot project of the implementation of the BI (Business Intelligence) solution in the company Haguess, a. s. The project is concerned with the analyses of data stored in a database -- DataBase Management System (DBMS) -- which is used as a data source by the web application Customer Support Center (CSC). Haguess primarily uses CSC as a helpdesk for its clients and partners, but also uses it for internal purposes. The main use of the CSC application is to support information systems delivered by Haguess. There were two motives for the choice of this subject: BI software tools had not previously been used by Haguess, and the company management was keen to get data analyses from the CSC application. That's why I decided to do these analyses. My goal was to create a practically useful solution which would be an incentive for Haguess to use BI software tools for other purposes. This thesis has two main goals. The first one is the realisation of the pilot BI solution, which would outline the possibilities for analysis of data from the CSC application using BI software tools. This involved the following activities: multidimensional analysis and BI solution design -- i.e. design of data pipelines, OLAP (On-Line Analytical Processing) cube/cubes, and user tools in MS Excel calculator. The second goal was to select the proper software tools for a future complex realisation and running of the BI solution. The first objective was achieved by doing the following: proper analysis of the CSC application data model, definition of user requests for output analysis and its comparison with the data model analysis. Based on this comparison, the basic subjects of output analyses were determined. These basic subjects were the starting point for the implementation of the BI solution. The second objective was achieved as follows: on the basis of the implementation results, basic demands (criteria) for BI software tools features were determined, bearing in mind the possible future complex realisation and running of a BI solution. Research determined which BI software tools were available on the market. The most suitable BI software tool was selected following a comparison of available options, and the criteria mentioned above. The primary outcome of this thesis is the creation of a practically usable BI solution.

Search results