Global ETD Search

1	Intégration de données temps-réel issues de capteurs dans un entrepôt de données géo-décisionnel Mathieu, Jean 17 April 2018 (has links) Nous avons pu, au cours des dernières années, assister à une augmentation du nombre de capteurs utilisés pour mesurer des phénomènes de plus en plus variés. En effet, nous pouvons aujourd'hui utiliser les capteurs pour mesurer un niveau d'eau, une position (GPS), une température et même le rythme cardiaque d'un individu. La grande diversité de capteurs fait d'eux aujourd'hui des outils par excellence en matière d'acquisition de données. En parallèle à cette effervescence, les outils d'analyse ont également évolué depuis les bases de données transactionnelles et ont mené à l'apparition d'une nouvelle famille d’outils, appelés systèmes d’analyse (systèmes décisionnels), qui répond à des besoins d’analyse globale sur les données. Les entrepôts de données et outils OLAP (On-Line Analytical Processing), qui font partie de cette famille, permettent dorénavant aux décideurs d'analyser l'énorme volume de données dont ils disposent, de réaliser des comparaisons dans le temps et de construire des graphiques statistiques à l’aide de simples clics de la souris. Les nombreux types de capteurs peuvent certainement apporter de la richesse à une analyse, mais nécessitent de longs travaux d'intégration pour les amener jusqu'à un entrepôt géo-décisionnel, qui est au centre du processus de prise de décision. Les différents modèles de capteurs, types de données et moyens de transférer les données sont encore aujourd'hui des obstacles non négligeables à l'intégration de données issues de capteurs dans un entrepôt géo-décisionnel. Également, les entrepôts de données géo-décisionnels actuels ne sont pas initialement conçus pour accueillir de nouvelles données sur une base fréquente. Puisque l'utilisation de l'entrepôt par les utilisateurs est restreinte lors d'une mise à jour, les nouvelles données sont généralement ajoutées sur une base hebdomadaire, mensuelle, etc. Il existe pourtant des entrepôts de données capables d'être mis à jour plusieurs fois par jour sans que les performances lors de leur exploitation ne soient atteintes, les entrepôts de données temps-réel (EDTR). Toutefois, cette technologie est encore aujourd’hui peu courante, très coûteuse et peu développée. Ces travaux de recherche visent donc à développer une approche permettant de publier et standardiser les données temps-réel issues de capteurs et de les intégrer dans un entrepôt géo-décisionnel conventionnel. Une stratégie optimale de mise à jour de l'entrepôt a également été développée afin que les nouvelles données puissent être ajoutées aux analyses sans que la qualité de l'exploitation de l'entrepôt par les utilisateurs ne soit remise en cause. / In the last decade, the use of sensors for measuring various phenomenons has greatly increased. As such, we can now make use of sensors to measure GPS position, temperature and even the heartbeats of a person. Nowadays, the wide diversity of sensor makes them the best tools to gather data. Along with this effervescence, analysis tools have also advanced since the creation of transactional databases, leading to a new category of tools, analysis systems (Business Intelligence (BI)), which respond to the need of the global analysis of the data. Data warehouses and OLAP (On-Line Analytical Processing) tools, which belong to this category, enable users to analyze big volumes of data, execute time-based requests and build statistic graphs in a few simple mouse clicks. Although the various types of sensor can surely enrich any analysis, such data requires heavy integration processes to be driven into the data warehouse, centerpiece of any decision-making process. The different data types produced by sensors, sensor models and ways to transfer such data are even today significant obstacles to sensors data streams integration in a geo-decisional data warehouse. Also, actual geo-decisional data warehouses are not initially built to welcome new data on a high frequency. Since the performances of a data warehouse are restricted during an update, new data is usually added weekly, monthly, etc. However, some data warehouses, called Real-Time Data Warehouses (RTDW), are able to be updated several times a day without letting its performance diminish during the process. But this technology is not very common, very costly and in most of cases considered as "beta" versions. Therefore, this research aims to develop an approach allowing to publish and normalize real-time sensors data streams and to integrate it into a classic data warehouse. An optimized update strategy has also been developed so the frequent new data can be added to the analysis without affecting the data warehouse performances. SD 121 UL 2011 Entrepôts de données (Informatique) Temps réel (Informatique) Données géospatiales Capteurs
2	Conception et développement d'un service Web de constitution de mini cubes SOLAP pour clients mobiles Dubé, Étienne 13 April 2018 (has links) Les applications d’aide à la décision spatiale telles que SOLAP (Spatial OLAP) sont traditionnellement conçues pour les environnements informatiques de bureau. L’adaptation des applications SOLAP aux contextes d’utilisation mobile (e.g. PDA et téléphones mobiles) pose certains problèmes dus à la nature et aux contraintes de ces environnements. Ce projet de recherche vise à apporter une solution, basée sur une architecture orientée services (SOA), pour l’adaptation des cubes de données SOLAP aux environnements mobiles. Il s’agit d’un service Web capable de transformer les cubes SOLAP des entrepôts de données géo-décisionnelles en mini-cubes de taille réduite, adaptés aux clients mobiles. Le service permet de sélectionner un sous-ensemble des cubes existants par l’intermédiaire d’opérateurs paramétrables, d’appliquer des traitements de simplification aux membres spatiaux, et finalement de transmettre ces données en format XML. Ce travail de recherche ouvre donc la voie à la conception et au développement de nouvelles applications géospatiales décisionnelles mobiles. / Decision support systems such as SOLAP (Spatial OLAP) have been originally designed as desktop applications. Adapting SOLAP applications to mobility contexts (e.g. using PDAs and mobile phones) pose some challenges due to the constraints of these environments. This research projects aims to provide a solution, based on a Service Oriented Architecture (SOA), for adapting SOLAP data cubes to mobile environments. It consists of a Web service which is capable of transforming SOLAP cubes from spatial data warehouses, in order to create mini-cubes of reduced size, suitable for mobile clients. This service allows selecting a subset of existing cubes (using parameterizable operators), applying simplification algorithms to spatial members, and finally transfering the data in an XML format. This research opens the way to the design and development of new geospatial decisional mobile applications. SD 121 UL 2008 SOLAP, Technologie Données géospatiales Informatique mobile Entrepôts de données (Informatique)
3	Conception et développement d'un service web de mise à jour incrémentielle pour les cubes de données spatiales Declercq, Charlotte 13 April 2018 (has links) Les applications géodécisionnelles évoluent vers le temps réel et nécessitent un mécanisme de mise à jour rapide. Or, ce processus est complexe et très coûteux en temps de calcul à cause de la structure dénormalisée des données, stockées sous forme de cube. La méthode classique qui consistait à reconstruire entièrement le cube de données prend de plus en plus de temps au fur et à mesure que le cube grossit, et n'est plus envisageable. De nouvelles méthodes de mise à jour dites incrémentielles ont fait leurs preuves dans le domaine du Business Intelligence. Malheureusement, de telles méthodes n'ont jamais été transposées en géomatique décisionnelle, car les données géométriques nécessitent des traitements spécifiques et complexes. La mise à jour des cubes de données spatiales soulève des problèmes jusqu'alors inconnus dans les cubes de données classiques. En plus de cela, une large confusion règne autour de la notion de mise à jour dans les entrepôts de données. On remarque également que l'architecture des entrepôts de données suit la tendance actuelle d'évolution des architectures de systèmes informatiques vers une distribution des tâches et des ressources, au détriment des systèmes centralisés, et vers le développement de systèmes interopérables. Les architectures en émergence, dites orientées services deviennent dans ce sens très populaires. Cependant, les services dédiés à des tâches de mise à jour de cubes sont pour l'heure inexistants, même si ceux-ci représenteraient un apport indéniable pour permettre la prise de décision sur des données toujours à jour et cohérentes. Le but de ce mémoire est d'élaborer des méthodes de mise à jour incrémentielles pour les cubes spatiaux et d'inscrire le dispositif dans une architecture orientée services. La formulation de typologies pour la gestion de l'entrepôt de données et pour la mise à jour de cube a servi de base à la réflexion. Les méthodes de mise à jour incrémentielles existantes pour les cubes non spatiaux ont été passées en revue et ont permis d'imaginer de nouvelles méthodes incrémentielles adaptées aux cubes spatiaux. Pour finir, une architecture orientée services a été conçue, elle intègre tous les composants de l'entrepôt de données et contient le service web de mise à jour de cube, qui expose les différentes méthodes proposées. SD 121 UL 2008 Données géospatiales -- Informatique Entrepôts de données (Informatique) Temps réel (Informatique) Services Web Systèmes transactionnels
4	Developing a model and a language to identify and specify the integrity constraints in spatial datacubes Salehi, Mehrdad 16 April 2018 (has links) La qualité des données dans les cubes de données spatiales est importante étant donné que ces données sont utilisées comme base pour la prise de décision dans les grandes organisations. En effet, une mauvaise qualité de données dans ces cubes pourrait nous conduire à une mauvaise prise de décision. Les contraintes d'intégrité jouent un rôle clé pour améliorer la cohérence logique de toute base de données, l'un des principaux éléments de la qualité des données. Différents modèles de cubes de données spatiales ont été proposés ces dernières années mais aucun n'inclut explicitement les contraintes d'intégrité. En conséquence, les contraintes d'intégrité de cubes de données spatiales sont traitées de façon non-systématique, pragmatique, ce qui rend inefficace le processus de vérification de la cohérence des données dans les cubes de données spatiales. Cette thèse fournit un cadre théorique pour identifier les contraintes d'intégrité dans les cubes de données spatiales ainsi qu'un langage formel pour les spécifier. Pour ce faire, nous avons d'abord proposé un modèle formel pour les cubes de données spatiales qui en décrit les différentes composantes. En nous basant sur ce modèle, nous avons ensuite identifié et catégorisé les différents types de contraintes d'intégrité dans les cubes de données spatiales. En outre, puisque les cubes de données spatiales contiennent typiquement à la fois des données spatiales et temporelles, nous avons proposé une classification des contraintes d'intégrité des bases de données traitant de l'espace et du temps. Ensuite, nous avons présenté un langage formel pour spécifier les contraintes d'intégrité des cubes de données spatiales. Ce langage est basé sur un langage naturel contrôlé et hybride avec des pictogrammes. Plusieurs exemples de contraintes d'intégrité des cubes de données spatiales sont définis en utilisant ce langage. Les designers de cubes de données spatiales (analystes) peuvent utiliser le cadre proposé pour identifier les contraintes d'intégrité et les spécifier au stade de la conception des cubes de données spatiales. D'autre part, le langage formel proposé pour spécifier des contraintes d'intégrité est proche de la façon dont les utilisateurs finaux expriment leurs contraintes d'intégrité. Par conséquent, en utilisant ce langage, les utilisateurs finaux peuvent vérifier et valider les contraintes d'intégrité définies par l'analyste au stade de la conception. SD 121 UL 2009 S163 Contraintes (Intelligence artificielle) Entrepôts de données (Informatique) Bases de données multidimensionnelles Bases de données spatio-temporelles
5	A BPMN-based conceptual language for designing ETL processes El Akkaoui, Zineb 27 June 2014 (has links) Business Intelligence (BI) is the set of techniques and technologies that support the decision-making process by providing an aggregated insight on data in the organization. Due to the numerous potentially useful data hold by the events and applications running in the organization, the BI market calls for new technologies able to suitably exploit it for analysis wherever it is available. In particular, the Extract, Transform, and Load (ETL) processes, the fundamental BI technology responsible for integrating and cleansing organization data, must respond to these requirements.<p><p>However, the development of ETL processes is still considered to be very complex and time-consuming, to such a point that roughly 80% of the BI project effort is dedicated to the ETL development. Among the phases of ETL development life cycle, ETL modeling is a critical and laborious task. Actually, this phase produces<p>the first effective formal representation of the ETL process, i.e. ETL model, that is completely reused and refined in the subsequent phases of the development.<p><p>Typically, the ETL processes are modeled using vendor-specific ETL tools from the very beginning of development. However, these tools are unsuitable for business users since they induce overwhelming fine-grained models.<p><p>As an attempt to provide more appropriate tools to business users, vendor-independent ETL modeling languages have been proposed in the literature. Nevertheless, they still remain immature. In order to get a precise view on these languages, we conduct a survey which: i) defines a set of criteria associated to major ETL<p>requirements identified in the literature; ii) compares the surveyed conceptual languages, issued from research work, to the physical languages, issued from prominent ETL tools; and iii) studies the whole methodologies of ETL development associated<p>to these modeling languages.<p><p>The analysis of our survey reveals several drawbacks in responding to the ETL requirements. Particularly, the conceptual languages have incomplete elements for ETL modeling with few or no formalization. Several languages are only descriptive with no ability to be automatically implemented into executable code, nor are they able to be automatically maintained according to changes over time.<p><p>To address these shortcomings, we present, in this thesis, a novel approach that tackles the whole development life cycle of ETL processes. <p><p>First, we propose a new vendor-independent language aiming at modeling ETL processes similar to typical business processes, the processes responsible for managing the operations in an organization. The rational behind this proposal is to provide ETL processes with better access to data in events and applications of the organization, including fresh data, and better design capabilities such as available analysis for any users. By using the standard representation mechanism denoted BPMN (Business Process Modeling and Notation) and a classification of ETL elements resulting from a study of the most used commercial and open source ETL tools, the language enables building agile and full-edged ETL processes. We name our language BPMN4ETL to refer to BPMN for ETL processes.<p><p>Second, we build a model-driven framework that provides automatic code generation capability and ameliorates maintenance support of our ETL language. We use the Model-Driven Development (MDD) technology as it helps in developing software, particularly in automating the transformation from one phase of the software development to another. We present a set of model-to-text transformations able to produce code for different business process engines and ETL engines. Also, we depict the model-to-model transformations that automatically update the ETL models with the aim of supporting the maintenance of the generated code according to data source evolution. A demonstration using a case study is conducted as an initial validation to show that the framework covering modeling, implementation and maintenance could be used in practice.<p><p> To illustrate new concepts introduced in the thesis, mainly the BPMN4ETL language, and the implementation and maintenance framework, we use a case study from the fictitious Northwind Traders company, a retailer company that imports and exports foods from around the world. / Doctorat en Sciences de l'ingénieur / info:eu-repo/semantics/nonPublished Informatique générale Data warehousing Entrepôts de données (Informatique) Systèmes d'information -- Gestion model-driven BPMN business process ETL processes data warehouse modeling conceptual
6	Designing conventional, spatial, and temporal data warehouses: concepts and methodological framework Malinowski Gajda, Elzbieta 02 October 2006 (has links) Decision support systems are interactive, computer-based information systems that provide data and analysis tools in order to better assist managers on different levels of organization in the process of decision making. Data warehouses (DWs) have been developed and deployed as an integral part of decision support systems. <p><p>A data warehouse is a database that allows to store high volume of historical data required for analytical purposes. This data is extracted from operational databases, transformed into a coherent whole, and loaded into a DW during the extraction-transformation-loading (ETL) process. <p><p>DW data can be dynamically manipulated using on-line analytical processing (OLAP) systems. DW and OLAP systems rely on a multidimensional model that includes measures, dimensions, and hierarchies. Measures are usually numeric additive values that are used for quantitative evaluation of different aspects about organization. Dimensions provide different analysis perspectives while hierarchies allow to analyze measures on different levels of detail. <p><p>Nevertheless, currently, designers as well as users find difficult to specify multidimensional elements required for analysis. One reason for that is the lack of conceptual models for DW and OLAP system design, which would allow to express data requirements on an abstract level without considering implementation details. Another problem is that many kinds of complex hierarchies arising in real-world situations are not addressed by current DW and OLAP systems.<p><p>In order to help designers to build conceptual models for decision-support systems and to help users in better understanding the data to be analyzed, in this thesis we propose the MultiDimER model - a conceptual model used for representing multidimensional data for DW and OLAP applications. Our model is mainly based on the existing ER constructs, for example, entity types, attributes, relationship types with their usual semantics, allowing to represent the common concepts of dimensions, hierarchies, and measures. It also includes a conceptual classification of different kinds of hierarchies existing in real-world situations and proposes graphical notations for them.<p><p>On the other hand, currently users of DW and OLAP systems demand also the inclusion of spatial data, visualization of which allows to reveal patterns that are difficult to discover otherwise. The advantage of using spatial data in the analysis process is widely recognized since it allows to reveal patterns that are difficult to discover otherwise. <p><p>However, although DWs typically include a spatial or a location dimension, this dimension is usually represented in an alphanumeric format. Furthermore, there is still a lack of a systematic study that analyze the inclusion as well as the management of hierarchies and measures that are represented using spatial data. <p><p>With the aim of satisfying the growing requirements of decision-making users, we extend the MultiDimER model by allowing to include spatial data in the different elements composing the multidimensional model. The novelty of our contribution lays in the fact that a multidimensional model is seldom used for representing spatial data. To succeed with our proposal, we applied the research achievements in the field of spatial databases to the specific features of a multidimensional model. The spatial extension of a multidimensional model raises several issues, to which we refer in this thesis, such as the influence of different topological relationships between spatial objects forming a hierarchy on the procedures required for measure aggregations, aggregations of spatial measures, the inclusion of spatial measures without the presence of spatial dimensions, among others. <p><p>Moreover, one of the important characteristics of multidimensional models is the presence of a time dimension for keeping track of changes in measures. However, this dimension cannot be used to model changes in other dimensions. <p>Therefore, usual multidimensional models are not symmetric in the way of representing changes for measures and dimensions. Further, there is still a lack of analysis indicating which concepts already developed for providing temporal support in conventional databases can be applied and be useful for different elements composing a multidimensional model. <p><p>In order to handle in a similar manner temporal changes to all elements of a multidimensional model, we introduce a temporal extension for the MultiDimER model. This extension is based on the research in the area of temporal databases, which have been successfully used for modeling time-varying information for several decades. We propose the inclusion of different temporal types, such as valid and transaction time, which are obtained from source systems, in addition to the DW loading time generated in DWs. We use this temporal support for a conceptual representation of time-varying dimensions, hierarchies, and measures. We also refer to specific constraints that should be imposed on time-varying hierarchies and to the problem of handling multiple time granularities between source systems and DWs. <p><p>Furthermore, the design of DWs is not an easy task. It requires to consider all phases from the requirements specification to the final implementation including the ETL process. It should also take into account that the inclusion of different data items in a DW depends on both, users' needs and data availability in source systems. However, currently, designers must rely on their experience due to the lack of a methodological framework that considers above-mentioned aspects. <p><p>In order to assist developers during the DW design process, we propose a methodology for the design of conventional, spatial, and temporal DWs. We refer to different phases, such as requirements specification, conceptual, logical, and physical modeling. We include three different methods for requirements specification depending on whether users, operational data sources, or both are the driving force in the process of requirement gathering. We show how each method leads to the creation of a conceptual multidimensional model. We also present logical and physical design phases that refer to DW structures and the ETL process.<p><p>To ensure the correctness of the proposed conceptual models, i.e. with conventional data, with the spatial data, and with time-varying data, we formally define them providing their syntax and semantics. With the aim of assessing the usability of our conceptual model including representation of different kinds of hierarchies as well as spatial and temporal support, we present real-world examples. Pursuing the goal that the proposed conceptual solutions can be implemented, we include their logical representations using relational and object-relational databases.<p> / Doctorat en sciences appliquées / info:eu-repo/semantics/nonPublished Sciences de l'ingénieur Informatique générale OLAP technology Data warehousing Data warehousing -- Design Multidimensional databases OLAP, Technologie Entrepôts de données (Informatique) Bases de données multidimensionnelles temporal data warehouses spatial data warehouses OLAP hierarchies multidimensional model conceptual modeling data warehouses methodology for data warehouse design spatial OLAP

1

Page generated in 0.0976 seconds