Spelling suggestions: "subject:"[een] DATA INTEGRATION"" "subject:"[enn] DATA INTEGRATION""
1 |
Selection of maintenance policies for a data warehousing environment : a cost based approach to meeting quality of service requirementsEngström, Henrik January 2002 (has links)
No description available.
|
2 |
Integration of building design and construction information : a neutral object-oriented modelKiwan, Mohd S. A. A. January 1994 (has links)
No description available.
|
3 |
Data Sharing and Exchange: Semantics and Query AnsweringAwada, Rana January 2015 (has links)
Exchanging and integrating data that belong to worlds of different vocabularies are two prominent problems in the database literature. While data coordination deals with managing and integrating data between autonomous yet related sources with possibly distinct vocabularies, data exchange is defined as the problem of extracting data from a source and materializing it in an independent target to conform to the target schema. These two problems, however, have never been studied in a unified setting which allows both the exchange of the data as well as the coordination of different vocabularies between different sources. Our thesis shows that such a unified setting exhibits data integration capabilities that are beyond the ones provided by data exchange and data coordination separately. In this thesis, we propose a new setting – called DSE, for Data Sharing and Exchange – which allows the exchange of data between independent source and target applications that possess independent schemas, as well as independent yet related domains of constants. To facilitate this type of exchange, we extend the source-to-target dependencies used in the ordinary data exchange setting which allow the association between the source and the target at the schema level, with the mapping table construct introduced in the classical data coordination setting which defines the association between the source and the target at the instance level. A mapping table construct defines for each source element, the set of associated (or corresponding) elements in the domain of the target. The semantics of this association relationship between source and target elements change with different requirements of different applications. Ordinary DE settings can represent DSE settings; however, we show that there exist DSE settings with particular semantics of related values in mapping tables where DE is not the best exchange solution to adopt. The thesis introduces two DSE settings with such a property. We call the first DSE with unique identity semantics. The semantics of a mapping table in this DSE setting specifies that each source element should be uniquely mapped to at least one target element that is associated with it in the mapping table.
ii In this setting, classical DE is one method to perform a data exchange; however, it is not the best method to adopt, since it can not represent exchange applications, that require – as DC applications – to compute both portions as well as complete sets of certain answers for conjunctive queries. In addition, we show that adopting known DE universal solutions as semantics for such DSE settings is not the best in terms of efficiency when computing certain answers for conjunctive queries. The second DSE setting that the thesis introduces with the same property is called DSE with equality semantics. This setting captures interesting meaning of related data in a mapping table. Such semantics impose that each source element in a mapping table is related to a target element only if both elements are equivalent (i.e they have the same meaning). We show in our thesis that this DSE setting differs from ordinary DE settings in the sense that additional information could be entailed under such interpretation of related data. Also, this added information needs to be augmented to both the source instance and the mapping table in order to generate target instances that correctly reflect both in a DSE scenario. In other words, we can say that in such a DSE setting, a source instance and a mapping table can be incomplete with respect to the semantics of the mapping table. We formally define the two aforementioned semantics of a DSE setting and we distinguish between two types of solutions for this setting, named,universal DSE solutions, which contain the complete set of exchanged information, and universal DSE KB-Solutions, which store a portion of the exchanged information with implicit information in the form of a set of rules over the target. DSEKB-Solutions allow applications to compute on demand both a portion and the complete set of certain answers for conjunctive queries. In addition,we define the semantics of conjunctive query answering, and we distinguish between sound and complete certain answers for conjunctive queries and we define the algorithms to compute these efficiently. Finally, we provide experimental results which compare the run times to generate DSE solutions versus DSE KB-solutions, and compare the performance of computing sound and complete certain answers for conjunctive queries using both types of solutions
|
4 |
Metadata-Driven Data IntegrationNadal Francesch, Sergi 16 May 2019 (has links) (PDF)
Data has an undoubtable impact on society. Storing and processing large amounts of available data is currently one of the key success factors for an organization. Nonetheless, we are recently witnessing a change represented by huge and heterogeneous amounts of data. Indeed, 90% of the data in the world has been generated in the last two years. Thus, in order to carry on these data exploitation tasks, organizations must first perform data integration combining data from multiple sources to yield a unified view over them. Yet, the integration of massive and heterogeneous amounts of data requires revisiting the traditional integration assumptions to cope with the new requirements posed by such data-intensive settings.This PhD thesis aims to provide a novel framework for data integration in the context of data-intensive ecosystems, which entails dealing with vast amounts of heterogeneous data, from multiple sources and in their original format. To this end, we advocate for an integration process consisting of sequential activities governed by a semantic layer, implemented via a shared repository of metadata. From an stewardship perspective, this activities are the deployment of a data integration architecture, followed by the population of such shared metadata. From a data consumption perspective, the activities are virtual and materialized data integration, the former an exploratory task and the latter a consolidation one. Following the proposed framework, we focus on providing contributions to each of the four activities.We begin proposing a software reference architecture for semantic-aware data-intensive systems. Such architecture serves as a blueprint to deploy a stack of systems, its core being the metadata repository. Next, we propose a graph-based metadata model as formalism for metadata management. We focus on supporting schema and data source evolution, a predominant factor on the heterogeneous sources at hand. For virtual integration, we propose query rewriting algorithms that rely on the previously proposed metadata model. We additionally consider semantic heterogeneities in the data sources, which the proposed algorithms are capable of automatically resolving. Finally, the thesis focuses on the materialized integration activity, and to this end, proposes a method to select intermediate results to materialize in data-intensive flows. Overall, the results of this thesis serve as contribution to the field of data integration in contemporary data-intensive ecosystems. / Doctorat en Sciences de l'ingénieur et technologie / info:eu-repo/semantics/nonPublished
|
5 |
Statistical methods for robust analysis of transcriptome data by integration of biological prior knowledge / Méthodes statistiques pour une analyse robuste du transcriptome à travers l'intégration d'a priori biologiqueJeanmougin, Marine 16 November 2012 (has links)
Au cours de la dernière décennie, les progrès en Biologie Moléculaire ont accéléré le développement de techniques d'investigation à haut-débit. En particulier, l'étude du transcriptome a permis des avancées majeures dans la recherche médicale. Dans cette thèse, nous nous intéressons au développement de méthodes statistiques dédiées au traitement et à l'analyse de données transcriptomiques à grande échelle. Nous abordons le problème de sélection de signatures de gènes à partir de méthodes d'analyse de l'expression différentielle et proposons une étude de comparaison de différentes approches, basée sur plusieurs stratégies de simulations et sur des données réelles. Afin de pallier les limites de ces méthodes classiques qui s'avèrent peu reproductibles, nous présentons un nouvel outil, DiAMS (DIsease Associated Modules Selection), dédié à la sélection de modules de gènes significatifs. DiAMS repose sur une extension du score-local et permet l'intégration de données d'expressions et de données d'interactions protéiques. Par la suite, nous nous intéressons au problème d'inférence de réseaux de régulation de gènes. Nous proposons une méthode de reconstruction à partir de modèles graphiques Gaussiens, basée sur l'introduction d'a priori biologique sur la structure des réseaux. Cette approche nous permet d'étudier les interactions entre gènes et d'identifier des altérations dans les mécanismes de régulation, qui peuvent conduire à l'apparition ou à la progression d'une maladie. Enfin l'ensemble de ces développements méthodologiques sont intégrés dans un pipeline d'analyse que nous appliquons à l'étude de la rechute métastatique dans le cancer du sein. / Recent advances in Molecular Biology have led biologists toward high-throughput genomic studies. In particular, the investigation of the human transcriptome offers unprecedented opportunities for understanding cellular and disease mechanisms. In this PhD, we put our focus on providing robust statistical methods dedicated to the treatment and the analysis of high-throughput transcriptome data. We discuss the differential analysis approaches available in the literature for identifying genes associated with a phenotype of interest and propose a comparison study. We provide practical recommendations on the appropriate method to be used based on various simulation models and real datasets. With the eventual goal of overcoming the inherent instability of differential analysis strategies, we have developed an innovative approach called DiAMS, for DIsease Associated Modules Selection. This method was applied to select significant modules of genes rather than individual genes and involves the integration of both transcriptome and protein interactions data in a local-score strategy. We then focus on the development of a framework to infer gene regulatory networks by integration of a biological informative prior over network structures using Gaussian graphical models. This approach offers the possibility of exploring the molecular relationships between genes, leading to the identification of altered regulations potentially involved in disease processes. Finally, we apply our statistical developments to study the metastatic relapse of breast cancer.
|
6 |
Distributed Data Integration using Web Services and XMLMukker, Alka 20 December 2004 (has links)
Data integration has been an active topic of research in the past. With the advances in technology in the context of web, data integration faces new challenges imposed by heterogeneity of source, their autonomy and independence. Web services, which are universally accessible software components deployed on the web, are becoming the focus of recent researches due to their ability to interconnect systems and cost optimizations. At the same time, XML has also become one of the core technologies for business applications. By offering a standard, flexible and inherently extensible data format, XML significantly reduces the burden of deploying the many technologies needed to ensure the success of Web services. This thesis examines the opportunities for data integration in the context of web services development paradigm. It examines the existing technologies and standards of web services and XML and provides an example of how web services can be used to unlock heterogeneous systems to extract and integrate data. The approach followed to illustrate this uses embedded web service calls inside XML documents. The main contributions of this paper are: 1) comprehensive research of existing technologies 2) architecture to support invocation of embedded web services 3) implementation of an application to show the results 4) use of existing technologies to implement the proposed system.
|
7 |
Duomenų integravimas panaudojant XML / XML - based data integrationKazlauskas, Marius 07 January 2005 (has links)
Using different information systems and correct communication between them are very important problem at this time. Modern companies have a large number of major applications that take care of running the business. At different times, different people used different technologies to write these applications for different purposes. Data resources can be at many different forms in organization: relational databases, object databases, XML documents and etc. Databases can operate with different systems and use distinct software. This problem must be solved by enterprise application integration (EAI) systems. The project analyzes data level EAI. This type of integration is relatively inexpensive and it doesn’t need to incur the expense of changing, testing, and deploying the application. XML can be a powerful ally for data integration. The main purpose of this project was to analyze usage of XML technologies possibilities for data integration. The problem solved is how to use the same XML flow for a several purposes: the change of another database(s), transforming to a HTML document and forming a PDF document for printing. These problems are analyzed and solved in the particular range – organization of management of public utility. Here is designed and realized an information system.
|
8 |
Studijų modulių planavimo ir valdymo sistema / Planning and management system of studies modulesJurčikonis, Dainius 16 January 2005 (has links)
In this work analyzed some standards and technologies according to data transfer processes between different data sources. New database development and usage requires legacy database data integration into new databases. The common integration structure must be created for this purpose. During integration we should pay attention to name conflicts. XML is being used because of its usage flexibility and simplicity. Data exchange using XML enables to ensure data storage for the future regenerative possibility. System realization was used in KTU Computer Cathedral activity.
|
9 |
SociQL: a query language for the social webSerrano Suarez, Diego Fernando Unknown Date
No description available.
|
10 |
SociQL: a query language for the social webSerrano Suarez, Diego Fernando 06 1900 (has links)
Social network sites are becoming increasingly popular and useful as well as relevant means for serious social research. However, despite their user appeal and wide adoption, the current generation of sites are hard to query and explore, offering limited views of local network neighbourhoods. Moreover these sites are disconnected islands of information due to application and interface differences. We describe SociQL: a query language along with a prototype implementation that enables for the representation, querying and exploration of disparate social networks. Unlike generic web query languages, SociQL is designed to support the examination of sociological questions, incorporating social theory and integration of networks that form a single unified source of information. The thesis discusses the design and rationale for the elements in the language, and reports on our experiences in querying real social network sites with it.
|
Page generated in 0.08 seconds