• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 144
  • 60
  • 20
  • 15
  • 10
  • 7
  • 4
  • 4
  • 2
  • 2
  • 1
  • 1
  • Tagged with
  • 300
  • 300
  • 66
  • 41
  • 40
  • 39
  • 33
  • 32
  • 32
  • 31
  • 30
  • 30
  • 29
  • 29
  • 28
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
21

Exploring Strategies to Integrate Disparate Bioinformatics Datasets

Fakhry, Charbel Bader 01 January 2019 (has links)
Distinct bioinformatics datasets make it challenging for bioinformatics specialists to locate the required datasets and unify their format for result extraction. The purpose of this single case study was to explore strategies to integrate distinct bioinformatics datasets. The technology acceptance model was used as the conceptual framework to understand the perceived usefulness and ease of use of integrating bioinformatics datasets. The population of this study included bioinformatics specialists of a research institution in Lebanon that has strategies to integrate distinct bioinformatics datasets. The data collection process included interviews with 6 bioinformatics specialists and reviewing 27 organizational documents relating to integrating bioinformatics datasets. Thematic analysis was used to identify codes and themes related to integrating distinct bioinformatics datasets. Key themes resulting from data analysis included a focus on integrating bioinformatics datasets, adding metadata with the submitted bioinformatics datasets, centralized bioinformatics database, resources, and bioinformatics tools. I showed throughout analyzing the findings of this study that specialists who promote standardizing techniques, adding metadata, and centralization may increase efficiency in integrating distinct bioinformatics datasets. Bioinformaticians, bioinformatics providers, the health care field, and society might benefit from this research. Improvement in bioinformatics affects poistevely the health-care field which has a positive social change. The results of this study might also lead to positive social change in research institutions, such as reduced workload, less frustration, reduction in costs, and increased efficiency while integrating distinct bioinformatics datasets.
22

A Property Valuation Model for Rural Victoria

Hayles, Kelly, kellyhayles@iinet.net.au January 2006 (has links)
Licensed valuers in the State of Victoria, Australia currently appraise rural land using manual techniques. Manual techniques typically involve site visits to the property, liaison with property owners through interview, and require a valuer experienced in agricultural properties to determine a value. The use of manual techniques typically takes longer to determine a property value than for valuations performed using automated techniques, providing appropriate data are available. Manual methods of valuation can be subjective and lead to bias in valuation estimates, especially where valuers have varying levels of experience within a specific regional area. Automation may lend itself to more accurate valuation estimates by providing greater consistency between valuations. Automated techniques presently in use for valuation include artificial neural networks, expert systems, case based reasoning and multiple regression analysis. The latter technique appears mo st widely used for valuation. The research aimed to develop a conceptual rural property valuation model, and to develop and evaluate quantitative models for rural property valuation based on the variables identified in the conceptual model. The conceptual model was developed by examining peer research, Valuation Best Practice Standards, a standard in use throughout Victoria for rating valuations, and rural property valuation texts. Using data that are only available digitally and publicly, the research assessed this conceptualisation using properties from four LGAs in the Wellington and Wimmera Catchment Management Authority (CMAs) areas in Victoria. Cluster analysis was undertaken to assess if the use of sub-markets, that are determined statistically, can lead to models that are more accurate than sub-markets that have been determined using geographically defined areas. The research is divided into two phases; the 'available data phase' and the 'restricted data phase'. The 'available data phase' used publicly available digital data to build quantitative models to estimate the value of rural properties. The 'restricted data phase' used data that became available near the completion of the research. The research examined the effect of using statistically derived sub-markets as opposed to geographically derived ones for property valuation. Cluster analysis was used during both phases of model development and showed that one of the clusters developed in the available data phase was superior in its model prediction compared to the models produced using geographically derived regions. A number of limitations with the digital property data available for Victoria were found. Although GIS analysis can enable more property characteristics to be derived and measured from existing data, it is reliant on having access to suitable digital data. The research also identified limitations with the metadata elements in use in Victoria (ANZMETA DTD version 1). It is hypothesised that to further refine the models and achieve greater levels of price estimation, additional properties would need to be sourced and added to the current property database. It is suggested that additional research needs to address issues associated with sub-market identification. If results of additional modelling indicated significantly different levels of price estimation, then these models could be used with manual techniques to evaluate manually derived valuation estimates.
23

Query Processing for Peer Mediator Databases

Katchaounov, Timour January 2003 (has links)
<p>The ability to physically interconnect many distributed, autonomous and heterogeneous software systems on a large scale presents new opportunities for sharing and reuse of existing, and for the creataion of new information and new computational services. However, finding and combining information in many such systems is a challenge even for the most advanced computer users. To address this challenge, mediator systems logically integrate many sources to hide their heterogeneity and distribution and give the users the illusion of a single coherent system.</p><p>Many new areas, such as scientific collaboration, require cooperation between many autonomous groups willing to share their knowledge. These areas require that the data integration process can be distributed among many autonomous parties, so that large integration solutions can be constructed from smaller ones. For this we propose a decentralized mediation architecture, peer mediator systems (PMS), based on the peer-to-peer (P2P) paradigm. In a PMS, reuse of human effort is achieved through logical composability of the mediators in terms of other mediators and sources by defining mediator views in terms of views in other mediators and sources.</p><p>Our thesis is that logical composability in a P2P mediation architecture is an important requirement and that composable mediators can be implemented efficiently through query processing techniques.</p><p>In order to compute answers of queries in a PMS, logical mediator compositions must be translated to query execution plans, where mediators and sources cooperate to compute query answers. The focus of this dissertation is on query processing methods to realize composability in a PMS architecture in an efficient way that scales over the number of mediators.</p><p>Our contributions consist of an investigation of the interfaces and capabilities for peer mediators, and the design, implementation and experimental study of several query processing techniques that realize composability in an efficient and scalable way.</p>
24

Tabular Representation of Schema Mappings: Semantics and Algorithms

Rahman, Md. Anisur 27 May 2011 (has links)
Our thesis investigates a mechanism for representing schema mapping by tabular forms and checking utility of the new representation. Schema mapping is a high-level specification that describes the relationship between two database schemas. Schema mappings constitute essential building blocks of data integration, data exchange and peer-to-peer data sharing systems. Global-and-local-as-view (GLAV) is one of the approaches for specifying the schema mappings. Tableaux are used for expressing queries and functional dependencies on a single database in a tabular form. In our thesis, we first introduce a tabular representation of GLAV mappings. We find that this tabular representation helps to solve many mapping-related algorithmic and semantic problems. For example, a well-known problem is to find the minimal instance of the target schema for a given instance of the source schema and a set of mappings between the source and the target schema. Second, we show that our proposed tabular mapping can be used as an operator on an instance of the source schema to produce an instance of the target schema which is `minimal' and `most general' in nature. There exists a tableaux-based mechanism for finding equivalence of two queries. Third, we extend that mechanism for deducing equivalence between two schema mappings using their corresponding tabular representations. Sometimes, there exist redundant conjuncts in a schema mapping which causes data exchange, data integration and data sharing operations more time consuming. Fourth, we present an algorithm that utilizes the tabular representations for reducing number of constraints in the schema mappings. At present, either schema-level mappings or data-level mappings are used for data sharing purposes. Fifth, we introduce and give the semantics of bi-level mapping that combines the schema-level and data-level mappings. We also show that bi-level mappings are more effective for data sharing systems. Finally, we implemented our algorithms and developed a software prototype to evaluate our proposed strategies.
25

A Practical Approach to Merging Multidimensional Data Models

Mireku Kwakye, Michael 30 November 2011 (has links)
Schema merging is the process of incorporating data models into an integrated, consistent schema from which query solutions satisfying all incorporated models can be derived. The efficiency of such a process is reliant on the effective semantic representation of the chosen data models, as well as the mapping relationships between the elements of the source data models. Consider a scenario where, as a result of company mergers or acquisitions, a number of related, but possible disparate data marts need to be integrated into a global data warehouse. The ability to retrieve data across these disparate, but related, data marts poses an important challenge. Intuitively, forming an all-inclusive data warehouse includes the tedious tasks of identifying related fact and dimension table attributes, as well as the design of a schema merge algorithm for the integration. Additionally, the evaluation of the combined set of correct answers to queries, likely to be independently posed to such data marts, becomes difficult to achieve. Model management refers to a high-level, abstract programming language designed to efficiently manipulate schemas and mappings. Particularly, model management operations such as match, compose mappings, apply functions and merge, offer a way to handle the above-mentioned data integration problem within the domain of data warehousing. In this research, we introduce a methodology for the integration of star schema source data marts into a single consolidated data warehouse based on model management. In our methodology, we discuss the development of three (3) main streamlined steps to facilitate the generation of a global data warehouse. That is, we adopt techniques for deriving attribute correspondences, and for schema mapping discovery. Finally, we formulate and design a merge algorithm, based on multidimensional star schemas; which is primarily the core contribution of this research. Our approach focuses on delivering a polynomial time solution needed for the expected volume of data and its associated large-scale query processing. The experimental evaluation shows that an integrated schema, alongside instance data, can be derived based on the type of mappings adopted in the mapping discovery step. The adoption of Global-And-Local-As-View (GLAV) mapping models delivered a maximally-contained or exact representation of all fact and dimensional instance data tuples needed in query processing on the integrated data warehouse. Additionally, different forms of conflicts, such as semantic conflicts for related or unrelated dimension entities, and descriptive conflicts for differing attribute data types, were encountered and resolved in the developed solution. Finally, this research has highlighted some critical and inherent issues regarding functional dependencies in mapping models, integrity constraints at the source data marts, and multi-valued dimension attributes. These issues were encountered during the integration of the source data marts, as it has been the case of evaluating the queries processed on the merged data warehouse as against that on the independent data marts.
26

Tabular Representation of Schema Mappings: Semantics and Algorithms

Rahman, Md. Anisur 27 May 2011 (has links)
Our thesis investigates a mechanism for representing schema mapping by tabular forms and checking utility of the new representation. Schema mapping is a high-level specification that describes the relationship between two database schemas. Schema mappings constitute essential building blocks of data integration, data exchange and peer-to-peer data sharing systems. Global-and-local-as-view (GLAV) is one of the approaches for specifying the schema mappings. Tableaux are used for expressing queries and functional dependencies on a single database in a tabular form. In our thesis, we first introduce a tabular representation of GLAV mappings. We find that this tabular representation helps to solve many mapping-related algorithmic and semantic problems. For example, a well-known problem is to find the minimal instance of the target schema for a given instance of the source schema and a set of mappings between the source and the target schema. Second, we show that our proposed tabular mapping can be used as an operator on an instance of the source schema to produce an instance of the target schema which is `minimal' and `most general' in nature. There exists a tableaux-based mechanism for finding equivalence of two queries. Third, we extend that mechanism for deducing equivalence between two schema mappings using their corresponding tabular representations. Sometimes, there exist redundant conjuncts in a schema mapping which causes data exchange, data integration and data sharing operations more time consuming. Fourth, we present an algorithm that utilizes the tabular representations for reducing number of constraints in the schema mappings. At present, either schema-level mappings or data-level mappings are used for data sharing purposes. Fifth, we introduce and give the semantics of bi-level mapping that combines the schema-level and data-level mappings. We also show that bi-level mappings are more effective for data sharing systems. Finally, we implemented our algorithms and developed a software prototype to evaluate our proposed strategies.
27

A Practical Approach to Merging Multidimensional Data Models

Mireku Kwakye, Michael 30 November 2011 (has links)
Schema merging is the process of incorporating data models into an integrated, consistent schema from which query solutions satisfying all incorporated models can be derived. The efficiency of such a process is reliant on the effective semantic representation of the chosen data models, as well as the mapping relationships between the elements of the source data models. Consider a scenario where, as a result of company mergers or acquisitions, a number of related, but possible disparate data marts need to be integrated into a global data warehouse. The ability to retrieve data across these disparate, but related, data marts poses an important challenge. Intuitively, forming an all-inclusive data warehouse includes the tedious tasks of identifying related fact and dimension table attributes, as well as the design of a schema merge algorithm for the integration. Additionally, the evaluation of the combined set of correct answers to queries, likely to be independently posed to such data marts, becomes difficult to achieve. Model management refers to a high-level, abstract programming language designed to efficiently manipulate schemas and mappings. Particularly, model management operations such as match, compose mappings, apply functions and merge, offer a way to handle the above-mentioned data integration problem within the domain of data warehousing. In this research, we introduce a methodology for the integration of star schema source data marts into a single consolidated data warehouse based on model management. In our methodology, we discuss the development of three (3) main streamlined steps to facilitate the generation of a global data warehouse. That is, we adopt techniques for deriving attribute correspondences, and for schema mapping discovery. Finally, we formulate and design a merge algorithm, based on multidimensional star schemas; which is primarily the core contribution of this research. Our approach focuses on delivering a polynomial time solution needed for the expected volume of data and its associated large-scale query processing. The experimental evaluation shows that an integrated schema, alongside instance data, can be derived based on the type of mappings adopted in the mapping discovery step. The adoption of Global-And-Local-As-View (GLAV) mapping models delivered a maximally-contained or exact representation of all fact and dimensional instance data tuples needed in query processing on the integrated data warehouse. Additionally, different forms of conflicts, such as semantic conflicts for related or unrelated dimension entities, and descriptive conflicts for differing attribute data types, were encountered and resolved in the developed solution. Finally, this research has highlighted some critical and inherent issues regarding functional dependencies in mapping models, integrity constraints at the source data marts, and multi-valued dimension attributes. These issues were encountered during the integration of the source data marts, as it has been the case of evaluating the queries processed on the merged data warehouse as against that on the independent data marts.
28

Manifold Integration: Data Integration on Multiple Manifolds

Choi, Hee Youl 2010 May 1900 (has links)
In data analysis, data points are usually analyzed based on their relations to other points (e.g., distance or inner product). This kind of relation can be analyzed on the manifold of the data set. Manifold learning is an approach to understand such relations. Various manifold learning methods have been developed and their effectiveness has been demonstrated in many real-world problems in pattern recognition and signal processing. However, most existing manifold learning algorithms only consider one manifold based on one dissimilarity matrix. In practice, multiple measurements may be available, and could be utilized. In pattern recognition systems, data integration has been an important consideration for improved accuracy given multiple measurements. Some data integration algorithms have been proposed to address this issue. These integration algorithms mostly use statistical information from the data set such as uncertainty of each data source, but they do not use the structural information (i.e., the geometric relations between data points). Such a structure is naturally described by a manifold. Even though manifold learning and data integration have been successfully used for data analysis, they have not been considered in a single integrated framework. When we have multiple measurements generated from the same data set and mapped onto different manifolds, those measurements can be integrated using the structural information on these multiple manifolds. Furthermore, we can better understand the structure of the data set by combining multiple measurements in each manifold using data integration techniques. In this dissertation, I present a new concept, manifold integration, a data integration method using the structure of data expressed in multiple manifolds. In order to achieve manifold integration, I formulated the manifold integration concept, and derived three manifold integration algorithms. Experimental results showed the algorithms' effectiveness in classification and dimension reduction. Moreover, for manifold integration, I showed that there are good theoretical and neuroscientific applications. I expect the manifold integration approach to serve as an effective framework for analyzing multimodal data sets on multiple manifolds. Also, I expect that my research on manifold integration will catalyze both manifold learning and data integration research.
29

Production Data Integration into High Resolution Geologic Models with Trajectory-based Methods and A Dual Scale Approach

Kim, Jong Uk 2009 August 1900 (has links)
Inverse problems associated with reservoir characterization are typically underdetermined and often have difficulties associated with stability and convergence of the solution. A common approach to address this issue is through the introduction of prior constraints, regularization or reparameterization to reduce the number of estimated parameters. We propose a dual scale approach to production data integration that relies on a combination of coarse-scale and fine-scale inversions while preserving the essential features of the geologic model. To begin with, we sequentially coarsen the fine-scale geological model by grouping layers in such a way that the heterogeneity measure of an appropriately defined 'static' property is minimized within the layers and maximized between the layers. Our coarsening algorithm results in a non-uniform coarsening of the geologic model with minimal loss of heterogeneity and the ?optimal? number of layers is determined based on a bias-variance trade-off criterion. The coarse-scale model is then updated using production data via a generalized travel time inversion. The coarse-scale inversion proceeds much faster compared to a direct fine-scale inversion because of the significantly reduced parameter space. Furthermore, the iterative minimization is much more effective because at the larger scales there are fewer local minima and those tend to be farther apart. At the end of the coarse-scale inversion, a fine-scale inversion may be carried out, if needed. This constitutes the outer iteration in the overall algorithm. The fine-scale inversion is carried out only if the data misfit is deemed to be unsatisfactory. We propose a fast and robust approach to calibrating geologic models by transient pressure data using a trajectory-based approach that based on a high frequency asymptotic expansion of the diffusivity equation. The trajectory or ray-based methods are routinely used in seismic tomography. In this work, we investigate seismic rays and compare them with streamlines. We then examine the applicability of streamline-based methods for transient pressure data inversion. Specifically, the high frequency asymptotic approach allows us to analytically compute the sensitivity of the pressure responses with respect to reservoir properties such as porosity and permeability. It facilitates a very efficient methodology for the integration of pressure data into geologic models.
30

Field scale history matching and assisted history matching using streamline simulation

Kharghoria, Arun 15 November 2004 (has links)
In this study, we apply the streamline-based production data integration method to condition a multimillion cell geologic model to historical production response for a giant Saudi Arabian reservoir. The field has been under peripheral water injection with 16 injectors and 70 producers. There is also a strong aquifer influx into the field. A total of 30 years of production history with detailed rate, infill well and re-perforation schedule were incorporated via multiple pressure updates during streamline simulation. Also, gravity and compressibility effects were included to account for water slumping and aquifer support. To our knowledge, this is the first and the largest such application of production data integration to geologic models accounting for realistic field conditions. We have developed novel techniques to analytically compute the sensitivities of the production response in the presence of gravity and changing field conditions. This makes our method computationally extremely efficient. The field application takes less than 6 hours to run on a PC. The geologic model derived after conditioning to production response was validated using field surveillance data. In particular, the flood front movement, the aquifer encroachment and bypassed oil locations obtained from the geologic model was found to be consistent with field observations. Finally, an examination of the permeability changes during production data integration revealed that most of these changes were aligned along the facies distribution, particularly the 'good' facies distribution with no resulting loss in geologic realism. We also propose a novel assisted history matching procedure for finite difference simulators using streamline derived sensitivity calculations. Unlike existing assisted history matching techniques where the user is required to manually adjust the parameters, this procedure combines the rigor of finite difference models and efficiencies of streamline simulators to perform history matching. Finite difference simulator is used to solve for pressure, flux and saturations which, in turn, are used as input for the streamline simulator for estimating the parameter sensitivities analytically. The streamline derived sensitivities are then used to update the reservoir model. The updated model is then used in the finite difference simulator in an iterative mode until a significant satisfactory history match is obtained. The assisted history matching procedure has been tested for both synthetic and field examples. The results show a significant speed-up in history matching using conventional finite difference simulators.

Page generated in 0.0364 seconds