Return to search

Interoperability between heterogeneous and distributed biodiversity data sources in structured data networks

The extensive capturing of biodiversity data and storing them in heterogeneous information systems that are accessible on the internet across the globe has created many interoperability problems. One is that the data providers are independent of others and they can run systems which were developed on different platforms at different times using different software products to respond to different needs of information. A second arises from the data modelling used to convert the real world data into a computerised data structure which is not conditioned by a universal standard. Most importantly the need for interoperation between these disparate data sources is to get accurate and useful information for further analysis and decision making. The software representation of a universal or a single data definition structure for depicting a biodiversity entity is ideal. But this is not necessarily possible when integrating data from independently developed systems. The different perspectives of the real-world entity when being modelled by independent teams will result in the use of different terminologies, definition and representation of attributes and operations for the same real-world entity. The research in this thesis is concerned with designing and developing an interoperable flexible framework that allows data integration between various distributed and heterogeneous biodiversity data sources that adopt XML standards for data communication. In particular the problems of scope and representational heterogeneity among the various XML data schemas are addressed. To demonstrate this research a prototype system called BUFFIE (Biodiversity Users‘ Flexible Framework for Interoperability Experiments) was designed using a hybrid of Object-oriented and Functional design principles. This system accepts the query information from the user in a web form, and designs an XML query. This request query is enriched and is made more specific to data providers using the data provider information stored in a repository. These requests are sent to the different heterogeneous data resources across the internet using HTTP protocol. The responses received are in varied XML formats which are integrated using knowledge mapping rules defined in XSLT & XML. The XML mappings are derived from a biodiversity domain knowledgebase defined for schema mappings of different data exchange protocols. The integrated results are presented to users or client programs to do further analysis. The main results of this thesis are: (1) A framework model that allows interoperation between the heterogeneous data source systems. (2) Enriched querying improves the accuracy of responses by finding the correct information existing among autonomous, distributed and heterogeneous data resources. (3) A methodology that provides a foundation for extensibility as any new network data standards in XML can be added to the existing protocols. The presented approach shows that (1) semi automated mapping and integration of datasets from the heterogeneous and autonomous data providers is feasible. (2) Query enriching and integrating the data allows the querying and harvesting of useful data from various data providers for helpful analysis.

Identiferoai:union.ndltd.org:bl.uk/oai:ethos.bl.uk:567106
Date January 2010
CreatorsSundaravadivelu, Rathinasabapathy
PublisherCardiff University
Source SetsEthos UK
Detected LanguageEnglish
TypeElectronic Thesis or Dissertation
Sourcehttp://orca.cf.ac.uk/18086/

Page generated in 0.002 seconds