Global ETD Search

Return to search

Information fusion in taxonomic descriptions.

Providing a single access point to an information system from multiple sources is helpful in many fields. As a case study, this research investigates the potential of applying information fusion techniques in biodiversity area since researchers in this domain desperately need information from different sources to support decision making on tasks like biological identification. Furthermore, there are massive collections in this area and the descriptive materials on the same species (object) are scattered in different places. It is not easy to manually collect information to form a broader and integrated one. / As one of the most important descriptive materials in this field, floras are selected as the target of this research. This research tests a hypothesis concerning the organization of text and the constancy of fact-based information in text. It is observed that individual descriptions may not contain sufficient information to differentiate the target species from others, and different information sources might contain not only overlap information but also complementary information that is helpful. We also observe non-trivial complementary information could also be from different-level descriptions [family, genus, or species level] from the same source. By using the sample dataset from Flora of North America (FNA) and Flora of China (FOC), we found that about 50% information could only be found in single source and another 25% complementary information could be identified by fusion. And the most importantly, confliction information could only be detected by direct comparison. / The question is how could we fuse the records in an automatic or semi-automatic manner, so that each resulting record provides a broader while non-redundant description of each species? The proposed system demonstrates the feasibility with currently available techniques. The prototype system contains 4 modules: Text segmentation and Taxonomic Name Identification, Organ-level and Sub-organ level Information Extraction, Relationship Identification, and Information fusion. By using the sample descriptions from Flora of North America and Flora of China, we demonstrate that the method gain promising fusion result based on Cross-Description Relationships. With the evaluation results, we identified the key factors contribute to the performance of fusion. Some methods that might lead to further improvement on fusion performances are discussed. / This study also demonstrates that to a certain extent, this fusion approach is generalizable. The generalizability of this fusion approach is a challenging problem due to the typical domain- and task-oriented nature of the fusion methods. We identified the challenges while applying the approach to different data set.

http://thesis.lib.nccu.edu.tw/cgi-bin/cdrfb3/gsweb.cgi?o=dstdcdr&i=sid=%22U0003496771%22.

Library Science.

Information Science.

Identifer	oai:union.ndltd.org:CHENGCHI/U0003496771
Creators	Wei, Qin.
Publisher	University of Illinois at Urbana-Champaign.
Source Sets	National Chengchi University Libraries
Detected Language	English
Type	text
Rights	Copyright © nccu library on behalf of the copyright holders

Page generated in 0.002 seconds

Information fusion in taxonomic descriptions.

Description

Links & Downloads

Tags

Additional Fields