Return to search

Modeling species geographic distributions in aquatic ecosystems using a density-based clustering algorithm

Distributional ecology is a branch of ecology which aims to reconstruct and predict the geographic range of free-living and symbiotic organisms in terrestrial and aquatic ecosystems. More recently, distributional ecology has been used to map disease transmission risk. The implementation of distributional ecology for disease transmission has, however, been erroneous in many cases. The inaccurate representation of disease distribution is detrimental to effective control and prevention. Furthermore, ecological niche modeling experiments are generally developed and tested using data from terrestrial organisms, neglecting aquatic organisms in case studies. Both disease and aquatic systems are often data limited, and current modeling methods are often insufficient. There is, therefore, a need to develop data-driven models that perform accurately even when only limited amounts of data are available or when there is little to no knowledge of the species' natural history to be modeled. Here, I propose a data-driven ecological niche modeling method that requires presence-only data (i.e., absence, pseudoabsence, or background records are not needed for model calibration). My method is expected to reconstruct environmental conditions where data-limited aquatic organisms are more likely to be present, based on a density-based clustering algorithm as a proxy of the realized niche (i.e., abiotic, and biotic environmental conditions occupied by the organism). Supported by ecological theories and methods, my central hypothesis is that because density-based clustering machine-learning modeling prevents extrapolation and interpolation, it can robustly reconstruct the realized niche of a data-limited aquatic organism. First, I assembled a comprehensive dataset of abiotic (temperature) and biotic (phytoplankton) environmental conditions and presence reports using Vibrio cholerae, a well-understood aquatic bacterium species in coastal waters globally (Chapter 2). Second, using V. cholerae as a model system, I developed detailed parameterizations of density-based clustering models to determine the parameter values with the best capacities to reconstruct and predict the species' distribution in global seawaters (Chapter 3). Finally, I compared the performance of density-based clustering modeling against traditional, correlative machine-learning ecological niche modeling methods (Chapter 4). Density-based clustering models, when assessed based on model fit and prediction, had comparable performance to traditional 'data-hungry' machine-learning correlative methods used in modern applications of ecological niche modeling. Modeling the environmental and geographic ranges of V. cholerae, an aquatic organism of free-living and parasitic ecologies, is a novel approach itself in distributional ecology. Ecological niche modeling applications to pathogens, such as V. cholerae, provide an opportunity to further the knowledge of directly-transmitted emerging diseases for which only limited data are available. Density-based clustering ecological niche modeling is termed here as Marble, honoring a previous, experimental version of this analytical approach, and is expected to provide new opportunities to understand how an ecological niche modeling method influences estimates of the distribution of data-limited organisms of complex ecology. These are lessons applicable to novel, rare, and cryptic aquatic organisms, such as emerging diseases, endangered fishes, and elusive aquatic species. / Master of Science / Distributional ecology is a branch of ecology which aims to reconstruct and predict the geographic distribution of land and water organisms. In the case of diseases, a correct representation of their geographic distributions is key for successful management. Previous studies highlight the need to develop new models that perform accurately even when limited amounts of data are available and there is little to no knowledge of the organisms' ecology. This thesis proposes a data-driven method, originally termed Marble. Marble is expected to help reconstruct environmental conditions where data-limited aquatic organisms are more likely to be found. Supported by ecological theories and methods, my hypothesis is that because Marble prevents under- and over-fitting, this method will produce results which better fit the data. Using V. cholerae, an aquatic organism, as a model system, I compared the performance of Marble against other traditional modeling algorithms. I found that Marble, in terms of model fit, performed similarly to traditional methods used in distributional ecology. Modeling the ecology of V. cholerae is a new approach in and of itself in ecological modeling. Furthermore, modeling pathogens provides an opportunity to further the knowledge of directly transmitted diseases, and Marble is expected to provide opportunities to understand how algorithm selection can reconstruct (or not) the distribution of data-limited aquatic organisms of diverse ecologies.

Identiferoai:union.ndltd.org:VTETD/oai:vtechworks.lib.vt.edu:10919/111820
Date13 September 2022
CreatorsCastaneda Guzman, Mariana
ContributorsFish and Wildlife Conservation, Escobar Quinonez, Luis E., Abaid, Nicole, Frimpong, Emmanuel A.
PublisherVirginia Tech
Source SetsVirginia Tech Theses and Dissertation
LanguageEnglish
Detected LanguageEnglish
TypeThesis
FormatETD, application/pdf
RightsIn Copyright, http://rightsstatements.org/vocab/InC/1.0/

Page generated in 0.0021 seconds