<p>Streamflow prediction in ungauged basins (PUB) is a process generating streamflow time series at ungauged reaches in a river network. PUB is essential for facilitating various engineering tasks such as managing stormwater, water resources, and water-related environmental impacts. Machine Learning (ML) has emerged as a powerful tool for PUB using its generalization process to capture the streamflow generation processes from hydrological datasets (observations). ML’s generalization process is impacted by two major components: data splitting process of observations and the architecture design. To unveil the potential limitations of ML’s generalization process, this dissertation explores its robustness and associated uncertainty. More precisely, this dissertation has three objectives: (1) analyzing the potential uncertainty caused by the data splitting process for ML modeling, (2) investigating the improvement of ML models’ performance by incorporating hydrological processes within their architectures, and (3) identifying the potential biases in ML’s generalization process regarding the trend and periodicity of streamflow simulations.</p><p>The first objective of this dissertation is to assess the sensitivity and uncertainty caused by the regular data splitting process for ML modeling. The regular data splitting process in ML was initially designed for homogeneous and stationary datasets, but it may not be suitable for hydrological datasets in the context of PUB studies. Hydrological datasets usually consist of data collected from diverse watersheds with distinct streamflow generation regimes influenced by varying meteorological forcing and watershed characteristics. To address the potential inconsistency in the data splitting process, multiple data splitting scenarios are generated using the Monte Carlo method. The scenario with random data splitting results accounts for frequent covariate shift and tends to add uncertainty and biases to ML’s generalization process. The findings in this objective suggest the importance of avoiding the covariate shift during the data splitting process when developing ML models for PUB to enhance the robustness and reliability of ML’s performance.</p><p>The second objective of this dissertation is to investigate the improvement of ML models’ performance brought by Physics-Guided Architecture (PGA), which incorporates ML with the rainfall abstraction process. PGA is a theory-guided machine learning framework integrating conceptual tutors (CTs) with ML models. In this study, CTs correspond to rainfall abstractions estimated by Green-Ampt (GA) and SCS-CN models. Integrating the GA model’s CTs, which involves information on dynamic soil properties, into PGA models leads to better performance than a regular ML model. On the contrary, PGA models integrating the SCS-CN model's CTs yield no significant improvement of ML model’s performance. The results of this objective demonstrate that the ML’s generalization process can be improved by incorporating CTs involving dynamic soil properties.</p><p>The third objective of this dissertation is to explore the limitations of ML’s generalization process in capturing trend and periodicity for streamflow simulations. Trend and periodicity are essential components of streamflow time series, representing the long-term correlations and periodic patterns, respectively. When the ML models generate streamflow simulations, they tend to have relatively strong long-term periodic components, such as yearly and multiyear periodic patterns. In addition, compared to the observed streamflow data, the ML models display relatively weak short-term periodic components, such as daily and weekly periodic patterns. As a result, the ML’s generalization process may struggle to capture the short-term periodic patterns in the streamflow simulations. The biases in ML’s generalization process emphasize the demands for external knowledge to improve the representation of the short-term periodic components in simulating streamflow.</p>
Identifer | oai:union.ndltd.org:purdue.edu/oai:figshare.com:article/23826393 |
Date | 03 August 2023 |
Creators | Pin-Ching Li (16734693) |
Source Sets | Purdue University |
Detected Language | English |
Type | Text, Thesis |
Rights | CC BY 4.0 |
Relation | https://figshare.com/articles/thesis/Application_of_Machine_Learning_and_AI_for_Prediction_in_Ungauged_Basins/23826393 |
Page generated in 0.0024 seconds