Return to search

Comparing machine learning models and physics-based models in groundwater science

The use of machine learning techniques in tackling hydrological problems has significantly
increased over the last decade. Machine learning tools can provide alternatives or surrogates to complex and comprehensive methodologies such as physics-based numerical models.
Machine learning algorithms have been used in hydrology for estimating streamflow, runoff,
water table fluctuations and calculating the impacts of climate change on nutrient loading
among many other applications. In recent years we have also seen arguments for and
advances in combining physics-based models and machine learning algorithms for mutual
benefit. This thesis contributes to these advances by addressing two different groundwater
problems by developing a machine learning approach and comparing this previously
developed physics-based models: i) estimating groundwater and surface water depletion
caused by groundwater pumping using artificial neural networks and ii) estimating a global
steady-state map of water table depth using random forests.
The first chapter of this thesis outlines the purpose of this thesis and how this thesis is a
contribution to the overall scientific knowledge on the topic. The results of this research
contribute to three of the twenty-three major unsolved problems in hydrology, as has been
summarized by a collective of hundreds of hydrologists.
In the second chapter, we tested the potential of artificial neural networks (ANNs), a deeplearning
tool, as an alternative method for estimating source water of groundwater
abstraction compared to conventional methods (analytical solutions and numerical models).
Surrogate ANN models of three previously calibrated numerical groundwater models were
developed using hydrologically meaningful input parameters (e.g., well-stream distance and
hydraulic diffusivity) selected by predictor parameter optimization, combining hydrological
expertise and statistical methodologies (ANCOVA). The output parameters were three
transient sources of groundwater abstraction (shallow and deep storage release, and local
surface-water depletion). We found that the optimized ANNs have a predictive skill of up to
0.84 (R2, 2σ = ± 0.03) when predicting water sources compared to physics-based numerical
(MODFLOW) models. Optimal ANN skill was obtained when using between five and seven
predictor parameters, with hydraulic diffusivity and mean aquifer thickness being the most
important predictor parameters. Even though initial results are promising and
computationally frugal, we found that the deep learning models were not yet sufficient or
outperforming numerical model simulations.
The third chapter used random forests in mapping steady-state water table depth on a global
scale (0.1°-spatial resolution) and to integrate the results to improve our understanding on
scale and perceptual modeling of global water table depth. In this study we used a spatially
biased ~1.5-million-point database of water table depth observations with a variety of
iv
globally distributed above- and below-ground predictor variables with causal relationships to
steady-state water table depth. We mapped water table depth globally as well as at regional
to continental scales to interrogate performance, feature importance and hydrologic process
across scales and regions with varying hydrogeological landscapes and climates. The global
water table depth map has a correlation (cross validation error) of R2 = 0.72 while our highest
continental correlation map (Australia) has a correlation of R2 = 0.86. The results of this study
surprisingly show that above-ground variables such as surface elevation, slope, drainage
density and precipitation are among the most important predictor parameters while
subsurface parameters such as permeability and porosity are notably less important. This is
contrary to conventional thought among hydrogeologists, who would assume that subsurface
parameters are very important. Machine learning results overall underestimate water
table depth similar to existing global physics-based groundwater models which also have
comparable differences between existing physics-based groundwater models themselves.
The feature importance derived from our random forest models was used to develop
alternative perceptual models that highlight different water table depth controls between
areas with low relief and high relief. Finally, we considered the representativeness of the
prediction domain and the predictor database and found that 90% of the prediction domain
has a dissimilarity index lower than 0.75. We conclude that we see good extrapolation
potential for our random forest models to regions with unknown water table depth, except
for some high elevation regions.
Finally in chapter four, the most important findings of chapters two and three are considered
as contributions to the unresolved questions in hydrology. Overall, this thesis has contributed
to advancing hydrological sciences through: i) mapping of global steady-state water table
depth using machine learning; ii) advancing hybrid modeling by using synthetic data derived
from physics-based models to train an artificial neural network for estimating storage
depletion; and (iii) it contributing to answering three unsolved problems in hydrology
involving themes of parameter scaling across temporal and spatial scales, extracting
hydrological insight from data, the use of innovative modeling techniques to estimate
hydrological fluxes/states and extrapolation of models to no-data regions. / Graduate

Identiferoai:union.ndltd.org:uvic.ca/oai:dspace.library.uvic.ca:1828/13724
Date25 January 2022
CreatorsBoerman, Thomas Christiaan
ContributorsGleeson, Tom
Source SetsUniversity of Victoria
LanguageEnglish, English
Detected LanguageEnglish
TypeThesis
Formatapplication/pdf
RightsAvailable to the World Wide Web

Page generated in 0.0029 seconds