Return to search

Data driven modelling for environmental water management

Management of water quality is generally based on physically-based equations or hypotheses describing the behaviour of water bodies. In recent years models built on the basis of the availability of larger amounts of collected data are gaining popularity. This modelling approach can be called data driven modelling. Observational data represent specific knowledge, whereas a hypothesis represents a generalization of this knowledge that implies and characterizes all such observational data. Traditionally deterministic numerical models have been used for predicting flow and water quality processes in inland and coastal basins. These models generally take a long time to run and cannot be used as on-line decision support tools, thereby enabling imminent threats to public health risk and flooding etc. to be predicted. In contrast, Data driven models are data intensive and there are some limitations in this approach. The extrapolation capability of data driven methods are a matter of conjecture. Furthermore, the extensive data required for building a data driven model can be time and resource consuming or for the case predicting the impact of a future development then the data is unlikely to exist. The main objective of the study was to develop an integrated approach for rapid prediction of bathing water quality in estuarine and coastal waters. Faecal Coliforms (FC) were used as a water quality indicator and two of the most popular data mining techniques, namely, Genetic Programming (GP) and Artificial Neural Networks (ANNs) were used to predict the FC levels in a pilot basin. In order to provide enough data for training and testing the neural networks, a calibrated hydrodynamic and water quality model was used to generate input data for the neural networks. A novel non-linear data analysis technique, called the Gamma Test, was used to determine the data noise level and the number of data points required for developing smooth neural network models. Details are given of the data driven models, numerical models and the Gamma Test. Details are also given of a series experiments being undertaken to test data driven model performance for a different number of input parameters and time lags. The response time of the receiving water quality to the input boundary conditions obtained from the hydrodynamic model has been shown to be a useful knowledge for developing accurate and efficient neural networks. It is known that a natural phenomenon like bacterial decay is affected by a whole host of parameters which can not be captured accurately using solely the deterministic models. Therefore, the data-driven approach has been investigated using field survey data collected in Cardiff Bay to investigate the relationship between bacterial decay and other parameters. Both of the GP and ANN models gave similar, if not better, predictions of the field data in comparison with the deterministic model, with the added benefit of almost instant prediction of the bacterial levels for this recreational water body. The models have also been investigated using idealised and controlled laboratory data for the velocity distributions along compound channel reaches with idealised rods have located on the floodplain to replicate large vegetation (such as mangrove trees).

Identiferoai:union.ndltd.org:bl.uk/oai:ethos.bl.uk:584036
Date January 2007
CreatorsSyed, Mofazzal
PublisherCardiff University
Source SetsEthos UK
Detected LanguageEnglish
TypeElectronic Thesis or Dissertation
Sourcehttp://orca.cf.ac.uk/54592/

Page generated in 0.0011 seconds