Surface water is essential for the eco-environment and provides various purposes and functions in anthropogenic activities such as domestic water supply, agriculture irrigation and drainage, and energy creation. Due to its important role in human health and ecosystem, it is essential to establish effective models to study the water quantity and quality through systematic research. In this dissertation, machine learning (ML) has been investigated as a modelling approach to simulate water quality and quantity of rivers. It includes the analysis of river pollution sources, downstream river flow prediction, long-term river heavy metal prediction and the influence of environmental factors on river heavy metals. This research was conducted to combine ML methods to enhance the accuracy and applicability of models in water environment studies and further assist relevant sectors to strengthen the management of surface water quantity and quality.
(1) The primary source contributors of trace metals in surface water were identified based on the trained optimal model. This study trained and evaluated the typical shallow and deep learning approaches to identifying and classifying source contributors based on a database and analyzed the source apportionment of trace metals in the main stream and tributaries of river basin by the proposed approaches.
(2) The interpretable ML models were developed to overcome the under-appreciated issue of the model for predicting heavy metals that are both predictive and transparent. This study employed and compared five tree-based machine learning models and then performed global and local feature importance analyses with the optimum models to predict the most important environmental factors for heavy metals management.
(3) The performance of conventional or hybrid ML models with time series decomposition technology was developed and compared for long term prediction of heavy metals. This study examined the effect of inputs time-series data of selection and division on the performance of conventional and hybrid models and evaluated the long-term fore-casts by standard metrics to select the optimum approaches for typical metal long term prediction.
(4) The downstream river flow was predicted based on the combined ML models. This study established the hybrid CNN-LSTM model to process the 2D rainfall radar information by convolutional neural network (CNN) and the time series information by long short-term memory (LSTM), and explored the capacity of the hybrid model for river flow forecasting.:Table of Content
List of Abbreviations V
List of Publications on the Ph.D. Topic VII
List of Co-authored Publications during the Ph.D. VII
1 General Introduction 1
1.1 Background 1
1.2 Aim and Objective 3
1.3 Innovation and Contribution 3
1.4 Outline 4
1.5 References 5
2 Traceability Study of Metals 9
2.1 Introduction 10
2.2 Materials and Methods 11
2.2.1 Study Area 11
2.2.2 Ecological Risk Assessment 12
2.2.3 Model Development 13
2.2.4 The Classification Supervised ML Models 14
2.2.5 Model Evaluation 15
2.3 Results 16
2.3.1 Metal Characteristics of the Registered Dataset and Applicable Area 16
2.3.2 Spatial Assessment of Ecological Risk 20
2.3.3 Temporal Trend in Ecologic Risk 21
2.3.4 Performance Analysis of Classification Models 22
2.3.5 Resource Analysis Based on the Trained Model 23
2.4 Discussion 24
2.4.1 RF Outperformed the Other Models 24
2.4.2 Potential Sources of Metals in the Given Area 25
2.5 Conclusion 27
2.6 References 27
3 Elucidation of Environmental Factors’ Influence on Metals 33
3.1 Introduction 34
3.2 Materials and Methods 36
3.2.1 Study Sites and Data 36
3.2.2 Indexing Approach 39
3.2.3 Regression Models 40
3.2.4 Model Performance Metrics and Hyperparameter Tuning 41
3.2.5 SHapley Additive exPlanations (SHAP) 42
3.2.6 The Partial Dependence Plot (PDP) 42
3.3 Results 43
3.3.1 HPI and Environmental Variables 43
3.3.2 Assessment of Approaches 44
3.3.3 Global Feature Importance 46
3.3.4 Local Feature Importance 47
3.3.5 Sensitive Factor Analysis 49
3.4 Discussion 49
3.4.1 SHAP Outperformed the Other Importance Method 49
3.4.2 Robust Association of Top Variables with HPI 50
3.5 Conclusion 51
3.6 References 51
4 Long-Term Prediction of Metals Concentration 57
4.1 Introduction 58
4.2 Materials and Methods 60
4.2.1 Study Area and Water Quality Data 60
4.2.2 Input Identification 61
4.2.3 Wavelet Transform 63
4.2.4 Back-Propagation Neural Network (BPNN) Model 63
4.2.5 Nonlinear Autoregressive Exogenous (NARX) Model 65
4.2.6 Wavelet and BPNN (WNN) Hybrid Model 65
4.2.7 Wavelet and NARX (WNARX) Hybrid Model 66
4.2.8 Model Performance Evaluation 66
4.3 Results 67
4.3.1 Model Establishment 67
4.3.2 Performance Analysis of the Optimal Scenarios 70
4.4 Discussion 74
4.5 Conclusion 75
4.6 References 76
5 Prediction of Downstream River Flow 79
5.1 Introduction 80
5.2 Materials and Methods 82
5.2.1 Study Area and Data Acquisition 82
5.2.2 Convolutional Neural Network (CNN) 83
5.2.3 Long Short-Term Memory (LSTM) 83
5.2.4 River Flow Simulation 84
5.2.5 Performance Evaluation 84
5.3 Results 85
5.3.1 Flow Time Series 85
5.3.2 Input Selection 86
5.3.3 Flow Simulation 87
5.4 Discussion 90
5.5 Conclusion 91
5.6 References 92
6 Conclusions and Future Research 97
6.1 Traceability Study of Metals 97
6.2 Elucidation of Environmental Factors’ Influence on Metals 97
6.3 Long-Term Prediction of Metals Concentration 98
6.4 Prediction of River Flow 98
6.5 Discussion and Future Research 98
6.5.1 Discussion 99
6.5.2 Future Research 100
7 Appendices 103
7.1 Supporting Information for Traceability Study of Metals 103
7.1.1 Naive Bayes 103
7.1.2 Support Vector Machine 103
7.1.3 Neural Network 105
7.1.4 Random Forest 106
7.1.5 Long Short-Term Memory 106
7.1.6 Convolutional Neural Network 107
7.2 Supporting Information for Elucidation of Environmental Factors’ Influence on Metals 108
7.3 Supporting Information for Long-Term Prediction of Metals Concentration 109
7.3.1 Figures 109
7.4 References 120
Identifer | oai:union.ndltd.org:DRESDEN/oai:qucosa:de:qucosa:88843 |
Date | 04 January 2024 |
Creators | Li, Peifeng |
Contributors | Krebs, Peter, Wang, Gangsheng, Rauch, Wolfgang, Technische Universität Dresden |
Source Sets | Hochschulschriftenserver (HSSS) der SLUB Dresden |
Language | English |
Detected Language | English |
Type | info:eu-repo/semantics/publishedVersion, doc-type:doctoralThesis, info:eu-repo/semantics/doctoralThesis, doc-type:Text |
Rights | info:eu-repo/semantics/openAccess |
Relation | 10.1038/s41598-020-70438-8, 10.3390/w14060993, 10.1016/j.scitotenv.2022.155944, urn:nbn:de:bsz:14-qucosa2-805409, qucosa:80540 |
Page generated in 0.0027 seconds