Spelling suggestions: "subject:"data science"" "subject:"mata science""
91 |
Comparing machine learning models and physics-based models in groundwater scienceBoerman, Thomas Christiaan 25 January 2022 (has links)
The use of machine learning techniques in tackling hydrological problems has significantly
increased over the last decade. Machine learning tools can provide alternatives or surrogates to complex and comprehensive methodologies such as physics-based numerical models.
Machine learning algorithms have been used in hydrology for estimating streamflow, runoff,
water table fluctuations and calculating the impacts of climate change on nutrient loading
among many other applications. In recent years we have also seen arguments for and
advances in combining physics-based models and machine learning algorithms for mutual
benefit. This thesis contributes to these advances by addressing two different groundwater
problems by developing a machine learning approach and comparing this previously
developed physics-based models: i) estimating groundwater and surface water depletion
caused by groundwater pumping using artificial neural networks and ii) estimating a global
steady-state map of water table depth using random forests.
The first chapter of this thesis outlines the purpose of this thesis and how this thesis is a
contribution to the overall scientific knowledge on the topic. The results of this research
contribute to three of the twenty-three major unsolved problems in hydrology, as has been
summarized by a collective of hundreds of hydrologists.
In the second chapter, we tested the potential of artificial neural networks (ANNs), a deeplearning
tool, as an alternative method for estimating source water of groundwater
abstraction compared to conventional methods (analytical solutions and numerical models).
Surrogate ANN models of three previously calibrated numerical groundwater models were
developed using hydrologically meaningful input parameters (e.g., well-stream distance and
hydraulic diffusivity) selected by predictor parameter optimization, combining hydrological
expertise and statistical methodologies (ANCOVA). The output parameters were three
transient sources of groundwater abstraction (shallow and deep storage release, and local
surface-water depletion). We found that the optimized ANNs have a predictive skill of up to
0.84 (R2, 2σ = ± 0.03) when predicting water sources compared to physics-based numerical
(MODFLOW) models. Optimal ANN skill was obtained when using between five and seven
predictor parameters, with hydraulic diffusivity and mean aquifer thickness being the most
important predictor parameters. Even though initial results are promising and
computationally frugal, we found that the deep learning models were not yet sufficient or
outperforming numerical model simulations.
The third chapter used random forests in mapping steady-state water table depth on a global
scale (0.1°-spatial resolution) and to integrate the results to improve our understanding on
scale and perceptual modeling of global water table depth. In this study we used a spatially
biased ~1.5-million-point database of water table depth observations with a variety of
iv
globally distributed above- and below-ground predictor variables with causal relationships to
steady-state water table depth. We mapped water table depth globally as well as at regional
to continental scales to interrogate performance, feature importance and hydrologic process
across scales and regions with varying hydrogeological landscapes and climates. The global
water table depth map has a correlation (cross validation error) of R2 = 0.72 while our highest
continental correlation map (Australia) has a correlation of R2 = 0.86. The results of this study
surprisingly show that above-ground variables such as surface elevation, slope, drainage
density and precipitation are among the most important predictor parameters while
subsurface parameters such as permeability and porosity are notably less important. This is
contrary to conventional thought among hydrogeologists, who would assume that subsurface
parameters are very important. Machine learning results overall underestimate water
table depth similar to existing global physics-based groundwater models which also have
comparable differences between existing physics-based groundwater models themselves.
The feature importance derived from our random forest models was used to develop
alternative perceptual models that highlight different water table depth controls between
areas with low relief and high relief. Finally, we considered the representativeness of the
prediction domain and the predictor database and found that 90% of the prediction domain
has a dissimilarity index lower than 0.75. We conclude that we see good extrapolation
potential for our random forest models to regions with unknown water table depth, except
for some high elevation regions.
Finally in chapter four, the most important findings of chapters two and three are considered
as contributions to the unresolved questions in hydrology. Overall, this thesis has contributed
to advancing hydrological sciences through: i) mapping of global steady-state water table
depth using machine learning; ii) advancing hybrid modeling by using synthetic data derived
from physics-based models to train an artificial neural network for estimating storage
depletion; and (iii) it contributing to answering three unsolved problems in hydrology
involving themes of parameter scaling across temporal and spatial scales, extracting
hydrological insight from data, the use of innovative modeling techniques to estimate
hydrological fluxes/states and extrapolation of models to no-data regions. / Graduate
|
92 |
Community Recommendation in Social Networks with Sparse DataRahmaniazad, Emad 12 1900 (has links)
Indiana University-Purdue University Indianapolis (IUPUI) / Recommender systems are widely used in many domains. In this work, the importance of a recommender system in an online learning platform is discussed. After explaining the concept of adding an intelligent agent to online education systems, some features of the Course Networking (CN) website are demonstrated. Finally, the relation between CN, the intelligent agent (Rumi), and the recommender system is presented. Along with the argument of three different approaches for building a community recommendation system. The result shows that the Neighboring Collaborative Filtering (NCF) outperforms both the transfer learning method and the Continuous bag-of-words approach. The NCF algorithm has a general format with two various implementations that can be used for other recommendations, such as course, skill, major, and book recommendations.
|
93 |
Detection of Faults in HVAC Systems using Tree-based Ensemble Models and Dynamic ThresholdsChakraborty, Debaditya January 2018 (has links)
No description available.
|
94 |
IONA: Intelligent Online News AnalysisDoumit, Sarjoun S. January 2018 (has links)
No description available.
|
95 |
Lifetime Performance Modeling of Commercial Photovoltaic Power PlantsCurran, Alan J. 26 August 2019 (has links)
No description available.
|
96 |
Tracking, Recognizing and Analyzing Human Exercise ActivitySathe, Pushkar Sunil January 2019 (has links)
No description available.
|
97 |
Automatic Network Traffic Anomaly Detection and Analysis using SupervisedMachine Learning TechniquesSyal, Astha January 2019 (has links)
No description available.
|
98 |
Lifetime and Degradation Studies of Poly (Methyl Methacrylate) (PMMA) via Data-driven MethodsLi, Donghui 01 June 2020 (has links)
No description available.
|
99 |
SqueezeFit Linear Program: Fast and Robust Label-aware Dimensionality ReductionLu, Tien-hsin 01 October 2020 (has links)
No description available.
|
100 |
Pruning GHSOM to create an explainable intrusion detection systemKirby, Thomas Michael 12 May 2023 (has links) (PDF)
Intrusion Detection Systems (IDS) that provide high detection rates but are black boxes leadto models that make predictions a security analyst cannot understand. Self-Organizing Maps(SOMs) have been used to predict intrusion to a network, while also explaining predictions throughvisualization and identifying significant features. However, they have not been able to compete withthe detection rates of black box models. Growing Hierarchical Self-Organizing Maps (GHSOMs)have been used to obtain high detection rates on the NSL-KDD and CIC-IDS-2017 network trafficdatasets, but they neglect creating explanations or visualizations, which results in another blackbox model.This paper offers a high accuracy, Explainable Artificial Intelligence (XAI) based on GHSOMs.One obstacle to creating a white box hierarchical model is the model growing too large and complexto understand. Another contribution this paper makes is a pruning method used to cut down onthe size of the GHSOM, which provides a model that can provide insights and explanation whilemaintaining a high detection rate.
|
Page generated in 0.0885 seconds