201 |
Heavy Tails and Anomalous Diffusion in Human Online DynamicsWang, Xiangwen 28 February 2019 (has links)
In this dissertation, I extend the analysis of human dynamics to human movements in online activities. My work starts with a discussion of the human information foraging process based on three large collections of empirical search click-through logs collected in different time periods. With the analogy of viewing the click-through on search engine result pages as a random walk, a variety of quantities like the distributions of step length and waiting time as well as mean-squared displacements, correlations and entropies are discussed. Notable differences between the different logs reveal an increased efficiency of the search engines, which is found to be related to the vanishing of the heavy-tailed characteristics of step lengths in newer logs as well as the switch from superdiffusion to normal diffusion in the diffusive processes of the random walks. In the language of foraging, the newer logs indicate that online searches overwhelmingly yield local searches, whereas for the older logs the foraging processes are a combination of local searches and relocation phases that are power-law distributed. The investigation highlights the presence of intermittent search processes in online searches, where phases of local explorations are separated by power-law distributed relocation jumps. In the second part of this dissertation I focus on an in-depth analysis of online gambling behaviors. For this analysis the collected empirical gambling logs reveal the wide existence of heavy-tailed statistics in various quantities in different online gambling games. For example, when players are allowed to choose arbitrary bet values, the bet values present log-normal distributions, meanwhile if they are restricted to use items as wagers, the distribution becomes truncated power laws. Under the analogy of viewing the net change of income of each player as a random walk, the mean-squared displacement and first-passage time distribution of these net income random walks both exhibit anomalous diffusion. In particular, in an online lottery game the mean-squared displacement presents a crossover from a superdiffusive to a normal diffusive regime, which is reproduced using simulations and explained analytically. This investigation also reveals the scaling characteristics and probability reweighting in risk attitude of online gamblers, which may help to interpret behaviors in economic systems. This work was supported by the US National Science Foundation through grants DMR-1205309 and DMR-1606814. / Ph. D. / Humans are complex, meanwhile understanding the complex human behaviors is of crucial importance in solving many social problems. In recent years, socio physicists have made substantial progress in human dynamics research. In this dissertation, I extend this type of analysis to human movements in online activities. My work starts with a discussion of the human information foraging process. This investigation is based on empirical search logs and an analogy of viewing the click-through on search engine result pages as a random walk. With an increased efficiency of the search engines, the heavy-tailed characteristics of step lengths disappear, and the diffusive processes of the random walkers switch from superdiffusion to normal diffusion. In the language of foraging, the newer logs indicate that online searches overwhelmingly yield local searches, whereas for the older logs the foraging processes are a combination of local searches and relocation phases that are power-law distributed. The investigation highlights the presence of intermittent search processes in online searches, where phases of local explorations are separated by power-law distributed relocation jumps. In the second part of this dissertation I focus on an in-depth analysis of online gambling behaviors, where the collected empirical gambling logs reveal the wide existence of heavy-tailed statistics in various quantities. Using an analogy of viewing the net change of income of each player as a random walk, the mean-squared displacement and first-passage time distribution of these net income random walks exhibit anomalous diffusion. This investigation also reveals the scaling characteristics and probability reweighting in risk attitude of online gamblers, which may help to interpret behaviors in economic systems. This work was supported by the US National Science Foundation through grants DMR-1205309 and DMR-1606814.
|
202 |
Predictive Turbulence Modeling with Bayesian Inference and Physics-Informed Machine LearningWu, Jinlong 25 September 2018 (has links)
Reynolds-Averaged Navier-Stokes (RANS) simulations are widely used for engineering design and analysis involving turbulent flows. In RANS simulations, the Reynolds stress needs closure models and the existing models have large model-form uncertainties. Therefore, the RANS simulations are known to be unreliable in many flows of engineering relevance, including flows with three-dimensional structures, swirl, pressure gradients, or curvature. This lack of accuracy in complex flows has diminished the utility of RANS simulations as a predictive tool for engineering design, analysis, optimization, and reliability assessments. Recently, data-driven methods have emerged as a promising alternative to develop the model of Reynolds stress for RANS simulations. In this dissertation I explore two physics-informed, data-driven frameworks to improve RANS modeled Reynolds stresses. First, a Bayesian inference framework is proposed to quantify and reduce the model-form uncertainty of RANS modeled Reynolds stress by leveraging online sparse measurement data with empirical prior knowledge. Second, a machine-learning-assisted framework is proposed to utilize offline high-fidelity simulation databases. Numerical results show that the data-driven RANS models have better prediction of Reynolds stress and other quantities of interest for several canonical flows. Two metrics are also presented for an a priori assessment of the prediction confidence for the machine-learning-assisted RANS model. The proposed data-driven methods are also applicable to the computational study of other physical systems whose governing equations have some unresolved physics to be modeled. / Ph. D. / Reynolds-Averaged Navier–Stokes (RANS) simulations are widely used for engineering design and analysis involving turbulent flows. In RANS simulations, the Reynolds stress needs closure models and the existing models have large model-form uncertainties. Therefore, the RANS simulations are known to be unreliable in many flows of engineering relevance, including flows with three-dimensional structures, swirl, pressure gradients, or curvature. This lack of accuracy in complex flows has diminished the utility of RANS simulations as a predictive tool for engineering design, analysis, optimization, and reliability assessments. Recently, data-driven methods have emerged as a promising alternative to develop the model of Reynolds stress for RANS simulations. In this dissertation I explore two physics-informed, data-driven frameworks to improve RANS modeled Reynolds stresses. First, a Bayesian inference framework is proposed to quantify and reduce the model-form uncertainty of RANS modeled Reynolds stress by leveraging online sparse measurement data with empirical prior knowledge. Second, a machine-learning-assisted framework is proposed to utilize offline high fidelity simulation databases. Numerical results show that the data-driven RANS models have better prediction of Reynolds stress and other quantities of interest for several canonical flows. Two metrics are also presented for an a priori assessment of the prediction confidence for the machine-learning-assisted RANS model. The proposed data-driven methods are also applicable to the computational study of other physical systems whose governing equations have some unresolved physics to be modeled.
|
203 |
An evaluation of a data-driven approach to regional scale surface runoff modellingZhang, Ruoyu 03 August 2018 (has links)
Modelling surface runoff can be beneficial to operations within many fields, such as agriculture planning, flood and drought risk assessment, and water resource management. In this study, we built a data-driven model that can reproduce monthly surface runoff at a 4-km grid network covering 13 watersheds in the Chesapeake Bay area. We used a random forest algorithm to build the model, where monthly precipitation, temperature, land cover, and topographic data were used as predictors, and monthly surface runoff generated by the SWAT hydrological model was used as the response. A sub-model was developed for each of 12 monthly surface runoff estimates, independent of one another. Accuracy statistics and variable importance measures from the random forest algorithm reveal that precipitation was the most important variable to the model, but including climatological data from multiple months as predictors significantly improves the model performance. Using 3-month climatological, land cover, and DEM derivatives from 40% of the 4-km grids as the training dataset, our model successfully predicted surface runoff for the remaining 60% of the grids (mean R2 (RMSE) for the 12 monthly models is 0.83 (6.60 mm)). The lowest R2 was associated with the model for August, when the surface runoff values are least in a year. In all studied watersheds, the highest predictive errors were found within the watershed with greatest topographic complexity, for which the model tended to underestimate surface runoff. For the other 12 watersheds studied, the data-driven model produced smaller and more spatially consistent predictive errors. / Master of Science / Surface runoff data can be valuable to many fields, such as agriculture planning, water resource management, and flood and drought risk assessment. The traditional approach to acquire the surface runoff data is by simulating hydrological models. However, running such models always requires advanced knowledge to watersheds and computation technologies. In this study, we build a statistical model that can reproduce monthly surface runoff at 4-km grid covering 13 watersheds in Chesapeake Bay area. This model uses publicly accessible climate, land cover, and topographic datasets as predictors, and monthly surface runoff from the SWAT model as the response. We develop 12 monthly models for each month, independent to each other. To test whether the model can be applied to generalize the surface runoff for the entire study area, we use 40% of grid data as the training sample and the remainder as validation. The accuracy statistics, the annual mean R2 and RMSE are 0.83 and 6.60 mm, show our model is capable to accurately reproduce monthly surface runoff of our study area. The statistics for August model are not as satisfying as other months’ models. The possible reason is the surface runoff in August is the lowest among the year, thus there is no enough variation for the algorithm to distinguish the minor difference of the response in model building process. When applying the model to watersheds in steep terrain conditions, we need to pay attention to the results in which the error may be relatively large.
|
204 |
Commutation Error in Reduced Order ModelingKoc, Birgul 01 October 2018 (has links)
We investigate the effect of spatial filtering on the recently proposed data-driven correction reduced order model (DDC-ROM). We compare two filters: the ROM projection, which was originally used to develop the DDC-ROM, and the ROM differential filter, which uses a Helmholtz operator to attenuate the small scales in the input signal. We focus on the following questions: ``Do filtering and differentiation with respect to space variable commute, when filtering is applied to the diffusion term?'' or in other words ``Do we have commutation error (CE) in the diffusion term?" and ``If so, is the commutation error data-driven correction ROM (CE-DDC-ROM) more accurate than the original DDC-ROM?'' If the CE exists, the DDC-ROM has two different correction terms: one comes from the diffusion term and the other from the nonlinear convection term. We investigate the DDC-ROM and the CE-DDC-ROM equipped with the two ROM spatial filters in the numerical simulation of the Burgers equation with different diffusion coefficients and two different initial conditions (smooth and non-smooth). / M.S. / We propose reduced order models (ROMs) for an efficient and relatively accurate numerical simulation of nonlinear systems. We use the ROM projection and the ROM differential filters to construct a novel data-driven correction ROM (DDC-ROM). We show that the ROM spatial filtering and differentiation do not commute for the diffusion operator. Furthermore, we show that the resulting commutation error has an important effect on the ROM, especially for low viscosity values. As a mathematical model for our numerical study, we use the one-dimensional Burgers equations with smooth and non-smooth initial conditions.
|
205 |
CRITICAL TRANSITIONS OF POST-DISASTER RECOVERY VIA DATA-DRIVEN MULTI-AGENT SYSTEMSSangung Park (19201096) 26 July 2024 (has links)
<p dir="ltr">Increased frequency and intensity of disasters necessitate the dynamic post-disaster recovery process. Developing human mobility patterns, household return decision-making models, and agent-based simulations in disaster management has opened a new door towards more intricate and enduring recovery frameworks. Despite these opportunities, the importance of a unified framework is underestimated to identify the underlying mechanisms hindering the post-disaster recovery process. My research has been geared towards forging advancements in civil and disaster management, focusing on two main areas: (1) modeling the post-disaster recovery process and (2) identifying critical transitions within the recovery process.</p><p dir="ltr">My dissertation explores the collective and individual dynamics of post-disaster recovery across different spatial and temporal scales. I have identified the best recovery strategies for various contexts by constructing data-driven socio-physical multi-agent systems. Employing various advanced computational methodologies, including machine learning, system dynamics, causal discovery, econometrics, and network analysis, has been instrumental. I start with aggregated level analysis for post-disaster recovery. Initially, I examined the system dynamics model for the post-discovery recovery process in socio-physical systems, using normalized visit density of points of interest and power outage information. Through counterfactual analyses of budget allocation strategies, I discovered their significant impact on recovery trajectories, noting that specific budget allocations substantially enhance recovery patterns. I also revealed the urban-rural dissimilarity by the data-driven causal discovery approach. I utilized county-level normalized visit density of points of interest and nighttime light data to identify the relationship between counties. I found that urban and rural areas have similar but different recovery patterns across different types of points of interest.</p><p dir="ltr">Moving from aggregated to disaggregated level analysis on post-disaster recovery, I investigated household-level decision-making regarding disaster-induced evacuation and return behaviors. The model yielded insights into the varying influences of certain variables across urban and rural contexts. Subsequently, I developed a unified framework integrating aggregated and disaggregated level analyses through multilayer multi-agent systems to model significant shifts in the post-disaster recovery process. I evaluated various scenarios to pinpoint conditions for boosting recovery and assessing the effects of different intervention strategies on these transitions. Lastly, a comparison between mathematical models and graph convolutional networks was conducted to better understand the conditions leading to critical transitions in the recovery process. The insights and methodologies presented in this dissertation contribute to the broader understanding of the disaster recovery process in complex urban systems, advocating for a shift towards a unified framework over individual models. By harnessing big data and complex systems modeling, I can achieve a detailed quantitative analysis of the disaster recovery process, including critical transition conditions of the post-disaster recovery. This approach facilitates the evaluation of such recovery policies through inter-regional comparisons and the testing of various policy interventions in counterfactual scenarios.</p>
|
206 |
Sample-efficient Data-driven Learning of Dynamical Systems with Physical Prior Information and Active Learning / 物理的な事前情報とアクティブラーニングによる動的システムのサンプル効率の高いデータ駆動型学習Tang, Shengbing 25 July 2022 (has links)
京都大学 / 新制・課程博士 / 博士(工学) / 甲第24146号 / 工博第5033号 / 新制||工||1786(附属図書館) / 京都大学大学院工学研究科航空宇宙工学専攻 / (主査)教授 藤本 健治, 教授 松野 文俊, 教授 森本 淳 / 学位規則第4条第1項該当 / Doctor of Philosophy (Engineering) / Kyoto University / DFAM
|
207 |
Trustworthy Soft Sensing in Water Supply Systems using Deep LearningSreng, Chhayly 22 May 2024 (has links)
In many industrial and scientific applications, accurate sensor measurements are crucial. Instruments such as nitrate sensors are vulnerable to environmental conditions, calibration drift, high maintenance costs, and degrading. Researchers have turned to advanced computational methods, including mathematical modeling, statistical analysis, and machine learning, to overcome these limitations. Deep learning techniques have shown promise in outperforming traditional methods in many applications by achieving higher accuracy, but they are often criticized as 'black-box' models due to their lack of transparency. This thesis presents a framework for deep learning-based soft sensors that can quantify the robustness of soft sensors by estimating predictive uncertainty and evaluating performance across various scenarios. The framework facilitates comparisons between hard and soft sensors. To validate the framework, I conduct experiments using data generated by AI and Cyber for Water and Ag (ACWA), a cyber-physical system water-controlled environment testbed. Afterwards, the framework is tested on real-world environment data from Alexandria Renew Enterprise (AlexRenew), establishing its applicability and effectiveness in practical settings. / Master of Science / Sensors are essential in various industrial systems and offer numerous advantages. Essential to measurement science and technology, it allows reliable high-resolution low-cost measurement and impacts areas such as environmental monitoring, medical applications and security. The importance of sensors extends to Internet of Things (IoT) and large-scale data analytics fields. In these areas, sensors are vital to the generation of data that is used in industries such as health care, transportation and surveillance. Big Data analytics processes this data for a variety of purposes, including health management and disease prediction, demonstrating the growing importance of sensors in data-driven decision making.
In many industrial and scientific applications, precision and trustworthiness in measurements are crucial for informed decision-making and maintaining high-quality processes. Instruments such as nitrate sensors are particularly susceptible to environmental conditions, calibration drift, high maintenance costs, and a tendency to become less reliable over time due to aging. The lifespan of these instruments can be as short as two weeks, posing significant challenges. To overcome these limitations, researchers have turned to advanced computational methods, including mathematical modeling, statistical analysis, and machine learning. Traditional methods have had some success, but they often struggle to fully capture the complex dynamics of natural environments. This has led to increased interest in more sophisticated approaches, such as deep learning techniques. Deep learning-based soft sensors have shown promise in outperforming traditional methods in many applications by achieving higher accuracy. However, they are often criticized as "black-box" models due to their lack of transparency. This raises questions about their reliability and trustworthiness, making it critical to assess these aspects.
This thesis presents a comprehensive framework for deep learning-based soft sensors. The framework will quantify the robustness of soft sensors by estimating predictive uncertainty and evaluating performance across a range of contextual scenarios, such as weather conditions, flood events, and water parameters. These evaluations will help define the trustworthiness of the soft sensor and facilitate comparisons between hard and soft sensors. To validate the framework, we will conduct experiments using data generated by ACWA, a cyber-physical system water-controlled environment testbed we developed. This will provide a controlled environment to test and refine our framework. Subsequently, we will test the framework on real-world environment data from AlexRenew. This will further establish its applicability and effectiveness in practical settings, providing a robust and reliable tool for sensor data analysis and prediction. Ultimately, this work aims to contribute to the broader field of sensor technology, enhancing our ability to make informed decisions based on reliable and accurate sensor data.
|
208 |
Data Driven Positioning System for Underground MinesJohdet Piwek, Oliver January 2024 (has links)
In this thesis, the focus is on enhancing EMI's products, Onboard and PocketMine leading software solutions in the mining sector. This study explores how the extensive data gathered by Onboard can be used to develop a more precise and reliable positioning system for PocketMine and to create the foundation of redundancy for Onboard using machine learning models. Furthermore, it will explore what machine learning model performs optimally with this data. This thesis is motivated by the potential of data-driven methodologies to enhance the safety and accuracy of EMI’s products, significantly improving operational safety and precision in challenging underground environments but also contributing to the broader field of positioning technology. The goals for this thesis are achieved by comparing four different ML models on three distinct datasets based on locations in the mine to decide which models the final solution will be using. Additionally, the idea of creating a model encapsulating the entire mine is examined and compared to the POI-specific models to see if it is feasible for one model to learn the intricacies of the mine. In addition to this, the deployment strategy will be discussed. Upon comparing the models against each other and the mine-wide model, it was decided to move on with Weighted K Nearest Neighbors as the model of choice based on several evaluation metrics. The large scale of the mine proved too great to be handled by one model so the decision to cluster the mine into 100 distinct clusters and create one model for each cluster was made. The results show that the proposed solution made a great improvement in positional accuracy over the current positioning algorithm of PocketMine. This improvement suggests in line with testing it against Onboard that the proposed model could effectively serve as a reliable backup system for Onboard.
|
209 |
Physics-Informed, Data-Driven Framework for Model-Form Uncertainty Estimation and Reduction in RANS SimulationsWang, Jianxun 05 April 2017 (has links)
Computational fluid dynamics (CFD) has been widely used to simulate turbulent flows. Although an increased availability of computational resources has enabled high-fidelity simulations (e.g. large eddy simulation and direct numerical simulation) of turbulent flows, the Reynolds-Averaged Navier-Stokes (RANS) equations based models are still the dominant tools for industrial applications. However, the predictive capability of RANS models is limited by potential inaccuracies driven by hypotheses in the Reynolds stress closure. With the ever-increasing use of RANS simulations in mission-critical applications, the estimation and reduction of model-form uncertainties in RANS models have attracted attention in the turbulence modeling community. In this work, I focus on estimating uncertainties stemming from the RANS turbulence closure and calibrating discrepancies in the modeled Reynolds stresses to improve the predictive capability of RANS models. Both on-line and off-line data are utilized to achieve this goal. The main contributions of this dissertation can be summarized as follows: First, a physics-based, data-driven Bayesian framework is developed for estimating and reducing model-form uncertainties in RANS simulations. An iterative ensemble Kalman method is employed to assimilate sparse on-line measurement data and empirical prior knowledge for a full-field inversion. The merits of incorporating prior knowledge and physical constraints in calibrating RANS model discrepancies are demonstrated and discussed. Second, a random matrix theoretic framework is proposed for estimating model-form uncertainties in RANS simulations. Maximum entropy principle is employed to identify the probability distribution that satisfies given constraints but without introducing artificial information. Objective prior perturbations of RANS-predicted Reynolds stresses in physical projections are provided based on comparisons between physics-based and random matrix theoretic approaches. Finally, a physics-informed, machine learning framework towards predictive RANS turbulence modeling is proposed. The functional forms of model discrepancies with respect to mean flow features are extracted from the off-line database of closely related flows based on machine learning algorithms. The RANS-modeled Reynolds stresses of prediction flows can be significantly improved by the trained discrepancy function, which is an important step towards the predictive turbulence modeling. / Ph. D. / Turbulence modeling is a critical component in computational fluid dynamics (CFD) simulations of industrial flows. Despite the significant growth in computational resources over the past two decades, the time-resolved high-fidelity simulations (e.g., large eddy simulation and direct numerical simulation) are not feasible for engineering applications. Therefore, the small-scale turbulent velocity fluctuations have to resort to the time-averaging modeling. Reynolds-averaged Navier-Stokes (RANS) equations based turbulence models describe the averaged flow quantities for turbulent flows and are believed to be the dominant tools for industrial applications in coming decades. However, for many practical flows, the predictive accuracy of RANS models is largely limited by the model-form uncertainties stemming from the potential inaccuracies in the Reynolds stress closure. As RANS models are used in the design and safety evaluation of many mission-critical systems, such as airplanes and nuclear power plants, properly estimating and reducing these model uncertainties are of significant importance. In this work, I focus on estimating uncertainties stemming from the RANS turbulence closure and calibrating discrepancies in the modeled Reynolds stresses to improve the predictive capability of RANS models. Several data-driven approaches based on stateof-the-art data assimilation and machine learning algorithms are proposed to achieve this goal by leveraging the use of on-line and off-line high-fidelity data. Numerical simulations of several canonical flows are used to demonstrate the merits of the proposed approaches. Moreover, the proposed methods also have implications in many fields in which the governing equations are well understood, but the model uncertainties come from unresolved physical processes.
|
210 |
Dynamic skin deformation using finite difference solutions for character animationChaudhry, E., Bian, S.J., Ugail, Hassan, Jin, X., You, L.H., Zhang, J.J. 27 September 2014 (has links)
No / We present a new skin deformation method to create dynamic skin deformations in this paper. The core
elements of our approach are a dynamic deformation model, an efficient data-driven finite difference
solution, and a curve-based representation of 3D models.We first reconstruct skin deformation models
at different poses from the taken photos of a male human arm movement to achieve real deformed skin
shapes. Then, we extract curves from these reconstructed skin deformation models. A new dynamic
deformation model is proposed to describe physics of dynamic curve deformations, and its finite
difference solution is developed to determine shape changes of the extracted curves. In order to improve
visual realism of skin deformations, we employ data-driven methods and introduce skin shapes at the
initial and final poses in to our proposed dynamic deformation model. Experimental examples and
comparisons made in this paper indicate that our proposed dynamic skin deformation technique can
create realistic deformed skin shapes efficiently with a small data size.
|
Page generated in 0.0664 seconds