Global ETD Search

11	The spiral curriculum, integrated teaching and structured learning of mathematics at the secondary level Alummoottil, Joseph Michael January 1990 (has links) The investigator's experience of teaching mathematics at a college of education since 1983 has reinforced his conviction that trainee students come to college with significant gaps, weaknesses and faults in their (mathematical) conceptual structures, probably as a result of shortcomings in the mathematics teaching to which they have been exposed. The theme of this investigation is thus a natural choice that appeared to be of immediate relevance to secondary school mathematics teaching. The analysis of the issue leads to a unified perspective: the problem is placed in a theoretical framework where Bruner [spiral curriculum], Ausubel [structured learning] and Skemp [relational understanding] are brought together. How the curriculum, textbooks and examination influence school mathematics teaching is examined in some depth and the consequences investigated. Two specific topics, viz. the generalised Pythagorean relation and absolute value are investigated in relation to published work, curriculum and textbooks, and each (topic) is presented as a unifying theme in secondary mathematics to standard 9 pupils. The classroom exercise is assessed to test the hypothesis that structured, integrated presentation around a spiral curriculum promotes "relational understanding". Analysis of results supports the hypothesis.
12	Optimization and learning in dynamic environments: models and algorithms Shukla, Apurv January 2022 (has links) This dissertation proposes new models and algorithms for optimization and learning in dynamic environments. We consider three problems: design of variance-aware optimization algorithms for the optimal power flow problem, robust streaming PCA and the contextual Pareto bandit problem. For the variance-aware optimal power flow problem, we consider the incorporation of stochastic loads and generation into the operation of power grids gives rise to an exposure to stochastic risk. This risk has been addressed in prior work through a variety of mechanisms, such as scenario generation or chance constraints, that can be incorporated into OPF computations. We introduce a variety of convex variants of OPF that explicitly address the interplay of (power flow) variance with cost minimization, and present numerical experiments that highlight our contributions. In Robust Streaming PCA, we consider streaming principal component analysis (PCA) when the stochastic data-generating model is subject to adversarial perturbations. While existing models assume a fixed stochastic data-generating model, we instead adopt a robust perspective where the data generating model constrains the amount of allowed adversarial perturbations, and establish fundamental limits on achievable performance of any algorithm to recover appropriate principal components under this model. Using a novel proof technique, we establish the rate-optimality for robust versions of the noisy power method, previously developed for the non-robust version of the problem. Our modeling and analysis framework provides a unique lens to study sequential stochastic optimization with a non-convex objective and sheds light on the fragility of using off-the-shelf PCA algorithms in an adversarial environment. Our claims are further corroborated on a suite of numerical experiments. The results are numerically verified for a range of parameter values governing the streaming PCA problem. In contextual Pareto bandits, we consider a continuum-armed contextual bandit problem under vectorial rewards. For this problem, we propose a tree-based policy that maintains separate partitions for action and covariate spaces. In the presence of vectorial rewards, we evaluate the performance of the proposed policy in terms of its Contextual Pareto regret. We establish an upper bound on the performance of the proposed policy for this static policy. Finally, the efficacy of the proposed policy is described on a suite of numerical experiments. Operations research Mathematical optimization Stochastic processes Learning--Mathematical models
13	Efficient portfolio optimisation by hydridised machine learning 26 March 2015 (has links) D.Ing. / The task of managing an investment portfolio is one that continues to challenge both professionals and private individuals on a daily basis. Contrary to popular belief, the desire of these actors is not in all (or even most) instances to generate the highest profits imaginable, but rather to achieve an acceptable return for a given level of risk. In other words, the investor desires to have his funds generate money for him, while not feeling that he is gambling away his (or his clients’) funds. The reasons for a given risk tolerance (or risk appetite) are as varied as the clients themselves – in some instances, clients will simply have their own arbitrary risk appetites, while other may need to maintain certain values to satisfy their mandates, while other may need to meet regulatory requirements. In order to accomplish this task, many measures and representations of performance data are employed to both communicate and understand the risk-reward trade-offs involved in the investment process. In light of the recent economic crisis, greater understanding and control of investment is being clamoured for around the globe, along with the concomitant finger-pointing and blame-assignation that inevitably follows such turmoil, and such heavy costs. The reputation of the industry, always dubious in the best of times, has also taken a significant knock after the events, and while this author would not like to point fingers, clearly the managers of funds, custodians of other people’s money, are in no small measure responsible for the loss of the funds under their care. It is with these concerns in mind that this thesis explores the potential for utilising the powerful tools found within the disciplines of artificial intelligence and machine learning in order to aid fund managers in the balancing of portfolios, tailoring specifically to their clients’ individual needs. These fields hold particular promise due to their focus on generalised pattern recognition, multivariable optimisation and continuous learning. With these tools in hand, a fund manager is able to continuously rebalance a portfolio for a client, given the client’s specific needs, and achieve optimal results while staying within the client’s risk parameters (in other words, keeping within the clients comfort zone in terms of price / value fluctuations).This thesis will first explore the drivers and constraints behind the investment process, as well as the process undertaken by the fund manager as recommended by the CFA (Certified Financial Analyst) Institute. The thesis will then elaborate on the existing theory behind modern investment theory, and the mathematics and statistics that underlie the process. Some common tools from the field of Technical Analysis will be examined, and their implicit assumptions and limitations will be shown, both for understanding and to show how they can still be utilised once their limitations are explicitly known. Thereafter the thesis will show the various tools from within the fields of machine learning and artificial intelligence that form the heart of the thesis herein. A highlight will be placed on data structuring, and the inherent dangers to be aware of when structuring data representations for computational use. The thesis will then illustrate how to create an optimiser using a genetic algorithm for the purpose of balancing a portfolio. Lastly, it will be shown how to create a learning system that continues to update its own understanding, and create a hybrid learning optimiser to enable fund managers to do their job effectively and safely. Machine learning - Mathematical models Mathematical optimization Risk assessment - Mathematical models
14	Sparse learning under regularization framework. / 正則化框架下的稀疏學習 / CUHK electronic theses & dissertations collection / Zheng ze hua kuang jia xia de xi shu xue xi January 2011 (has links) Regularization is a dominant theme in machine learning and statistics due to its prominent ability in providing an intuitive and principled tool for learning from high-dimensional data. As large-scale learning applications become popular, developing efficient algorithms and parsimonious models become promising and necessary for these applications. Aiming at solving large-scale learning problems, this thesis tackles the key research problems ranging from feature selection to learning with unlabeled data and learning data similarity representation. More specifically, we focus on the problems in three areas: online learning, semi-supervised learning, and multiple kernel learning. / The first part of this thesis develops a novel online learning framework to solve group lasso and multi-task feature selection. To the best our knowledge, the proposed online learning framework is the first framework for the corresponding models. The main advantages of the online learning algorithms are that (1) they can work on the applications where training data appear sequentially; consequently, the training procedure can be started at any time; (2) they can handle data up to any size with any number of features. The efficiency of the algorithms is attained because we derive closed-form solutions to update the weights of the corresponding models. At each iteration, the online learning algorithms just need O (d) time complexity and memory cost for group lasso, while they need O (d x Q) for multi-task feature selection, where d is the number of dimensions and Q is the number of tasks. Moreover, we provide theoretical analysis for the average regret of the online learning algorithms, which also guarantees the convergence rate of the algorithms. In addition, we extend the online learning framework to solve several related models which yield more sparse solutions. / The second part of this thesis addresses a general scenario of semi-supervised learning for the binary classification problern, where the unlabeled data may be a mixture of relevant and irrelevant data to the target binary classification task. Without specifying the relatedness in the unlabeled data, we develop a novel maximum margin classifier, named the tri-class support vector machine (3C-SVM), to seek an inductive rule that can separate these data into three categories: --1, +1, or 0. This is achieved by adopting a novel min loss function and following the maximum entropy principle. For the implementation, we approximate the problem and solve it by a standard concaveconvex procedure (CCCP). The approach is very efficient and it is possible to solve large-scale datasets. / The third part of this thesis focuses on multiple kernel learning (MKL) to solve the insufficiency of the L1-MKL and the Lp-MKL models. Hence, we propose a generalized MKL (GMKL) model by introducing an elastic net-type constraint on the kernel weights. More specifically, it is an MKL model with a constraint on a linear combination of the L1-norm and the square of the L2-norm on the kernel weights to seek the optimal kernel combination weights. Therefore, previous MKL problems based on the L1-norm or the L2-norm constraints can be regarded as its special cases. Moreover, our GMKL enjoys the favorable sparsity property on the solution and also facilitates the grouping effect. In addition, the optimization of our GMKL is a convex optimization problem, where a local solution is the globally optimal solution. We further derive the level method to efficiently solve the optimization problem. / Yang, Haiqin. / Advisers: Kuo Chin Irwin King; Michael Rung Tsong Iyu. / Source: Dissertation Abstracts International, Volume: 73-04, Section: B, page: . / Thesis (Ph.D.)--Chinese University of Hong Kong, 2011. / Includes bibliographical references (leaves 152-173). / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Electronic reproduction. [Ann Arbor, MI] : ProQuest Information and Learning, [201-] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Abstract also in Chinese. Kernel functions Machine learning--Mathematical models Sparse matrices Support vector machines
15	Essays on Demand Estimation, Financial Economics and Machine Learning He, Pu January 2019 (has links) In this era of big data, we often rely on techniques ranging from simple linear regression, structural estimation, and state-of-the-art machine learning algorithms to make operational and financial decisions based on data. This calls for a deep understanding of practical and theoretical aspects of methods and models from statistics, econometrics, and computer science, combined with relevant domain knowledge. In this thesis, we study several practical, data-related problems in the particular domains of sharing economy and financial economics/financial engineering, using appropriate approaches from an arsenal of data-analysis tools. On the methodological front, we propose a new estimator for classic demand estimation problem in economics, which is important for pricing and revenue management. In the first part of this thesis, we study customer preference for the bike share system in London, in order to provide policy recommendations on bike share system design and expansion. We estimate a structural demand model on the station network to learn the preference parameters, and use the estimated model to provide insights on the design and expansion of the system. We highlight the importance of network effects in understanding customer demand and evaluating expansion strategies of transportation networks. In the particular example of the London bike share system, we find that allocating resources to some areas of the station network can be 10 times more beneficial than others in terms of system usage, and that currently implemented station density rule is far from optimal. We develop a new method to deal with the endogeneity problem of the choice set in estimating demand for network products. Our method can be applied to other settings, in which the available set of products or services depends on demand. In the second part of this thesis, we study demand estimation methodology when data has a long-tail pattern, that is, when a significant portion of products have zero or very few sales. Long-tail distributions in sales or market share data have long been an issue in empirical studies in areas such as economics, operations, and marketing, and it is increasingly common nowadays with more detailed levels of data available and many more products being offered in places like online retailers and platforms. The classic demand estimation framework cannot deal with zero sales, which yields inconsistent estimates. More importantly, biased demand estimates, if used as an input to subsequent tasks such as pricing, lead to managerial decisions that are far from optimal. We introduce two new two-stage estimators to solve the problem: our solutions apply machine learning algorithms to estimate market shares in the first stage, and in the second stage, we utilize the first-stage results to correct for the selection bias in demand estimates. We find that our approach works better than traditional methods using simulations. In the third part of this thesis, we study how to extract a signal from option pricing models to form a profitable stock trading strategy. Recent work has documented roughness in the time series of stock market volatility and investigated its implications for option pricing. We study a strategy for trading stocks based on measures of their implied and realized roughness. A strategy that goes long the roughest-volatility stocks and short the smoothest-volatility stocks earns statistically significant excess annual returns of 6% or more, depending on the time period and strategy details. Standard factors do not explain the profitability of the strategy. We compare alternative measures of roughness in volatility and find that the profitability of the strategy is greater when we sort stocks based on implied rather than realized roughness. We interpret the profitability of the strategy as compensation for near-term idiosyncratic event risk. Lastly, we apply a heterogeneous treatment effect (HTE) estimator from statistics and machine learning to financial asset pricing. Recent progress in the interdisciplinary area of causal inference and machine learning has proposed various promising estimators for HTE. We take the R-learner algorithm by [73] and adapt it to empirical asset pricing. We study characteristics associated with standard factors, size, value and momentum through the lens of HTE. Our goal is to identify sub-universes of stocks, ``characteristic responders", in which size, value or momentum trading strategies perform best, compared with the performance had they been applied to the entire universe. On the other hand, we identify subsets of ``characteristic traps" in which the strategies perform the worst. In our test period, the differences in average monthly returns between long-short strategies restricted to ``characteristic responders" and ``characteristic traps" range from 0.77% to 1.54% depending on treatment characteristics. The differences are statistically significant and cannot be explained by standard factors: a long-short of long-short strategy generates alpha of significant magnitude from 0.98% to 1.80% monthly, with respect to standard Fama-French plus momentum factors. Simple interaction terms between standard factors and ex-post important features do not explain the alphas either. We also characterize and interpret the characteristic traps and responders identified by our algorithm. Our study can be viewed as a systematic, data-driven way to investigate interaction effects between features and treatment characteristic, and to identify characteristic traps and responders. Economics Finance Statistics Financial engineering Econometric models Machine learning--Mathematical models
16	A Mathematical Study of Learning Dynamics Keller, Rachael Tara January 2021 (has links) Data-driven discovery of dynamics, where data is used to learn unknown dynamics, is witnessing a resurgence of interest as data and computational tools have become widespread and increasingly accessible. Advances in machine learning, data science, and neural networks are fueling new data-driven studies and rapidly changing the landscape in almost every field. Meanwhile, classical numerical analysis remains a steady tool to analyze these new problems. This thesis situates emerging works coupling machine learning, neural networks, and data-driven discovery of dynamics in classical numerical theory. We begin by formulating a universal learning framework based in optimization theory. We discuss how three paradigms of machine learning -- supervised, unsupervised, and reinforcement learning -- are encapsulated by this framework and form a general learning problem for discovery of dynamics. Using this formulation, we distill data-driven discovery of dynamics using the classical technique of linear multistep methods with neural networks to its most basic roots for numerical analysis. We establish for the first time a rigorous mathematical theory for using linear multistep methods in discovery of dynamics assuming exact data. We present refined notions of consistency, stability, and convergence for discovery and show convergence results for the popular schemes of Adams-Bashforth, Adams-Moulton, and Backwards Differentiation Formula. Extending the study for noisy data, we propose and analyze the recovery of a smooth approximation to the state using splines and prove new results on discrete differentiation error estimates. Mathematics Machine learning Machine learning--Mathematical models Neural networks (Computer science) Numerical analysis
17	Salience Estimation and Faithful Generation: Modeling Methods for Text Summarization and Generation Kedzie, Christopher January 2021 (has links) This thesis is focused on a particular text-to-text generation problem, automatic summarization, where the goal is to map a large input text to a much shorter summary text. The research presented aims to both understand and tame existing machine learning models, hopefully paving the way for more reliable text-to-text generation algorithms. Somewhat against the prevailing trends, we eschew end-to-end training of an abstractive summarization model, and instead break down the text summarization problem into its constituent tasks. At a high level, we divide these tasks into two categories: content selection, or “what to say” and content realization, or “how to say it” (McKeown, 1985). Within these categories we propose models and learning algorithms for the problems of salience estimation and faithful generation. Salience estimation, that is, determining the importance of a piece of text relative to some context, falls into a problem of the former category, determining what should be selected for a summary. In particular, we experiment with a variety of popular or novel deep learning models for salience estimation in a single document summarization setting, and design several ablation experiments to gain some insight into which input signals are most important for making predictions. Understanding these signals is critical for designing reliable summarization models. We then consider a more difficult problem of estimating salience in a large document stream, and propose two alternative approaches using classical machine learning techniques from both unsupervised clustering and structured prediction. These models incorporate salience estimates into larger text extraction algorithms that also consider redundancy and previous extraction decisions. Overall, we find that when simple, position based heuristics are available, as in single document news or research summarization, deep learning models of salience often exploit them to make predictions, while ignoring the arguably more important content features of the input. In more demanding environments, like stream summarization, where heuristics are unreliable, more semantically relevant features become key to identifying salience content. In part two, content realization, we assume content selection has already been performed and focus on methods for faithful generation (i.e., ensuring that output text utterances respect the semantics of the input content). Since they can generate very fluent and natural text, deep learning- based natural language generation models are a popular approach to this problem. However, they often omit, misconstrue, or otherwise generate text that is not semantically correct given the input content. In this section, we develop a data augmentation and self-training technique to mitigate this problem. Additionally, we propose a training method for making deep learning-based natural language generation models capable of following a content plan, allowing for more control over the output utterances generated by the model. Under a stress test evaluation protocol, we demonstrate some empirical limits on several neural natural language generation models’ ability to encode and properly realize a content plan. Finally, we conclude with some remarks on future directions for abstractive summarization outside of the end-to-end deep learning paradigm. Our aim here is to suggest avenues for constructing abstractive summarization systems with transparent, controllable, and reliable behavior when it comes to text understanding, compression, and generation. Our hope is that this thesis inspires more research in this direction, and, ultimately, real tools that are broadly useful outside of the natural language processing community. Computer science Machine learning Machine learning--Mathematical models Decision making--Computer programs
18	Examination of Bandwidth Enhancement and Circulant Filter Frequency Cutoff Robustification in Iterative Learning Control Zhang, Tianyi January 2021 (has links) The iterative learning control (ILC) problem considers control tasks that perform a specific tracking command, and the command is to be performed is many times. The system returns to the same initial conditions on the desired trajectory for each repetition, also called run, or iteration. The learning law adjusts the command to a feedback system based on the error observed in the previous run, and aims to converge to zero-tracking error at sampled times as the iterations progress. The ILC problem is an inverse problem: it seeks to converge to that command that produces the desired output. Mathematically that command is given by the inverse of the transfer function of the feedback system, times the desired output. However, in many applications that unique command is often an unstable function of time. A discrete-time system, converted from a continuous-time system fed by a zero-order hold, often has non-minimum phase zeros which become unstable poles in the inverse problem. An inverse discrete-time system will have at least one unstable pole, if the pole-zero excess of the original continuous-time counterpart is equal to or larger than three, and the sample rate is fast enough. The corresponding difference equation has roots larger than one, and the homogeneous solution has components that are the values of these poles to the power of k, with k being the time step. This creates an unstable command growing in magnitude with time step. If the ILC law aims at zero-tracking error for such systems, the command produced by the ILC iterations will ask for a command input that grows exponentially in magnitude with each time step. This thesis examines several ways to circumvent this difficulty, designing filters that prevent the growth in ILC. The sister field of ILC, repetitive control (RC), aims at zero-error at sample times when tracking a periodic command or eliminating a periodic disturbance of known period, or both. Instead of learning from a previous run always starting from the same initial condition, RC learns from the error in the previous period of the periodic command or disturbance. Unlike ILC, the system in RC eventually enters into steady state as time progresses. As a result, one can use frequency response thinking. In ILC, the frequency thinking is not applicable since the output of the system has transients for every run. RC is also an inverse problem and the periodic command to the system converges to the inverse of the system times the desired output. Because what RC needs is zero error after reaching steady state, one can aim to invert the steady state frequency response of the system instead of the system transfer function in order to have a stable solution to the inverse problem. This can be accomplished by designing a Finite Impulse Response (FIR) filter that mimics the steady state frequency response, and which can be used in real time. This dissertation discusses how the digital feedback control system configuration affects the locations of sampling zeros and discusses the effectiveness of RC design methods for these possible sampling zeros. The sampling zeros are zeros introduced by the discretization process from continuous-time system to the discrete-time system. In the RC problem, the feedback control system can have sampling zeros outside the unit circle, and they are challenges for the RC law design. Previous research concentrated on the situation where the sampling zeros of the feedback control system come from a zero-order hold on the input of a continuous-time feedback system, and studied the influence of these zeros including the influence of these sampling zeros as the sampling rate is changed from the asymptotic value of sample time interval approaching zero. Effective RC design methods are developed and tested based for this configuration. In the real world, the feedback control system may not be the continuous-time system. Here we investigate the possible sampling zero locations that can be encountered in digital control systems where the zero-order hold can be in various possible places in the control loop. We show that various new situations can occur. We discuss the sampling zeros location with different feedback system structures, and show that the RC design methods still work. Moreover, we compare the learning rates of different RC design methods and show that the RC design method based on a quadratic fit of the reciprocal of the steady state frequency response will have the desired learning rate features that balance the robustness with efficiency. This dissertation discusses the steady-state response filter of the finite-time signal used in ILC. The ILC problem is sensitive to model errors and unmodelled high frequency dynamics, thus it needs a zero-phase low-pass filter to cutoff learning for frequencies where there is too much model inaccuracy for convergence. But typical zero-phase low-pass filters, like Filtfilt used by MATLAB, gives the filtered results with transients that can destabilize ILC. The associated issues are examined from several points of view. First, the dissertation discusses use of a partial inverse of the feedback system as both learning gain matrix and a low-pass filter to address this problem The approach is used to make a partial system inverse for frequencies where the model is accurate, eliminating the robustness issue. The concept is used as a way to improve a feedback control system performance whose bandwidth is not as high as desired. When the feedback control system design is unable to achieve the desired bandwidth, the partial system inverse for frequency in a range above the bandwidth can boost the bandwidth. If needed ILC can be used to further correct response up to the new bandwidth. The dissertation then discusses Discrete Fourier Transform (DFT) based filters to cut off the learning at high frequencies where model uncertainty is too large for convergence. The concept of a low pass filter is based on steady state frequency response, but ILC is always a finite time problem. This forms a mismatch in the design process, and we seek to address this. A math proof is given showing the DFT based filters directly give the steady-state response of the filter for the finite-time signal which can eliminate the possibility of instability of ILC. However, such filters have problems of frequency leakage and Gibbs phenomenon in applications, produced by the difference between the signal being filtered at the start time and at the final time, This difference applies to the signal filtered for nearly all iterations in ILC. This dissertation discusses the use of single reflection that produced a signal that has the start time and end times matching and then using the original signal portion of the result. In addition, a double reflection of the signal is studied that aims not only to eliminate the discontinuity that produces Gibbs, but also aims to have continuity of the first derivative. It applies a specific kind of double reflection. It is shown mathematically that the two reflection methods reduce the Gibbs phenomenon. A criterion is given to determine when one should consider using such reflection methods on any signal. The numerical simulations demonstrate the benefits of these reflection methods in reducing the tracking error of the system. Mechanical engineering Machine learning--Mathematical models Intelligent control systems Iterative methods (Mathematics) Gibbs phenomenon
19	Application of Support Vector Machine in Predicting the Market's Monthly Trend Direction Alali, Ali 10 December 2013 (has links) In this work, we investigate different techniques to predict the monthly trend direction of the S&P 500 market index. The techniques use a machine learning classifier with technical and macroeconomic indicators as input features. The Support Vector Machine (SVM) classifier was explored in-depth in order to optimize the performance using four different kernels; Linear, Radial Basis Function (RBF), Polynomial, and Quadratic. A result found was the performance of the classifier can be optimized by reducing the number of macroeconomic features needed by 30% using Sequential Feature Selection. Further performance enhancement was achieved by optimizing the RBF kernel and SVM parameters through gridsearch. This resulted in final classification accuracy rates of 62% using technical features alone with gridsearch and 60.4% using macroeconomic features alone using Rankfeatures Support vector machines Machine learning -- Mathematical models Electrical and Computer Engineering
20	Towards Trustworthy Geometric Deep Learning for Elastoplasticity Vlassis, Nikolaos Napoleon January 2021 (has links) Recent advances in machine learning have unlocked new potential for innovation in engineering science. Neural networks are used as universal function approximators that harness high-dimensional data with excellent learning capacity. While this is an opportunity to accelerate computational mechanics research, application in constitutive modeling is not trivial. Machine learning material response predictions without enforcing physical constraints may lack interpretability and could be detrimental to high-risk engineering applications. This dissertation presents a meta-modeling framework for automating the discovery of elastoplasticity models across material scales with emphasis on establishing interpretable and, hence, trustworthy machine learning modeling tools. Our objective is to introduce a workflow that leverages computational mechanics domain expertise to enforce / post hoc validate physical properties of the data-driven constitutive laws. Firstly, we introduce a deep learning framework designed to train and validate neural networks to predict the hyperelastic response of materials. We adopt the Sobolev training method and adapt it for mechanics modeling to gain control over the higher-order derivatives of the learned functions. We generate machine learning models that are thermodynamically consistent, interpretable, and demonstrate enhanced learning capacity. The Sobolev training framework is shown through numerical experiments on different material data sets (e.g. β-HMX crystal, polycrystals, soil) to generate hyperelastic energy functionals that predict the elastic energy, stress, and stiffness measures more accurately than the classical training methods that minimize L2 norms. To model path-dependent phenomena, we depart from the common approach to lump the elastic and plastic response into one black-box neural network prediction. We decompose the elastoplastic behavior into its interpretable theoretical components by training separately a stored elastic energy function, a yield surface, and a plastic flow that evolve based on a set of deep neural network predictions. We interpret the yield function as a level set and control its evolutionas the neural network approximated solutions of a Hamilton-Jacobi equation that governs the hardening/softening mechanism. Our framework may recover any classical literature yield functions and hardening rules as well as discover new mechanisms that are either unbeknownst or difficult to express with mathematical expressions. Through numerical experiments on a 3D FFT-generated polycrystal material response database, we demonstrate that our novel approach provides more robust and accurate forward predictions of cyclic stress paths than black-box deep neural network models. We demonstrate the framework's capacity to readily extend to more complex plasticity phenomena, such as pressure sensitivity, rate-dependence, and anisotropy. Finally, we integrate geometric deep learning and Sobolev training to generate constitutive models for the homogenized responses of anisotropic microstructures (e.g. polycrystals, granular materials). Commonly used hand-crafted homogenized microstructural descriptors (e.g. porosity or the averaged orientation of constitutes) may not adequately capture the topological structures of a material. This is overcome by introducing weighted graphs as new high-dimensional descriptors that represent topological information, such as the connectivity of anisotropic grains in an assemble. Through graph convolutional deep neural networks and graph embedding techniques, our neural networks extract low-dimensional features from the weighted graphs and, subsequently, learn the influence of these low-dimensional features on the resultant stored elastic energy functionals and plasticity models. Engineering Elastoplasticity--Mathematical models Machine learning--Mathematical models Neural networks (Computer science) Anisotropy--Mathematical models

Search results