1 |
A Self-Organizing Computational Neural Network Architecture with Applications to Sensorimotor Grounded Linguistic Grammar AcquisitionJansen, Peter 10 1900 (has links)
<p> Connectionist models of language acquisition typically have difficulty with systematicity, or the ability for the network to generalize its limited experience with language to novel utterances. In this way, connectionist systems learning grammar from a set of example sentences tend to store a set of specific instances, rather than a generalized abstract knowledge of the process of grammatical combination. Further, recent models that do show limited systematicity do so at the expense of simultaneously storing explicit lexical knowledge, and also make use of both developmentally-implausible training data and biologically-implausible learning rules. Consequently, this research program develops a novel unsupervised neural network architecture, and applies this architecture to the problem of systematicity in language models.</p> <p> In the first of several studies, a connectionist architecture capable of simultaneously storing explicit and separate representations of both conceptual and grammatical information is developed, where this architecture is a hybrid of both a self-organizing map and an intra-layer Hebbian associative network. Over the course of several studies, this architecture's capacity to acquire linguistic grammar is evaluated, where the architecture is progressively refined until it is capable of acquiring a benchmark grammar consisting of several difficult clausal sentence structures - though it must acquire this grammar at the level of grammatical category, rather than the lexical level.</p> <p> The final study bridges the gap between the lexical and grammatical category levels, and
develops an activation function based on a semantic feature co-occurrence metric. In concert
with developmentally-plausible sensorimotor grounded conceptual representations, it is shown
that a network using this metric is able to undertake a process of semantic bootstrapping, and
successfully acquire separate explicit representations at the level of the concept, part-of-speech category, and grammatical sequence. This network demonstrates broadly systematic behaviour on a difficult test of systematicity, and extends its knowledge of grammar to novel sensorimotor-grounded words.</p> / Thesis / Doctor of Philosophy (PhD)
|
2 |
Hybrid Machine Learning and Physics-Based Modeling Approaches for Process Control and OptimizationPark, Junho 01 December 2022 (has links)
Transformer neural networks have made a significant impact on natural language processing. The Transformer network self-attention mechanism effectively addresses the vanishing gradient problem that limits a network learning capability, especially when the time series gets longer or the size of the network gets deeper. This dissertation examines the usage of the Transformer model for time-series forecasting and customizes it for a simultaneous multistep-ahead prediction model in a surrogate model predictive control (MPC) application. The proposed method demonstrates enhanced control performance and computation efficiency compared to the Long-short term memory (LSTM)-based MPC and one-step-ahead prediction model structures for both LSTM and Transformer networks. In addition to the Transformer, this research investigates hybrid machine-learning modeling. The machine learning models are known for superior function approximation capability with sufficient data. However, the quantity and quality of data to ensure the prediction precision are usually not readily available. The physics-informed neural network (PINN) is a type of hybrid modeling method using dynamic physics-based equations in training a standard machine learning model as a form of multi-objective optimization. The PINN approach with the state-of-the-art time-series neural networks Transformer is studied in this research providing the standard procedure to develop the Physics-Informed Transformer (PIT) and validating with various case studies. This research also investigates the benefit of nonlinear model-based control and estimation algorithms for managed pressure drilling (MPD). This work presents a new real-time high-fidelity flow model (RT-HFM) for bottom-hole pressure (BHP) regulation in MPD operations. Lastly, this paper presents details of an Arduino microcontroller temperature control lab as a benchmark for modeling and control methods. Standard benchmarks are essential for comparing competing models and control methods, especially when a new method is proposed. A physical benchmark considers real process characteristics such as the requirement to meet a cycle time, discrete sampling intervals, communication overhead with the process, and model mismatch. Novel contributions of this work are (1) a new MPC system built upon a Transformer time-series architecture, (2) a training method for time-series machine learning models that enables multistep-ahead prediction, (3) verification of Transformer MPC solution time performance improvement (15 times) over LSTM networks, (4) physics-informed machine learning to improve extrapolation potential, and (5) two case studies that demonstrate hybrid modeling and benchmark performance criteria.
|
3 |
Neural network based identification and control of an unmanned helicopterSamal, Mahendra, Engineering & Information Technology, Australian Defence Force Academy, UNSW January 2009 (has links)
This research work provides the development of an Adaptive Flight Control System (AFCS) for autonomous hover of a Rotary-wing Unmanned Aerial Vehicle (RUAV). Due to the complex, nonlinear and time-varying dynamics of the RUAV, indirect adaptive control using the Model Predictive Control (MPC) is utilised. The performance of the MPC mainly depends on the model of the RUAV used for predicting the future behaviour. Due to the complexities associated with the RUAV dynamics, a neural network based black box identification technique is used for modelling the behaviour of the RUAV. Auto-regressive neural network architecture is developed for offline and online modelling purposes. A hybrid modelling technique that exploits the advantages of both the offline and the online models is proposed. In the hybrid modelling technique, the predictions from the offline trained model are corrected by using the error predictions from the online model at every sample time. To reduce the computational time for training the neural networks, a principal component analysis based algorithm that reduces the dimension of the input training data is also proposed. This approach is shown to reduce the computational time significantly. These identification techniques are validated in numerical simulations before flight testing in the Eagle and RMAX helicopter platforms. Using the successfully validated models of the RUAVs, Neural Network based Model Predictive Controller (NN-MPC) is developed taking into account the non-linearity of the RUAVs and constraints into consideration. The parameters of the MPC are chosen to satisfy the performance requirements imposed on the flight controller. The optimisation problem is solved numerically using nonlinear optimisation techniques. The performance of the controller is extensively validated using numerical simulation models before flight testing. The effects of actuator and sensor delays and noises along with the wind gusts are taken into account during these numerical simulations. In addition, the robustness of the controller is validated numerically for possible parameter variations. The numerical simulation results are compared with a base-line PID controller. Finally, the NN-MPCs are flight tested for height control and autonomous hover. For these, SISO as well as multiple SISO controllers are used. The flight tests are conducted in varying weather conditions to validate the utility of the control technique. The NN-MPC in conjunction with the proposed hybrid modelling technique is shown to handle additional disturbances successfully. Extensive flight test results provide justification for the use of the NN-MPC technique as a reliable technique for control of non-linear complex dynamic systems such as RUAVs.
|
4 |
Multimodal Deep Learning for Multi-Label Classification and Ranking ProblemsDubey, Abhishek January 2015 (has links) (PDF)
In recent years, deep neural network models have shown to outperform many state of the art algorithms. The reason for this is, unsupervised pretraining with multi-layered deep neural networks have shown to learn better features, which further improves many supervised tasks. These models not only automate the feature extraction process but also provide with robust features for various machine learning tasks. But the unsupervised pretraining and feature extraction using multi-layered networks are restricted only to the input features and not to the output. The performance of many supervised learning algorithms (or models) depends on how well the output dependencies are handled by these algorithms [Dembczy´nski et al., 2012]. Adapting the standard neural networks to handle these output dependencies for any specific type of problem has been an active area of research [Zhang and Zhou, 2006, Ribeiro et al., 2012].
On the other hand, inference into multimodal data is considered as a difficult problem in machine learning and recently ‘deep multimodal neural networks’ have shown significant results [Ngiam et al., 2011, Srivastava and Salakhutdinov, 2012]. Several problems like classification with complete or missing modality data, generating the missing modality etc., are shown to perform very well with these models. In this work, we consider three nontrivial supervised learning tasks (i) multi-class classification (MCC),
(ii) multi-label classification (MLC) and (iii) label ranking (LR), mentioned in the order of increasing complexity of the output. While multi-class classification deals with predicting one class for every instance, multi-label classification deals with predicting more than one classes for every instance and label ranking deals with assigning a rank to each label for every instance. All the work in this field is associated around formulating new error functions that can force network to identify the output dependencies.
Aim of our work is to adapt neural network to implicitly handle the feature extraction (dependencies) for output in the network structure, removing the need of hand crafted error functions. We show that the multimodal deep architectures can be adapted for these type of problems (or data) by considering labels as one of the modalities. This also brings unsupervised pretraining to the output along with the input. We show that these models can not only outperform standard deep neural networks, but also outperform standard adaptations of neural networks for individual domains under various metrics over several data sets considered by us. We can observe that the performance of our models over other models improves even more as the complexity of the output/ problem increases.
|
5 |
Hardware/Software Co-Design for Keyword Spotting on Edge DevicesJacob Irenaeus M Bushur (15360553) 29 April 2023 (has links)
<p>The introduction of artificial neural networks (ANNs) to speech recognition applications has sparked the rapid development and popularization of digital assistants. These digital assistants perform keyword spotting (KWS), constantly monitoring the audio captured by a microphone for a small set of words or phrases known as keywords. Upon recognizing a keyword, a larger audio recording is saved and processed by a separate, more complex neural network. More broadly, neural networks in speech recognition have popularized voice as means of interacting with electronic devices, sparking an interest in individuals using speech recognition in their own projects. However, while large companies have the means to develop custom neural network architectures alongside proprietary hardware platforms, such development precludes those lacking similar resources from developing efficient and effective neural networks for embedded systems. While small, low-power embedded systems are widely available in the hobbyist space, a clear process is needed for developing a neural network that accounts for the limitations of these resource-constrained systems. In contrast, a wide variety of neural network architectures exists, but often little thought is given to deploying these architectures on edge devices. </p>
<p><br></p>
<p>This thesis first presents an overview of audio processing techniques, artificial neural network fundamentals, and machine learning tools. A summary of a set of specific neural network architectures is also discussed. Finally, the process of implementing and modifying these existing neural network architectures and training specific models in Python using TensorFlow is demonstrated. The trained models are also subjected to post-training quantization to evaluate the effect on model performance. The models are evaluated using metrics relevant to deployment on resource-constrained systems, such as memory consumption, latency, and model size, in addition to the standard comparisons of accuracy and parameter count. After evaluating the models and architectures, the process of deploying one of the trained and quantized models is explored on an Arduino Nano 33 BLE using TensorFlow Lite for Microcontrollers and on a Digilent Nexys 4 FPGA board using CFU Playground.</p>
|
Page generated in 0.3333 seconds