Spelling suggestions: "subject:"1earning ono networks"" "subject:"1earning onn networks""
21 |
Compressing Deep Convolutional Neural NetworksMancevo del Castillo Ayala, Diego January 2017 (has links)
Deep Convolutional Neural Networks and "deep learning" in general stand at the cutting edge on a range of applications, from image based recognition and classification to natural language processing, speech and speaker recognition and reinforcement learning. Very deep models however are often large, complex and computationally expensive to train and evaluate. Deep learning models are thus seldom deployed natively in environments where computational resources are scarce or expensive. To address this problem we turn our attention towards a range of techniques that we collectively refer to as "model compression" where a lighter student model is trained to approximate the output produced by the model we wish to compress. To this end, the output from the original model is used to craft the training labels of the smaller student model. This work contains some experiments on CIFAR-10 and demonstrates how to use the aforementioned techniques to compress a people counting model whose precision, recall and F1-score are improved by as much as 14% against our baseline.
|
22 |
Sequence-learning in a self-referential closed-loop behavioural systemPorr, Bernd January 2003 (has links)
This thesis focuses on the problem of "autonomous agents". It is assumed that such agents want to be in a desired state which can be assessed by the agent itself when it observes the consequences of its own actions. Therefore the feedback from the motor output via the environment to the sensor input is an essential component of such a system. As a consequence an agent is defined in this thesis as a self-referential system which operates within a closed sensor- mot or-sensor feedback loop. The generic situation is that the agent is always prone to unpredictable disturbances which arrive from the outside, i.e. from its environment. These disturbances cause a deviation from the desired state (for example the organism is attacked unexpectedly or the temperature in the environment changes, ...). The simplest mechanism for managing such disturbances in an organism is to employ a reflex loop which essentially establishes reactive behaviour. Reflex loops are directly related to closed loop feedback controllers. Thus, they are robust and they do not need a built-in model of the control situation. However, reflexes have one main disadvantage, namely that they always occur 'too late'; i.e., only after a (for example, unpleasant) reflex eliciting sensor event has occurred. This defines an objective problem for the organism. This thesis provides a solution to this problem which is called Isotropic Sequence Order (ISO-) learning. The problem is solved by correlating the primary reflex and a predictive sensor input: the result is that the system learns the temporal relation between the primary reflex and the earlier sensor input and creates a new predictive reflex. This (new) predictive reflex does not have the disadvantage of the primary reflex, namely of always being too late. As a consequence the agent is able to maintain its desired input-state all the time. In terms of engineering this means that ISO learning solves the inverse controller problem for the reflex, which is mathematically proven in this thesis. Summarising, this means that the organism starts as a reactive system and learning turns the system into a pro-active system. It will be demonstrated by a real robot experiment that ISO learning can successfully learn to solve the classical obstacle avoidance task without external intervention (like rewards). In this experiment the robot has to correlate a reflex (retraction after collision) with signals of range finders (turn before the collision). After successful learning the robot generates a turning reaction before it bumps into an obstacle. Additionally it will be shown that the learning goal of 'reflex avoidance' can also, paradoxically, be used to solve an attraction task.
|
23 |
Representation learning with a temporally coherent mixed-representationParkinson, Jon January 2017 (has links)
Guiding a representation towards capturing temporally coherent aspects present invideo improves object identity encoding. Existing models apply temporal coherenceuniformly over all features based on the assumption that optimal encoding of objectidentity only requires temporally stable components. We test the validity of this assumptionby exploring the effects of applying a mixture of temporally coherent invariantfeatures, alongside variable features, in a single 'mixed' representation. Applyingtemporal coherence to different proportions of the available features, we evaluate arange of models on a supervised object classification task. This series of experimentswas tested on three video datasets, each with a different complexity of object shape andmotion. We also investigated whether a mixed-representation improves the capture ofinformation components associated with object position, alongside object identity, ina single representation. Tests were initially applied using a single layer autoencoderas a test bed, followed by subsequent tests investigating whether similar behaviouroccurred in the more abstract features learned by a deep network. A representationapplying temporal coherence in some fashion produced the best results in all tests,on both single layered and deep networks. The majority of tests favoured a mixed representation,especially in cases where the quantity of labelled data available to thesupervised task was plentiful. This work is the first time a mixed-representation hasbeen investigated, and demonstrates its use as a method for representation learning.
|
24 |
RELIABILITY AND RISK ASSESSMENT OF NETWORKED URBAN INFRASTRUCTURE SYSTEMS UNDER NATURAL HAZARDSRokneddin, Keivan 16 September 2013 (has links)
Modern societies increasingly depend on the reliable functioning of urban infrastructure systems in the aftermath of natural disasters such as hurricane and earthquake events. Apart from a sizable capital for maintenance and expansion, the reliable performance of infrastructure systems under extreme hazards also requires strategic planning and effective resource assignment. Hence, efficient system reliability and risk assessment methods are needed to provide insights to system stakeholders to understand infrastructure performance under different hazard scenarios and accordingly make informed decisions in response to them. Moreover, efficient assignment of limited financial and human resources for maintenance and retrofit actions requires new methods to identify critical system components under extreme events.
Infrastructure systems such as highway bridge networks are spatially distributed systems with many linked components. Therefore, network models describing them as mathematical graphs with nodes and links naturally apply to study their performance. Owing to their complex topology, general system reliability methods are ineffective to evaluate the reliability of large infrastructure systems. This research develops computationally efficient methods such as a modified Markov Chain Monte Carlo simulations algorithm for network reliability, and proposes a network reliability framework (BRAN: Bridge Reliability Assessment in Networks) that is applicable to large and complex highway bridge systems. Since the response of system components to hazard scenario events are often correlated, the BRAN framework enables accounting for correlated component failure probabilities stemming from different correlation sources. Failure correlations from non-hazard sources are particularly emphasized, as they potentially have a significant impact on network reliability estimates, and yet they have often been ignored or only partially considered in the literature of infrastructure system reliability.
The developed network reliability framework is also used for probabilistic risk assessment, where network reliability is assigned as the network performance metric. Risk analysis studies may require prohibitively large number of simulations for large and complex infrastructure systems, as they involve evaluating the network reliability for multiple hazard scenarios. This thesis addresses this challenge by developing network surrogate models by statistical learning tools such as random forests. The surrogate models can replace network reliability simulations in a risk analysis framework, and significantly reduce computation times. Therefore, the proposed approach provides an alternative to the established methods to enhance the computational efficiency of risk assessments, by developing a surrogate model of the complex system at hand rather than reducing the number of analyzed hazard scenarios by either hazard consistent scenario generation or importance sampling. Nevertheless, the application of surrogate models can be combined with scenario reduction methods to improve even further the analysis efficiency.
To address the problem of prioritizing system components for maintenance and retrofit actions, two advanced metrics are developed in this research to rank the criticality of system components. Both developed metrics combine system component fragilities with the topological characteristics of the network, and provide rankings which are either conditioned on specific hazard scenarios or probabilistic, based on the preference of infrastructure system stakeholders. Nevertheless, they both offer enhanced efficiency and practical applicability compared to the existing methods.
The developed frameworks for network reliability evaluation, risk assessment, and component prioritization are intended to address important gaps in the state-of-the-art management and planning for infrastructure systems under natural hazards. Their application can enhance public safety by informing the decision making process for expansion, maintenance, and retrofit actions for infrastructure systems.
|
25 |
Polytopes Arising from Binary Multi-way Contingency Tables and Characteristic Imsets for Bayesian NetworksXi, Jing 01 January 2013 (has links)
The main theme of this dissertation is the study of polytopes arising from binary multi-way contingency tables and characteristic imsets for Bayesian networks.
Firstly, we study on three-way tables whose entries are independent Bernoulli ran- dom variables with canonical parameters under no three-way interaction generalized linear models. Here, we use the sequential importance sampling (SIS) method with the conditional Poisson (CP) distribution to sample binary three-way tables with the sufficient statistics, i.e., all two-way marginal sums, fixed. Compared with Monte Carlo Markov Chain (MCMC) approach with a Markov basis (MB), SIS procedure has the advantage that it does not require expensive or prohibitive pre-computations. Note that this problem can also be considered as estimating the number of lattice points inside the polytope defined by the zero-one and two-way marginal constraints. The theorems in Chapter 2 give the parameters for the CP distribution on each column when it is sampled. In this chapter, we also present the algorithms, the simulation results, and the results for Samson’s monks data.
Bayesian networks, a part of the family of probabilistic graphical models, are widely applied in many areas and much work has been done in model selections for Bayesian networks. The second part of this dissertation investigates the problem of finding the optimal graph by using characteristic imsets, where characteristic imsets are defined as 0-1 vector representations of Bayesian networks which are unique up to Markov equivalence. Characteristic imset polytopes are defined as the convex hull of all characteristic imsets we consider. It was proven that the problem of finding optimal Bayesian network for a specific dataset can be converted to a linear programming problem over the characteristic imset polytope [51]. In Chapter 3, we first consider characteristic imset polytopes for all diagnosis models and show that these polytopes are direct product of simplices. Then we give the combinatorial description of all edges and all facets of these polytopes. At the end of this chapter, we generalize these results to the characteristic imset polytopes for all Bayesian networks with a fixed underlying ordering of nodes.
Chapter 4 includes discussion and future work on these two topics.
|
26 |
Development of a real-time learning scheduler using adaptive critics conceptsSahinoglu, Mehmet Murat. January 1993 (has links)
Thesis (M.S.)--Ohio University, November, 1993. / Title from PDF t.p.
|
27 |
Discovery of low-dimensional structure in high-dimensional inference problemsAksoylar, Cem 10 March 2017 (has links)
Many learning and inference problems involve high-dimensional data such as images, video or genomic data, which cannot be processed efficiently using conventional methods due to their dimensionality. However, high-dimensional data often exhibit an inherent low-dimensional structure, for instance they can often be represented sparsely in some basis or domain. The discovery of an underlying low-dimensional structure is important to develop more robust and efficient analysis and processing algorithms.
The first part of the dissertation investigates the statistical complexity of sparse recovery problems, including sparse linear and nonlinear regression models, feature selection and graph estimation. We present a framework that unifies sparse recovery problems and construct an analogy to channel coding in classical information theory. We perform an information-theoretic analysis to derive bounds on the number of samples required to reliably recover sparsity patterns independent of any specific recovery algorithm. In particular, we show that sample complexity can be tightly characterized using a mutual information formula similar to channel coding results. Next, we derive major extensions to this framework, including dependent input variables and a lower bound for sequential adaptive recovery schemes, which helps determine whether adaptivity provides performance gains. We compute statistical complexity bounds for various sparse recovery problems, showing our analysis improves upon the existing bounds and leads to intuitive results for new applications.
In the second part, we investigate methods for improving the computational complexity of subgraph detection in graph-structured data, where we aim to discover anomalous patterns present in a connected subgraph of a given graph. This problem arises in many applications such as detection of network intrusions, community detection, detection of anomalous events in surveillance videos or disease outbreaks. Since optimization over connected subgraphs is a combinatorial and computationally difficult problem, we propose a convex relaxation that offers a principled approach to incorporating connectivity and conductance constraints on candidate subgraphs. We develop a novel nearly-linear time algorithm to solve the relaxed problem, establish convergence and consistency guarantees and demonstrate its feasibility and performance with experiments on real networks.
|
28 |
Learning in adaptive networks : analytical and computational approachesYang, Guoli January 2016 (has links)
The dynamics on networks and the dynamics of networks are usually entangled with each other in many highly connected systems, where the former means the evolution of state and the latter means the adaptation of structure. In this thesis, we will study the coupled dynamics through analytical and computational approaches, where the adaptive networks are driven by learning of various complexities. Firstly, we investigate information diffusion on networks through an adaptive voter model, where two opinions are competing for the dominance. Two types of dynamics facilitate the agreement between neighbours: one is pairwise imitation and the other is link rewiring. As the rewiring strength increases, the network of voters will transform from consensus to fragmentation. By exploring various strategies for structure adaptation and state evolution, our results suggest that network configuration is highly influenced by range-based rewiring and biased imitation. In particular, some approximation techniques are proposed to capture the dynamics analytically through moment-closure differential equations. Secondly, we study an evolutionary model under the framework of natural selection. In a structured community made up of cooperators and cheaters (or defectors), a new-born player will adopt a strategy and reorganise its neighbourhood based on social inheritance. Starting from a cooperative population, an invading cheater may spread in the population occasionally leading to the collapse of cooperation. Such a collapse unfolds rapidly with the change of external conditions, bearing the traits of a critical transition. In order to detect the risk of invasions, some indicators based on population composition and network structure are proposed to signal the fragility of communities. Through the analyses of consistency and accuracy, our results suggest possible avenues for detecting the loss of cooperation in evolving networks. Lastly, we incorporate distributed learning into adaptive agents coordination, which emerges as a consequence of rational individual behaviours. A generic framework of work-learn-adapt (WLA) is proposed to foster the success of agents organisation. To gain higher organisation performance, the division of labour is achieved by a series of events of state evolution and structure adaptation. Importantly, agents are able to adjust their states and structures through quantitative information obtained from distributed learning. The adaptive networks driven by explicit learning pave the way for a better understanding of intelligent organisations in real world.
|
29 |
Trénovatelná segmentace obrazu s použitím hlubokého učení / Trainable image segmentation using deep learningDolníček, Pavel January 2017 (has links)
This work focuses on the topic of machine learning, specifically implementation of a program for automated classification using deep learning. This work compares different trainable models of neural networks and describes practical solutions encountered during their implementation.
|
30 |
TOWARDS AN UNDERSTANDING OF RESIDUAL NETWORKS USING NEURAL TANGENT HIERARCHYYuqing Li (10223885) 06 May 2021 (has links)
<div>Deep learning has become an important toolkit for data science and artificial intelligence. In contrast to its practical success across a wide range of fields, theoretical understanding of the principles behind the success of deep learning has been an issue of controversy. Optimization, as an important component of theoretical machine learning, has attracted much attention. The optimization problems induced from deep learning is often non-convex and</div><div>non-smooth, which is challenging to locate the global optima. However, in practice, global convergence of first-order methods like gradient descent can be guaranteed for deep neural networks. In particular, gradient descent yields zero training loss in polynomial time for deep neural networks despite its non-convex nature. Besides that, another mysterious phenomenon is the compelling performance of Deep Residual Network (ResNet). Not only</div><div>does training ResNet require weaker conditions, the employment of residual connections by ResNet even enables first-order methods to train the neural networks with an order of magnitude more layers. Advantages arising from the usage of residual connections remain to be discovered.</div><div><br></div><div>In this thesis, we demystify these two phenomena accordingly. Firstly, we contribute to further understanding of gradient descent. The core of our analysis is the neural tangent hierarchy (NTH) that captures the gradient descent dynamics of deep neural networks. A recent work introduced the Neural Tangent Kernel (NTK) and proved that the limiting</div><div>NTK describes the asymptotic behavior of neural networks trained by gradient descent in the infinite width limit. The NTH outperforms the NTK in two ways: (i) It can directly study the time variation of NTK for neural networks. (ii) It improves the result to non-asymptotic settings. Moreover, by applying NTH to ResNet with smooth and Lipschitz activation function, we reduce the requirement on the layer width m with respect to the number of training samples n from quartic to cubic, obtaining a state-of-the-art result. Secondly, we extend our scope of analysis to structural properties of deep neural networks. By making fair and consistent comparisons between fully-connected network and ResNet, we suggest strongly that the particular skip-connection architecture possessed by ResNet is the main</div><div>reason for its triumph over fully-connected network.</div>
|
Page generated in 0.2591 seconds