Spelling suggestions: "subject:"atructured prediction"" "subject:"estructured prediction""
1 |
Towards Structured Prediction in Bioinformatics with Deep LearningLi, Yu 01 November 2020 (has links)
Using machine learning, especially deep learning, to facilitate biological research
is a fascinating research direction. However, in addition to the standard classi cation
or regression problems, whose outputs are simple vectors or scalars, in bioinformatics,
we often need to predict more complex structured targets, such as 2D images
and 3D molecular structures. The above complex prediction tasks are referred to as
structured prediction. Structured prediction is more complicated than the traditional
classi cation but has much broader applications, especially in bioinformatics, considering
the fact that most of the original bioinformatics problems have complex output
objects.
Due to the properties of those structured prediction problems, such as having
problem-speci c constraints and dependency within the labeling space, the straightforward
application of existing deep learning models on the problems can lead to
unsatisfactory results. In this dissertation, we argue that the following two ideas
can help resolve a wide range of structured prediction problems in bioinformatics.
Firstly, we can combine deep learning with other classic algorithms, such as probabilistic
graphical models, which model the problem structure explicitly. Secondly,
we can design and train problem-speci c deep learning architectures or methods by
considering the structured labeling space and problem constraints, either explicitly
or implicitly. We demonstrate our ideas with six projects from four bioinformatics
sub elds, including sequencing analysis, structure prediction, function annotation,
and network analysis. The structured outputs cover 1D electrical signals, 2D images, 3D structures, hierarchical labeling, and heterogeneous networks. With the help of
the above ideas, all of our methods can achieve state-of-the-art performance on the
corresponding problems.
The success of these projects motivates us to extend our work towards other more
challenging but important problems, such as health-care problems, which can directly
bene t people's health and wellness. We thus conclude this thesis by discussing such
future works, and the potential challenges and opportunities.
|
2 |
Effective and annotation efficient deep learning for image understanding / Méthodes d'apprentissage profond pour l'analyse efficace d'images en limitant l'annotation humaineGidaris, Spyridon 11 December 2018 (has links)
Le développement récent de l'apprentissage profond a permis une importante amélioration des résultats dans le domaine de l'analyse d'image. Cependant, la conception d'architectures d'apprentissage profond à même de résoudre efficacement les tâches d'analyse d'image est loin d'être simple. De plus, le succès des approches d'apprentissage profond dépend fortement de la disponibilité de données en grande quantité étiquetées manuellement (par des humains), ce qui est à la fois coûteux et peu pratique lors du passage à grande échelle. Dans ce contexte, l'objectif de cette thèse est d'explorer des approches basées sur l'apprentissage profond pour certaines tâches de compréhension de l'image qui permettraient d'augmenter l'efficacité avec laquelle celles-ci sont effectuées ainsi que de rendre le processus d'apprentissage moins dépendant à la disponibilité d'une grande quantité de données annotées à la main. Nous nous sommes d'abord concentrés sur l'amélioration de l'état de l'art en matière de détection d'objets. Plus spécifiquement, nous avons tenté d'améliorer la capacité des systèmes de détection d'objets à reconnaître des instances d'objets (même difficiles à distinguer) en proposant une représentation basée sur des réseaux de neurone convolutionnels prenant en compte le aspects multi-région et de segmentation sémantique, et capable de capturer un ensemble diversifié de facteurs d'apparence discriminants. De plus, nous avons visé à améliorer la précision de localisation des systèmes de détection d'objets en proposant des schémas itératifs de détection d'objets et un nouveau modèle de localisation pour estimer la boîte de délimitation d'un objet. En ce qui concerne le problème de l'étiquetage des images à l'échelle du pixel, nous avons exploré une famille d'architectures de réseaux de neurones profonds qui effectuent une prédiction structurée des étiquettes de sortie en apprenant à améliorer (itérativement) une estimation initiale de celles-ci. L'objectif est d'identifier l'architecture optimale pour la mise en œuvre de tels modèles profonds de prévision structurée. Dans ce contexte, nous avons proposé de décomposer la tâche d'amélioration de l'étiquetage en trois étapes : 1) détecter les estimations initialement incorrectes des étiquettes, 2) remplacer les étiquettes incorrectes par de nouvelles étiquettes, et finalement 3) affiner les étiquettes renouvelées en prédisant les corrections résiduelles. Afin de réduire la dépendance à l'effort d'annotation humaine, nous avons proposé une approche d'apprentissage auto-supervisée qui apprend les représentations sémantiques d'images à l'aide d'un réseau de neurones convolutionnel en entraînant ce dernier à reconnaître la rotation 2d qui est appliquée à l'image qu'il reçoit en entrée. Plus précisément, les caractéristiques de l'image tirées de cette tâche de prédiction de rotation donnent de très bons résultats lorsqu'elles sont transférées sur les autres tâches de détection d'objets et de segmentation sémantique, surpassant les approches d'apprentissage antérieures non supervisées et réduisant ainsi l'écart avec le cas supervisé. Enfin, nous avons proposé un nouveau système de reconnaissance d'objets qui, après son entraînement, est capable d'apprendre dynamiquement de nouvelles catégories à partir de quelques exemples seulement (typiquement, seulement un ou cinq), sans oublier les catégories sur lesquelles il a été formé. Afin de mettre en œuvre le système de reconnaissance proposé, nous avons introduit deux nouveautés techniques, un générateur de poids de classification basé sur l'attention et un modèle de reconnaissance basé sur un réseau neuronal convolutionnel dont le classificateur est implémenté comme une fonction de similarité cosinusienne entre les représentations de caractéristiques et les vecteurs de classification / Recent development in deep learning have achieved impressive results on image understanding tasks. However, designing deep learning architectures that will effectively solve the image understanding tasks of interest is far from trivial. Even more, the success of deep learning approaches heavily relies on the availability of large-size manually labeled (by humans) data. In this context, the objective of this dissertation is to explore deep learning based approaches for core image understanding tasks that would allow to increase the effectiveness with which they are performed as well as to make their learning process more annotation efficient, i.e., less dependent on the availability of large amounts of manually labeled training data. We first focus on improving the state-of-the-art on object detection. More specifically, we attempt to boost the ability of object detection systems to recognize (even difficult) object instances by proposing a multi-region and semantic segmentation-aware ConvNet-based representation that is able to capture a diverse set of discriminative appearance factors. Also, we aim to improve the localization accuracy of object detection systems by proposing iterative detection schemes and a novel localization model for estimating the bounding box of the objects. We demonstrate that the proposed technical novelties lead to significant improvements in the object detection performance of PASCAL and MS COCO benchmarks. Regarding the pixel-wise image labeling problem, we explored a family of deep neural network architectures that perform structured prediction by learning to (iteratively) improve some initial estimates of the output labels. The goal is to identify which is the optimal architecture for implementing such deep structured prediction models. In this context, we propose to decompose the label improvement task into three steps: 1) detecting the initial label estimates that are incorrect, 2) replacing the incorrect labels with new ones, and finally 3) refining the renewed labels by predicting residual corrections w.r.t. them. We evaluate the explored architectures on the disparity estimation task and we demonstrate that the proposed architecture achieves state-of-the-art results on the KITTI 2015 benchmark.In order to accomplish our goal for annotation efficient learning, we proposed a self-supervised learning approach that learns ConvNet-based image representations by training the ConvNet to recognize the 2d rotation that is applied to the image that it gets as input. We empirically demonstrate that this apparently simple task actually provides a very powerful supervisory signal for semantic feature learning. Specifically, the image features learned from this task exhibit very good results when transferred on the visual tasks of object detection and semantic segmentation, surpassing prior unsupervised learning approaches and thus narrowing the gap with the supervised case.Finally, also in the direction of annotation efficient learning, we proposed a novel few-shot object recognition system that after training is capable to dynamically learn novel categories from only a few data (e.g., only one or five training examples) while it does not forget the categories on which it was trained on. In order to implement the proposed recognition system we introduced two technical novelties, an attention based few-shot classification weight generator, and implementing the classifier of the ConvNet based recognition model as a cosine similarity function between feature representations and classification vectors. We demonstrate that the proposed approach achieved state-of-the-art results on relevant few-shot benchmarks
|
3 |
Stochastic functional descent for learning Support Vector MachinesHe, Kun 22 January 2016 (has links)
We present a novel method for learning Support Vector Machines (SVMs) in the online setting. Our method is generally applicable in that it handles the online learning of the binary, multiclass, and structural SVMs in a unified view.
The SVM learning problem consists of optimizing a convex objective function that is composed of two parts: the hinge loss and quadratic regularization. To date, the predominant family of approaches for online SVM learning has been gradient-based methods, such as Stochastic Gradient Descent (SGD). Unfortunately, we note that there are two drawbacks in such approaches: first, gradient-based methods are based on a local linear approximation to the function being optimized, but since the hinge loss is piecewise-linear and nonsmooth, this approximation can be ill-behaved. Second, existing online SVM learning approaches share the same problem formulation with batch SVM learning methods, and they all need to tune a fixed global regularization parameter by cross validation. On the one hand, global regularization is ineffective in handling local irregularities encountered in the online setting; on the other hand, even though the learning problem for a particular global regularization parameter value may be efficiently solved, repeatedly solving for a wide range of values can be costly.
We intend to tackle these two problems with our approach. To address the first problem, we propose to perform implicit online update steps to optimize the hinge loss, as opposed to explicit (or gradient-based) updates that utilize subgradients to perform local linearization. Regarding the second problem, we propose to enforce local regularization that is applied to individual classifier update steps, rather than having a fixed global regularization term.
Our theoretical analysis suggests that our classifier update steps progressively optimize the structured hinge loss, with the rate controlled by a sequence of regularization parameters; setting these parameters is analogous to setting the stepsizes in gradient-based methods. In addition, we give sufficient conditions for the algorithm's convergence. Experimentally, our online algorithm can match optimal classification performances given by other state-of-the-art online SVM learning methods, as well as batch learning methods, after only one or two passes over the training data. More importantly, our algorithm can attain these results without doing cross validation, while all other methods must perform time-consuming cross validation to determine the optimal choice of the global regularization parameter.
|
4 |
USING MODULAR ARCHITECTURES TO PREDICT CHANGE OF BELIEFS IN ONLINE DEBATESAldo Fabrizio Porco (7460849) 17 October 2019 (has links)
<div>
<div>
<div>
<p>Researchers studying persuasion have mostly focused on modeling arguments to
understand how people’s beliefs can change. However, in order to convince an audience the speakers usually adapt their speech. This can be seen often in political
campaigns when ideas are phrased - framed - in different ways according to the geo-graphical region the candidate is in. This practice suggests that, in order to change
people’s beliefs, it is important to take into account their previous perspectives and
topics of interest.
</p><p><br></p>
<p>In this work we propose ChangeMyStance, a novel task to predict if a user would
change their mind after being exposed to opposing views on a particular subject. This
setting takes into account users’ beliefs before a debate, thus modeling their preconceived notions about the topic. Moreover, we explore a new approach to solve the
problem, where the task is decomposed into ”simpler” problems. Breaking the main
objective into several tasks allows to build expert modules that combined produce
better results. This strategy significantly outperforms a BERT end-to-end model over
the same inputs.
</p>
</div>
</div>
</div>
|
5 |
Improving the accuracy and scalability of discriminative learning methods for Markov logic networksHuynh, Tuyen Ngoc 01 June 2011 (has links)
Many real-world problems involve data that both have complex structures and uncertainty. Statistical relational learning (SRL) is an emerging area of research that addresses the problem of learning from these noisy structured/relational data. Markov logic networks (MLNs), sets of weighted first-order logic formulae, are a simple but powerful SRL formalism that generalizes both first-order logic and Markov networks. MLNs have been successfully applied to a variety of real-world problems ranging from extraction knowledge from text to visual event recognition. Most of the existing learning algorithms for MLNs are in the generative setting: they try to learn a model that is equally capable of predicting the values of all variables given an arbitrary set of evidence; and they do not scale to problems with thousands of examples. However, many real-world problems in structured/relational data are discriminative--where the variables are divided into two disjoint sets input and output, and the goal is to correctly predict the values of the output variables given evidence data about the input variables. In addition, these problems usually involve data that have thousands of examples. Thus, it is important to develop new discriminative learning methods for MLNs that are more accurate and more scalable, which are the topics addressed in this thesis. First, we present a new method that discriminatively learns both the structure and parameters for a special class of MLNs where all the clauses are non-recursive ones. Non-recursive clauses arise in many learning problems in Inductive Logic Programming. To further improve the predictive accuracy, we propose a max-margin approach to learning weights for MLNs. Then, to address the issue of scalability, we present CDA, an online max-margin weight learning algorithm for MLNs. Ater [sic] that, we present OSL, the first algorithm that performs both online structure learning and parameter learning. Finally, we address an issue arising in applying MLNs to many real-world problems: learning in the presence of many hard constraints. Including hard constraints during training greatly increases the computational complexity of the learning problem. Thus, we propose a simple heuristic for selecting which hard constraints to include during training. Experimental results on several real-world problems show that the proposed methods are more accurate, more scalable (can handle problems with thousands of examples), or both more accurate and more scalable than existing learning methods for MLNs. / text
|
6 |
Sublinear-Time Learning and Inference for High-Dimensional ModelsYan, Enxu 01 May 2018 (has links)
Across domains, the scale of data and complexity of models have both been increasing greatly in the recent years. For many models of interest, tractable learning and inference without access to expensive computational resources have become challenging. In this thesis, we approach efficient learning and inference through the leverage of sparse structures inherent in the learning objective, which allows us to develop algorithms sublinear in the size of parameters without compromising the accuracy of models. In particular, we address the following three questions for each problem of interest: (a) how to formulate model estimation as an optimization problem with tractable sparse structure, (b) how to efficiently, i.e. in sublinear time, search, maintain, and utilize the sparse structures during training and inference, (c) how to guarantee fast convergence of our optimization algorithm despite its greedy nature? By answering these questions, we develop state-of-the-art algorithms in varied domains. Specifically, in the extreme classification domain, we utilizes primal and dual sparse structures to develop greedy algorithms of complexity sublinear in the number of classes, which obtain state-of-the-art accuracies on several benchmark data sets with one or two orders of magnitude speedup over existing algorithms. We also apply the primal-dual-sparse theory to develop a state-of-the-art trimming algorithm for Deep Neural Networks, which sparsifies neuron connections of a DNN with a task-dependent theoretical guarantee, which results in models of smaller storage cost and faster inference speed. When it comes to structured prediction problems (i.e. graphical models) with inter-dependent outputs, we propose decomposition methods that exploit sparse messages to decompose a structured learning problem of large output domains into factorwise learning modules amenable to sublineartime optimization methods, leading to practically much faster alternatives to existing learning algorithms. The decomposition technique is especially effective when combined with search data structures, such as those for Maximum Inner-Product Search (MIPS), to improve the learning efficiency jointly. Last but not the least, we design novel convex estimators for a latent-variable model by reparameterizing it as a solution of sparse support in an exponentially high-dimensional space, and approximate it with a greedy algorithm, which yields the first polynomial-time approximation method for the Latent-Feature Models and Generalized Mixed Regression without restrictive data assumptions.
|
7 |
Learning Structured and Deep Representations for Traffc Scene UnderstandingYu, Zhiding 01 December 2017 (has links)
Recent advances in representation learning have led to an increasing variety of vision-based approaches in traffic scene understanding. This includes general vision problems such as object detection, depth estimation, edge/boundary/contour detection, semantic segmentation and scene classification, as well as application-driven problems such as pedestrian detection, vehicle detection, lane marker detection and road segmentation, etc. In this thesis, we approach some of these problems by exploring structured and invariant representations from the visual input. Our research is mainly motivated by two facts: 1. Traffic scenes often contain highly structured layouts. Exploring structured priors is expected to help considerably in improving the scene understanding performance. 2. A major challenge of traffic scene understanding lies in the diverse and changing nature of the contents. It is therefore important to find robust visual representations that are invariant against such variability. We start from highway scenarios where we are interested in detecting the hard road borders and estimating the drivable space before such physical boundary. To this end, we treat the task as a joint detection and tracking problem, and formulate it with structured Hough voting (SVH): A conditional random field model that explores both intra-frame geometric and interframe temporal information to generate more accurate and stable predictions. Turning from highway scenes to urban scenes, we consider dense prediction problems such as category-aware semantic edge detection and semantic segmentation. Category-aware semantic edge detection is challenging as the model is required to jointly localize object contours and classify each edge pixel to one or multiple predefined classes. We propose CASENet, a multilabel deep network with state of the art edge detection performance. To address the label misalignment problem in edge learning, we also propose SEAL, a framework towards simultaneous edge alignment and learning. Failure across different domains has been a common bottleneck of semantic segmentation methods. In this thesis, we address the problem of adapting a segmentation model trained on a source domain to another different target domain without knowing the target domain labels, and propose a class-balanced self-training approach for such unsupervised domain adaptation. We adopt the \synthetic-to-real" setting where a model is pre-trained on GTA-5 and adapted to real world datasets such as Cityscapes and Nexar, as well as the \cross-city" setting where a model is pre-trained on Cityscapes, and adapted to unseen data from Rio, Tokyo, Rome and Taipei. Experiment shows the superior performance of our method compared to state of the art methods, such as adversarial training based domain adaptation.
|
8 |
Efficient Algorithms for Learning Combinatorial Structures from Limited DataAsish Ghoshal (5929691) 15 May 2019 (has links)
<div>Recovering combinatorial structures from noisy observations is a recurrent problem in many application domains, including, but not limited to, natural language processing, computer vision, genetics, health care, and automation. For instance, dependency parsing in natural language processing entails recovering parse trees from sentences which are inherently ambiguous. From a computational standpoint, such problems are typically intractable and call for designing efficient approximation or randomized algorithms with provable guarantees. From a statistical standpoint, algorithms that recover the desired structure using an optimal number of samples are of paramount importance.</div><div><br></div><div>We tackle several such problems in this thesis and obtain computationally and statistically efficient procedures. We demonstrate optimality of our methods by proving fundamental lower bounds on the number of samples needed by any method for recovering the desired structures. Specifically, the thesis makes the following contributions:</div><div><br></div><div>(i) We develop polynomial-time algorithms for learning linear structural equation models --- which are a widely used class of models for performing causal inference --- that recover the correct directed acyclic graph structure under identifiability conditions that are weaker than existing conditions. We also show that the sample complexity of our method is information-theoretically optimal.</div><div><br></div><div>(ii) We develop polynomial-time algorithms for learning the underlying graphical game from observations of the behavior of self-interested agents. The key combinatorial problem here is to recover the Nash equilibria set of the true game from behavioral data. We obtain fundamental lower bounds on the number of samples required for learning games and show that our method is statistically optimal.</div><div><br></div><div>(iii) Lastly, departing from the generative model framework, we consider the problem of structured prediction where the goal is to learn predictors from data that predict complex structured objects directly from a given input. We develop efficient learning algorithms that learn structured predictors by approximating the partition function and obtain generalization guarantees for our method. We demonstrate that randomization can not only improve efficiency but also generalization to unseen data.</div><div><br></div>
|
9 |
Asynchronous optimization for machine learning / Optimisation asynchrone pour l'apprentissage statistiqueLeblond, Rémi 15 November 2018 (has links)
Les explosions combinées de la puissance computationnelle et de la quantité de données disponibles ont fait des algorithmes les nouveaux facteurs limitants en machine learning. L’objectif de cette thèse est donc d’introduire de nouvelles méthodes capables de tirer profit de quantités de données et de ressources computationnelles importantes. Nous présentons deux contributions indépendantes. Premièrement, nous développons des algorithmes d’optimisation rapides, adaptés aux avancées en architecture de calcul parallèle pour traiter des quantités massives de données. Nous introduisons un cadre d’analyse pour les algorithmes parallèles asynchrones, qui nous permet de faire des preuves correctes et simples. Nous démontrons son utilité en analysant les propriétés de convergence et d’accélération de deux nouveaux algorithmes. Asaga est une variante parallèle asynchrone et parcimonieuse de Saga, un algorithme à variance réduite qui a un taux de convergence linéaire rapide dans le cas d’un objectif lisse et fortement convexe. Dans les conditions adéquates, Asaga est linéairement plus rapide que Saga, même en l’absence de parcimonie. ProxAsaga est une extension d’Asaga au cas plus général où le terme de régularisation n’est pas lisse. ProxAsaga obtient aussi une accélération linéaire. Nous avons réalisé des expériences approfondies pour comparer nos algorithms à l’état de l’art. Deuxièmement, nous présentons de nouvelles méthodes adaptées à la prédiction structurée. Nous nous concentrons sur les réseaux de neurones récurrents (RNNs), dont l’algorithme d’entraînement traditionnel – basé sur le principe du maximum de vraisemblance (MLE) – présente plusieurs limitations. La fonction de coût associée ignore l’information contenue dans les métriques structurées ; de plus, elle entraîne des divergences entre l’entraînement et la prédiction. Nous proposons donc SeaRNN, un nouvel algorithme d’entraînement des RNNs inspiré de l’approche dite “learning to search”. SeaRNN repose sur une exploration de l’espace d’états pour définir des fonctions de coût globales-locales, plus proches de la métrique d’évaluation que l’objectif MLE. Les modèles entraînés avec SeaRNN ont de meilleures performances que ceux appris via MLE pour trois tâches difficiles, dont la traduction automatique. Enfin, nous étudions le comportement de ces modèles et effectuons une comparaison détaillée de notre nouvelle approche aux travaux de recherche connexes. / The impressive breakthroughs of the last two decades in the field of machine learning can be in large part attributed to the explosion of computing power and available data. These two limiting factors have been replaced by a new bottleneck: algorithms. The focus of this thesis is thus on introducing novel methods that can take advantage of high data quantity and computing power. We present two independent contributions. First, we develop and analyze novel fast optimization algorithms which take advantage of the advances in parallel computing architecture and can handle vast amounts of data. We introduce a new framework of analysis for asynchronous parallel incremental algorithms, which enable correct and simple proofs. We then demonstrate its usefulness by performing the convergence analysis for several methods, including two novel algorithms. Asaga is a sparse asynchronous parallel variant of the variance-reduced algorithm Saga which enjoys fast linear convergence rates on smooth and strongly convex objectives. We prove that it can be linearly faster than its sequential counterpart, even without sparsity assumptions. ProxAsaga is an extension of Asaga to the more general setting where the regularizer can be non-smooth. We prove that it can also achieve a linear speedup. We provide extensive experiments comparing our new algorithms to the current state-of-art. Second, we introduce new methods for complex structured prediction tasks. We focus on recurrent neural networks (RNNs), whose traditional training algorithm for RNNs – based on maximum likelihood estimation (MLE) – suffers from several issues. The associated surrogate training loss notably ignores the information contained in structured losses and introduces discrepancies between train and test times that may hurt performance. To alleviate these problems, we propose SeaRNN, a novel training algorithm for RNNs inspired by the “learning to search” approach to structured prediction. SeaRNN leverages test-alike search space exploration to introduce global-local losses that are closer to the test error than the MLE objective. We demonstrate improved performance over MLE on three challenging tasks, and provide several subsampling strategies to enable SeaRNN to scale to large-scale tasks, such as machine translation. Finally, after contrasting the behavior of SeaRNN models to MLE models, we conduct an in-depth comparison of our new approach to the related work.
|
10 |
Apprentissage Profond pour des Prédictions Structurées Efficaces appliqué à la Classification Dense en Vision par Ordinateur / Efficient Deep Structured Prediction for Dense Labeling Tasks in Computer VisionChandra, Siddhartha 11 May 2018 (has links)
Dans cette thèse, nous proposons une technique de prédiction structurée qui combine les vertus des champs aléatoires conditionnels Gaussiens (G-CRF) avec les réseaux de neurones convolutifs (CNN). L’idée à l’origine de cette thèse est l’observation que tout en étant d’une forme limitée, les GCRF nous permettent d’effectuer une inférence exacte de Maximum-A-Posteriori (MAP) de manière efficace. Nous préférons l’exactitude et la simplicité à la généralité et préconisons la prédiction structurée basée sur les G-CRFs dans les chaînes de traitement d’apprentissage en profondeur. Nous proposons des méthodes de prédiction structurées qui permettent de gérer (i) l’inférence exacte, (ii) les interactions par paires à court et à long terme, (iii) les expressions CNN riches pour les termes paires et (iv) l’entraînement de bout en bout aux côtés des CNN. Nous concevons de nouvelles stratégies de mise en œuvre qui nous permettent de surmonter les problèmes de mémoire et de calcul lorsque nous traitons des modèles graphiques entièrement connectés. Ces méthodes sont illustrées par des études expérimentales approfondies qui démontrent leur utilité. En effet, nos méthodes permettent une amélioration des résultats vis-à-vis de L’état de l’art sur des applications variées dans le domaine de la vision par ordinateur. / In this thesis we propose a structured prediction technique that combines the virtues of Gaussian Conditional Random Fields (G-CRFs) with Convolutional Neural Networks (CNNs). The starting point of this thesis is the observation that while being of a limited form GCRFs allow us to perform exact Maximum-APosteriori (MAP) inference efficiently. We prefer exactness and simplicity over generality and advocate G-CRF based structured prediction in deep learning pipelines. Our proposed structured prediction methods accomodate (i) exact inference, (ii) both shortand long- term pairwise interactions, (iii) rich CNN-based expressions for the pairwise terms, and (iv) end-to-end training alongside CNNs. We devise novel implementation strategies which allow us to overcome memory and computational challenges
|
Page generated in 0.0826 seconds