161 |
A Deep 3D Object Pose Estimation Framework for Robots with RGB-D SensorsWagh, Ameya Yatindra 24 April 2019 (has links)
The task of object detection and pose estimation has widely been done using template matching techniques. However, these algorithms are sensitive to outliers and occlusions, and have high latency due to their iterative nature. Recent research in computer vision and deep learning has shown great improvements in the robustness of these algorithms. However, one of the major drawbacks of these algorithms is that they are specific to the objects. Moreover, the estimation of pose depends significantly on their RGB image features. As these algorithms are trained on meticulously labeled large datasets for object's ground truth pose, it is difficult to re-train these for real-world applications. To overcome this problem, we propose a two-stage pipeline of convolutional neural networks which uses RGB images to localize objects in 2D space and depth images to estimate a 6DoF pose. Thus the pose estimation network learns only the geometric features of the object and is not biased by its color features. We evaluate the performance of this framework on LINEMOD dataset, which is widely used to benchmark object pose estimation frameworks. We found the results to be comparable with the state of the art algorithms using RGB-D images. Secondly, to show the transferability of the proposed pipeline, we implement this on ATLAS robot for a pick and place experiment. As the distribution of images in LINEMOD dataset and the images captured by the MultiSense sensor on ATLAS are different, we generate a synthetic dataset out of very few real-world images captured from the MultiSense sensor. We use this dataset to train just the object detection networks used in the ATLAS Robot experiment.
|
162 |
Characterisation and segmentation of basal ganglia mineralization in normal ageing with multimodal structural MRIGlatz, Andreas January 2016 (has links)
Iron is the most abundant trace metal in the brain and is essential for many biological processes, such as neurotransmitter synthesis and myelin formation. This thesis investigates small, multifocal hypointensities that are apparent on T2*- weighted (T2*w) MRI in the basal ganglia, where presumably most iron enters the brain via the blood-brain-barrier along the penetrating arteries. These basal ganglia T2*w hypointensities are believed to arise from iron-rich microvascular mineral deposits, which are frequently found in community-dwelling elderly subjects and are associated with age-related cognitive decline. This thesis documents the characteristic spatial distribution and morphology of basal ganglia T2*w hypointensities of 98 community-dwelling, elderly subjects in their seventies, as well as their imaging signatures on T1-weighted (T1w) and T2- weighted (T2w) MRI. A fully automated, novel method is introduced for the segmentation of basal ganglia T2*w hypointensities, which was developed to reduce the high intra- and inter-rater variability associated with current semi-automated segmentation methods and to facilitate the segmentation of these features in other single- and multi-centre studies. This thesis also presents a multi parametric quantitative MRI relaxometry methodology for conventional clinical MRI scanners that was developed and validated to improve the characterisation of brain iron. Lastly, this thesis describes the application of the developed methods in the segmentation of basal ganglia T2*w hypointensities of 243 community-dwelling participants of the Austrian Stroke Prevention Study Family (ASPS-Fam) and their analysis on R2* (=1/T2*) relaxation rate and Larmor frequency shift maps. This work confirms that basal ganglia T2*w hypointensities, especially in the globus pallidus, are potentially MRI markers of microvascular mineralization. Furthermore, the ASPS-Fam results show that basal ganglia mineral deposits mainly consist of paramagnetic particles, which presumably arise from an imbalance in the brain iron homeostasis. Hence, basal ganglia T2*w hypointensities are possibly an indicator of age-related microvascular dysfunction with iron accumulation, which might help to explain the variability of cognitive decline in normal ageing.
|
163 |
Brain volumetric MRI study of extremely low gestational age newborns (ELGANs) at 9 to 10 years of ageZhou, Qingde 08 April 2016 (has links)
PURPOSE: Extremely low gestation age newborns (ELGANs) are at high risk for developmental brain abnormalities, which can lead to cognitive, physical, emotional and behavioral deficits. This study is to determine potential brain volumetric abnormalities of ELGAN children at 9 to 10 years of age.
METHODS: High-resolution magnetic resonance imaging (MRI) scans were obtained from 82 ELGAN children using a dual-echo turbo spin-echo (DE-TSE) pulse sequence at 3.0T (or 1.5T at only one site). The DICOM MR images were processed with quantitative MRI algorithms programmed in Mathcad. The brain gray matter (GM), white matter (WM) and cerebrospinal fluid (CSF) volumes were quantified using semi-automated clustering segmentation algorithms.
RESULTS: Total brain volumes (GM+WM) of ELGAN children showed a large distribution range from 400 to 1500 mL. About 63% of the children had smaller brain volumes while 5% of them had larger brain volumes compared to the published data from normal children at the same ages1. Smaller brain volumes were observed more often in males (74%) than in females (50%). WM reduction was the major change in ELGANs with over 90% of them (86% of males and 92% of females) having reduced WM volumes. GM volumes were either reduced (15%) or enlarged (32%); GM reduction was observed more often in males (31%) than in females (4.8%), while GM enlargement was more frequently observed in females (35%) than in males (28%). Intracranial CSF volumes range from 25 mL to 600 mL, with 16% of ELGAN children (9% of males and 21% of females) having smaller CSF volume, while 38% of them (53% of males and 27% of females) having larger CSF volume. Correlation analysis revealed a positive correlation between total intracranial matter (ICM) and CSF volumes (male: r = 0.4972, p = 0.0014 and female: r = 0.3233, p = 0.0125), but a negative correlation was found between brain volumes and CSF volumes (male: r = - 0.2998, p = 0.0424 and female: r = - 0.2279, p = 0.0596). Further analysis demonstrated a negative correlation between GM and CSF both in absolute (male: r = - 0.4489, p = 0.0039 and female: r = - 0.3769, p = 0.0041) and in relative (male: r = - 0.8675, p = 0.0000 and female: r = - 0.8350, p = 0.0000) volumes, while WM volumes did not correlate with CSF volumes.
CONCLUSION: ELGAN children had mostly smaller brain volumes while some of them displayed larger brain volumes at ages of 9 to 10 years. The reduction of WM was a characteristic change in ELGAN children and contributed to smaller brain volumes. GM volumes either increased or decreased. Larger intracranial CSF volumes were associated with larger intracranial matter (ICM) volume.
|
164 |
Image Characterization by Morphological Hierarchical Representations / Caractérisation d'images par des représentations morphologiques hiérarchiquesFehri, Amin 25 May 2018 (has links)
Cette thèse porte sur l'extraction de descripteurs hiérarchiques et multi-échelles d'images, en vue de leur interprétation, caractérisation et segmentation. Elle se décompose en deux parties.La première partie expose des éléments théoriques et méthodologiques sur l'obtention de classifications hiérarchiques des nœuds d'un graphe valué aux arêtes. Ces méthodes sont ensuite appliquées à des graphes représentant des images pour obtenir différentes méthodes de segmentation hiérarchique d'images. De plus, nous introduisons différentes façons de combiner des segmentations hiérarchiques. Nous proposons enfin une méthodologie pour structurer et étudier l'espace des hiérarchies que nous avons construites en utilisant la distance de Gromov-Hausdorff entre elles.La seconde partie explore plusieurs applications de ces descriptions hiérarchiques d'images. Nous exposons une méthode pour apprendre à extraire de ces hiérarchies une bonne segmentation de façon automatique, étant donnés un type d'images et un score de bonne segmentation. Nous proposons également des descripteurs d'images obtenus par mesure des distances inter-hiérarchies, et exposons leur efficacité sur des données réelles et simulées. Enfin, nous étendons les potentielles applications de ces hiérarchies en introduisant une technique permettant de prendre en compte toute information spatiale a priori durant leur construction. / This thesis deals with the extraction of hierarchical and multiscale descriptors on images, in order to interpret, characterize and segment them. It breaks down into two parts.The first part outlines a theoretical and methodological approach for obtaining hierarchical clusterings of the nodes of an edge-weighted graph. In addition, we introduce different approaches to combine hierarchical segmentations. These methods are then applied to graphs representing images and derive different hierarchical segmentation techniques. Finally, we propose a methodology for structuring and studying the space of hierarchies by using the Gromov-Hausdorff distance as a metric.The second part explores several applications of these hierarchical descriptions for images. We expose a method to learn how to automatically extract a segmentation of an image, given a type of images and a score of evaluation for a segmentation. We also propose image descriptors obtained by measuring inter-hierarchical distances, and expose their efficiency on real and simulated data. Finally, we extend the potential applications of these hierarchies by introducing a technique to take into account any spatial prior information during their construction.
|
165 |
Atlas anatômico da região da cabeça e do pescoço : em direção à radioterapia adaptativaParraga, Adriane January 2008 (has links)
Em radioterapia externa, uma nova técnica chamada terapia de radiação de intensidade modulada - IMRT - permite delinear a dose de radiação em imagens de 2 ou 3 dimensões, delimitando de forma bastante precisa e não necessariamente uniforme a região a ser irradiada. Assim, ao mesmo tempo que o tumor é irradiado, é possível evitar a irradiação aos tecidos vizinhos íntegros (sãos), limitando os efeitos secundários do tratamento. Para que a radioterapia externa tenha sucesso usando a técnica IMRT, é fundamental delinear previamente de forma precisa o tumor e os órgãos sãos que devem ser protegidos da radiação, garantindo assim a dose exata de radiação nos volumes alvos. O objetivo desta tese é fornecer ferramentas que sejam adequadas ao delineamento automático de estruturas de interesse e à radioterapia adaptativa para tumores da região da cabeça e do pescoço. Atualmente, a segmentação de estruturas de interesse, tais como os órgãos em risco e as regiões de propagação tumoral, é feita manualmente. Esta é uma tarefa que demanda bastante tempo de um especialista, além de ser tediosa. Além do mais, o planejamento em radioterapia é feito baseado na imagem adquirida na semana do pré-tratamento, onde é calculada a dose. Normalmente o tratamento ocorre em várias semanas, porém a dose estimada no início do tratamento é a mesma para todas as outras semanas do tratamento. Calcular a dose e mantê-la nas demais semanas é uma simplificação que não corresponde à realidade, já que ocorrem mudanças anatômicas no paciente ao longo do tratamento. Estas mudanças ocorrem devido ao encolhimento do tumor e ao possível emagrecimento do paciente, provocando alterações anatômicas locais e globais. As contribuições desta tese visam solucionar e avançar nestes problemas e são apresentadas em dois eixos. No primeiro eixo, é proposta uma metodologia para escolher uma anatomia que seja representativa da população, anatomia esta chamada de atlas. O registro do atlas na imagem do paciente permite que estruturas de interesse sejam segmentadas automaticamente, acelerando o processo de delineamento e tornando-o mais robusto. A segunda contribuição desta tese é voltada à radioterapia adaptativa. Para que a dose estimada na primeira semana seja adaptada às modificações anatômicas, é necessária a utilização de métodos de registro não-rígidos. Portanto, nesta etapa é feita uma avaliação e adaptação dos métodos de registros de forma que a região do tumor esteja bem alinhada. / Intensity Modulated Radiotherapy (IMRT) is a new technique enabling the delineation of the 3D radiation dose. It allows to delineate a radiation zone of almost any shape and to modulate the beam intensity inside the target. If IMRT enables to constrain the radiation plan in the beam delivery as well as in the protection of important functional areas (e.g. spinal cord), it also raises the issues of adequacy and accuracy of the selection and delineation of the target volumes. The purpose of this thesis is to provide tools to automatic delineation of the regions of interest and also to adaptive radiotherapy treatment for tumors located in the head and neck region. The delineation in the patient computed tomography image of the tumor volume and organs to be protected is currently performed by an expert who delineates slice by slice the contours of interest. This task is highly time-consuming and requires experts’ knowledge. Moreover, the planning process in radiotherapy typically involves the acquisition of a unique set of computed tomography images in treatment position on which target volumes (TVs) and normal structures are delineated, and which are used for dose calculation. Restricting the delineation of these regions of interest based solely on pre-treatment images is an oversimplification as it is only a snapshot of the patient´s anatomy at a given time. Shrinkage of the tumor and modification of the patient anatomy at large (e.g. due to weight loss) may indeed occur within the several weeks’ duration of a typical treatment. The main contributions of this thesis aim to advance in the solution to these issues and are presented in two axes. In the first one, it is proposed a methodology to choose an image with the most representative anatomy of a population; such image is called Atlas. The registration of the atlas into a new image of the patient allows to automatically segment the structures of interest, speeding up the delineation process and making it more robust. The second contribution of this thesis is focused on the adaptive radiotherapy. In order to adjust the estimated dose to the anatomical modifications, it is fundamental to have non-rigid registration algorithms. So, the evaluation and adaptation of non-rigid registration methods are required, addressing especially the alignment of the tumor’s region among different moments of the treatment.
|
166 |
Segmentation of mammographic images for computer aided diagnosis / Segmentation d’images mammographiques pour l’aide au diagnosticFeudjio Kougoum, Cyrille Désiré 05 October 2016 (has links)
Les outils d’aide au diagnostic sont de nos jours au cœur de plusieurs protocoles cliniques car ils améliorent la qualité du diagnostic posé et des soins médicaux. Ce travail de recherche met en avant une architecture hiérarchique pour la conception d'un outil d'aide à la détection du cancer du sein robuste et performant. Il s’intéresse à la réduction des fausses alarmes en identifiant les régions potentiellement cancérogènes. La gamme dynamique des niveaux de gris des zones sombres est étirée pour améliorer le contraste entre la région du sein et l'arrière plan et permettre une meilleure extraction de celle-ci. Toutefois, le muscle pectoral demeure incrusté dans la région du sein et interfère avec l'analyse des tissus. Son extraction est à la fois difficile et complexe à mettre en œuvre à cause de son chevauchement avec les tissus denses du sein. Dans ces conditions, même en exploitant l'information spatiale pendant la clusterisation par un algorithme de fuzzy C-means ne produit pas toujours des résultats de segmentation pertinents. Pour s'affranchir de cette difficulté, une étape de validation suivie d'un ajustement de contour est mise sur pied pour détecter et corriger les imperfections de segmentation. La seconde étape est consacrée à la caractérisation de la densité des tissus. Pour faire face au problème de variabilité des distributions de niveaux de gris dans les classes de densités, nous introduisons une modification de contraste basée sur un transport optimisé de niveaux de gris. Grâce à cette technique, la surface relative de tissus denses estimée par simple segmentation est très fortement corrélée aux classes de densités issues d’un jeu de données étiquetées. / Computer-aided diagnosis systems are currently at the heart of many clinical protocols since they significantly improve diagnosis making and therefore medical care. This research work therefore puts forward a hierarchical architecture for the design of a robust and efficient CAD tool for breast cancer detection. More precisely, it focuses on the reduction of false alarms rate through the identification of image regions of foremost interest i.e potential cancerous areas. The dynamic range of gray level intensities in dark regions is, first of all stretched to enhance the contrast between tissues and background and thus favors accurate breast region extraction. A second segmentation follows since pectoral muscle which regularly tampers breast tissue analysis remains inlaid in the foreground region. Extracting pectoral muscle tissues is both hard and challenging due to its overlap with dense tissues. In such conditions, even exploiting spatial information during the clustering process of the fuzzy C-means algorithm does not always produce a relevant segmentation. To overcome this difficulty, a new validation process followed by a refinement strategy is proposed to detect and correct the segmentation imperfections. The second macro-step is devoted to breast tissue density analysis. To address the variability in gray levels distributions with of mammographic density classes, we introduce an optimized gray level transport map for mammographic image contrast standardization. Thanks to this technique, dense region areas computed using simple thresholding are highly correlated to density classes from an annotated dataset.
|
167 |
The influence of expertise on segmentation and memory for basketball and Overwatch videosNewberry, Kimberly Marie January 1900 (has links)
Master of Science / Department of Psychological Sciences / Heather R. Bailey / Much research has shown that experts possess superior memory in their field of expertise. This memory benefit has been proposed to be the result of various encoding mechanisms, such as chunking and differentiation. Another potential encoding mechanism that is associated with memory is event segmentation, which is the process by which individuals parse continuous information into meaningful, discrete units. Event Segmentation Theory proposes that segmentation is influenced by perceptual (e.g., motion) and conceptual (e.g. semantic knowledge) cues. Previous research has found evidence supporting the influence of knowledge on segmentation, specifically through the manipulation of goals and familiarity for everyday activities. To date, few studies have investigated the influence of expertise on segmentation, and questions about expertise, segmentation ability, and their impact on memory still remain. The goal of the current study was to investigate the influence of expertise on segmentation and memory ability for two different domains: basketball and Overwatch. Participants with high and low knowledge for basketball viewed and segmented basketball and Overwatch videos at coarse and fine grains, then completed memory tests. Differences in segmentation ability and memory were present between experts and novices, specifically for the basketball videos; however, segmentation only predicted memory for activities for which knowledge was lacking, for experts. Overall, this research suggests that experts’ superior memory is not due to their segmentation ability and contributes to a growing body of literature showing evidence supporting conceptual effects on segmentation.
|
168 |
Segmentation of irises acquired in degraded conditions / Segmentation d’iris acquis en conditions dégradéesLefevre, Thierry 30 January 2013 (has links)
Les performances des systèmes de reconnaissances basés sur l'iris sont très négativement affectées par les relâchements des contraintes lors de l'acquisition des images (sujet mobile ou faiblement coopératif, image acquise loin du capteur…). L’objectif de cette thèse est de proposer une amélioration des algorithmes de segmentation traditionnels afin de pouvoir travailler dans de telles conditions. Nous avons identifié et traité quarte modules qui permettent de limiter l'impact des dégradations des images sur les performances du système de reconnaissance global : • Une localisation précise et robuste de la pupille dans l'image l'œil. Pour cela, nous avons développé une méthode qui supprime les cils et les sourcils de l'image pour faciliter la détection de la pupille. • Une segmentation précise de la texture de l'iris dans l'image. Nous avons étudié plusieurs méthodes de la littérature des Contours Actifs et comparé l'impact de ces méthodes sur les performances de reconnaissances du système complet. • Une estimation précise et robuste des contours anatomique de l'iris indépendamment des occlusions dans l'image. Pour cela, nous avons dérivé les équations des Contours Actifs explicitement pour des cercles et des ellipses. Nous avons par ailleurs proposé une méthodologie efficace pour rendre la détection moins sensible aux minimas locaux. • Une méthode de détection des erreurs de segmentation. Il est en effet important de pouvoir avertir le système de reconnaissance global qu'une erreur s’est produite. Pour cela nous avons développé plusieurs critères d'évaluation de la qualité de segmentation. Nous avons ensuite fusionnés ces mesures en utilisant un algorithme de type <<Support Vector Regression>> (SVR) pour former une mesure de qualité globale évaluant la qualité de la segmentation / This thesis is focused on the development of robust segmentation algorithms for iris recognition systems working in degraded acquisition conditions. In controlled acquisition scenarios, iris segmentation is well handled by simple segmentation schemes, which modeled the iris borders by circles and assumed that the iris can only be occluded by eyelids. However, such simple models tend to fail when the iris is strongly occluded or off-angle, or when the iris borders are not sharp enough. In this thesis, we propose a complete segmentation system working efficiently despite the above-mentioned degradations of the input data. After a study of the recent state of the art in iris recognition, we identified four key issues that an iris segmentation system should handle when being confronted to images of poor quality, leading this way to four key modules for the complete system: • The system should be able locate the pupil in the image in order to initialize more complex algorithms. To address this problem, we propose an original and effective way to first segment dark elements in the image that can lead to mistakes of the pupil localization process. This rough segmentation detects high frequency areas of the image and then the system uses the pupil homogeneity as a criterion to identify the pupil area among other dark regions of the image. • Accurate segmentation of the iris texture in the eye image is a core task of iris segmentation systems. We propose to segment the iris texture by Active Contours because they meet both the requirement in robustness and accuracy required to perform segmentation on large databases of degraded images. We studied several Active Contours that varies in speed, robustness, accuracy and in the features they use to model the iris region. We make a comparative evaluation of the algorithms’ influence on the system performance. • A complete segmentation system must also accurately estimate the iris shape in occluded regions, in order to format the iris texture for recognition. We propose a robust and accurate scheme based on a variational formulation to fit an elliptic model on the iris borders. We explicitly derive evolution equations for ellipses using the Active Contours formalism. We also propose an effective way to limit the sensitivity of this process to initial conditions. This part of our work is currently under review for final acceptance in the international journal Computer Vision and Image Understanding (CVIU). • Finally, we address the main issue of automatic detection of segmentation failures of the system. Few works in the literature address measuring the quality of a segmentation algorithm, critical task for an operational system. We propose in this thesis a set of novel quality measures for segmentation and show a correlation between each of them with the intrinsic recognition performance of the segmented images. We perform fusion of the individual quality measures via a Support Vector Regression (SVR) algorithm, in order to propose a more robust global segmentation quality score
|
169 |
THE INFLUENCE OF CONTROL STRATEGY ON EVENT SEGMENTATIONCarlos, Vanessa 01 March 2018 (has links)
The dual mechanism of cognitive control framework (DMC) describes cognitive control via two strategies: proactive and reactive. Individuals using a proactive strategy, focus on actively maintaining goal-relevant information in memory, whereas reactive individuals store goal-relevant information and retrieve it when cues are present. Reimer and colleagues (2015, 2017) added cue-probe location shifts to the typical AX-CPT, as well as, a virtual-reality environment version of the AX-CPT. Through this, they found that the effect of location shifts vary depending on whether a proactive or reactive mode of control is utilized. Thus, the aim of the present study was to test whether the effect of location shifts on cognitive control depends on type of control strategy used. Two versions of the AX-CPT were used: shift alone and shift with no-go trials. The shift alone AX-CPT examined the influence of location shifts in proactively-biased young adults. The shift with no-go trials AX-CPT examined the influence of location shifts with a manipulation that is known to induce a reactive control strategy (Gonthier et al., 2016). It was hypothesized that cue-probe location shifts would have a differential effect on mode of control. Results demonstrated that type of AX-CPT given, cue-probe location, and type of trial presented individually influenced participant performance. There was also an interaction between AX-CPT type and trial type that provides evidence for a successful manipulation of mode of control. The hypothesized interaction between all variables, however, was not found. Possible limitations of the present study, as well as, future direction were discussed.
|
170 |
Apprentissage autosupervisé de modèles prédictifs de segmentation à partir de vidéos / Self-supervised learning of predictive segmentation models from videoLuc, Pauline 25 June 2019 (has links)
Les modèles prédictifs ont le potentiel de permettre le transfert des succès récents en apprentissage par renforcement à de nombreuses tâches du monde réel, en diminuant le nombre d’interactions nécessaires avec l’environnement.La tâche de prédiction vidéo a attiré un intérêt croissant de la part de la communauté ces dernières années, en tant que cas particulier d’apprentissage prédictif dont les applications en robotique et dans les systèmes de navigations sont vastes.Tandis que les trames RGB sont faciles à obtenir et contiennent beaucoup d’information, elles sont extrêmement difficile à prédire, et ne peuvent être interprétées directement par des applications en aval.C’est pourquoi nous introduisons ici une tâche nouvelle, consistant à prédire la segmentation sémantique ou d’instance de trames futures.Les espaces de descripteurs que nous considérons sont mieux adaptés à la prédiction récursive, et nous permettent de développer des modèles de segmentation prédictifs performants jusqu’à une demi-seconde dans le futur.Les prédictions sont interprétables par des applications en aval et demeurent riches en information, détaillées spatialement et faciles à obtenir, en s’appuyant sur des méthodes état de l’art de segmentation.Dans cette thèse, nous nous attachons d’abord à proposer pour la tâche de segmentation sémantique, une approche discriminative se basant sur un entrainement par réseaux antagonistes.Ensuite, nous introduisons la tâche nouvelle de prédiction de segmentation sémantique future, pour laquelle nous développons un modèle convolutionnel autoregressif.Enfin, nous étendons notre méthode à la tâche plus difficile de prédiction de segmentation d’instance future, permettant de distinguer entre différents objets.Du fait du nombre de classes variant selon les images, nous proposons un modèle prédictif dans l’espace des descripteurs d’image convolutionnels haut niveau du réseau de segmentation d’instance Mask R-CNN.Cela nous permet de produire des segmentations visuellement plaisantes en haute résolution, pour des scènes complexes comportant un grand nombre d’objets, et avec une performance satisfaisante jusqu’à une demi seconde dans le futur. / Predictive models of the environment hold promise for allowing the transfer of recent reinforcement learning successes to many real-world contexts, by decreasing the number of interactions needed with the real world.Video prediction has been studied in recent years as a particular case of such predictive models, with broad applications in robotics and navigation systems.While RGB frames are easy to acquire and hold a lot of information, they are extremely challenging to predict, and cannot be directly interpreted by downstream applications.Here we introduce the novel tasks of predicting semantic and instance segmentation of future frames.The abstract feature spaces we consider are better suited for recursive prediction and allow us to develop models which convincingly predict segmentations up to half a second into the future.Predictions are more easily interpretable by downstream algorithms and remain rich, spatially detailed and easy to obtain, relying on state-of-the-art segmentation methods.We first focus on the task of semantic segmentation, for which we propose a discriminative approach based on adversarial training.Then, we introduce the novel task of predicting future semantic segmentation, and develop an autoregressive convolutional neural network to address it.Finally, we extend our method to the more challenging problem of predicting future instance segmentation, which additionally segments out individual objects.To deal with a varying number of output labels per image, we develop a predictive model in the space of high-level convolutional image features of the Mask R-CNN instance segmentation model.We are able to produce visually pleasing segmentations at a high resolution for complex scenes involving a large number of instances, and with convincing accuracy up to half a second ahead.
|
Page generated in 0.1062 seconds