Spelling suggestions: "subject:"generative neural networks"" "subject:"agenerative neural networks""
1 |
Deep Synthetic Noise Generation for RGB-D Data AugmentationHammond, Patrick Douglas 01 June 2019 (has links)
Considerable effort has been devoted to finding reliable methods of correcting noisy RGB-D images captured with unreliable depth-sensing technologies. Supervised neural networks have been shown to be capable of RGB-D image correction, but require copious amounts of carefully-corrected ground-truth data to train effectively. Data collection is laborious and time-intensive, especially for large datasets, and generation of ground-truth training data tends to be subject to human error. It might be possible to train an effective method on a relatively smaller dataset using synthetically damaged depth-data as input to the network, but this requires some understanding of the latent noise distribution of the respective camera. It is possible to augment datasets to a certain degree using naive noise generation, such as random dropout or Gaussian noise, but these tend to generalize poorly to real data. A superior method would imitate real camera noise to damage input depth images realistically so that the network is able to learn to correct the appropriate depth-noise distribution.We propose a novel noise-generating CNN capable of producing realistic noise customized to a variety of different depth-noise distributions. In order to demonstrate the effects of synthetic augmentation, we also contribute a large novel RGB-D dataset captured with the Intel RealSense D415 and D435 depth cameras. This dataset pairs many examples of noisy depth images with automatically completed RGB-D images, which we use as proxy for ground-truth data. We further provide an automated depth-denoising pipeline which may be used to produce proxy ground-truth data for novel datasets. We train a modified sparse-to-dense depth-completion network on splits of varying size from our dataset to determine reasonable baselines for improvement. We determine through these tests that adding more noisy depth frames to each RGB-D image in the training set has a nearly identical impact on depth-completion training as gathering more ground-truth data. We leverage these findings to produce additional synthetic noisy depth images for each RGB-D image in our baseline training sets using our noise-generating CNN. Through use of our augmentation method, it is possible to achieve greater than 50% error reduction on supervised depth-completion training, even for small datasets.
|
2 |
Towards deep unsupervised inverse graphicsParent-Lévesque, Jérôme 12 1900 (has links)
Un objectif de longue date dans le domaine de la vision par ordinateur est de déduire le
contenu 3D d’une scène à partir d’une seule photo, une tâche connue sous le nom d’inverse
graphics. L’apprentissage automatique a, dans les dernières années, permis à de nombreuses
approches de faire de grands progrès vers la résolution de ce problème. Cependant, la plupart
de ces approches requièrent des données de supervision 3D qui sont coûteuses et parfois
impossible à obtenir, ce qui limite les capacités d’apprentissage de telles œuvres. Dans
ce travail, nous explorons l’architecture des méthodes d’inverse graphics non-supervisées
et proposons deux méthodes basées sur des représentations 3D et algorithmes de rendus
différentiables distincts: les surfels ainsi qu’une nouvelle représentation basée sur Voronoï.
Dans la première méthode basée sur les surfels, nous montrons que, bien qu’efficace pour
maintenir la cohérence visuelle, la production de surfels à l’aide d’une carte de profondeur
apprise entraîne des ambiguïtés car la relation entre la carte de profondeur et le rendu n’est
pas bijective. Dans notre deuxième méthode, nous introduisons une nouvelle représentation
3D basée sur les diagrammes de Voronoï qui modélise des objets/scènes à la fois explicitement
et implicitement, combinant ainsi les avantages des deux approches. Nous montrons comment
cette représentation peut être utilisée à la fois dans un contexte supervisé et non-supervisé
et discutons de ses avantages par rapport aux représentations 3D traditionnelles / A long standing goal of computer vision is to infer the underlying 3D content in a scene from
a single photograph, a task known as inverse graphics. Machine learning has, in recent years,
enabled many approaches to make great progress towards solving this problem. However,
most approaches rely on 3D supervision data which is expensive and sometimes impossible
to obtain and therefore limits the learning capabilities of such work. In this work, we explore
the deep unsupervised inverse graphics training pipeline and propose two methods based on
distinct 3D representations and associated differentiable rendering algorithms: namely surfels
and a novel Voronoi-based representation. In the first method based on surfels, we show that,
while effective at maintaining view-consistency, producing view-dependent surfels using a
learned depth map results in ambiguities as the mapping between depth map and rendering
is non-bijective. In our second method, we introduce a novel 3D representation based on
Voronoi diagrams which models objects/scenes both explicitly and implicitly simultaneously,
thereby combining the benefits of both. We show how this representation can be used in both
a supervised and unsupervised context and discuss its advantages compared to traditional
3D representations.
|
Page generated in 0.0781 seconds