Global ETD Search

31	Algorithmes, architecture et éléments optiques pour l'acquisition embarquées d'images totalement focalisées et annotées en distance / Algorithms, architecture and optics components for embedded All-in-Focus and distance-annoted image acquision system Emberger, Simon 13 December 2017 (has links) L'acquisition de la profondeur d'une scène en plus de son image est une caractéristique souhaitable pour de nombreuses applications qui dépendent de l'environnement proche. L'état de l'art dans le domaine de l'extraction de profondeur propose de nombreuses méthodes, mais très peu sont réellement adaptées aux systèmes embarqués miniaturisés. Certaines parce qu'elles sont trop encombrantes en raison de leur système optique, d'autres parce qu'elles nécessitent une calibration délicate, ou des méthodes de reconstructions difficilement implantables dans un système embarqué. Dans cette thèse nous nous concentrons sur des méthodes a faible complexité matérielle afin de proposer une solution algorithmique et optique pour réaliser un capteur permettant à la fois d'extraire la profondeur de la scène, de fournir une évaluation de pertinence de cette mesure et de proposer des images focalisées en tout point. Dans ce sens, nous montrons que les algorithmes du type Depth from Focus (DfF) sont les plus adaptés à ces contraintes. Ce procédé consiste à acquérir un cube d'images multi-focus d'une même scène pour différentes distances de focalisation. Les images sont analysées afin d'annoter chacune des zones de la scène d'un indice relatif à sa profondeur estimée. Cet indice est utilisé pour reconstruire une image nette en tout point.Nous avons travaillé sur la notion de netteté afin de proposer des solutions peu complexes, uniquement basées sur des additions et comparaisons, et de fait, facilement adaptables pour un portage sur une architecture matérielle. La solution proposée effectue une analyse bidirectionnelle de contraste local puis combine les meilleures estimations de profondeur en fin de traitement. Elle se décline en trois approches avec une restriction de la complexité de plus en plus forte et ainsi une aptitude de plus en plus marquée pour l'embarqué. Pour chaque méthode, des cartes de profondeurs et de confiances sont établies, ainsi qu'une image totalement focalisée constituée d'éléments issus de l'ensemble du cube multi-focus. Ces approches sont comparées en qualité et en complexité à d'autres méthodes de l'état de l'art de complexité similaire. Une architecture est proposée pour une implantation matérielle de la solution la plus prometteuse. La conception de ces algorithmes soulève le problème de la qualité d'image. Il est en effet primordial d'avoir une évolution remarquable du contraste ainsi qu'une invariance de la scène lors de la capture du cube multi-focus. Un effet très souvent négligé dans ce type d'approche est le zoom parasite provoqué par la lentille responsable de la variation de focus. Ce zoom de focalisation fragilise l'aspect invariance de la scène et provoque l'apparition d'artefacts sur les trois informations Profondeur, Image et Confiance. La recherche d'optiques adaptées au DfF constitue donc un second axe de ces travaux. Nous avons évalué des lentilles liquides industrielles et des lentilles modales expérimentales à cristaux liquides nématiques conçues durant cette thèse. Ces technologies ont été comparées en termes de rapidité, de qualité d'image, d'intensité de zoom de focalisation engendré, de tension d'alimentation et enfin de qualité des cartes de profondeur extraites et des images totalement focalisées reconstruites.La lentille et l'algorithme répondant le mieux à cette problématique DfF embarqué ont ensuite été évalués via le portage sur une plateforme de développement CPU-GPU permettant l'acquisition d'images et de cartes de profondeurs et de confiances en temps réel. / Acquiring the depth of a scene in addition to its image is a desirable feature for many applications which depend on the near environment. The state of the art in the field of depth extraction offers many methods, but very few are well adapted to small embedded systems. Some of them are too cumbersome because of their large optical system. Others might require a delicate calibration or processing methods which are difficult to implement in an embedded system. In this PhD thesis, we focus on methods with low hardware complexity in order to propose algorithms and optical solutions that extract the depth of the scene, provide a relevance evaluation of this measurement and produce all-in-focus images. We show that Depth from Focus (DfF) algorithms are the most adapted to embedded electronics constraints. This method consists in acquiring a cube of multi-focus images of the same scene for different focusing distances. The images are analyzed in order to annotate each zone of the scene with an index relative to its estimated depth. This index is then used to build an all in focus image. We worked on the sharpness criterion in order to propose low complexity solutions, only based on additions and comparisons, easily adaptable on a hardware architecture. The proposed solution uses bidirectional local contrast analysis and then combines the most relevant depth estimations based on detection confidence at the end of treatment. It is declined in three approaches which need less and less processing and thus make them more and more adapted for a final embedded solution. For each method, depth and confidence maps are established, as well as an all-in-focus image composed of elements from the entire multi-focus cube. These approaches are compared in quality and complexity with other state-of-the-art methods which present similar complexity. A hardware implementation of the best solution is proposed. The design of these algorithms raises the problem of image quality. It is indeed essential to have a remarkable contrast evolution as well as a motionless scene during the capture of the multi-focus cube. A very often neglected effect in this type of approach is the parasitic zoom caused by the lens motion during a focus variation. This "focal zoom" weakens the invariance aspect of the scene and causes artifacts on the depth and confidence maps and on the all in focus image. The search for optics adapted to DfF is thus a second line of research in this work. We have evaluated industrial liquid lenses and experimental nematic liquid crystal modal lenses designed during this thesis. These technologies were compared in terms of speed, image quality, generated focal zoom intensity, power supply voltage and finally the quality of extracted depth maps and reconstructed all in focus images. The lens and the algorithm which best suited this embedded DfF issue were then evaluated on a CPU-GPU development platform allowing real time acquisition of depth maps, confidence maps and all in focus images. Capteur d'image 3D Acquisition de la profondeur Imagerie All-In-Focus 3D Image-Sensor Depth acquisition All Focus image generation Depth Map Confidence Map Liquid crystal lens 620
32	Generation of Synthetic Retinal Images with High Resolution / Generation of Synthetic Retinal Images with High Resolution Aubrecht, Tomáš January 2020 (has links) K pořízení snímků sítnice, která představuje nejdůležitější část lidského oka, je potřeba speciálního vybavení, kterým je fundus kamera. Z tohoto důvodu je cílem této práce navrhnout a implementovat systém, který bude schopný generovat takovéto snímky bez použítí této kamery. Navržený systém využívá mapování vstupního černobílého snímku krevního řečiště sítnice na barevný výstupní snímek celé sítnice. Systém se skládá ze dvou neuronových sítí: generátoru, který generuje snímky sítnic, a diskriminátoru, který klasifikuje dané snímky jako reálné či syntetické. Tento systém byl natrénován na 141 snímcích z veřejně dostupných databází. Následně byla vytvořena nová databáze obsahující více než 2,800 snímků zdravých sítnic v rozlišení 1024x1024. Tato databáze může být použita jako učební pomůcka pro oční lékaře nebo může poskytovat základ pro vývoj různých aplikací pracujících se sítnicemi.
33	Object Detection with Deep Convolutional Neural Networks in Images with Various Lighting Conditions and Limited Resolution / Detektion av objekt med Convolutional Neural Networks (CNN) i bilder med dåliga belysningförhållanden och lågupplösning Landin, Roman January 2021 (has links) Computer vision is a key component of any autonomous system. Real world computer vision applications rely on a proper and accurate detection and classification of objects. A detection algorithm that doesn’t guarantee reasonable detection accuracy is not applicable in real time scenarios where safety is the main objective. Factors that impact detection accuracy are illumination conditions and image resolution. Both contribute to degradation of objects and lead to low classifications and detection accuracy. Recent development of Convolutional Neural Networks (CNNs) based algorithms offers possibilities for low-light (LL) image enhancement and super resolution (SR) image generation which makes it possible to combine such models in order to improve image quality and increase detection accuracy. This thesis evaluates different CNNs models for SR generation and LL enhancement by comparing generated images against ground truth images. To quantify the impact of the respective model on detection accuracy, a detection procedure was evaluated on generated images. Experimental results evaluated on images selected from NoghtOwls and Caltech Pedestrian datasets proved that super resolution image generation and low-light image enhancement improve detection accuracy by a substantial margin. Additionally, it has been proven that a cascade of SR generation and LL enhancement further boosts detection accuracy. However, the main drawback of such cascades is related to an increased computational time which limits possibilities for a range of real time applications. / Datorseende är en nyckelkomponent i alla autonoma system. Applikationer för datorseende i realtid är beroende av en korrekt detektering och klassificering av objekt. En detekteringsalgoritm som inte kan garantera rimlig noggrannhet är inte tillämpningsbar i realtidsscenarier, där huvudmålet är säkerhet. Faktorer som påverkar detekteringsnoggrannheten är belysningförhållanden och bildupplösning. Dessa bidrar till degradering av objekt och leder till låg klassificerings- och detekteringsnoggrannhet. Senaste utvecklingar av Convolutional Neural Networks (CNNs) -baserade algoritmer erbjuder möjligheter för förbättring av bilder med dålig belysning och bildgenerering med superupplösning vilket gör det möjligt att kombinera sådana modeller för att förbättra bildkvaliteten och öka detekteringsnoggrannheten. I denna uppsats utvärderas olika CNN-modeller för superupplösning och förbättring av bilder med dålig belysning genom att jämföra genererade bilder med det faktiska data. För att kvantifiera inverkan av respektive modell på detektionsnoggrannhet utvärderades en detekteringsprocedur på genererade bilder. Experimentella resultat utvärderades på bilder utvalda från NoghtOwls och Caltech datauppsättningar för fotgängare och visade att bildgenerering med superupplösning och bildförbättring i svagt ljus förbättrar noggrannheten med en betydande marginal. Dessutom har det bevisats att en kaskad av superupplösning-generering och förbättring av bilder med dålig belysning ytterligare ökar noggrannheten. Den största nackdelen med sådana kaskader är relaterad till en ökad beräkningstid som begränsar möjligheterna för en rad realtidsapplikationer. Object detection Super Resolution image generation Low-Light image enhancement Computer Vision Detektion av objekt Bildgenerering med superupplösning Datorseende Computer and Information Sciences Data- och informationsvetenskap
34	Image generation through feature extraction and learning using a deep learning approach Bruneel, Tibo January 2023 (has links) With recent advancements, image generation has become more and more possible with the introduction of stronger generative artificial intelligence (AI) models. The idea and ability of generating non-existing images that highly resemble real world images is interesting for many use cases. Generated images could be used, for example, to augment, extend or replace real data sets for training AI models, therefore being capable of minimising costs on data collection and similar processes. Deep learning, a sub-field within the AI field has been on the forefront of such methodologies due to its nature of being able to capture and learn highly complex and feature-rich data. This work focuses on deep generative learning approaches within a forestry application, with the goal of generating tree log end images in order to enhance an AI model that uses such images. This approach would not only reduce costs of data collection for this model, but also many other information extraction models within the forestry field. This thesis study includes research on the state of the art within deep generative modelling and experiments using a full pipeline from a deep generative modelling stage to a log end recognition model. On top of this, a variant architecture and image sampling algorithm are proposed to add in this pipeline and evaluate its performance. The experiments and findings show that the applied generative model approaches show good feature learning, but lack the high-quality and realistic generation, resulting in more blurry results. The variant approach resulted in slightly better feature learning with a trade-off in generation quality. The proposed sampling algorithm proved to work well on a qualitative basis. The problems found in the generative models propagated further into the training of the recognition model, making the improvement of another AI model based on purely generated data impossible at this point in the research. The results of this research show that more work is needed on improving the application and generation quality to make it resemble real world data more, so that other models can be trained on artificial data. The variant approach does not improve much and its findings contribute to the field by proving its strengths and weaknesses, as with the proposed image sampling algorithm. At last this study provides a good starting point for research within this application, with many different directions and opportunities for future work. Deep Learning Neural Networks Deep Generative Learning Variational Autoencoders Generative Adversarial Networks Flow-based Models Triplet Image Generation Triplet Loss Tree Log End Generation Forestry Application Computer Sciences Datavetenskap (datalogi)
35	Deep Synthesis of Distortion-free 3D Omnidirectional Imagery from 2D Images Christopher K May (18422640) 22 April 2024 (has links) <p dir="ltr">Omnidirectional images are a way to visualize an environment in all directions. They have a spherical topology and require careful attention when represented by a computer. Namely, mapping the sphere to a plane introduces stretching of the spherical image content, and requires at least one seam in the image to be able to unwrap the sphere. Generative neural networks have shown impressive ability to synthesize images, but generating spherical images is still challenging. Without specific handling of the spherical topology, the generated images often exhibit distorted contents and discontinuities across the seams. We describe strategies for mitigating such distortions during image generation, as well as ensuring the image remains continuous across all boundaries. Our solutions can be applied to a variety of spherical image representations, including cube-maps and equirectangular projections.</p><p dir="ltr">A closely related problem in generative networks is 3D-aware scene generation, wherein the task involves the creation of an environment in which the viewpoint can be directly controlled. Many NeRF-based solutions have been proposed, but they generally focus on generation of single objects or faces. Full 3D environments are more difficult to synthesize and are less studied. We approach this problem by leveraging omnidirectional image synthesis, using the initial features of the network as a transformable foundation upon which to build the scene. By translating within the initial feature space, we correspondingly translate in the output omnidirectional image, preserving the scene characteristics. We additionally develop a regularizing loss based on epipolar geometry to encourage geometric consistency between viewpoints. We demonstrate the effectiveness of our method with a structure-from-motion-based reconstruction metric, along with comparisons to related works.</p> Computer vision Computer graphics Procedural content generation Virtual and mixed reality Adversarial machine learning Deep learning Neural networks Image generation Omni-directional Generative Adversial Network
36	Évaluation de la qualité des documents anciens numérisés Rabeux, Vincent 06 March 2013 (has links) Les travaux de recherche présentés dans ce manuscrit décrivent plusieurs apports au thème de l’évaluation de la qualité d’images de documents numérisés. Pour cela nous proposons de nouveaux descripteurs permettant de quantifier les dégradations les plus couramment rencontrées sur les images de documents numérisés. Nous proposons également une méthodologie s’appuyant sur le calcul de ces descripteurs et permettant de prédire les performances d’algorithmes de traitement et d’analyse d’images de documents. Les descripteurs sont définis en analysant l’influence des dégradations sur les performances de différents algorithmes, puis utilisés pour créer des modèles de prédiction à l’aide de régresseurs statistiques. La pertinence, des descripteurs proposés et de la méthodologie de prédiction, est validée de plusieurs façons. Premièrement, par la prédiction des performances de onze algorithmes de binarisation. Deuxièmement par la création d’un processus automatique de sélection de l’algorithme de binarisation le plus performant pour chaque image. Puis pour finir, par la prédiction des performances de deux OCRs en fonction de l’importance du défaut de transparence (diffusion de l’encre du recto sur le verso d’un document). Ce travail sur la prédiction des performances d’algorithmes est aussi l’occasion d’aborder les problèmes scientifiques liés à la création de vérités-terrains et d’évaluation de performances. / This PhD. thesis deals with quality evaluation of digitized document images. In order to measure the quality of a document image, we propose to create new features dedicated to the characterization of most commons degradations. We also propose to use these features to create prediction models able to predict the performances of different types of document analysis algorithms. The features are defined by analyzing the impact of a specific degradation on the results of an algorithm and then used to create statistical regressors.The relevance of the proposed features and predictions models, is analyzed in several experimentations. The first one aims to predict the performance of different binarization methods. The second experiment aims to create an automatic procedure able to select the best binarization method for each image. At last, the third experiment aims to create a prediction model for two commonly used OCRs. This work on performance prediction algorithms is also an opportunity to discuss the scientific problems of creating ground-truth for performance evaluation. Images de documents anciens Évaluation de la qualité Modèles de prédiction Descripteurs images Binarisation Reconnaissance de caractères Évaluation de performances Génération de documents synthétiques Création de vérité-terrains Régression linéaire Ancient document images Quality evaluation Image features Optical character recognition Performance evaluation Synthetic document image generation Ground-truth creation

Page generated in 0.04 seconds