Global ETD Search

141	Discrete Scale-Space Theory and the Scale-Space Primal Sketch Lindeberg, Tony January 1991 (has links) This thesis, within the subfield of computer science known as computer vision, deals with the use of scale-space analysis in early low-level processing of visual information. The main contributions comprise the following five subjects: The formulation of a scale-space theory for discrete signals. Previously, the scale-space concept has been expressed for continuous signals only. We propose that the canonical way to construct a scale-space for discrete signals is by convolution with a kernel called the discrete analogue of the Gaussian kernel, or equivalently by solving a semi-discretized version of the diffusion equation. Both the one-dimensional and two-dimensional cases are covered. An extensive analysis of discrete smoothing kernels is carried out for one-dimensional signals and the discrete scale-space properties of the most common discretizations to the continuous theory are analysed. A representation, called the scale-space primal sketch, which gives a formal description of the hierarchical relations between structures at different levels of scale. It is aimed at making information in the scale-space representation explicit. We give a theory for its construction and an algorithm for computing it. A theory for extracting significant image structures and determining the scales of these structures from this representation in a solely bottom-up data-driven way. Examples demonstrating how such qualitative information extracted from the scale-space primal sketch can be used for guiding and simplifying other early visual processes. Applications are given to edge detection, histogram analysis and classification based on local features. Among other possible applications one can mention perceptual grouping, texture analysis, stereo matching, model matching and motion. A detailed theoretical analysis of the evolution properties of critical points and blobs in scale-space, comprising drift velocity estimates under scale-space smoothing, a classification of the possible types of generic events at bifurcation situations and estimates of how the number of local extrema in a signal can be expected to decrease as function of the scale parameter. For two-dimensional signals the generic bifurcation events are annihilations and creations of extremum-saddle point pairs. Interpreted in terms of blobs, these transitions correspond to annihilations, merges, splits and creations. Experiments on different types of real imagery demonstrate that the proposed theory gives perceptually intuitive results. / <p>QC 20120119</p> Computer vision low-level processing scale-space diffusion Gaussian filtering discrete smoothing primal sketch segmentation descriptive elements scale detection image structure focus-of-attention tuning low-level processing blob detection edge detection edge focusing histogram analysis junction classification perceptual grouping texture analysis critical points classification of blob events bifurcations drift velocity density of local extrema multi-scale representation digital signal processing
142	Natural image processing and synthesis using deep learning Ganin, Iaroslav 09 1900 (has links) Nous étudions dans cette thèse comment les réseaux de neurones profonds peuvent être utilisés dans différents domaines de la vision artificielle. La vision artificielle est un domaine interdisciplinaire qui traite de la compréhension d’images et de vidéos numériques. Les problèmes de ce domaine ont traditionnellement été adressés avec des méthodes ad-hoc nécessitant beaucoup de réglages manuels. En effet, ces systèmes de vision artificiels comprenaient jusqu’à récemment une série de modules optimisés indépendamment. Cette approche est très raisonnable dans la mesure où, avec peu de données, elle bénéficient autant que possible des connaissances du chercheur. Mais cette avantage peut se révéler être une limitation si certaines données d’entré n’ont pas été considérées dans la conception de l’algorithme. Avec des volumes et une diversité de données toujours plus grands, ainsi que des capacités de calcul plus rapides et économiques, les réseaux de neurones profonds optimisés d’un bout à l’autre sont devenus une alternative attrayante. Nous démontrons leur avantage avec une série d’articles de recherche, chacun d’entre eux trouvant une solution à base de réseaux de neurones profonds à un problème d’analyse ou de synthèse visuelle particulier. Dans le premier article, nous considérons un problème de vision classique: la détection de bords et de contours. Nous partons de l’approche classique et la rendons plus ‘neurale’ en combinant deux étapes, la détection et la description de motifs visuels, en un seul réseau convolutionnel. Cette méthode, qui peut ainsi s’adapter à de nouveaux ensembles de données, s’avère être au moins aussi précis que les méthodes conventionnelles quand il s’agit de domaines qui leur sont favorables, tout en étant beaucoup plus robuste dans des domaines plus générales. Dans le deuxième article, nous construisons une nouvelle architecture pour la manipulation d’images qui utilise l’idée que la majorité des pixels produits peuvent d’être copiés de l’image d’entrée. Cette technique bénéficie de plusieurs avantages majeurs par rapport à l’approche conventionnelle en apprentissage profond. En effet, elle conserve les détails de l’image d’origine, n’introduit pas d’aberrations grâce à la capacité limitée du réseau sous-jacent et simplifie l’apprentissage. Nous démontrons l’efficacité de cette architecture dans le cadre d’une tâche de correction du regard, où notre système produit d’excellents résultats. Dans le troisième article, nous nous éclipsons de la vision artificielle pour étudier le problème plus générale de l’adaptation à de nouveaux domaines. Nous développons un nouvel algorithme d’apprentissage, qui assure l’adaptation avec un objectif auxiliaire à la tâche principale. Nous cherchons ainsi à extraire des motifs qui permettent d’accomplir la tâche mais qui ne permettent pas à un réseau dédié de reconnaître le domaine. Ce réseau est optimisé de manière simultané avec les motifs en question, et a pour tâche de reconnaître le domaine de provenance des motifs. Cette technique est simple à implémenter, et conduit pourtant à l’état de l’art sur toutes les tâches de référence. Enfin, le quatrième article présente un nouveau type de modèle génératif d’images. À l’opposé des approches conventionnels à base de réseaux de neurones convolutionnels, notre système baptisé SPIRAL décrit les images en termes de programmes bas-niveau qui sont exécutés par un logiciel de graphisme ordinaire. Entre autres, ceci permet à l’algorithme de ne pas s’attarder sur les détails de l’image, et de se concentrer plutôt sur sa structure globale. L’espace latent de notre modèle est, par construction, interprétable et permet de manipuler des images de façon prévisible. Nous montrons la capacité et l’agilité de cette approche sur plusieurs bases de données de référence. / In the present thesis, we study how deep neural networks can be applied to various tasks in computer vision. Computer vision is an interdisciplinary field that deals with understanding of digital images and video. Traditionally, the problems arising in this domain were tackled using heavily hand-engineered adhoc methods. A typical computer vision system up until recently consisted of a sequence of independent modules which barely talked to each other. Such an approach is quite reasonable in the case of limited data as it takes major advantage of the researcher's domain expertise. This strength turns into a weakness if some of the input scenarios are overlooked in the algorithm design process. With the rapidly increasing volumes and varieties of data and the advent of cheaper and faster computational resources end-to-end deep neural networks have become an appealing alternative to the traditional computer vision pipelines. We demonstrate this in a series of research articles, each of which considers a particular task of either image analysis or synthesis and presenting a solution based on a ``deep'' backbone. In the first article, we deal with a classic low-level vision problem of edge detection. Inspired by a top-performing non-neural approach, we take a step towards building an end-to-end system by combining feature extraction and description in a single convolutional network. The resulting fully data-driven method matches or surpasses the detection quality of the existing conventional approaches in the settings for which they were designed while being significantly more usable in the out-of-domain situations. In our second article, we introduce a custom architecture for image manipulation based on the idea that most of the pixels in the output image can be directly copied from the input. This technique bears several significant advantages over the naive black-box neural approach. It retains the level of detail of the original images, does not introduce artifacts due to insufficient capacity of the underlying neural network and simplifies training process, to name a few. We demonstrate the efficiency of the proposed architecture on the challenging gaze correction task where our system achieves excellent results. In the third article, we slightly diverge from pure computer vision and study a more general problem of domain adaption. There, we introduce a novel training-time algorithm (\ie, adaptation is attained by using an auxilliary objective in addition to the main one). We seek to extract features that maximally confuse a dedicated network called domain classifier while being useful for the task at hand. The domain classifier is learned simultaneosly with the features and attempts to tell whether those features are coming from the source or the target domain. The proposed technique is easy to implement, yet results in superior performance in all the standard benchmarks. Finally, the fourth article presents a new kind of generative model for image data. Unlike conventional neural network based approaches our system dubbed SPIRAL describes images in terms of concise low-level programs executed by off-the-shelf rendering software used by humans to create visual content. Among other things, this allows SPIRAL not to waste its capacity on minutae of datasets and focus more on the global structure. The latent space of our model is easily interpretable by design and provides means for predictable image manipulation. We test our approach on several popular datasets and demonstrate its power and flexibility. Apprentissage profond Vision artificielle Réseaux de neurones Réseaux de neurones convolutionnels Détections de bords Correction du regard Transformateurs spatiaux Adaptation de domaine Adversaire Modèles génératifs Apprentissage par renforcement Graphisme inverse Deep learning Computer vision Neural networks Convolutional neural networks Edge detection Gaze correction Spatial transformers Domain adaptation Adversarial Generative models Reinforcement learning Inverse graphics
143	An Effective Framework of Autonomous Driving by Sensing Road/motion Profiles Zheyuan Wang (11715263) 22 November 2021 (has links) <div>With more and more videos taken from dash cams on thousands of cars, retrieving these videos and searching for important information is a daunting task. The purpose of this work is to mine some key road and vehicle motion attributes in a large-scale driving video data set for traffic analysis, sensing algorithm development and autonomous driving test benchmarks. Current sensing and control of autonomous cars based on full-view identification makes it difficult to maintain a high-frequency with a fast-moving vehicle, since computation is increasingly used to cope with driving environment changes.</div><div><br></div><div>A big challenge in video data mining is how to deal with huge amounts of data. We use a compact representation called the road profile system to visualize the road environment in long 2D images. It reduces the data from each frame of image to one line, thereby compressing the video clip to the image. This data dimensionality reduction method has several advantages: First, the data size is greatly compressed. The data is compressed from a video to an image, and each frame in the video is compressed into a line. The data size is compressed hundreds of times. While the size and dimensionality of the data has been compressed greatly, the useful information in the driving video is still completely preserved, and motion information is even better represented more intuitively. Because of the data and dimensionality reduction, the identification algorithm computational efficiency is higher than the full-view identification method, and it makes the real-time identification on road is possible. Second, the data is easier to be visualized, because the data is reduced in dimensionality, and the three-dimensional video data is compressed into two-dimensional data, the reduction is more conducive to the visualization and mutual comparison of the data. Third, continuously changing attributes are easier to show and be captured. Due to the more convenient visualization of two-dimensional data, the position, color and size of the same object within a few frames will be easier to compare and capture. At the same time, in many cases, the trouble caused by tracking and matching can be eliminated. Based on the road profile system, there are three tasks in autonomous driving are achieved using the road profile images.</div><div><br></div><div>The first application is road edge detection under different weather and appearance for road following in autonomous driving to capture the road profile image and linearity profile image in the road profile system. This work uses naturalistic driving video data mining to study the appearance of roads, which covers large-scale road data and changes. This work excavated a large number of naturalistic driving video sets to sample the light-sensitive area for color feature distribution. The effective road contour image is extracted from the long-time driving video, thereby greatly reducing the amount of video data. Then, the weather and lighting type can be identified. For each weather and lighting condition obvious features are I identified at the edge of the road to distinguish the road edge. </div><div><br></div><div>The second application is detecting vehicle interactions in driving videos via motion profile images to capture the motion profile image in the road profile system. This work uses visual actions recorded in driving videos taken by a dashboard camera to identify this interaction. The motion profile images of the video are filtered at key locations, thereby reducing the complexity of object detection, depth sensing, target tracking and motion estimation. The purpose of this reduction is for decision making of vehicle actions such as lane changing, vehicle following, and cut-in handling.</div><div><br></div><div>The third application is motion planning based on vehicle interactions and driving video. Taking note of the fact that a car travels in a straight line, we simply identify a few sample lines in the view to constantly scan the road, vehicles, and environment, generating a portion of the entire video data. Without using redundant data processing, we performed semantic segmentation to streaming road profile images. We plan the vehicle's path/motion using the smallest data set possible that contains all necessary information for driving.</div><div><br></div><div>The results are obtained efficiently, and the accuracy is acceptable. The results can be used for driving video mining, traffic analysis, driver behavior understanding, etc.</div> Applied Computer Science data mining road traffic road vehicles video signal processing data visualisations edge detection feature selection object detection approaches road safety traffic engineering computing feature extraction algorithm unsupervised learning computer vision algorithm artificial intelligence Mobile robots -- Automatic control intelligent vehicle Intelligent vehicle intelligent vehicle systems autonomous driving ADAS advanced driving assistance system
144	Detekce jízdních pruhů a překážek / Traffic lanes and interruptions detection Dojava, Marian January 2011 (has links) This master´s thesis deals with depiction aplication of camera like sensitive element for assisting system of car. It was proposed, how find a road, a lane and a obstacle on roadways. Only one camera was aplication for it. Solution is realized by methods, that are based on color and gradient of image. It applies simple methods and methods with mathematical model. Result is sum of method and its test and comparing. Realization of my program is present at resume of this thesis.
145	Analýza obrazu pro korekci elektronových mikroskopů / Image analysis for correction of electron microscopes Smital, Petr January 2011 (has links) This thesis describes the physical nature of corrections of an electron microscope and mathematical methods of image processing required for their complete automation. The corrections include different types of focusing, astigmatism correction, electron beam centring, and image stabilisation. The mathematical methods described in this thesis include various methods of measuring focus and astigmatism, with and without using the Fourier transform, edge detection, histogram operations, and image registration, i.e. detection of spatial transformations in images. This thesis includes detailed descriptions of the mathematical methods, their evaluation using an “offline” application, descriptions of the algorithms of their implementation into an actual electron microscope and results of their testing on the actual electron microscope, in the form of a video footage grabbed from its control computer’s screen.
146	Rozpoznávání ručně psaného písma pomocí neuronových sítí / Handwritten Character Recognition Using Artificial Neural Networks Horký, Vladimír January 2012 (has links) Neural networks with algorithm back-propagation will be presented in this work. Theoretical background of the algorithm will be explained. The problems with training neural nets will be solving there. The work discuss some techniques of image preprocessing and image extraction features, which is one of main part in classification. Some part of work discuss few experiments with neural nets with chosen image features.
147	Detekce hran pomocí neuronové sítě / Neural Network Based Edge Detection Janda, Miloš January 2010 (has links) Aim of this thesis is description of neural network based edge detection methods that are substitute for classic methods of detection using edge operators. First chapters generally discussed the issues of image processing, edge detection and neural networks. The objective of the main part is to show process of generating synthetic images, extracting training datasets and discussing variants of suitable topologies of neural networks for purpose of edge detection. The last part of the thesis is dedicated to evaluating and measuring accuracy values of neural network.
148	Une mesure de non-stationnarité générale : Application en traitement d'images et du signaux biomédicaux / A general non-stationarity measure : Application to biomedical image and signal processing Xu, Yanli 04 October 2013 (has links) La variation des intensités est souvent exploitée comme une propriété importante du signal ou de l’image par les algorithmes de traitement. La grandeur permettant de représenter et de quantifier cette variation d’intensité est appelée une « mesure de changement », qui est couramment employée dans les méthodes de détection de ruptures d’un signal, dans la détection des contours d’une image, dans les modèles de segmentation basés sur les contours, et dans des méthodes de lissage d’images avec préservation de discontinuités. Dans le traitement des images et signaux biomédicaux, les mesures de changement existantes fournissent des résultats peu précis lorsque le signal ou l’image présentent un fort niveau de bruit ou un fort caractère aléatoire, ce qui conduit à des artefacts indésirables dans le résultat des méthodes basées sur la mesure de changement. D’autre part, de nouvelles techniques d'imagerie médicale produisent de nouveaux types de données dites à valeurs multiples, qui nécessitent le développement de mesures de changement adaptées. Mesurer le changement dans des données de tenseur pose alors de nouveaux problèmes. Dans ce contexte, une mesure de changement, appelée « mesure de non-stationnarité (NSM) », est améliorée et étendue pour permettre de mesurer la non-stationnarité de signaux multidimensionnels quelconques (scalaire, vectoriel, tensoriel) par rapport à un paramètre statistique, et en fait ainsi une mesure générique et robuste. Une méthode de détection de changements basée sur la NSM et une méthode de détection de contours basée sur la NSM sont respectivement proposées et appliquées aux signaux ECG et EEG, ainsi qu’a des images cardiaques pondérées en diffusion (DW). Les résultats expérimentaux montrent que les méthodes de détection basées sur la NSM permettent de fournir la position précise des points de changement et des contours des structures tout en réduisant efficacement les fausses détections. Un modèle de contour actif géométrique basé sur la NSM (NSM-GAC) est proposé et appliqué pour segmenter des images échographiques de la carotide. Les résultats de segmentation montrent que le modèle NSM-GAC permet d’obtenir de meilleurs résultats comparativement aux outils existants avec moins d'itérations et de temps de calcul, et de réduire les faux contours et les ponts. Enfin, et plus important encore, une nouvelle approche de lissage préservant les caractéristiques locales, appelée filtrage adaptatif de non-stationnarité (NAF), est proposée et appliquée pour améliorer les images DW cardiaques. Les résultats expérimentaux montrent que la méthode proposée peut atteindre un meilleur compromis entre le lissage des régions homogènes et la préservation des caractéristiques désirées telles que les bords ou frontières, ce qui conduit à des champs de tenseurs plus homogènes et par conséquent à des fibres cardiaques reconstruites plus cohérentes. / The intensity variation is often used in signal or image processing algorithms after being quantified by a measurement method. The method for measuring and quantifying the intensity variation is called a « change measure », which is commonly used in methods for signal change detection, image edge detection, edge-based segmentation models, feature-preserving smoothing, etc. In these methods, the « change measure » plays such an important role that their performances are greatly affected by the result of the measurement of changes. The existing « change measures » may provide inaccurate information on changes, while processing biomedical images or signals, due to the high noise level or the strong randomness of the signals. This leads to various undesirable phenomena in the results of such methods. On the other hand, new medical imaging techniques bring out new data types and require new change measures. How to robustly measure changes in theos tensor-valued data becomes a new problem in image and signal processing. In this context, a « change measure », called the Non-Stationarity Measure (NSM), is improved and extended to become a general and robust « change measure » able to quantify changes existing in multidimensional data of different types, regarding different statistical parameters. A NSM-based change detection method and a NSM-based edge detection method are proposed and respectively applied to detect changes in ECG and EEG signals, and to detect edges in the cardiac diffusion weighted (DW) images. Experimental results show that the NSM-based detection methods can provide more accurate positions of change points and edges and can effectively reduce false detections. A NSM-based geometric active contour (NSM-GAC) model is proposed and applied to segment the ultrasound images of the carotid. Experimental results show that the NSM-GAC model provides better segmentation results with less iterations that comparative methods and can reduce false contours and leakages. Last and more important, a new feature-preserving smoothing approach called « Nonstationarity adaptive filtering (NAF) » is proposed and applied to enhance human cardiac DW images. Experimental results show that the proposed method achieves a better compromise between the smoothness of the homogeneous regions and the preservation of desirable features such as boundaries, thus leading to homogeneously consistent tensor fields and consequently a more reconstruction of the coherent fibers. Imagerie médicale Imagerie cardiaque Image IRM par résonnance magnétique Filtrage d'Image Filtrage adaptatif Détection de contours Détection de changement de contours Mesure de non stationarité - NSM Traitement des images Signal analytique multidimensionnel Medical Imaging Cardiac Imaging Magnetic Resonance Image Image Filtering Adaptative filtering Edge detection Edge change detection Non-stationary measure - NSM Image Processing Multi Dimensional Analytical Signal Geometric active contour dtection 616.075 480 72

Search results