• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1475
  • 473
  • 437
  • 372
  • 104
  • 74
  • 68
  • 34
  • 33
  • 32
  • 28
  • 26
  • 21
  • 18
  • 10
  • Tagged with
  • 3658
  • 1091
  • 748
  • 488
  • 458
  • 440
  • 418
  • 390
  • 389
  • 348
  • 344
  • 327
  • 319
  • 317
  • 315
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
251

Analyse des personnes dans les films stéréoscopiques / Person analysis in stereoscopic movies

Seguin, Guillaume 29 April 2016 (has links)
Les humains sont au coeur de nombreux problèmes de vision par ordinateur, tels que les systèmes de surveillance ou les voitures sans pilote. Ils sont également au centre de la plupart des contenus visuels, pouvant amener à des jeux de données très larges pour l’entraînement de modèles et d’algorithmes. Par ailleurs, si les données stéréoscopiques font l’objet d’études depuis longtemps, ce n’est que récemment que les films 3D sont devenus un succès commercial. Dans cette thèse, nous étudions comment exploiter les données additionnelles issues des films 3D pour les tâches d’analyse des personnes. Nous explorons tout d’abord comment extraire une notion de profondeur à partir des films stéréoscopiques, sous la forme de cartes de disparité. Nous évaluons ensuite à quel point les méthodes de détection de personne et d’estimation de posture peuvent bénéficier de ces informations supplémentaires. En s’appuyant sur la relative facilité de la tâche de détection de personne dans les films 3D, nous développons une méthode pour collecter automatiquement des exemples de personnes dans les films 3D afin d’entraîner un détecteur de personne pour les films non 3D. Nous nous concentrons ensuite sur la segmentation de plusieurs personnes dans les vidéos. Nous proposons tout d’abord une méthode pour segmenter plusieurs personnes dans les films 3D en combinant des informations dérivées des cartes de profondeur avec des informations dérivées d’estimations de posture. Nous formulons ce problème comme un problème d’étiquetage de graphe multi-étiquettes, et notre méthode intègre un modèle des occlusions pour produire une segmentation multi-instance par plan. Après avoir montré l’efficacité et les limitations de cette méthode, nous proposons un second modèle, qui ne repose lui que sur des détections de personne à travers la vidéo, et pas sur des estimations de posture. Nous formulons ce problème comme la minimisation d’un coût quadratique sous contraintes linéaires. Ces contraintes encodent les informations de localisation fournies par les détections de personne. Cette méthode ne nécessite pas d’information de posture ou des cartes de disparité, mais peut facilement intégrer ces signaux supplémentaires. Elle peut également être utilisée pour d’autres classes d’objets. Nous évaluons tous ces aspects et démontrons la performance de cette nouvelle méthode. / People are at the center of many computer vision tasks, such as surveillance systems or self-driving cars. They are also at the center of most visual contents, potentially providing very large datasets for training models and algorithms. While stereoscopic data has been studied for long, it is only recently that feature-length stereoscopic ("3D") movies became widely available. In this thesis, we study how we can exploit the additional information provided by 3D movies for person analysis. We first explore how to extract a notion of depth from stereo movies in the form of disparity maps. We then evaluate how person detection and human pose estimation methods perform on such data. Leveraging the relative ease of the person detection task in 3D movies, we develop a method to automatically harvest examples of persons in 3D movies and train a person detector for standard color movies. We then focus on the task of segmenting multiple people in videos. We first propose a method to segment multiple people in 3D videos by combining cues derived from pose estimates with ones derived from disparity maps. We formulate the segmentation problem as a multi-label Conditional Random Field problem, and our method integrates an occlusion model to produce a layered, multi-instance segmentation. After showing the effectiveness of this approach as well as its limitations, we propose a second model which only relies on tracks of person detections and not on pose estimates. We formulate our problem as a convex optimization one, with the minimization of a quadratic cost under linear equality or inequality constraints. These constraints weakly encode the localization information provided by person detections. This method does not explicitly require pose estimates or disparity maps but can integrate these additional cues. Our method can also be used for segmenting instances of other object classes from videos. We evaluate all these aspects and demonstrate the superior performance of this new method.
252

Segmentation automatique d'images sur des critères géométriques, application à l'inspection visuelle de produits agroalimentaires / Automated segmentation of images using geometrical criteria, application on image processing for good inspection

Dubosclard, Pierre 25 January 2016 (has links)
À l’échelle mondiale, la récolte de céréales atteint plusieurs milliards de tonnes chaque année. Les producteurs céréaliers échangent leurs récoltes selon un prix déterminé par la qualité de leur production. Cette évaluation, appelée agréage, est réalisée pour chaque lot sur un échantillon jugé représentatif. La difficulté au cours de cette évaluation est de parvenir à parfaitement caractériser l'échantillon. Il faut pour cela qualifier chacun de ses éléments, en d'autres termes, il est nécessaire d'évaluer chaque grain de céréale de manière individuelle. Cette opération est historiquement réalisée par un opérateur qui isole chaque grain manuellement pour l’inspecter et l'évaluer. Ce procédé est exposé à différents problèmes : d'une part les résultats obtenus par un opérateur ne sont pas parfaitement répétables : son état de fatigue visuelle peut influencer son appréciation ; d'autre part l'évaluation dépend de l'opérateur : elle n'est pas reproductible, les résultats peuvent varier d'un opérateur à l'autre. Cette thèse a donc pour but de mettre au point un système capable de réaliser cette inspection visuelle.Le système d’acquisition est présenté dans un premier temps. Cette enceinte contient les dispositifs d’éclairage et d’acquisition d’images. Différents outils ont été mis en œuvre pour s’assurer de la justesse et de la stabilité des acquisitions. Une méthode d’apprentissage de modèles de forme est ensuite présentée : elle a pour but de définir et de modéliser le type de forme des grains de l’application considérée (blé, riz, orge). Cette étape est réalisée sur une image d’objets isolés. Deux méthodes de détection sont ensuite présentées : une approche déterministe et une approche probabiliste. Ces deux méthodes, mises au point pour segmenter les objets d’une image, utilisent des outils communs bien qu’elles soient conçues différemment. Les résultats obtenus et présentés dans cette thèse démontrent la capacité du système automatique à se positionner comme une solution fiable à la problématique d’inspection visuelle de grains de céréales. / In agriculture, the global grain harvest reached several billion tons each year. Cereal producers exchange their crops at a price determined by the quality of their production. This assessment, called grading, is performed for each set on a representative sample. The difficulty of this assessment is to fully characterize the sample. To do so, it is necessary to qualify each of its elements. In other words, it is necessary to evaluate each individual cereal grain. Historically, this has been performed manually by an operator who isolates each evaluated grain. This method is exposed to various problems: firstly, results obtained by an operator are not perfectly repeatable. For example, eyestrain can influence the assessment. On the other hand the evaluation depends on the operator: it is not reproducible. The results can vary from one operator to another. The aim of this thesis is to develop a system that can handle this visual inspection. In a first time, the acquisition system is introduced. Image acquisition and lighting parts are placed in a cabin. Several methods have been introduced to manage accuracy and stability of the acquisitions. Then, a shape model learning is detailed: this step, based on an image with manually separated objects, defines and modelizes shape of the considered cereal grains (wheat, rice, barley). Two detection approaches are then introduced: a deterministic method and a probabilistic one. Both are based on the same tools to process the objects segmentation of an image, but they deal with the question in a different way. The results provided by the system and presented in this thesis emphasize the ability of this automatic system to process the visual inspection of food products.
253

UNRESTRICTED CONTROLLABLE ATTACKS FOR SEGMENTATION NEURAL NETWORKS

Guangyu Shen (8795963) 12 October 2021 (has links)
<p>Despite the rapid development of adversarial attacks on machine learning models, many types of new adversarial examples remain unknown. Undiscovered types of adversarial attacks pose a</p><p>serious concern for the safety of the models, which raises the issue about the effectiveness of current adversarial robustness evaluation. Image semantic segmentation is a practical computer</p><p>vision task. However, segmentation networks’ robustness under adversarial attacks receives insufficient attention. Recently, machine learning researchers started to focus on generating</p><p>adversarial examples beyond the norm-bound restriction for segmentation neural networks. In this thesis, a simple and efficient method: AdvDRIT is proposed to synthesize unconstrained controllable adversarial images leveraging conditional-GAN. Simple CGAN yields poor image quality and low attack effectiveness. Instead, the DRIT (Disentangled Representation Image Translation) structure is leveraged with a well-designed loss function, which can generate valid adversarial images in one step. AdvDRIT is evaluated on two large image datasets: ADE20K and Cityscapes. Experiment results show that AdvDRIT can improve the quality of adversarial examples by decreasing the FID score down to 40% compared to state-of-the-art generative models such as Pix2Pix, and also improve the attack success rate 38% compared to other adversarial attack methods including PGD.</p>
254

Gore Classification and Censoring in Images

Larocque, William 30 November 2021 (has links)
With the large amount of content being posted on the Internet every day, moderators, investigators, and analysts can be exposed to hateful, pornographic, or graphic content as part of their work. Exposure to this kind of content can have a severe impact on the mental health of these individuals. Hence, measures must be taken to lessen their mental health burden. Significant effort has been made to find and censor pornographic content; gore has not been researched to the same extent. Research in this domain has focused on protecting the public from seeing graphic content in images, movies, or online videos. However, these solutions do little to flag this content for employees who need to review such footage as part of their work. In this thesis, we aim to address this problem by creating a full image processing pipeline to find and censor gore in images. This involves creating a dataset, as none are publicly available, training and testing different machine learning solutions to automatically censor gore content. We propose an Image Processing Pipeline consisting of two models: a classification model which aims to find whether the image contains gore, and a segmentation model to censor the gore in the image. The classification results can be used to reduce accidental exposure to gore, by blurring the image in the search results for example. It can also be used to reduce processing time and storage space by ensuring the segmentation model does not need to generate a censored image for every image submitted to the pipeline. Both models use pretrained Convolutional Neural Network (CNN) architectures and weights as part of their design and are fine-tuned using Machine Learning (ML). We have done so to maximize the performance on the small dataset we gathered for these two tasks. The segmentation dataset contains 737 training images while the classification dataset contains 3830 images. We explored various variations on the proposed models that are inspired from existing solutions in similar domains, such as pornographic content detection and censoring and medical wound segmentation. These variations include Multiple Instance Learning (MIL), Generative Adversarial Networks (GANs) and Mask R-CNN. The best classification model we trained is a voting ensemble that combines the results of 4 classification models. This model achieved a 91.92% Double F1-Score, 87.30% precision, and 90.66% recall on the testing set. Our highest performing segmentation model achieved a testing Intersection over Union (IoU) value of 56.75%. However, when we employed the proposed Image Processing Pipeline (classification followed by segmentation), we achieved a testing IoU of 69.95%.
255

Segmentace ultrazvukových sekvencí / Ultrasound Image Sequences Segmentation

Kořínek, Peter January 2011 (has links)
When we scan image data by ultrasound, we have a little information of displayed scene. For understanding content of the image we try to separate the observed objects of interest from the background. Obtaining information of these objects is called a process called segmentation. This work is focused on the segmentation of ultrasound image sequences using geometric active contours solved by the method of level sets. For better representation is also dealing with image preprocessing. The result is an implementation of segmentation methods on simulated and real data.
256

Regularized neural networks for semantic image segmentation

Jia, Fan 10 September 2020 (has links)
Image processing consists of a series of tasks which widely appear in many areas. It can be used for processing photos taken by people's cameras, astronomy radio, radar imaging, medical devices and tomography. Among these tasks, image segmentation is a fundamental task in a series of applications. Image segmentation is so important that it attracts hundreds of thousands of researchers from lots of fields all over the world. Given an image, the goal of image segmentation is to classify all pixels into several classes. Given an image defined over a domain, the segmentation task is to divide the domain into several different sub-domains such that pixels in each sub-domain share some common information. Variational methods showcase their performance in all kinds of image processing problems, such as image denoising, image debluring, image segmentation and so on. They can preserve structures of images well. In recent decades, it is more and more popular to reformulate an image processing problem into an energy minimization problem. The problem is then minimized by some optimization based methods. Meanwhile, convolutional neural networks (CNNs) gain outstanding achievements in a wide range of fields such as image processing, nature language processing and video recognition. CNNs are data-driven techniques which often need large datasets for training comparing to other methods like variational based methods. When handling image processing tasks with large scale datasets, CNNs are the first selections due to their superior performances. However, the class of each pixel is predicted independently in semantic segmentation tasks which are dense classification problems. Spatial regularity of the segmented objects is still a problem for these methods. Especially when given few training data, CNNs could not perform well in the details. Isolated and scattered small regions often appear in all kinds of CNN segmentation results. In this thesis, we successfully add spatial regularization to the segmented objects. In our methods, spatial regularization such as total variation (TV) can be easily integrated into CNNs and they produce smooth edges and eliminates isolated points. Spatial dependency is a very important prior for many image segmentation tasks. Generally, convolutional operations are building blocks that process one local neighborhood at a time, which means CNNs usually don't explicitly make use of the spatial prior on image segmentation tasks. Empirical evaluations of the regularized neural networks on a series of image segmentation datasets show its good performance and ability in improving the performance of many image segmentation CNNs. We also design a recurrent structure which is composed of multiple TV blocks. By applying this structure to a popular segmentation CNN, the segmentation results are further improved. This is an end-to-end framework to regularize the segmentation results. The proposed framework could give smooth edges and eliminate isolated points. Comparing to other post-processing methods, our method needs little extra computation thus is effective and efficient. Since long range dependency is also very important for semantic segmentation, we further present non-local regularized softmax activation function for semantic image segmentation tasks. We introduce graph operators into CNNs by integrating nonlocal total variation regularizer into softmax activation function. We find the non-local regularized softmax activation function by the primal-dual hybrid gradient method. Experiments show that non-local regularized softmax activation function can bring regularization effect and preserve object details at the same time
257

Morfologická segmentace v češtině s využitím slovotvorné sítě / Morphological Segmentation in Czech using Word-Formation Network

Bodnár, Jan January 2020 (has links)
Morphological segmentation is segmentation of words into morphemes - smallest units carrying meaning. It is a low level Natural Language Processing task. Since morphological segmentation is sometimes used as method of preprocessing, achieving better results on this task may help NLP algorithms to better solve various problems, especially in scenarios involving small amount of data, and it may also also help the linguistic research. We propose a novel ensemble algorithm for morphological segmentation of Czech lemmas which makes use of the DeriNet derivation tree dataset. As a sideproduct we also created suggestions for improvements of the DeriNet dataset.
258

Shape-Tailored Invariant Descriptors for Segmentation

Khan, Naeemullah 11 1900 (has links)
Segmentation is one of the first steps in human visual system which helps us see the world around us. Humans pre-attentively segment scenes into regions of unique textures in around 10-20 ms. In this thesis, we address the problem of segmentation by grouping dense pixel-wise descriptors. Our work is based on the fact that human vision has a feed forward and a feed backward loop, where low level feature are used to refine high level features in forward feed, and higher level feature information is used to refine the low level features in backward feed. Most vision algorithms are based on a feed-forward loop, where low-level features are used to construct and refine high level features, but they don’t have the feed back loop. We have introduced ”Shape-Tailored Local Descriptors”, where we use the high level feature information (region approximation) to update low level features i.e. the descriptor, and the low level feature information of the descriptor is used to update the segmentation regions. Our ”Shape-Tailored Local Descriptor” are dense local descriptors which are tailored to an arbitrarily shaped region, aggregating data only within the region of interest. Since the segmentation, i.e., the regions, are not known a-priori, we propose a joint problem for Shape-Tailored Local Descriptors and Segmentation (regions). Furthermore, since natural scenes consist of multiple objects, which may have different visual textures at different scales, we propose to use a multi-scale approach to segmentation. We have used a set of discrete scales, and a continuum of scales in our experiments, both resulted in state-of-the-art performance. Lastly we have looked into the nature of the features selected, we tried handcrafted color and gradient channels and we have also introduced an algorithm to incorporate learning optimal descriptors in segmentation approaches. In the final part of this thesis we have introduced techniques for unsupervised learning of descriptors for segmentation. This eliminates the problem of deep learning methods where we need huge amounts of training data to train the networks. The optimum descriptors are learned, without any training data, on the go during segmentation.
259

Template Matching on Vector Fields using Clifford Algebra

Ebling, J., Scheuermann, G. 14 December 2018 (has links)
Due to the amount of flow simulation and measurement data, automatic detection, classification and visualization of features is necessary for an inspection. Therefore, many automated feature detection methods have been developed in recent years. However, one feature class is visualized afterwards in most cases, and many algorithms have problems in the presence of noise or superposition effects. In contrast, image processing and computer vision have robust methods for feature extraction and computation of derivatives of scalar fields. Furthermore, interpolation and other filter can be analyzed in detail. An application of these methods to vector fields would provide a solid theoretical basis for feature extraction. The authors suggest Clifford algebra as a mathematical framework for this task. Clifford algebra provides a unified notation for scalars and vectors as well as a multiplication of all basis elements. The Clifford product of two vectors provides the complete geometric information of the relative positions of these vectors. Integration of this product results in Clifford correlation and convolution which can be used for template matching on vector fields. Furthermore, for frequency analysis of vector fields and the behavior of vector-valued filters, a Clifford Fourier transform has been derived for 2 and 3 dimensions. Convolution and other theorems have been proved, and fast algorithms for the computation of the Clifford Fourier transform exist. Therefore the computation of Clifford convolution can be accelerated by computing it in Clifford Fourier domain. Clifford convolution and Fourier transform can be used for a thorough analysis and subsequent visualization of vector fields
260

Efficient rendering of real-world environments in a virtual reality application, using segmented multi-resolution meshes

Chiromo, Tanaka Alois January 2020 (has links)
Virtual reality (VR) applications are becoming increasingly popular and are being used in various applications. VR applications can be used to simulate large real-world landscapes in a computer program for various purposes such as entertainment, education or business. Typically, 3-dimensional (3D) and VR applications use environments that are made up of meshes of relatively small size. As the size of the meshes increase, the applications start experiencing lagging and run-time memory errors. Therefore, it is inefficient to upload large-sized meshes into a VR application directly. Manually modelling an accurate real-world environment can also be a complicated task, due to the large size and complex nature of the landscapes. In this research, a method to automatically convert 3D point-clouds of any size and complexity into a format that can be efficiently rendered in a VR application is proposed. Apart from reducing the cost on performance, the solution also reduces the risks of virtual reality induced motion sickness. The pipeline of the system incorporates three main steps: a surface reconstruction step, a texturing step and a segmentation step. The surface reconstruction step is necessary to convert the 3D point-clouds into 3D triangulated meshes. Texturing is required to add a realistic feel to the appearance of themeshes. Segmentation is used to split large-sized meshes into smaller components that can be rendered individually without overflowing the memory. A novel mesh segmentation algorithm, the Triangle Pool Algorithm (TPA) is designed to segment the mesh into smaller parts. To avoid using the complex geometric and surface features of natural scenes, the TPA algorithm uses the colour attribute of the natural scenes for segmentation. The TPA algorithm manages to produce comparable results to those of state-of-the-art 3D segmentation algorithms when segmenting regular 3D objects and also manages to outperform the state-of-the-art algorithms when segmenting meshes of real-world natural landscapes. The VR application is designed using the Unreal and Unity 3D engines. Its principle of operation involves rendering regions closer to the user using highly-detailed multiple mesh segments, whilst regions further away from the user are comprised of a lower detailed mesh. The rest of the segments that are not rendered at a particular time, are stored in external storage. The principle of operation manages to free up memory and also to reduce the amount of computational power required to render highly-detailed meshes. / Dissertation (MEng)--University of Pretoria, 2020. / Electrical, Electronic and Computer Engineering / MEng / Unrestricted

Page generated in 0.1063 seconds