• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 348
  • 42
  • 20
  • 13
  • 10
  • 8
  • 5
  • 4
  • 3
  • 3
  • 2
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 541
  • 541
  • 253
  • 210
  • 173
  • 134
  • 113
  • 111
  • 108
  • 89
  • 87
  • 80
  • 75
  • 74
  • 73
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
121

人が放置する物体の動的認識

渡辺, 崇, WATANABE, Takashi, 前田, 優樹, MAEDA, Yuki 08 1900 (has links)
No description available.
122

Detection of black-backed jackal in still images

Pathare, Sneha P. 03 1900 (has links)
Thesis (MSc)--Stellenbosch University, 2015. / ENGLISH ABSTRACT: In South Africa, black-back jackal (BBJ) predation of sheep causes heavy losses to sheep farmers. Different control measures such as shooting, gin-traps and poisoning have been used to control the jackal population; however, these techniques also kill many harmless animals, as they fail to differentiate between BBJ and harmless animals. In this project, a system is implemented to detect black-backed jackal faces in images. The system was implemented using the Viola-Jones object detection algorithm. This algorithm was originally developed to detect human faces, but can also be used to detect a variety of other objects. The three important key features of the Viola-Jones algorithm are the representation of an image as a so-called ”integral image”, the use of the Adaboost boosting algorithm for feature selection, and the use of a cascade of classifiers to reduce false alarms. In this project, Python code has been developed to extract the Haar-features from BBJ images by acting as a classifier to distinguish between a BBJ and the background. Furthermore, the feature selection is done using the Asymboost instead of the Adaboost algorithm so as to achieve a high detection rate and low false positive rate. A cascade of strong classifiers is trained using a cascade learning algorithm. The inclusion of a special fifth feature Haar feature, adapted to the relative spacing of the jackal’s eyes, improves accuracy further. The final system detects 78% of the jackal faces, while only 0.006% of other image frames are wrongly identified as faces. / AFRIKAANSE OPSOMMING: Swartrugjakkalse veroorsaak swaar vee-verliese in Suid Afrika. Teenmaatreels soos jag, slagysters en vergiftiging word algemeen gebruik, maar is nie selektief genoeg nie en dood dus ook vele nie-teiken spesies. In hierdie projek is ’n stelsel ontwikkel om swartrugjakkals gesigte te vind op statiese beelde. Die Viola-Jones deteksie algoritme, aanvanklik ontwikkel vir die deteksie van mens-gesigte, is hiervoor gebruik. Drie sleutel-aspekte van hierdie algoritme is die voorstelling van ’n beeld deur middel van ’n sogenaamde integraalbeeld, die gebruik van die ”Adaboost” algoritme om gepaste kenmerke te selekteer, en die gebruik van ’n kaskade van klassifiseerders om vals-alarm tempos te verlaag. In hierdie projek is Python kode ontwikkel om die nuttigste ”Haar”-kenmerke vir die deteksie van dié jakkalse te onttrek. Eksperimente is gedoen om die nuttigheid van die ”Asymboost” algoritme met die van die ”Adaboost” algoritme te kontrasteer. ’n Kaskade van klassifiseerders is vir beide van hierdie tegnieke afgerig en vergelyk. Die resultate toon dat die kenmerke wat die ”Asymboost” algoritme oplewer, tot laer vals-alarm tempos lei. Die byvoeging van ’n spesiale vyfde tipe Haar-kenmerk, wat aangepas is by die relatiewe spasieëring van die jakkals se oë, verhoog die akkuraatheid verder. Die uiteindelike stelsel vind 78% van die gesigte terwyl slegs 0.006% ander beeld-raampies verkeerdelik as gesigte geklassifiseer word.
123

Visual saliency computation for image analysis

Zhang, Jianming 08 December 2016 (has links)
Visual saliency computation is about detecting and understanding salient regions and elements in a visual scene. Algorithms for visual saliency computation can give clues to where people will look in images, what objects are visually prominent in a scene, etc. Such algorithms could be useful in a wide range of applications in computer vision and graphics. In this thesis, we study the following visual saliency computation problems. 1) Eye Fixation Prediction. Eye fixation prediction aims to predict where people look in a visual scene. For this problem, we propose a Boolean Map Saliency (BMS) model which leverages the global surroundedness cue using a Boolean map representation. We draw a theoretic connection between BMS and the Minimum Barrier Distance (MBD) transform to provide insight into our algorithm. Experiment results show that BMS compares favorably with state-of-the-art methods on seven benchmark datasets. 2) Salient Region Detection. Salient region detection entails computing a saliency map that highlights the regions of dominant objects in a scene. We propose a salient region detection method based on the Minimum Barrier Distance (MBD) transform. We present a fast approximate MBD transform algorithm with an error bound analysis. Powered by this fast MBD transform algorithm, our method can run at about 80 FPS and achieve state-of-the-art performance on four benchmark datasets. 3) Salient Object Detection. Salient object detection targets at localizing each salient object instance in an image. We propose a method using a Convolutional Neural Network (CNN) model for proposal generation and a novel subset optimization formulation for bounding box filtering. In experiments, our subset optimization formulation consistently outperforms heuristic bounding box filtering baselines, such as Non-maximum Suppression, and our method substantially outperforms previous methods on three challenging datasets. 4) Salient Object Subitizing. We propose a new visual saliency computation task, called Salient Object Subitizing, which is to predict the existence and the number of salient objects in an image using holistic cues. To this end, we present an image dataset of about 14K everyday images which are annotated using an online crowdsourcing marketplace. We show that an end-to-end trained CNN subitizing model can achieve promising performance without requiring any localization process. A method is proposed to further improve the training of the CNN subitizing model by leveraging synthetic images. 5) Top-down Saliency Detection. Unlike the aforementioned tasks, top-down saliency detection entails generating task-specific saliency maps. We propose a weakly supervised top-down saliency detection approach by modeling the top-down attention of a CNN image classifier. We propose Excitation Backprop and the concept of contrastive attention to generate highly discriminative top-down saliency maps. Our top-down saliency detection method achieves superior performance in weakly supervised localization tasks on challenging datasets. The usefulness of our method is further validated in the text-to-region association task, where our method provides state-of-the-art performance using only weakly labeled web images for training.
124

Image context for object detection, object context for part detection

Gonzalez-Garcia, Abel January 2018 (has links)
Objects and parts are crucial elements for achieving automatic image understanding. The goal of the object detection task is to recognize and localize all the objects in an image. Similarly, semantic part detection attempts to recognize and localize the object parts. This thesis proposes four contributions. The first two make object detection more efficient by using active search strategies guided by image context. The last two involve parts. One of them explores the emergence of parts in neural networks trained for object detection, whereas the other improves on part detection by adding object context. First, we present an active search strategy for efficient object class detection. Modern object detectors evaluate a large set of windows using a window classifier. Instead, our search sequentially chooses what window to evaluate next based on all the information gathered before. This results in a significant reduction on the number of necessary window evaluations to detect the objects in the image. We guide our search strategy using image context and the score of the classifier. In our second contribution, we extend this active search to jointly detect pairs of object classes that appear close in the image, exploiting the valuable information that one class can provide about the location of the other. This leads to an even further reduction on the number of necessary evaluations for the smaller, more challenging classes. In the third contribution of this thesis, we study whether semantic parts emerge in Convolutional Neural Networks trained for different visual recognition tasks, especially object detection. We perform two quantitative analyses that provide a deeper understanding of their internal representation by investigating the responses of the network filters. Moreover, we explore several connections between discriminative power and semantics, which provides further insights on the role of semantic parts in the network. Finally, the last contribution is a part detection approach that exploits object context. We complement part appearance with the object appearance, its class, and the expected relative location of the parts inside it. We significantly outperform approaches that use part appearance alone in this challenging task.
125

Localizing spatially and temporally objects and actions in videos

Kalogeiton, Vasiliki January 2018 (has links)
The rise of deep learning has facilitated remarkable progress in video understanding. This thesis addresses three important tasks of video understanding: video object detection, joint object and action detection, and spatio-temporal action localization. Object class detection is one of the most important challenges in computer vision. Object detectors are usually trained on bounding-boxes from still images. Recently, video has been used as an alternative source of data. Yet, training an object detector on one domain (either still images or videos) and testing on the other one results in a significant performance gap compared to training and testing on the same domain. In the first part of this thesis, we examine the reasons behind this performance gap. We define and evaluate several domain shift factors: spatial location accuracy, appearance diversity, image quality, aspect distribution, and object size and camera framing. We examine the impact of these factors by comparing the detection performance before and after cancelling them out. The results show that all five factors affect the performance of the detectors and their combined effect explains the performance gap. While most existing approaches for detection in videos focus on objects or human actions separately, in the second part of this thesis we aim at detecting non-human centric actions, i.e., objects performing actions, such as cat eating or dog jumping. We introduce an end-to-end multitask objective that jointly learns object-action relationships. We compare it with different training objectives, validate its effectiveness for detecting object-action pairs in videos, and show that both tasks of object and action detection benefit from this joint learning. In experiments on the A2D dataset [Xu et al., 2015], we obtain state-of-the-art results on segmentation of object-action pairs. In the third part, we are the first to propose an action tubelet detector that leverages the temporal continuity of videos instead of operating at the frame level, as state-of-the-art approaches do. The same way modern detectors rely on anchor boxes, our tubelet detector is based on anchor cuboids by taking as input a sequence of frames and outputing tubelets, i.e., sequences of bounding boxes with associated scores. Our tubelet detector outperforms all state of the art on the UCF-Sports [Rodriguez et al., 2008], J-HMDB [Jhuang et al., 2013a], and UCF-101 [Soomro et al., 2012] action localization datasets especially at high overlap thresholds. The improvement in detection performance is explained by both more accurate scores and more precise localization.
126

Object Detection using deep learning and synthetic data

Lidberg, Love January 2018 (has links)
This thesis investigates how synthetic data can be utilized when training convolutional neural networks to detect flags with threatening symbols. The synthetic data used in this thesis consisted of rendered 3D flags with different textures and flags cut out from real images. The synthetic data showed that it can achieve an accuracy above 80% compared to 88% accuracy achieved by a data set containing only real images. The highest accuracy scored was achieved by combining real and synthetic data showing that synthetic data can be used as a complement to real data. Some attempts to improve the accuracy score was made using generative adversarial networks without achieving any encouraging results.
127

Object Detection and Tracking

Al-Ridha, Moatasem Yaseen 01 May 2013 (has links)
An improved object tracking algorithm based Kalman filtering is developed in this thesis. The algorithm uses a median filter and morphological operations during tracking. The problem created by object shadows is identified and the primary focus is to incorporate shadow detection and removal to improve tracking multiple objects in complex scenes. It is shown that the Kalman filter, without the improvements, fails to remove shadows that connect different objects. The application of the median filter helps the separation of different objects and thus enables the tracking of multiple objects individually. The performances of the Kalman filter and the improved tracking algorithm were tested on a highway video sequence of moving cars and it is shown that the proposed algorithm yields better performance in the presence of shadows.
128

Robust visual detection and tracking of complex objects : applications to space autonomous rendez-vous and proximity operations / Détection et suivi visuels robustes d'objets complexes : applications au rendezvous spatial autonome

Petit, Antoine 19 December 2013 (has links)
Dans cette thèse nous étudions le fait de localiser complètement un objet connu par vision artificielle, en utilisant une caméra monoculaire, ce qui constitue un problème majeur dans des domaines comme la robotique. Une attention particulière est ici portée sur des applications de robotique spatiale, dans le but de concevoir un système de localisation visuelle pour des opérations de rendez-vous spatial autonome. Deux composantes principales du problème sont abordées: celle de la localisation initiale de l'objet ciblé, puis celle du suivi de cet objet image par image, donnant la pose complète entre la caméra et l'objet, connaissant le modèle 3D de l'objet. Pour la détection, l'estimation de pose est basée sur une segmentation de l'objet en mouvement et sur une procédure probabiliste d'appariement et d'alignement basée contours de vues synthétiques de l'objet avec une séquence d'images initiales. Pour la phase de suivi, l'estimation de pose repose sur un algorithme de suivi basé modèle 3D, pour lequel nous proposons trois différents types de primitives visuelles, dans l'idée de décrire l'objet considéré par ses contours, sa silhouette et par un ensemble de points d'intérêts. L'intégrité du système de localisation est elle évaluée en propageant l'incertitude sur les primitives visuelles. Cette incertitude est par ailleurs utilisée au sein d'un filtre de Kalman linéaire sur les paramètres de vitesse. Des tests qualitatifs et quantitatifs ont été réalisés, sur des données synthétiques et réelles, avec notamment des conditions d'image difficiles, montrant ainsi l'efficacité et les avantages des différentes contributions proposées, et leur conformité avec un contexte de rendez vous spatial. / In this thesis, we address the issue of fully localizing a known object through computer vision, using a monocular camera, what is a central problem in robotics. A particular attention is here paid on space robotics applications, with the aims of providing a unified visual localization system for autonomous navigation purposes for space rendezvous and proximity operations. Two main challenges of the problem are tackled: initially detecting the targeted object and then tracking it frame-by-frame, providing the complete pose between the camera and the object, knowing the 3D CAD model of the object. For detection, the pose estimation process is based on the segmentation of the moving object and on an efficient probabilistic edge-based matching and alignment procedure of a set of synthetic views of the object with a sequence of initial images. For the tracking phase, pose estimation is handled through a 3D model-based tracking algorithm, for which we propose three different types of visual features, pertinently representing the object with its edges, its silhouette and with a set of interest points. The reliability of the localization process is evaluated by propagating the uncertainty from the errors of the visual features. This uncertainty besides feeds a linear Kalman filter on the camera velocity parameters. Qualitative and quantitative experiments have been performed on various synthetic and real data, with challenging imaging conditions, showing the efficiency and the benefits of the different contributions, and their compliance with space rendezvous applications.
129

Content Detection in Handwritten Documents

January 2018 (has links)
abstract: Handwritten documents have gained popularity in various domains including education and business. A key task in analyzing a complex document is to distinguish between various content types such as text, math, graphics, tables and so on. For example, one such aspect could be a region on the document with a mathematical expression; in this case, the label would be math. This differentiation facilitates the performance of specific recognition tasks depending on the content type. We hypothesize that the recognition accuracy of the subsequent tasks such as textual, math, and shape recognition will increase, further leading to a better analysis of the document. Content detection on handwritten documents assigns a particular class to a homogeneous portion of the document. To complete this task, a set of handwritten solutions was digitally collected from middle school students located in two different geographical regions in 2017 and 2018. This research discusses the methods to collect, pre-process and detect content type in the collected handwritten documents. A total of 4049 documents were extracted in the form of image, and json format; and were labelled using an object labelling software with tags being text, math, diagram, cross out, table, graph, tick mark, arrow, and doodle. The labelled images were fed to the Tensorflow’s object detection API to learn a neural network model. We show our results from two neural networks models, Faster Region-based Convolutional Neural Network (Faster R-CNN) and Single Shot detection model (SSD). / Dissertation/Thesis / Masters Thesis Computer Science 2018
130

Uma abordagem estrutural para detecção de objetos e localização em ambientes internos por dispositivos móveis / A structural approach for object detection and indoor localization with mobile devices

Henrique Morimitsu 29 August 2011 (has links)
A detecção de objetos é uma área de extrema importância para sistemas de visão computacional. Em especial, dado o aumento constante da utilização de dispositivos móveis, torna-se cada vez mais importante o desenvolvimento de métodos e aplicações capazes de serem utilizadas em tais aparelhos. Neste sentido, neste trabalho propõe-se o estudo e implementação de um aplicativo para dispositivos móveis capaz de detectar, em tempo real, objetos existentes em ambientes internos com uma aplicação para auxiliar um usuário a se localizar dentro do local. O aplicativo depende somente das capacidades do próprio aparelho e, portanto, procura ser mais flexível e sem restrições. A detecção de objetos é realizada por casamento de grafos-chave entre imagens de objetos pré-escolhidas e a imagem sendo capturada pela câmera do dispositivo. Os grafos-chave são uma generalização do método de detecção de pontos-chave tradicional e, por levarem em consideração um conjunto de pontos e suas propriedades estruturais, são capazes de descrever e detectar os objetos de forma robusta e eficiente. Para realizar a localização, optou-se por detectar placas existentes no próprio local. Após cada detecção, aplica-se um simples, mas bastante eficaz, sistema de localização baseado na comparação da placa detectada com uma base de dados de imagens de todo o ambiente. A base foi construída utilizando diversas câmeras colocadas sobre uma estrutura móvel, capturando sistematicamente imagens do ambiente em intervalos regulares. A implementação é descrita em detalhes e são apresentados resultados obtidos por testes reais no ambiente escolhido utilizando um celular Nokia N900. Tais resultados são avaliados em termos da precisão da detecção e da estimativa de localização, bem como do tempo decorrido para a realização de todo o processo. / Object detection is an area of extreme importance for computer vision systems. Specially because of the increasing use of mobile devices, it becomes more and more important to develop methods and applications that can be used in such devices. In this sense, we propose the study and implementation of an application for mobile devices that is able to detect, in real time, existing indoor objects with an application to help a user in localization in the environment. The application depends solely on the device capabilities and hence, it is flexible and unconstrained. Object detection is accomplished by keygraph matching between images of previously chosen signs and the image currently being captured by the camera device. Keygraphs are a generalization of the traditional keypoints method and, by taking into consideration a set of points and its structural properties, are capable of describing the objects robustly and efficiently. In order to perform localization, we chose to detect signs existing in the environment. After each detection, we apply a simple, but very effective, localization method based on a comparison between the detected sign and a dataset of images of the whole environment. The dataset was built using several cameras atop a mobile structure, systematically capturing images of the environment at regular intervals. The implementation is described in details and we show results obtained from real tests in the chosen environment using a Nokia N900 cell phone. Such results are evaluated in terms of detection and localization estimation precision, as well as the elapsed time to perform the whole process.

Page generated in 0.099 seconds