Spelling suggestions: "subject:"cisual cracking"" "subject:"cisual fracking""
21 |
Visual Tracking Using Stereo ImagesDehlin, Carl January 2019 (has links)
Visual tracking concerns the problem of following an arbitrary object in a video sequence. In this thesis, we examine how to use stereo images to extend existing visual tracking algorithms, which methods exists to obtain information from stereo images, and how the results change as the parameters to each tracker vary. For this purpose, four abstract approaches are identified, with five distinct implementations. Each tracker implementation is an extension of a baseline algorithm, MOSSE. The free parameters of each model are optimized with respect to two different evaluation strategies called nor- and wir-tests, and four different objective functions, which are then fixed when comparing the models against each other. The results are created on single target tracks extracted from the KITTI tracking dataset, and the optimization results show that none of the objective functions are sensitive to the exposed parameters under the joint selection of model and dataset. The evaluation results also shows that none of the extensions improve the results of the baseline tracker.
|
22 |
Contributions à l'estimation de mouvement 3D et à la commande par vision rapide : application aux robots parallèles / Contributions to 3D motion estimation and fast vision controlDahmouche, Redwan 18 November 2010 (has links)
La vision artificielle est un moyen de perception très apprécié en robotique. Les applications les plus courantes de la vision en robotique manipulatrice sont l'estimation de pose et la commande. D'un point de vue conceptuel, l'utilisation de la vision artificielle permet d'améliorer les performances de la commande des robots en termes de précision et de robustesse (vis-à-vis des erreurs sur les paramètres géométriques du robot). De plus, cette mesure est d'autant plus pertinente pour les robots parallèles puisque l'état de ces derniers est généralement mieux défini par la pose de l'effecteur que par les mesures articulaires, traditionnellement utilisées en commande. Cependant, les systèmes de vision classiques ne permettent pas de satisfaire les exigences des commandes hautes performances à cause de leur période d'acquisition et de leur temps de latence trop grands. Pour pallier ce problème, l'approche proposée dans cette thèse est de procéder à une acquisition séquentielle de fenêtres d'intérêt. En effet, le fait de ne transmettre que les régions de l'image contenant les primitives visuelles utiles a pour effet de diminuer la quantité de données à transmettre ce qui permet réduire la période d'acquisition et le temps de latence. De plus, l'acquisition non simultanée des primitives offre la possibilité d'estimer la pose et la vitesse du robot de façon conjointe. Différentes méthodes d'estimation et plusieurs schémas de commandes cinématiques et dynamiques utilisant ce mode d'acquisition ont ainsi été proposés dans ce document. Les résultats expérimentaux obtenus en commande dynamique par vision d'un robot parallèle montrent, pour la première fois, que la commande référencée vision peut être plus performante que la commande articulaire. / Visual sensing is very appreciated in manipulation robotics since it provides measures for pose estimation and robot control. Conceptually, vision allows improving the performance of manipulator robots control from accuracy and robustness (against kinematic parameters errors) points of view. In addition, vision-based control is particularly relevant for parallel robot manipulators. In fact, the state of these robots is usually better described by the pose of their mobile platform than by their articular joints. Thus, the use of vision simplifies the control of parallel robots. However, the typical vision systems do not fulfil the dynamic control constraints on the acquisition frequency and the latency. The approach proposed in this thesis is to perform a sequential acquisition of regions of interest which contain the useful visual features to cut down the data amount to transmit. Thus allows for reducing the acquisition period and latency. In addition, the sequential acquisition allows for estimating both the pose and the velocity of the robot platform. Thanks to this acquisition method, several control laws are proposed. The experimental results show that the performance of the proposed vision based dynamic control laws are, for the first time, better than classical dynamic control.
|
23 |
Robust visual detection and tracking of complex objects : applications to space autonomous rendez-vous and proximity operations / Détection et suivi visuels robustes d'objets complexes : applications au rendezvous spatial autonomePetit, Antoine 19 December 2013 (has links)
Dans cette thèse nous étudions le fait de localiser complètement un objet connu par vision artificielle, en utilisant une caméra monoculaire, ce qui constitue un problème majeur dans des domaines comme la robotique. Une attention particulière est ici portée sur des applications de robotique spatiale, dans le but de concevoir un système de localisation visuelle pour des opérations de rendez-vous spatial autonome. Deux composantes principales du problème sont abordées: celle de la localisation initiale de l'objet ciblé, puis celle du suivi de cet objet image par image, donnant la pose complète entre la caméra et l'objet, connaissant le modèle 3D de l'objet. Pour la détection, l'estimation de pose est basée sur une segmentation de l'objet en mouvement et sur une procédure probabiliste d'appariement et d'alignement basée contours de vues synthétiques de l'objet avec une séquence d'images initiales. Pour la phase de suivi, l'estimation de pose repose sur un algorithme de suivi basé modèle 3D, pour lequel nous proposons trois différents types de primitives visuelles, dans l'idée de décrire l'objet considéré par ses contours, sa silhouette et par un ensemble de points d'intérêts. L'intégrité du système de localisation est elle évaluée en propageant l'incertitude sur les primitives visuelles. Cette incertitude est par ailleurs utilisée au sein d'un filtre de Kalman linéaire sur les paramètres de vitesse. Des tests qualitatifs et quantitatifs ont été réalisés, sur des données synthétiques et réelles, avec notamment des conditions d'image difficiles, montrant ainsi l'efficacité et les avantages des différentes contributions proposées, et leur conformité avec un contexte de rendez vous spatial. / In this thesis, we address the issue of fully localizing a known object through computer vision, using a monocular camera, what is a central problem in robotics. A particular attention is here paid on space robotics applications, with the aims of providing a unified visual localization system for autonomous navigation purposes for space rendezvous and proximity operations. Two main challenges of the problem are tackled: initially detecting the targeted object and then tracking it frame-by-frame, providing the complete pose between the camera and the object, knowing the 3D CAD model of the object. For detection, the pose estimation process is based on the segmentation of the moving object and on an efficient probabilistic edge-based matching and alignment procedure of a set of synthetic views of the object with a sequence of initial images. For the tracking phase, pose estimation is handled through a 3D model-based tracking algorithm, for which we propose three different types of visual features, pertinently representing the object with its edges, its silhouette and with a set of interest points. The reliability of the localization process is evaluated by propagating the uncertainty from the errors of the visual features. This uncertainty besides feeds a linear Kalman filter on the camera velocity parameters. Qualitative and quantitative experiments have been performed on various synthetic and real data, with challenging imaging conditions, showing the efficiency and the benefits of the different contributions, and their compliance with space rendezvous applications.
|
24 |
Sistema de controle servo visual de uma câmera pan-tilt com rastreamento de uma região de referência. / Visual servoing system of a pan-tilt camera using region template tracking.Davi Yoshinobu Kikuchi 19 April 2007 (has links)
Uma câmera pan-tilt é capaz de se movimentar em torno de dois eixos de rotação (pan e tilt), permitindo que sua lente possa ser apontada para um ponto qualquer no espaço. Uma aplicação possível dessa câmera é mantê-la apontada para um determinado alvo em movimento, através de posicionamentos angulares pan e tilt adequados. Este trabalho apresenta uma técnica de controle servo visual, em que, inicialmente, as imagens capturadas pela câmera são utilizadas para determinar a posição do alvo. Em seguida, calculam-se as rotações necessárias para manter a projeção do alvo no centro da imagem, em um sistema em tempo real e malha fechada. A técnica de rastreamento visual desenvolvida se baseia em comparação de uma região de referência, utilizando a soma dos quadrados das diferenças (SSD) como critério de correspondência. Sobre essa técnica, é adicionada uma extensão baseada no princípio de estimação incremental e, em seguida, o algoritmo é mais uma vez modificado através do princípio de estimação em multiresolução. Para cada uma das três configurações, são realizados testes para comparar suas performances. O sistema é modelado através do princípio de fluxo óptico e dois controladores são apresentados para realimentar o sistema: um proporcional integral (PI) e um proporcional com estimação de perturbações externas através de um filtro de Kalman (LQG). Ambos são calculados utilizando um critério linear quadrático e os desempenhos deles também são analisados comparativamente. / A pan-tilt camera can move around two rotational axes (pan and tilt), allowing its lens to be pointed to any point in space. A possible application of the camera is to keep it pointed to a certain moving target, through appropriate angular pan-tilt positioning. This work presents a visual servoing technique, which uses first the images captured by the camera to determinate the target position. Then the method calculates the proper rotations to keep the target position in image center, establishing a real-time and closed-loop system. The developed visual tracking technique is based on template region matching, and makes use of the sum of squared differences (SSD) as similarity criterion. An extension based on incremental estimation principle is added to the technique, and then the algorithm is modified again by multiresolution estimation method. Experimental results allow a performance comparison between the three configurations. The system is modeled through optical flow principle and this work presents two controllers to accomplish the system feedback: a proportional integral (PI) and a proportional with external disturbances estimation by a Kalman filter (LQG). Both are determined using a linear quadratic method and their performances are also analyzed comparatively.
|
25 |
Reconhecimento dos conceitos de forma, cor, tamanho e posição em 10 crianças com Síndrome de RettVelloso, Renata de Lima 29 January 2008 (has links)
Made available in DSpace on 2016-03-15T19:40:30Z (GMT). No. of bitstreams: 1
Renata de Lima Velloso.pdf: 575032 bytes, checksum: 2ded8ecbcde5b280b54ac0be7c3a6a35 (MD5)
Previous issue date: 2008-01-29 / Fundo Mackenzie de Pesquisa / Children with Rett Syndrome (RS) are supposed to present progressive regression of psychomotor development and speech abilities as well as spontaneous hand movement loss,
resulting in severe difficulties for their communication. Several studies have been reporting that RS girls use the eyes with intentional purpose for communicating or expressing desires, and these findings make possible the use of eyes movements as a tool for assessing other RS aspects, such as the cognitive aspects. Ten girls aged 4y8m to 12y10m with RS
were assessed for this investigation with a computer system for visual tracking regarding their ability of recognizing concepts of color (red, yellow and blue), shape (circle, square
and triangle), size (big and small) and spatial position (over and under). Results from comparing the time of eyes fixation on required and not required concepts did not differ significantly. Correlation between age advancement and ability for recognizing the concept of the color "blue" could be observed. Children did not show to recognize the most part of
the required concepts when assessed with eye tracking system. / Crianças com Síndrome de Rett (SR) apresentam regressão progressiva do desenvolvimento psicomotor e das habilidades de linguagem verbal e perda das habilidades manuais voluntárias, o que lhes dificulta a comunicação. Estudos relatam que meninas com SR utilizam o olhar com finalidade intencional, como forma de comunicação ou de expressão de
desejos, o que levanta a possibilidade de avaliação de outros aspectos por meio do olhar, como os aspectos cognitivos. O objetivo deste estudo foi avaliar, em crianças com SR, o reconhecimento dos conceitos de cor (vermelho, amarelo e azul), forma (círculo, quadrado e triângulo), tamanho (grande e pequeno) e posição espacial (em cima e embaixo), com a
utilização de equipamento computadorizado de rastreamento ocular. Participaram do estudo 10 crianças com diagnóstico de SR com idade entre 4 anos e 8 meses e 12 anos e 10 meses. Comparando-se o tempo de fixação do olhar das crianças para o conceito solicitado com o tempo de fixação para outros conceitos não solicitados, os resultados não indicaram muitas diferenças significativas. Houve correlação entre o conceito cor "azul" e o aumento da idade, indicando que as crianças mais velhas aprendem o conceito "azul". Concluiu-se que, com o método de avaliação utilizado, as crianças não reconheceram a maior parte dos conceitos
de cor, forma, tamanho e posição.
|
26 |
Modeling of structured 3-D environments from monocular image sequencesRepo, T. (Tapio) 08 November 2002 (has links)
Abstract
The purpose of this research has been to show with applications that polyhedral scenes can be modeled in real time with a single video camera. Sometimes this can be done very efficiently without any special image processing hardware. The developed vision sensor estimates its three-dimensional position with respect to the environment and models it simultaneously. Estimates become recursively more accurate when objects are approached and observed from different viewpoints.
The modeling process starts by extracting interesting tokens, like lines and corners, from the first image. Those features are then tracked in subsequent image frames. Also some previously taught patterns can be used in tracking. A few features in the same image are extracted. By this way the processing can be done at a video frame rate. New features appearing can also be added to the environment structure.
Kalman filtering is used in estimation. The parameters in motion estimation are location and orientation and their first derivates. The environment is considered a rigid object in respect to the camera. The environment structure consists of 3-D coordinates of the tracked features. The initial model lacks depth information. The relational depth is obtained by utilizing facts such as closer points move faster on the image plane than more distant ones during translational motion. Additional information is needed to obtain absolute coordinates.
Special attention has been paid to modeling uncertainties. Measurements with high uncertainty get less weight when updating the motion and environment model. The rigidity assumption is utilized by using shapes of a thin pencil for initial model structure uncertainties. By observing continuously motion uncertainties, the performance of the modeler can be monitored.
In contrast to the usual solution, the estimations are done in separate state vectors, which allows motion and 3-D structure to be estimated asynchronously. In addition to having a more distributed solution, this technique provides an efficient failure detection mechanism. Several trackers can estimate motion simultaneously, and only those with the most confident estimates are allowed to update the common environment model.
Tests showed that motion with six degrees of freedom can be estimated in an unknown environment. The 3-D structure of the environment is estimated simultaneously. The achieved accuracies were millimeters at a distance of 1-2 meters, when simple toy-scenes and more demanding industrial pallet scenes were used in tests. This is enough to manipulate objects when the modeler is used to offer visual feedback.
|
27 |
Visual Tracking with Deformable Continuous Convolution OperatorsJohnander, Joakim January 2017 (has links)
Visual Object Tracking is the computer vision problem of estimating a target trajectory in a video given only its initial state. A visual tracker often acts as a component in the intelligent vision systems seen in for instance surveillance, autonomous vehicles or robots, and unmanned aerial vehicles. Applications may require robust tracking performance on difficult sequences depicting targets undergoing large changes in appearance, while enforcing a real-time constraint. Discriminative correlation filters have shown promising tracking performance in recent years, and consistently improved state-of-the-art. With the advent of deep learning, new robust deep features have improved tracking performance considerably. However, methods based on discriminative correlation filters learn a rigid template describing the target appearance. This implies an assumption of target rigidity which is not fulfilled in practice. This thesis introduces an approach which integrates deformability into a stateof-the-art tracker. The approach is thoroughly tested on three challenging visual tracking benchmarks, achieving state-of-the-art performance.
|
28 |
Contributions to dense visual tracking and visual servoing using robust similarity criteria / Contributions au suivi visuel et à l'asservissement visuel denses basées sur des critères de similarité robustesDelabarre, Bertrand 23 December 2014 (has links)
Dans cette thèse, nous traitons les problèmes de suivi visuel et d'asservissement visuel, qui sont des thèmes essentiels dans le domaine de la vision par ordinateur. La plupart des techniques de suivi et d'asservissement visuel présentes dans la littérature se basent sur des primitives géométriques extraites dans les images pour estimer le mouvement présent dans la séquence. Un problème inhérent à ce type de méthode est le fait de devoir extraire et mettre en correspondance des primitives à chaque nouvelle image avant de pouvoir estimer un déplacement. Afin d'éviter cette couche algorithmique et de considérer plus d'information visuelle, de récentes approches ont proposé d'utiliser directement la totalité des informations fournies par l'image. Ces algorithmes, alors qualifiés de directs, se basent pour la plupart sur l'observation des intensités lumineuses de chaque pixel de l'image. Mais ceci a pour effet de limiter le domaine d'utilisation de ces approches, car ce critère de comparaison est très sensibles aux perturbations de la scène (telles que les variations de luminosité ou les occultations). Pour régler ces problèmes nous proposons de nous baser sur des travaux récents qui ont montré que des mesures de similarité comme la somme des variances conditionnelles ou l'information mutuelle permettaient d'accroître la robustesse des approches directes dans des conditions perturbées. Nous proposons alors plusieurs algorithmes de suivi et d'asservissement visuels directs qui utilisent ces fonctions de similarité afin d'estimer le mouvement présents dans des séquences d'images et de contrôler un robot grâce aux informations fournies par une caméra. Ces différentes méthodes sont alors validées et analysées dans différentes conditions qui viennent démontrer leur efficacité. / In this document, we address the visual tracking and visual servoing problems. They are crucial thematics in the domain of computer and robot vision. Most of these techniques use geometrical primitives extracted from the images in order to estimate a motion from an image sequences. But using geometrical features means having to extract and match them at each new image before performing the tracking or servoing process. In order to get rid of this algorithmic step, recent approaches have proposed to use directly the information provided by the whole image instead of extracting geometrical primitives. Most of these algorithms, referred to as direct techniques, are based on the luminance values of every pixel in the image. But this strategy limits their use, since the criteria is very sensitive to scene perturbations such as luminosity shifts or occlusions. To overcome this problem, we propose in this document to use robust similarity measures, the sum of conditional variance and the mutual information, in order to perform robust direct visual tracking and visual servoing processes. Several algorithms are then proposed that are based on these criteria in order to be robust to scene perturbations. These different methods are tested and analyzed in several setups where perturbations occur which allows to demonstrate their efficiency.
|
29 |
AATrackT: A deep learning network using attentions for tracking fast-moving and tiny objects : (A)ttention (A)ugmented - (Track)ing on (T)iny objectsLundberg Andersson, Fredric January 2022 (has links)
Recent advances in deep learning have made it possible to visually track objects from a video sequence. Moreover, as transformers got introduced in computer vision, new state-of-the-art performances were achieved in visual tracking. However, most of these studies have used attentions to correlate the distinguishing factors between target-object and candidate-objects to localise the object throughout the video sequence. This approach is not adequate for tracking tiny objects. Also, conventional trackers in general are often not applicable to tracking extreme small objects, or objects that are moving fast. Therefore, the purpose of this study is to improve current methods to track tiny fast-moving objects, with the help of attentions. A deep neural network, named AATrackT, is built to address this gap by referring to it as a visual image segmentation problem. The proposed method is using data extracted from broadcasting videos of the sport Tennis. Moreover, to capture the global context of images, attention augmented convolutions are used as a substitute to the conventional convolution operation. Contrary to what the authors assumed, the experiment showed an indication that using attention augmented convolutions did not contribute to increasing the tracking performance. Our findings showed that the reason is mainly that the spatial resolution of the activation maps of 72x128 is too large for the attention weights to converge.
|
30 |
Vision-based Measurement Methods for Schools of Fish and Analysis of their Behaviors / 動画像処理に基づく魚群の計測手法と行動解析Terayama, Kei 23 March 2016 (has links)
京都大学 / 0048 / 新制・課程博士 / 博士(人間・環境学) / 甲第19807号 / 人博第778号 / 新制||人||187(附属図書館) / 27||人博||778(吉田南総合図書館) / 32843 / 京都大学大学院人間・環境学研究科共生人間学専攻 / (主査)教授 立木 秀樹, 准教授 櫻川 貴司, 教授 日置 尋久, 教授 阪上 雅昭 / 学位規則第4条第1項該当 / Doctor of Human and Environmental Studies / Kyoto University / DFAM
|
Page generated in 0.0743 seconds