• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 13
  • 7
  • 3
  • 3
  • 2
  • 1
  • Tagged with
  • 34
  • 34
  • 12
  • 9
  • 8
  • 8
  • 7
  • 7
  • 6
  • 6
  • 5
  • 5
  • 5
  • 5
  • 4
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
21

Segmentação de movimento coerente aplicada à codificação de vídeos baseada em objetos

Silva, Luciano Silva da January 2011 (has links)
A variedade de dispositivos eletrônicos capazes de gravar e reproduzir vídeos digitais vem crescendo rapidamente, aumentando com isso a disponibilidade deste tipo de informação nas mais diferentes plataformas. Com isso, se torna cada vez mais importante o desenvolvimento de formas eficientes de armazenamento, transmissão, e acesso a estes dados. Nesse contexto, a codificação de vídeos tem um papel fundamental ao compactar informação, otimizando o uso de recursos aplicados no armazenamento e na transmissão de vídeos digitais. Não obstante, tarefas que envolvem a análise de vídeos, manipulação e busca baseada em conteúdo também se tornam cada vez mais relevantes, formando uma base para diversas aplicações que exploram a riqueza da informação contida em vídeos digitais. Muitas vezes a solução destes problemas passa pela segmentação de vídeos, que consiste da divisão de um vídeo em regiões que apresentam homogeneidade segundo determinadas características, como por exemplo cor, textura, movimento ou algum aspecto semântico. Nesta tese é proposto um novo método para segmentação de vídeos em objetos constituintes com base na coerência de movimento de regiões. O método de segmentação proposto inicialmente identifica as correspondências entre pontos esparsamente amostrados ao longo de diferentes quadros do vídeo. Logo após, agrupa conjuntos de pontos que apresentam trajetórias semelhantes. Finalmente, uma classificação pixel a pixel é obtida a partir destes grupos de pontos amostrados. O método proposto não assume nenhum modelo de câmera ou de movimento global para a cena e/ou objetos, e possibilita que múltiplos objetos sejam identificados, sem que o número de objetos seja conhecido a priori. Para validar o método de segmentação proposto, foi desenvolvida uma abordagem para a codificação de vídeos baseada em objetos. Segundo esta abordagem, o movimento de um objeto é representado através de transformações afins, enquanto a textura e a forma dos objetos são codificadas simultaneamente, de modo progressivo. O método de codificação de vídeos desenvolvido fornece funcionalidades tais como a transmissão progressiva e a escalabilidade a nível de objeto. Resultados experimentais dos métodos de segmentação e codificação de vídeos desenvolvidos são apresentados, e comparados a outros métodos da literatura. Vídeos codificados segundo o método proposto são comparados em termos de PSNR a vídeos codificados pelo software de referência JM H.264/AVC, versão 16.0, mostrando a que distância o método proposto está do estado da arte em termos de eficiência de codificação, ao mesmo tempo que provê funcionalidades da codificação baseada em objetos. O método de segmentação proposto no presente trabalho resultou em duas publicações, uma nos anais do SIBGRAPI de 2007 e outra no períodico IEEE Transactions on Image Processing. / The variety of electronic devices for digital video recording and playback is growing rapidly, thus increasing the availability of such information in many different platforms. So, the development of efficient ways of storing, transmitting and accessing such data becomes increasingly important. In this context, video coding plays a key role in compressing data, optimizing resource usage for storing and transmitting digital video. Nevertheless, tasks involving video analysis, manipulation and content-based search also become increasingly relevant, forming a basis for several applications that exploit the abundance of information in digital video. Often the solution to these problems makes use of video segmentation, which consists of dividing a video into homogeneous regions according to certain characteristics such as color, texture, motion or some semantic aspect. In this thesis, a new method for segmentation of videos in their constituent objects based on motion coherence of regions is proposed. The proposed segmentation method initially identifies the correspondences of sparsely sampled points along different video frames. Then, it performs clustering of point sets that have similar trajectories. Finally, a pixelwise classification is obtained from these sampled point sets. The proposed method does not assume any camera model or global motion model to the scene and/or objects. Still, it allows the identification of multiple objects, without knowing the number of objects a priori. In order to validate the proposed segmentation method, an object-based video coding approach was developed. According to this approach, the motion of an object is represented by affine transformations, while object texture and shape are simultaneously coded, in a progressive way. The developed video coding method yields functionalities such as progressive transmission and object scalability. Experimental results obtained by the proposed segmentation and coding methods are presented, and compared to other methods from the literature. Videos coded by the proposed method are compared in terms of PSNR to videos coded by the reference software JM H.264/AVC, version 16.0, showing the distance of the proposed method from the sate of the art in terms of coding efficiency, while providing functionalities of object-based video coding. The segmentation method proposed in this work resulted in two publications, one in the proceedings of SIBGRAPI 2007 and another in the journal IEEE Transactions on Image Processing.
22

Segmentação de movimento usando morfologia matemática / Motion segmentation using mathematical morphology

Lara, Arnaldo Camara 06 November 2007 (has links)
Esta dissertação apresenta um novo método de segmentação de movimento baseado na obtenção dos contornos e em filtros morfológicos. A nova técnica apresenta vantagens em relação ao número de falsos positivos e falsos negativos em situações específicas quando comparada às técnicas tradicionais. / This work presents a novel motion segmentation technique based in contours and in morphological filters. It presents advantages in the number of false positives and false negatives in some situations when compared to the classic techniques.
23

Αναγνώριση αριθμού κινούμενων αντικειμένων και παρακολούθηση της τροχιάς των με μεθόδους μηχανικής όρασης

Κουζούπης, Δημήτριος 05 January 2011 (has links)
Η παρούσα διπλωματική εργασία αφορά την ανίχνευση και παρακολούθηση ανθρώπινων μορφών σε ακολουθίες βίντεο με μεθόδους μηχανικής όρασης. Οι ακολουθίες αυτές θεωρούμε πως έχουν ληφθεί από στατική κάμερα σε εσωτερικό ή εξωτερικό χώρο. Πιο συγκεκριμένα, το εν λόγω πρόβλημα υποδιαιρείται σε τρία κυρίως μέρη τα οποία μελετώνται, αναλύονται και υλοποιούνται σε ξεχωριστά κεφάλαια. Ξεκινάμε με το κομμάτι κατάτμησης κίνησης, συνεχίζουμε με την ταξινόμηση αντικειμένων ώστε να αναγνωριστούν οι άνθρωποι ανάμεσα στις κινούμενες οντότητες και τελειώνουμε με την παρακολούθηση των ανθρώπινων σιλουετών για καταγραφή της πορείας τους όση ώρα βρίσκονται στο πλάνο. Οι αλγόριθμοι που αναπτύχθηκαν λειτούργησαν ικανοποιητικά κάτω από διάφορες συνθήκες και τα αποτελέσματά τους μπορούν να περάσουν ως είσοδοι σε μια πληθώρα εφαρμογών υψηλότερου επιπέδου με σκοπό την αναγνώριση ανθρώπινης δραστηριότητας και την κατανόηση συμπεριφοράς. / The purpose of this thesis is to deal with the problem of human tracking in video sequences. We have divided the problem in three parts: motion segmentation, human tracking and object classification. Finally we have dedicate a whole chapter to optical flow techniques and the relevant methods that can be employed to solve the same problem.
24

Segmentação de movimento coerente aplicada à codificação de vídeos baseada em objetos

Silva, Luciano Silva da January 2011 (has links)
A variedade de dispositivos eletrônicos capazes de gravar e reproduzir vídeos digitais vem crescendo rapidamente, aumentando com isso a disponibilidade deste tipo de informação nas mais diferentes plataformas. Com isso, se torna cada vez mais importante o desenvolvimento de formas eficientes de armazenamento, transmissão, e acesso a estes dados. Nesse contexto, a codificação de vídeos tem um papel fundamental ao compactar informação, otimizando o uso de recursos aplicados no armazenamento e na transmissão de vídeos digitais. Não obstante, tarefas que envolvem a análise de vídeos, manipulação e busca baseada em conteúdo também se tornam cada vez mais relevantes, formando uma base para diversas aplicações que exploram a riqueza da informação contida em vídeos digitais. Muitas vezes a solução destes problemas passa pela segmentação de vídeos, que consiste da divisão de um vídeo em regiões que apresentam homogeneidade segundo determinadas características, como por exemplo cor, textura, movimento ou algum aspecto semântico. Nesta tese é proposto um novo método para segmentação de vídeos em objetos constituintes com base na coerência de movimento de regiões. O método de segmentação proposto inicialmente identifica as correspondências entre pontos esparsamente amostrados ao longo de diferentes quadros do vídeo. Logo após, agrupa conjuntos de pontos que apresentam trajetórias semelhantes. Finalmente, uma classificação pixel a pixel é obtida a partir destes grupos de pontos amostrados. O método proposto não assume nenhum modelo de câmera ou de movimento global para a cena e/ou objetos, e possibilita que múltiplos objetos sejam identificados, sem que o número de objetos seja conhecido a priori. Para validar o método de segmentação proposto, foi desenvolvida uma abordagem para a codificação de vídeos baseada em objetos. Segundo esta abordagem, o movimento de um objeto é representado através de transformações afins, enquanto a textura e a forma dos objetos são codificadas simultaneamente, de modo progressivo. O método de codificação de vídeos desenvolvido fornece funcionalidades tais como a transmissão progressiva e a escalabilidade a nível de objeto. Resultados experimentais dos métodos de segmentação e codificação de vídeos desenvolvidos são apresentados, e comparados a outros métodos da literatura. Vídeos codificados segundo o método proposto são comparados em termos de PSNR a vídeos codificados pelo software de referência JM H.264/AVC, versão 16.0, mostrando a que distância o método proposto está do estado da arte em termos de eficiência de codificação, ao mesmo tempo que provê funcionalidades da codificação baseada em objetos. O método de segmentação proposto no presente trabalho resultou em duas publicações, uma nos anais do SIBGRAPI de 2007 e outra no períodico IEEE Transactions on Image Processing. / The variety of electronic devices for digital video recording and playback is growing rapidly, thus increasing the availability of such information in many different platforms. So, the development of efficient ways of storing, transmitting and accessing such data becomes increasingly important. In this context, video coding plays a key role in compressing data, optimizing resource usage for storing and transmitting digital video. Nevertheless, tasks involving video analysis, manipulation and content-based search also become increasingly relevant, forming a basis for several applications that exploit the abundance of information in digital video. Often the solution to these problems makes use of video segmentation, which consists of dividing a video into homogeneous regions according to certain characteristics such as color, texture, motion or some semantic aspect. In this thesis, a new method for segmentation of videos in their constituent objects based on motion coherence of regions is proposed. The proposed segmentation method initially identifies the correspondences of sparsely sampled points along different video frames. Then, it performs clustering of point sets that have similar trajectories. Finally, a pixelwise classification is obtained from these sampled point sets. The proposed method does not assume any camera model or global motion model to the scene and/or objects. Still, it allows the identification of multiple objects, without knowing the number of objects a priori. In order to validate the proposed segmentation method, an object-based video coding approach was developed. According to this approach, the motion of an object is represented by affine transformations, while object texture and shape are simultaneously coded, in a progressive way. The developed video coding method yields functionalities such as progressive transmission and object scalability. Experimental results obtained by the proposed segmentation and coding methods are presented, and compared to other methods from the literature. Videos coded by the proposed method are compared in terms of PSNR to videos coded by the reference software JM H.264/AVC, version 16.0, showing the distance of the proposed method from the sate of the art in terms of coding efficiency, while providing functionalities of object-based video coding. The segmentation method proposed in this work resulted in two publications, one in the proceedings of SIBGRAPI 2007 and another in the journal IEEE Transactions on Image Processing.
25

Segmentação de movimento coerente aplicada à codificação de vídeos baseada em objetos

Silva, Luciano Silva da January 2011 (has links)
A variedade de dispositivos eletrônicos capazes de gravar e reproduzir vídeos digitais vem crescendo rapidamente, aumentando com isso a disponibilidade deste tipo de informação nas mais diferentes plataformas. Com isso, se torna cada vez mais importante o desenvolvimento de formas eficientes de armazenamento, transmissão, e acesso a estes dados. Nesse contexto, a codificação de vídeos tem um papel fundamental ao compactar informação, otimizando o uso de recursos aplicados no armazenamento e na transmissão de vídeos digitais. Não obstante, tarefas que envolvem a análise de vídeos, manipulação e busca baseada em conteúdo também se tornam cada vez mais relevantes, formando uma base para diversas aplicações que exploram a riqueza da informação contida em vídeos digitais. Muitas vezes a solução destes problemas passa pela segmentação de vídeos, que consiste da divisão de um vídeo em regiões que apresentam homogeneidade segundo determinadas características, como por exemplo cor, textura, movimento ou algum aspecto semântico. Nesta tese é proposto um novo método para segmentação de vídeos em objetos constituintes com base na coerência de movimento de regiões. O método de segmentação proposto inicialmente identifica as correspondências entre pontos esparsamente amostrados ao longo de diferentes quadros do vídeo. Logo após, agrupa conjuntos de pontos que apresentam trajetórias semelhantes. Finalmente, uma classificação pixel a pixel é obtida a partir destes grupos de pontos amostrados. O método proposto não assume nenhum modelo de câmera ou de movimento global para a cena e/ou objetos, e possibilita que múltiplos objetos sejam identificados, sem que o número de objetos seja conhecido a priori. Para validar o método de segmentação proposto, foi desenvolvida uma abordagem para a codificação de vídeos baseada em objetos. Segundo esta abordagem, o movimento de um objeto é representado através de transformações afins, enquanto a textura e a forma dos objetos são codificadas simultaneamente, de modo progressivo. O método de codificação de vídeos desenvolvido fornece funcionalidades tais como a transmissão progressiva e a escalabilidade a nível de objeto. Resultados experimentais dos métodos de segmentação e codificação de vídeos desenvolvidos são apresentados, e comparados a outros métodos da literatura. Vídeos codificados segundo o método proposto são comparados em termos de PSNR a vídeos codificados pelo software de referência JM H.264/AVC, versão 16.0, mostrando a que distância o método proposto está do estado da arte em termos de eficiência de codificação, ao mesmo tempo que provê funcionalidades da codificação baseada em objetos. O método de segmentação proposto no presente trabalho resultou em duas publicações, uma nos anais do SIBGRAPI de 2007 e outra no períodico IEEE Transactions on Image Processing. / The variety of electronic devices for digital video recording and playback is growing rapidly, thus increasing the availability of such information in many different platforms. So, the development of efficient ways of storing, transmitting and accessing such data becomes increasingly important. In this context, video coding plays a key role in compressing data, optimizing resource usage for storing and transmitting digital video. Nevertheless, tasks involving video analysis, manipulation and content-based search also become increasingly relevant, forming a basis for several applications that exploit the abundance of information in digital video. Often the solution to these problems makes use of video segmentation, which consists of dividing a video into homogeneous regions according to certain characteristics such as color, texture, motion or some semantic aspect. In this thesis, a new method for segmentation of videos in their constituent objects based on motion coherence of regions is proposed. The proposed segmentation method initially identifies the correspondences of sparsely sampled points along different video frames. Then, it performs clustering of point sets that have similar trajectories. Finally, a pixelwise classification is obtained from these sampled point sets. The proposed method does not assume any camera model or global motion model to the scene and/or objects. Still, it allows the identification of multiple objects, without knowing the number of objects a priori. In order to validate the proposed segmentation method, an object-based video coding approach was developed. According to this approach, the motion of an object is represented by affine transformations, while object texture and shape are simultaneously coded, in a progressive way. The developed video coding method yields functionalities such as progressive transmission and object scalability. Experimental results obtained by the proposed segmentation and coding methods are presented, and compared to other methods from the literature. Videos coded by the proposed method are compared in terms of PSNR to videos coded by the reference software JM H.264/AVC, version 16.0, showing the distance of the proposed method from the sate of the art in terms of coding efficiency, while providing functionalities of object-based video coding. The segmentation method proposed in this work resulted in two publications, one in the proceedings of SIBGRAPI 2007 and another in the journal IEEE Transactions on Image Processing.
26

Estimação de movimento a partir de imagens RGBD usando homomorfismo entre grafos / Motion estimation from RGBD images using graph homomorphism

David da Silva Pires 14 December 2012 (has links)
Recentemente surgiram dispositivos sensores de profundidade capazes de capturar textura e geometria de uma cena em tempo real. Com isso, diversas técnicas de Visão Computacional, que antes eram aplicadas apenas a texturas, agora são passíveis de uma reformulação, visando o uso também da geometria. Ao mesmo tempo em que tais algoritmos, tirando vantagem dessa nova tecnologia, podem ser acelerados ou tornarem-se mais robustos, surgem igualmente diversos novos desafios e problemas interessantes a serem enfrentados. Como exemplo desses dispositivos podemos citar o do Projeto Vídeo 4D, do IMPA, e o Kinect (TM), da Microsoft. Esses equipamentos fornecem imagens que vêm sendo chamadas de RGBD, fazendo referência aos três canais de cores e ao canal adicional de profundidade (com a letra \'D\' vindo do termo depth, profundidade em inglês). A pesquisa descrita nesta tese apresenta uma nova abordagem não-supervisionada para a estimação de movimento a partir de vídeos compostos por imagens RGBD. Esse é um passo intermediário necessário para a identificação de componentes rígidos de um objeto articulado. Nosso método faz uso da técnica de casamento inexato (homomorfismo) entre grafos para encontrar grupos de pixels (blocos) que se movem para um mesmo sentido em quadros consecutivos de um vídeo. Com o intuito de escolher o melhor casamento para cada bloco, é minimizada uma função custo que leva em conta distâncias tanto no espaço de cores RGB quanto no XYZ (espaço tridimensional do mundo). A contribuição metodológica consiste justamente na manipulação dos dados de profundidade fornecidos pelos novos dispositivos de captura, de modo que tais dados passem a integrar o vetor de características que representa cada bloco nos grafos a serem casados. Nosso método não usa quadros de referência para inicialização e é aplicável a qualquer vídeo que contenha movimento paramétrico por partes. Para blocos cujas dimensões causem uma relativa diminuição na resolução das imagens, nossa aplicação roda em tempo real. Para validar a metodologia proposta, são apresentados resultados envolvendo diversas classes de objetos com diferentes tipos de movimento, tais como vídeos de pessoas caminhando, os movimento de um braço e um casal de dançarinos de samba de gafieira. Também são apresentados os avanços obtidos na modelagem de um sistema de vídeo 4D orientado a objetos, o qual norteia o desenvolvimento de diversas aplicações a serem desenvolvidas na continuação deste trabalho. / Depth-sensing devices have arised recently, allowing real-time scene texture and depth capture. As a result, many computer vision techniques, primarily applied only to textures, now can be reformulated using additional properties like the geometry. At the same time that these algorithms, making use of this new technology, can be accelerated or be made more robust, new interesting challenges and problems to be confronted are appearing. Examples of such devices include the 4D Video Project, from IMPA, and Kinect (TM) from Microsoft. These devices offer the so called RGBD images, being related to the three color channels and to the additional depth channel. The research described on this thesis presents a new non-supervised approach to estimate motion from videos composed by RGBD images. This is an intermediary and necessary step to identify the rigid components of an articulated object. Our method uses the technique of inexact graph matching (homomorphism) to find groups of pixels (patches) that move to the same direction in subsequent video frames. In order to choose the best matching for each patch, we minimize a cost function that accounts for distances on RGB color and XYZ (tridimensional world coordinates) spaces. The methodological contribution consists on depth data manipulation given by the new capture devices, such that these data become components of the feature vector that represents each patch on graphs to be matched. Our method does not use reference frames in order to be initialized and it can be applied to any video that contains piecewise parametric motion. For patches which allow a relative decrease on images resolution, our application runs in real-time. In order to validate the proposed methodology, we present results involving object classes with different movement kinds, such as videos with walking people, the motions of an arm and a couple of samba dancers. We also present the advances obtained on modeling an object oriented 4D video system, which guide a development of different applications to be developed as future work.
27

Segmentação de movimento usando morfologia matemática / Motion segmentation using mathematical morphology

Arnaldo Camara Lara 06 November 2007 (has links)
Esta dissertação apresenta um novo método de segmentação de movimento baseado na obtenção dos contornos e em filtros morfológicos. A nova técnica apresenta vantagens em relação ao número de falsos positivos e falsos negativos em situações específicas quando comparada às técnicas tradicionais. / This work presents a novel motion segmentation technique based in contours and in morphological filters. It presents advantages in the number of false positives and false negatives in some situations when compared to the classic techniques.
28

Analyse sémantique d'un trafic routier dans un contexte de vidéo-surveillance / semantic analysis of road trafic in a context of video-surveillance

Brulin, Mathieu 25 October 2012 (has links)
Les problématiques de sécurité, ainsi que le coût de moins en moins élevé des caméras numériques, amènent aujourd'hui à un développement rapide des systèmes de vidéosurveillance. Devant le nombre croissant de caméras et l'impossibilité de placer un opérateur humain devant chacune d'elles, il est nécessaire de mettre en oeuvre des outils d'analyse capables d'identifier des évènements spécifiques. Le travail présenté dans cette thèse s'inscrit dans le cadre d'une collaboration entre le Laboratoire Bordelais de Recherche en Informatique (LaBRI) et la société Adacis. L'objectif consiste à concevoir un système complet de vidéo-surveillance destiné à l'analyse automatique de scènes autoroutières et la détection d'incidents. Le système doit être autonome, le moins supervisé possible et doit fournir une détection en temps réel d'un évènement.Pour parvenir à cet objectif, l'approche utilisée se décompose en plusieurs étapes. Une étape d'analyse de bas-niveau, telle que l'estimation et la détection des régions en mouvement, une identification des caractéristiques d'un niveau sémantique plus élevé, telles que l'extraction des objets et la trajectoire des objets, et l'identification d'évènements ou de comportements particuliers, tel que le non respect des règles de sécurité. Les techniques employées s'appuient sur des modèles statistiques permettant de prendre en compte les incertitudes sur les mesures et observations (bruits d'acquisition, données manquantes, ...).Ainsi, la détection des régions en mouvement s'effectue au travers la modélisation de la couleur de l'arrière-plan. Le modèle statistique utilisé est un modèle de mélange de lois, permettant de caractériser la multi-modalité des valeurs prises par les pixels. L'estimation du flot optique, de la différence de gradient et la détection d'ombres et de reflets sont employées pour confirmer ou infirmer le résultat de la segmentation.L'étape de suivi repose sur un filtrage prédictif basé sur un modèle de mouvement à vitesse constante. Le cas particulier du filtrage de Kalman (filtrage tout gaussien) est employé, permettant de fournir une estimation a priori de la position des objets en se basant sur le modèle de mouvement prédéfini.L'étape d'analyse de comportement est constituée de deux approches : la première consiste à exploiter les informations obtenues dans les étapes précédentes de l'analyse. Autrement dit, il s'agit d'extraire et d'analyser chaque objet afin d'en étudier son comportement. La seconde étape consiste à détecter les évènements à travers une coupe du volume 2d+t de la vidéo. Les cartes spatio-temporelles obtenues sont utilisées pour estimer les statistiques du trafic, ainsi que pour détecter des évènements telles que l'arrêt des véhicules.Pour aider à la segmentation et au suivi des objets, un modèle de la structure de la scène et de ses caractéristiques est proposé. Ce modèle est construit à l'aide d'une étape d'apprentissage durant laquelle aucune intervention de l'utilisateur n'est requise. La construction du modèle s'effectue à travers l'analyse d'une séquence d'entraînement durant laquelle les contours de l'arrière-plan et les trajectoires typiques des véhicules sont estimés. Ces informations sont ensuite combinées pour fournit une estimation du point de fuite, les délimitations des voies de circulation et une approximation des lignes de profondeur dans l'image. En parallèle, un modèle statistique du sens de direction du trafic est proposé. La modélisation de données orientées nécessite l'utilisation de lois de distributions particulières, due à la nature périodique de la donnée. Un mélange de lois de type von-Mises est utilisée pour caractériser le sens de direction du trafic. / Automatic traffic monitoring plays an important role in traffic surveillance. Video cameras are relatively inexpensive surveillance tools, but necessitate robust, efficient and automated video analysis algorithms. The loss of information caused by the formation of images under perspective projection made the automatic task of detection and tracking vehicles a very challenging problem, but essential to extract a semantic interpretation of vehicles behaviors. The work proposed in this thesis comes from a collaboration between the LaBRI (Laboratoire Bordelais de Recherche en Informatique) and the company Adacis. The aim is to elaborate a complete video-surveillance system designed for automatic incident detection.To reach this objective, traffic scene analysis proceeds from low-level processing to high-level descriptions of the traffic, which can be in a wide variety of type: vehicles entering or exiting the scene, vehicles collisions, vehicles' speed that are too fast or too low, stopped vehicles or objects obstructing part of the road... A large number of road traffic monitoring systems are based on background subtraction techniques to segment the regions of interest of the image. Resulted regions are then tracked and trajectories are used to extract a semantic interpretation of the vehicles behaviors.The motion detection is based on a statistical model of background color. The model used is a mixture model of probabilistic laws, which allows to characterize multimodal distributions for each pixel. Estimation of optical flow, a gradient difference estimation and shadow and highlight detection are used to confirm or invalidate the segmentation results.The tracking process is based on a predictive filter using a motion model with constant velocity. A simple Kalman filter is employed, which allow to predict state of objets based on a \textit{a priori} information from the motion model.The behavior analysis step contains two approaches : the first one consists in exploiting information from low-level and mid-level analysis. Objects and their trajectories are analysed and used to extract abnormal behavior. The second approach consists in analysing a spatio-temporal slice in the 3D video volume. The extracted maps are used to estimate statistics about traffic and are used to detect abnormal behavior such as stopped vehicules or wrong way drivers.In order to help the segmentaion and the tracking processes, a structure model of the scene is proposed. This model is constructed using an unsupervised learning step. During this learning step, gradient information from the background image and typical trajectories of vehicles are estimated. The results are combined to estimate the vanishing point of the scene, the lanes boundaries and a rough depth estimation is performed. In parallel, a statistical model of the trafic flow direction is proposed. To deal with periodic data, a von-Mises mixture model is used to characterize the traffic flow direction.
29

Non-local active contours

Appia, Vikram VijayanBabu 17 May 2012 (has links)
This thesis deals with image segmentation problems that arise in various computer vision related fields such as medical imaging, satellite imaging, video surveillance, recognition and robotic vision. More specifically, this thesis deals with a special class of image segmentation technique called Snakes or Active Contour Models. In active contour models, image segmentation is posed as an energy minimization problem, where an objective energy function (based on certain image related features) is defined on the segmenting curve (contour). Typically, a gradient descent energy minimization approach is used to drive the initial contour towards a minimum for the defined energy. The drawback associated with this approach is that the contour has a tendency to get stuck at undesired local minima caused by subtle and undesired image features/edges. Thus, active contour based curve evolution approaches are very sensitive to initialization and noise. The central theme of this thesis is to develop techniques that can make active contour models robust against certain classes of local minima by incorporating global information in energy minimization. These techniques lead to energy minimization with global considerations; we call these models -- 'Non-local active contours'. In this thesis, we consider three widely used active contour models: 1) Edge- and region-based segmentation model, 2) Prior shape knowledge based segmentation model, and 3) Motion segmentation model. We analyze the traditional techniques used for these models and establish the need for robust models that avoid local minima. We address the local minima problem for each model by adding global image considerations.
30

Trajectory-based Descriptors for Action Recognition in Real-world Videos

Narayan, Sanath January 2015 (has links) (PDF)
This thesis explores motion trajectory-based approaches to recognize human actions in real-world, unconstrained videos. Recognizing actions is an important task in applications such as video retrieval, surveillance, human-robot interactions, analysis of sports videos, summarization of videos, behaviour monitoring, etc. There has been a considerable amount of research done in this regard. Earlier work used to be on videos captured by static cameras where it was relatively easy to recognise the actions. With more videos being captured by moving cameras, recognition of actions in such videos with irregular camera motion is still a challenge in unconstrained settings with variations in scale, view, illumination, occlusion and unrelated motions in the background. With the increase in videos being captured from wearable or head-mounted cameras, recognizing actions in egocentric videos is also explored in this thesis. At first, an effective motion segmentation method to identify the camera motion in videos captured by moving cameras is explored. Next, action recognition in videos captured in normal third-person view (perspective) is discussed. Further, the action recognition approaches for first-person (egocentric) views are investigated. First-person videos are often associated with frequent unintended camera motion. This is due to the motion of the head resulting in the motion of the head-mounted cameras (wearable cameras). This is followed by recognition of actions in egocentric videos in a multicamera setting. And lastly, novel feature encoding and subvolume sampling (for “deep” approaches) techniques are explored in the context of action recognition in videos. The first part of the thesis explores two effective segmentation approaches to identify the motion due to camera. The first approach is based on curve fitting of the motion trajectories and finding the model which best fits the camera motion model. The curve fitting approach works when the trajectories generated are smooth enough. To overcome this drawback and segment trajectories under non-smooth conditions, a second approach based on trajectory scoring and grouping is proposed. By identifying the instantaneous dominant background motion and accordingly aggregating the scores (denoting the “foregroundness”) along the trajectory, the motion that is associated with the camera can be separated from the motion due to foreground objects. Additionally, the segmentation result has been used to align videos from moving cameras, resulting in videos that seem to be captured by nearly-static cameras. In the second part of the thesis, recognising actions in normal videos captured from third-person cameras is investigated. To this end, two kinds of descriptors are explored. The first descriptor is the covariance descriptor adapted for the motion trajectories. The covariance descriptor for a trajectory encodes the co-variations of different features along the trajectory’s length. Covariance, being a second-order encoding, encodes information of the trajectory that is different from that of the first-order encoding. The second descriptor is based on Granger causality. The novel causality descriptor encodes the “cause and effect” relationships between the motion trajectories of the actions. This type of interaction descriptors captures the causal inter-dependencies among the motion trajectories and encodes complimentary information different from those descriptors based on the occurrence of features. The causal dependencies are traditionally computed on time-varying signals. We extend it further to capture dependencies between spatiotemporal signals and compute generalised causality descriptors which perform better than their traditional counterparts. An egocentric or first-person video is captured from the perspective of the personof-interest (POI). The POI wears a camera and moves around doing his/her activities. This camera records the events and activities as seen by him/her. The POI who is performing actions or activities is not seen by the camera worn by him/her. Activities performed by the POI are called first-person actions and third-person actions are those done by others and observed by the POI. The third part of the thesis explores action recognition in egocentric videos. Differentiating first-person and third-person actions is important when summarising/analysing the behaviour of the POI. Thus, the goal is to recognise the action and the perspective from which it is being observed. Trajectory descriptors are adapted to recognise actions along with the motion trajectory ranking method of segmentation as pre-processing step to identify the camera motion. The motion segmentation step is necessary to remove unintended head motion (camera motion) during video capture. To recognise actions and corresponding perspectives in a multi-camera setup, a novel inter-view causality descriptor based on the causal dependencies between trajectories in different views is explored. Since this is a new problem being addressed, two first-person datasets are created with eight actions in third-person and first-person perspectives. The first dataset is a single camera dataset with action instances from first-person and third-person views. The second dataset is a multi-camera dataset with each action instance having multiple first-person and third-person views. In the final part of the thesis, a feature encoding scheme and a subvolume sampling scheme for recognising actions in videos is proposed. The proposed Hyper-Fisher Vector feature encoding is based on embedding the Bag-of-Words encoding into the Fisher Vector encoding. The resulting encoding is simple, effective and improves the classification performance over the state-of-the-art techniques. This encoding can be used in place of the traditional Fisher Vector encoding in other recognition approaches. The proposed subvolume sampling scheme, used to generate second layer features in “deep” approaches for action recognition in videos, is based on iteratively increasing the size of the valid subvolumes in the temporal direction to generate newer subvolumes. The proposed sampling requires lesser number of subvolumes to be generated to “better represent” the actions and thus, is less computationally intensive compared to the original sampling scheme. The techniques are evaluated on large-scale, challenging, publicly available datasets. The Hyper-Fisher Vector combined with the proposed sampling scheme perform better than the state-of-the-art techniques for action classification in videos.

Page generated in 0.142 seconds