• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 29
  • 11
  • 7
  • 7
  • 4
  • 3
  • 2
  • 2
  • 1
  • 1
  • Tagged with
  • 87
  • 87
  • 46
  • 28
  • 21
  • 16
  • 16
  • 14
  • 11
  • 10
  • 10
  • 9
  • 9
  • 9
  • 9
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
41

Detecção e contagem de veículos em vídeos de tráfego urbano / Detecting and counting vehicles in urban traffic video

Barcellos, Pablo Roberlan Manke January 2014 (has links)
Este trabalho apresenta um novo método para o rastreamento e contagem de veículos em vídeos de tráfego urbano. Usando técnicas de processamento de imagens e de agrupamentos de partículas, o método proposto usa coerência de movimento e coerência espacial para agrupar partículas, de modo que cada grupo represente veículos nas sequências de vídeo. Uma máscara contendo os objetos do primeiro plano é criada usando os métodos Gaussian Mixture Model e Motion Energy Images para determinar os locais onde as partículas devem ser geradas, e as regiões convexas dos agrupamentos são então analisadas para verificar se correspondem a um veículo. Esta análise leva em consideração a forma convexa dos grupos de partículas (objetos) e a máscara de foreground para realizar a fusão ou divisão dos agrupamentos obtidos. Depois que um veículo é identificado, ele é rastreado utilizando similaridade de histogramas de cor em janelas centradas nas partículas dos agrupamentos. A contagem de veículos acontece em laços virtuais definidos pelo usuário, através da interseção dos veículos rastreados com os laços virtuais. Testes foram realizados utilizando seis diferentes vídeos de tráfego, em um total de 80000 quadros. Os resultados foram comparados com métodos semelhantes disponíveis na literatura, fornecendo, resultados equivalentes ou superiores. / This work presents a new method for tracking and counting vehicles in traffic videos. Using techniques of image processing and particle clustering, the proposed method uses motion coherence and spatial adjacency to group particles so that each group represents vehicles in the video sequences. A foreground mask is created using Gaussian Mixture Model and Motion Energy Images to determine the locations where the particles must be generated, and the convex shapes of detecting groups are then analyzed for the potential detection of vehicles. This analysis takes into consideration the convex shape of the particle groups (objects) and the foreground mask to merge or split the obtained groupings. After a vehicle is identified, it is tracked using the similarity of color histograms on windows centered at the particle locations. The vehicle count takes place on userdefined virtual loops, through the intersections of tracked vehicles with the virtual loops. Tests were conducted using six different traffic videos, on a total of 80.000 frames. The results were compared with similar methods available in the literature, providing results equivalent or superior.
42

API datového úložiště pro práci s videem a obrázky / The API of a Video and Image Datastore

Fröml, Vojtěch January 2013 (has links)
This master's thesis proposes and implements an extension of the database interface VTApi which is being developed as a part of the MV ČR project "Tools and methods for video and image processing for terrorism prevention" at FIT VUT. This interface provides support for representation, management and indexation of multimedia data and related descriptive metadata used by analytic applications based on computer vision. It currently uses DBMS PostgreSQL as its default datastore. Paper describes basic techniques for processing image and video data, VTApi concept and proposes and implements its modifications for the purpose of supporting multiple types of datastores. As an example of an alternative datastore, support for usage of a SQLite database is integrated into VTApi.
43

Quantification of geometric properties of the melting zone in laser-assisted welding

John, Björn, Markert, Daniel, Englisch, Norbert, Grimm, Michael, Ritter, Marc, Hardt, Wolfram, Kowerko, Danny 14 August 2018 (has links)
By using camera systems – suitable for industrial applications – in combination with a large number of different measurement sensors, it is possible to monitor laser welding processes and their results in real-time. However, a low signal to noise ratio at framerates up to 2,400 fps allows only limited statements about the process behavior; especially concerning the analysis of new welding parameters and their impact on the melting bath. This article strives towards research of kinetic and geometric dependencies of the melting zone induced by different laser parameters through usage of a camera system with a high frame rate (1280x800 by 3,140 fps) in combination with model-driven image and data processing.
44

Lane-based Weaving Area Traffic Analysis Using Field Camera Data

Wei Lin (17582646) 03 January 2024 (has links)
<p dir="ltr">Vehicle weaving describes the lane-changing actions of vehicles, which is a critical aspect of traffic management and road design. This study focused on the weaving behavior of vehicles occurring between ramp merge and diverge areas. Weaving in these areas causes congestion and increases the risk of accidents, especially during heavy traffic. Redesigning such areas for enhanced safety requires a comprehensive analysis of the traffic conditions. Obtaining the weaving pattern is a challenge in the traffic industry. To address this challenge, we leveraged AI and image processing technology to develop algorithms for quantitative analysis of weaving using surveillance videos at the consecutive ramp merge and diverge areas. This approach can also determine the weaving patterns of passenger cars and trucks respectively. The experimental results captured the lane-based weaving behavior of around 30% of vehicles in the favorable areas. The captured weaving data is used as weaving data samples to derive an overall analysis of a weaving location. Remarkably, our approach can reduce the manual processing time for weaving analysis by more than 90%, making this highly practical for use.</p>
45

ADVANCES IN MACHINE LEARNING METHODOLOGIES FOR BUSINESS ANALYTICS, VIDEO SUPER-RESOLUTION, AND DOCUMENT CLASSIFICATION

Tianqi Wang (18431280) 26 April 2024 (has links)
<p dir="ltr">This dissertation encompasses three studies in distinct yet impactful domains: B2B marketing, real-time video super-resolution (VSR), and smart office document routing systems. In the B2B marketing sphere, the study addresses the extended buying cycle by developing an algorithm for customer data aggregation and employing a CatBoost model to predict potential purchases with 91% accuracy. This approach enables the identification of high-potential<br>customers for targeted marketing campaigns, crucial for optimizing marketing efforts.<br>Transitioning to multimedia enhancement, the dissertation presents a lightweight recurrent network for real-time VSR. Developed for applications requiring high-quality video with low latency, such as video conferencing and media playback, this model integrates an optical flow estimation network for motion compensation and leverages a hidden space for the propagation of long-term information. The model demonstrates high efficiency in VSR. A<br>comparative analysis of motion estimation techniques underscores the importance of minimizing information loss.<br>The evolution towards smart office environments underscores the importance of an efficient document routing system, conceptualized as an online class-incremental image classification challenge. This research introduces a one-versus-rest parametric classifier, complemented by two updating algorithms based on passive-aggressiveness, and adaptive thresholding methods to manage low-confidence predictions. Tested on 710 labeled real document<br>images, the method reports a cumulative accuracy rate of approximately 97%, showcasing the effectiveness of the chosen aggressiveness parameter through various experiments.</p>
46

Content-based digital video processing : digital videos segmentation, retrieval and interpretation

Chen, Juan January 2009 (has links)
Recent research approaches in semantics based video content analysis require shot boundary detection as the first step to divide video sequences into sections. Furthermore, with the advances in networking and computing capability, efficient retrieval of multimedia data has become an important issue. Content-based retrieval technologies have been widely implemented to protect intellectual property rights (IPR). In addition, automatic recognition of highlights from videos is a fundamental and challenging problem for content-based indexing and retrieval applications. In this thesis, a paradigm is proposed to segment, retrieve and interpret digital videos. Five algorithms are presented to solve the video segmentation task. Firstly, a simple shot cut detection algorithm is designed for real-time implementation. Secondly, a systematic method is proposed for shot detection using content-based rules and FSM (finite state machine). Thirdly, the shot detection is implemented using local and global indicators. Fourthly, a context awareness approach is proposed to detect shot boundaries. Fifthly, a fuzzy logic method is implemented for shot detection. Furthermore, a novel analysis approach is presented for the detection of video copies. It is robust to complicated distortions and capable of locating the copy of segments inside original videos. Then, iv objects and events are extracted from MPEG Sequences for Video Highlights Indexing and Retrieval. Finally, a human fighting detection algorithm is proposed for movie annotation.
47

Analyse de l'hypovigilance au volant par fusion d'informations environnementales et d'indices vidéo / Driver hypovigilance analysis based on environmental information and video evidence

Garcia garcia, Miguel 19 October 2018 (has links)
L'hypovigilance du conducteur (que ce soit provoquée par la distraction ou la somnolence) est une des menaces principales pour la sécurité routière. Cette thèse s'encadre dans le projet Toucango, porté par la start-up Innov+, qui vise à construire un détecteur d'hypovigilance en temps réel basé sur la fusion d'un flux vidéo en proche infra-rouge et d'informations environnementales. L'objectif de cette thèse consiste donc à proposer des techniques d'extraction des indices pertinents ainsi que des algorithmes de fusion multimodale qui puissent être embarqués sur le système pour un fonctionnement en temps réel. Afin de travailler dans des conditions proches du terrain, une base de données en conduite réelle a été créée avec la collaboration de plusieurs sociétés de transports. Dans un premier temps, nous présentons un état de l'art scientifique et une étude des solutions disponibles sur le marché pour la détection de l'hypovigilance. Ensuite, nous proposons diverses méthodes basées sur le traitement d'images (pour la détection des indices pertinents sur la tête, yeux, bouche et visage) et de données (pour les indices environnementaux basés sur la géolocalisation). Nous réalisons une étude sur les facteurs environnementaux liés à l'hypovigilance et développons un système d'estimation du risque contextuel. Enfin, nous proposons des techniques de fusion multimodale de ces indices avec l'objectif de détecter plusieurs comportements d'hypovigilance : distraction visuelle ou cognitive, engagement dans une tâche secondaire, privation de sommeil, micro-sommeil et somnolence. / Driver hypovigilance (whether caused by distraction or drowsiness) is one of the major threats to road safety. This thesis is part of the Toucango project, hold by the start-up Innov+, which aims to build a real-time hypovigilance detector based on the fusion of near infra-red video evidence and environmental information. The objective of this thesis is therefore to propose techniques for extracting relevant indices as well as multimodal fusion algorithms that can be embedded in the system for real-time operation. In order to work near ground truth conditions, a naturalistic driving database has been created with the collaboration of several transport companies. We first present a scientific state of the art and a study of the solutions available on the market for hypovigilance detection. Then, we propose several methods based on image (for the detection of relevant indices on the head, eyes, mouth and face) and data processing (for environmental indices based on geolocation). We carry out a study on the environmental factors related to hypovigilance and develop a contextual risk estimation system. Finally, we propose multimodal fusion techniques of these indices with the objective of detecting several hypovigilance behaviors: visual or cognitive distraction, engagement in a secondary task, sleep deprivation, microsleep and drowsiness.
48

Desenvolvimento e implementação de instrumentação eletrônica para criação de estímulos visuais para experimentos com o duto óptico da mosca / High-performance visual stimulation system for use in neuroscience experiments with the blowfly

Gazziro, Mario Alexandre 23 September 2009 (has links)
O presente trabalho descreve o desenvolvimento de geradores de estímulos visuais para serem utilizados em experimentos de neurociência com invertebrados, tais como moscas. O experimento consiste na visualização de uma imagem fixa que é movida horizontalmente de acordo com os dados de estímulo recebidos. O sistema é capaz de exibir 640x480 pixels com 256 níveis intensidade a 200 frames por segundo em monitores de varredura convencional. É baseado em hardware reconfigurável (FPGA), incluindo a lógica para gerar as temporizações do vídeo, dos sinais de sincronismo, assim como da memória de vídeo. Uma lógica de controle especial foi incluída para atualizar o deslocamento horizontal da imagem, de acordo com os estímulos desejados, a uma taxa de 200 quadros por segundo. Em um dos geradores desenvolvidos, a fim de duplicar a resolução de posicionamento horizontal, passos artificiais entre-pixels foram implementados usando dois frame buffers de vídeo, contendo respectivamente os pixels ímpares e pares da imagem original a ser exibida. Esta implementação gerou um efeito visual capaz de dobrar a capacidade de posicionamento horizontal deste gerador. / This thesis describes the development of many visual stimulus generators to be used in neuroscience experiments with invertebrates such as flies. The experiment consists in the visualization of a fixed image which is moved horizontally according to the received stimulus data. The system is capable to display 640x480 pixels with 256 intensity levels at 200 frames per second on conventional raster monitors. It´s based on reconfigurable hardware (FPGA), includes the logic to generate video timings and synchronization signals as well as the video memory. Special control logic was included to update the horizontal image offsets according to the desired stimulus data at 200 fps. In one of the developed generators, with the intent to double the horizontal positioning resolution, artificial interpixel steps are implemented using two video frame buffer containing respectively the odd and the even pixels of the original image to be displayed. This implementation generates a visual effect capable to double the horizontal positioning capabilities of the generator.
49

Coefficients de fiabilité et approche hierarchique pour la detection et le dénombrement de petits objets dans une vidéo / Reliability coefficients and hierarchical approach for detection and counting of small objets in videos

Pestova, Valentina 21 December 2018 (has links)
Le problème du dénombrement d’un grand nombre de très petits objets en mouvement dans les vidéos est un contexte applicatif jusqu’à présent peu étudié.Dans ce cadre, la difficulté réside essentiellement dans le fait qu’en raison de leurs très petites tailles apparentes dans la vidéo, il n’est pas possible de définir un modèle géométrique fiable de ces objets. Or, les travaux existants dans le domaine de la détection d’objets dans des vidéo, utilisent souvent un tel modèle géométrique des objets d’intérêt. Les méthodes de détection existantes ne sont de ce fait pas applicables directement dans le cadre de la détection de tels très petits objets. Dans le cadre de cette thèse, il est proposé une méthodologie complète permettant la détection de nombreux petits objets, avec un cadre applicatif visant plus particulièrement la détection et le comptage d’oiseaux migrateurs dans une vidéo. Le principe innovant, proposé en tant qu’une solution de ce problème, consiste à associer des coefficients de fiabilité de détection aux objets pour les dénombrer tout en évitant de prendre en compte de trop nombreuses fausses détections. Un algorithme hiérarchique analysant l’aspect spatio-temporel d’objets (leurs apparence et l’évolution dans le temps) dans une vidéo à l’aide de méthodes de traitement d’images, de statistique et de la logique floue est ainsi proposé. Le but des coefficients de fiabilité est d’estimer la probabilité que les paramètres d’une détection correspondent aux paramètres attendus pour les objets d’intérêt. Finalement, l’ensemble des coefficients est converti en une valeur qui évalue la séquence du traitement d’un objet. La somme de ces valeurs correspond au nombre d’objets d’intérêt dans une vidéo. Les résultats obtenus montrent que les bonnes détections sont pour la plupart comprises dans le dénombrement avec des coefficients de fiabilité égaux ou proche de 1, et où les fausses détections sont supprimées ou sous-pondérés avec des coefficients de fiabilité plus faible. Les résultats de comptage dans des vidéos contenant de très nombreux oiseaux sont proches de la vérité terrain, ce qui prouve la validité de la solution proposée comme un moyen de dénombrement automatique d’objets dans des vidéos. / The problem of counting of big volumes of very small moving objects in videos is a domain, which was not studied to date. The difficulty of this application consists essentially in the fact, that because of very small sizes of objects, apparent in the videos, it is impossible to define a reliable geometric model of these objects. The researches, existing in the domain of object detection in videos frequently use a geometrical model of objects of interest.For this reason, the existing methods of object detection cannot be applied for the detection of very small objects in the study case. This thesis proposes a complete methodology, allowing the detection of very small objects in videos, and designed particularly the detection and counting of migrating birds in videos. An innovative principle and the solution of this problem consist in association of coefficients of detection reliability to the objects, in order to count them, avoiding counting of many false detections. The solution proposes a hierarchical algorithm, which analyses the spatial and temporal aspects of objects (their appearance and evolution in time) in a video, by the means of methods of image processing, statistics, and fuzzy logic. The aim of the reliability coefficients is to estimate the probability, that the parameters of a detected objects conform to the expected parameters of the objects of interest. Finally, the coefficients are put together and converted into a value, which evaluates the sequence of processing, applied to detect an object. The sum of these values corresponds to the number of the objects of interest in a video. The results show, that the most of correct detections are characterized in the counting by the reliability coefficient equal or close to 1. The results show, that the most of correct detections have their reliability coefficients close to 1, and the false detection are deleted or have low reliability coefficients. The counting results in the videos with numerous groups of migrating birds are close to the ground trough. This validates the proposed solution as a method of automatic counting of objects in videos.
50

Processamento e estilização de dados RGB-Z em tempo real

Jesus, Alicia Isolina Pretel January 2014 (has links)
Orientador: Prof. Dr. João Paulo Gois / Dissertação (mestrado) - Universidade Federal do ABC, Programa de Pós-Graduação em Ciências da computação, 2014. / O desenvolvimento tecnológico de dispositivos de captura 3D nos últimos anos permitiram que os usuários acessassem dados 3D de forma fácil e com baixo custo. Neste trabalho estamos interessados no processamento de dados de câmeras que produzem seqüências de imagens (canais RGB) e as informações de profundidade dos objetos que compõem a cena (canal Z) simultaneamente. Atualmente o dispositivo mais popular para a produção deste tipo de informação é o Microsoft Kinect, originalmente usado para rastreamento de movimentos em aplicações de jogos. A informação de profundidade, juntamente com as imagens permite a produção de muitos efeitos visuais de re-iluminação, abstração, segmentação de fundo, bem como a modelagem da geometria da cena. No entanto, o sensor de profundidade tende a gerar dados ruidosos, onde filtros multidimensionais para estabilizar os quadros de vídeo são necessários. Nesse sentido, este trabalho desenvolve e avalia um conjunto de ferramentas para o processamento de vídeos RGB-Z, desde filtros para estabilização de vídeos até efeitos gráficos (renderings não-fotorrealísticos). Para tal, um framework que captura e processa os dados RGB-Z interativamente foi proposto. A implementação deste framework explora programação em GPU com o OpenGL Shading Language (GLSL). / The technological development of 3D capture devices in recent years has enabled users to easily access 3D data easily an in a low cost. In this work we are interested in processing data from cameras that produce sequences of images (RGB-channels) and the depth information of objects that compose the scene (Z-channel) simultaneously. Currently the most popular device for producing this type of information is the Microsoft Kinect, originally used for tracking movements in game applications. The depth information coupled with the images allow the production of many visual eects of relighting, abstraction, background segmentation as well as geometry modeling from the scene. However, the depth sensor tends to generate noisy data, where multidimensional filters to stabilize the frames of the video are required. In that sense this work developed and evaluated a set of tools for video processing in RGB-Z, from filters to video stabilization to the graphical eects (based on non-photorealistic rendering). To this aim, an interactive framework that captures and processes RGB-Z data interactively was presented. The implementation of this framework explores GPU programming with OpenGL Shading Language (GLSL).

Page generated in 0.0934 seconds