Global ETD Search

1	Multi-Object Tracking Using Dual-Attention with Regional-Representation Chen, Weijian January 2021 (has links) Nowadays, researchers have shown convolutional neural network (CNN) can achieve an improved performance in multi-object tracking (MOT) by performing detection and re-identification (ReID) simultaneously. Many models have been created to overcome challenges and bring the state-of-the-art performance to a new level. However, due to the fact the CNN models only utilize feature from a local region, the potential of the model has not been fully utilized. The long range dependencies in spatial domain are usually difficult for a network to capture. Hence, how to obtain such dependencies has become the new focus in MOT field. One approach is to adopt the self-attention mechanism named transformer. Since it was successfully transferred from natural language processing to computer vision, many recent works have implemented it to their trackers. With the introduce of global information, the trackers become more robust and stable. There are also traditional methods which are re-designed in the manner of CNN and achieve satisfying performance such as optical flow. It can generate a correlated relation between feature maps and also obtain non-local information. However, the introduces of these mechanism usually causes a significant surge in computational power and memory. They also requires huge amount of epochs to train thus the training time is largely increased. To solve this issue, we propose a new method to gather non-local information based on the existing self-attention methods, we named it dual attention with regional-representation, which significantly reduces the training time as well as the inference time, but only causes a small increase in computational memory and are able to run with a reasonable speed. Our experiments shows this module can help the ReID be more stable to improve the performance in different tasks. / Thesis / Master of Applied Science (MASc) Multi-Object Tracking Deep Learning Self-Attention
2	Improved 2D Camera-Based Multi-Object Tracking for Autonomous Vehicles Shinde, Omkar Mahesh 06 March 2025 (has links) Effective multi-object tracking is crucial for autonomous vehicles to navigate safely and efficiently in dynamic environments. To make autonomous vehicles more affordable one area to address is the computational limitations of the sensors, therefore, cameras are often the first choice sensor. Three challenges in implementation of multi-object tracking in autonomous vehicles are: 1) In these vehicles, sensors like cameras are not static, which can cause motion blur in the frames and make tracking inefficient. 2) Traditional methods for motion compensation, such as those used in Kalman Filter-based Multi-Object Tracking, require extensive parameter tuning to match features between consecutive frames accurately. 3) Simple intersection over union (IoU) metric is insufficient for reliable identification in such environments. This thesis proposes a novel methodology for 2D multi-object tracking in autonomous vehicles using a camera-based Tracking-by-Detection (TBD) approach, emphasizing four key innovations: (1) A real-time deblurring module to mitigate motion blur, ensuring clearer frames for accurate detection; (2) deep learning-based motion compensation module that adapts dynamically to varying motion patterns, enhancing robustness; (3) adaptive cost function for association, incorporating object appearance and temporal consistency to improve upon traditional IoU metrics; (4) The integration of the Unscented Kalman Filter to effectively address non-linearities in the tracking process, enhancing state estimation accuracy. To maintain a Simple Online and Realtime (SORT) framework, we enhance detection by fine-tuning YOLOv8 and YOLOv9 models using autonomous driving datasets like BDD100K and KITTI, which are specifically tailored for these scenarios. Additionally, we incorporate a non-linear approach using the UKF to better capture the influence of various tracking dynamics, further improving tracking performance. Our evaluations show that the proposed methodology significantly outperforms existing state-of-the-art methods while maintaining the same inference rate as the baseline SORT model. These advancements not only improve the accuracy and reliability of multi-object tracking but also reduce the computational burden associated with parameter tuning and motion compensation. Consequently, this work presents a robust and efficient tracking solution for autonomous vehicles, making it viable for real-world deployment under both computational and cost constraints. / Master of Science / Tracking multiple objects is really important for self-driving cars to move safely in busy places. Cameras are often the best choice because they are cheaper and easier to use, but using cameras comes with three main challenges: (1) When cars move, cameras can make blurry images, which makes it harder to see and track things; (2) Traditional tracking methods, like Kalman Filters, need a lot of adjustments to work well; (3) Simple methods, like checking if objects overlap (called Intersection over Union), are not always good enough in crowded, complicated places. This thesis presents a new way to track lots of things using cameras, with four big improvements: (1) A real-time deblurring system that fixes blurry pictures so the camera can see things more clearly; (2) A smart system that uses deep learning to follow movement better; (3) A better way to match objects by using not just their positions but also how they look and move over time, which is better than old IoU methods; (4) A special tool called the Unscented Kalman Filter that helps track objects more accurately when their movements aren't simple or straight. To keep everything simple, fast, and real-time, we use object detectors to help find objects, and we train them with special self-driving datasets like BDD100K and KITTI. These datasets are great for showing the kinds of situations self-driving cars deal with. The Unscented Kalman Filter helps us track objects with more complicated movements, making everything more accurate. Our study show that this new way works much better than older methods, without making the system slower. These improvements make tracking more reliable and cut down on the time needed for tuning and adjusting. Overall, this work provides a strong and simple solution for tracking things in self-driving cars, even if the computer isn't super powerful or the budget is small. Multi-Object Tracking 2D Perception Autonomous Driving
3	Multi-object tracking with camera Thomas Brigneti, Andrés Attilio January 2019 (has links) Memoria para optar al título de Ingeniero Civil Eléctrico / En este trabajo se evaluarán distintos algoritmos de trackeo para el problema de seguimiento de peatones, donde teniendo un video obtenido de una camara de seguridad, nos interesa reconocer correctamente cada individuo a traves del tiempo, buscando minimizar la cantindad de etiquetas mal asignadas y objetos (peatones) no identificados. Para esto se ocuparán algorimos basados en el concepto de Conjuntos Aleatorios Finitos (Random Finite Sets - RFS), los cuales usan mediciones pasadas de los objetos para predecir posiciones futuras de todos ellos simultaneamente, mientras que también se consideran los casos de nacimientos y muertes de los objetos. Estos algoritmos fueron concebidos para el trackeo de objetos con movimientos simples y predecibles en condiciones de una gran cantidad ruido en las mediciones. mientras que las condiciones en las que se evaluarán son drasticamente opuestas, con un nivel muy alto de certeza en las mediciones pero con movimientos altamente no linear y muy impredecible. Se ocupará una libreria abierta creada por el investigador Ba Tuong Vo, donde están implementados varios de los más clásicos algoritmos en esta área. Es por esto que el trabajo se basará más en el análisis de los resultados en estas nuevas condiciones y observar como se comparán a los algoritmos actuales del area de Computer Vision (CV)/ Machine Learning (ML), usando tanto métricas de RFS como del área de CV. Vision computacional Procesamiento de imagen Algoritmos computacionales Multi Object Tracking
4	A Graph Convolutional Neural Network Based Approach for Object Tracking Using Augmented Detections With Optical Flow Papakis, Ioannis 18 May 2021 (has links) This thesis presents a novel method for online Multi-Object Tracking (MOT) using Graph Convolutional Neural Network (GCNN) based feature extraction and end-to-end feature matching for object association. The Graph based approach incorporates both appearance and geometry of objects at past frames as well as the current frame into the task of feature learning. This new paradigm enables the network to leverage the "contextual" information of the geometry of objects and allows us to model the interactions among the features of multiple objects. Another central innovation of the proposed framework is the use of the Sinkhorn algorithm for end-to-end learning of the associations among objects during model training. The network is trained to predict object associations by taking into account constraints specific to the MOT task. Additionally, in order to increase the sensitivity of the object detector, a new approach is presented that propagates previous frame detections into each new frame using optical flow. These are treated as added object proposals which are then classified as objects. A new traffic monitoring dataset is also provided, which includes naturalistic video footage from current infrastructure cameras in Virginia Beach City with a variety of vehicle density and environment conditions. Experimental evaluation demonstrates the efficacy of the proposed approaches on the provided dataset and the popular MOT Challenge Benchmark. / Master of Science / This thesis presents a novel method for Multi-Object Tracking (MOT) in videos, with the main goal of associating objects between frames. The proposed method is based on a Deep Neural Network Architecture operating on a Graph Structure. The Graph based approach makes it possible to use both appearance and geometry of detected objects to retrieve high level information about their characteristics and interaction. The framework includes the Sinkhorn algorithm, which can be embedded in the training phase to satisfy MOT constraints, such as the 1 to 1 matching between previous and new objects. Another approach is also proposed to improve the sensitivity of the object detector by using previous frame detections as a guide to detect objects in each new frame, resulting in less missed objects. Alongside the new methods, a new dataset is also provided which includes naturalistic video footage from current infrastructure cameras in Virginia Beach City with a variety of vehicle density and environment conditions. Experimental evaluation demonstrates the eﬀicacy of the proposed approaches on the provided dataset and the popular MOT Challenge Benchmark. computer vision multi object tracking deep learning graph neural networks
5	Mental Imagery and Tracking Bruzadin Nunes, Ugo 01 December 2018 (has links) This study aimed to better understand visuomotor tracking and spatial visual imagery. 101 Participants performed four tasks: A Manual Tracking Task (MTT), in which participants mouse-tracked the path of a circle, sometimes with occlusion. A Multi-Object Tracking task (MOT), in which participants tracked several objects simultaneously. The Sussex Cognitive Styles Questionnaire (SCSQ), in which participants self-reported their experience with imagery. A Mental Rotation Task (MRT) in which participants mentally rotate Tetris-like objects. The results demonstrated a significant correlation between the technical/spatial subscale of the SCSQ and the occluded MTT, the MRT, the MOT but not the visible MTT. A multiple regression showed that occluded MTT and the MRT together significantly predicted the spatial/technical subscale of the SCSQ above visible MTT and MOT. These findings support the claim that the cognitive resources behind mental imagery may also be recruited during other tasks that arguably draw on the need for internal visualization. correlational study Manual Tracking Mental Imagery mental rotation multi-object tracking spatial imagery
6	Utilisation du contexte pour la détection et le suivi d'objets en vidéosurveillance / Using the context for objects detection and tracking in videosurveillance Rogez, Matthieu 09 June 2015 (has links) Les caméras de surveillance sont de plus en plus fréquemment présentes dans notre environnement (villes, supermarchés, aéroports, entrepôts, etc.). Ces caméras sont utilisées, entre autres, afin de pouvoir détecter des comportements suspects (intrusion par exemple) ou de reconnaître une catégorie d'objets ou de personnes (détection de genre, détection de plaques d'immatriculation par exemple). D'autres applications concernent également l'établissement de statistiques de fréquentation ou de passage (comptage d'entrée/sortie de personnes ou de véhicules) ou bien le suivi d'un ou plusieurs objets se déplaçant dans le champ de vision de la caméra (trajectoires d'objets, analyse du comportement des clients dans un magasin). Compte tenu du nombre croissant de caméras et de la difficulté à réaliser ces traitements manuellement, un ensemble de méthodes d'analyse vidéo ont été développées ces dernières années afin de pouvoir automatiser ces tâches. Dans cette thèse, nous nous concentrons essentiellement sur les tâches de détection et de suivi des objets mobiles à partir d'une caméra fixe. Contrairement aux méthodes basées uniquement sur les images acquises par les caméras, notre approche consiste à intégrer un certain nombre d'informations contextuelles à l'observation afin de pouvoir mieux interpréter ces images. Ainsi, nous proposons de construire un modèle géométrique et géolocalisé de la scène et de la caméra. Ce modèle est construit directement à partir des études de prédéploiement des caméras et peut notamment utiliser les données OpenStreetMap afin d'établir les modèles 3d des bâtiments proches de la caméra. Nous avons complété ce modèle en intégrant la possibilité de prédire la position du Soleil tout au long de la journée et ainsi pouvoir calculer les ombres projetées des objets de la scène. Cette prédiction des ombres a été mise à profit afin d'améliorer la segmentation des piétons par modèle de fond en supprimant les ombres du masque de mouvement. Concernant le suivi des objets mobiles, nous utilisons le formalisme des automates finis afin de modéliser efficacement les états et évolutions possibles d'un objet. Ceci nous permet d'adapter le traitement de chaque objet selon son état. Nous gérons les occultations inter-objets à l'aide d'un mécanisme de suivi collectif (suivi en groupe) des objets le temps de l'occultation et de ré-identification de ceux-ci à la fin de l'occultation. Notre algorithme s'adapte à n'importe quel type d'objet se déplaçant au sol (piétons, véhicules, etc.) et s'intègre naturellement au modèle de scène développé. Nous avons également développé un ensemble de "rétro-actions" tirant parti de la connaissance des objets suivis afin d'améliorer les détections obtenues à partir d'un modèle de fond. En particulier, nous avons abordé le cas des objets stationnaires, souvent intégrés à tort dans le fond, et avons revisité la méthode de suppression des ombres du masque de mouvement en tirant parti de la connaissance des objets suivis. L'ensemble des solutions proposées a été implémenté dans le logiciel de l'entreprise Foxstream et est compatible avec la contrainte d'exécution en temps réel nécessaire en vidéosurveillance. / Video-surveillance cameras are increasingly used in our environment. They are indeed present almost everywhere in the cities, supermarkets, airports, warehouses, etc. These cameras are used, among other things, in order to detect suspect behavior (an intrusion for instance) or to recognize a specific category of object or person (gender detection, license plates detection). Other applications also exist to count and/or track people in order to analyze their behavior. Due to the increasing number of cameras and the difficulty to achieve these tasks manually, several video analysis methods have been developed in order to address them automatically. In this thesis, we mainly focus on the detection and tracking of moving objects from a fixed camera. Unlike methods based solely on images captured by cameras, our approach integrates contextual pieces of information in order better interpret these images. Thus we propose to build a geometric and geolocalized model of the scene and the camera. This model is built directly from the pre-deployment studies of the cameras and uses the OpenStreetMap geographical database to build 3d models of buildings near the camera. We added to this model the ability to predict the position of the sun throughout the day and the resulting shadows in the scene. By predicting the shadows, and deleting them from the foreground mask, our method is able to improve the segmentation of pedestrians. Regarding the tracking of multiple mobile objects, we use the formalism of finite state machines to effectively model the states and possible transitions that an object is allowed to take. This allows us to tailor the processing of each object according to its state. We manage the inter-object occlusion using a collective tracking strategy. When taking part in an occlusion, objects are regrouped and tracked collectively. At the end of the occlusion, each object is re-identified and individual tracking resume. Our algorithm adapts to any type of ground-moving object (pedestrians, vehicles, etc.) and seamlessly integrates in the developed scene model. We have also developed several retro-actions taking advantage of the knowledge of tracked objects to improve the detections obtained with the background model. In particular, we tackle the issue of stationary objects often integrated erroneously in the background and we revisited the initial proposal regarding shadow removal. All proposed solutions have been implemented in the Foxstream products and are able to run in real-time. Modèle de fond Suivi multi-objets Ombres Vidéosurveillance OpenStreetMap Background model Multi-object tracking Shadows Videosurveillance OpenStreetMap
7	A LIGHTWEIGHT CAMERA-LIDAR FUSION FRAMEWORK FOR TRAFFIC MONITORING APPLICATIONS / A CAMERA-LIDAR FUSION FRAMEWORK Sochaniwsky, Adrian January 2024 (has links) Intelligent Transportation Systems are advanced technologies used to reduce traffic and increase road safety for vulnerable road users. Real-time traffic monitoring is an important technology for collecting and reporting the information required to achieve these goals through the detection and tracking of road users inside an intersection. To be effective, these systems must be robust to all environmental conditions. This thesis explores the fusion of camera and Light Detection and Ranging (LiDAR) sensors to create an accurate and real-time traffic monitoring system. Sensor fusion leverages complimentary characteristics of the sensors to increase system performance in low- light and inclement weather conditions. To achieve this, three primary components are developed: a 3D LiDAR detection pipeline, a camera detection pipeline, and a decision-level sensor fusion module. The proposed pipeline is lightweight, running at 46 Hz on modest computer hardware, and accurate, scoring 3% higher than the camera-only pipeline based on the Higher Order Tracking Accuracy metric. The camera-LiDAR fusion system is built on the ROS 2 framework, which provides a well-defined and modular interface for developing and evaluated new detection and tracking algorithms. Overall, the fusion of camera and LiDAR sensors will enable future traffic monitoring systems to provide cities with real-time information critical for increasing safety and convenience for all road-users. / Thesis / Master of Applied Science (MASc) / Accurate traffic monitoring systems are needed to improve the safety of road users. These systems allow the intersection to “see” vehicles and pedestrians, providing near instant information to assist future autonomous vehicles, and provide data to city planers and officials to enable reductions in traffic, emissions, and travel times. This thesis aims to design, build, and test a traffic monitoring system that uses a camera and 3D laser-scanner to find and track road users in an intersection. By combining a camera and 3D laser scanner, this system aims to perform better than either sensor alone. Furthermore, this thesis will collect test data to prove it is accurate and able to see vehicles and pedestrians during the day and night, and test if runs fast enough for “live” use. computer vision LiDAR object detection multi-object tracking intelligent transportation systems sensor fusion
8	Représenter pour suivre : exploitation de représentations parcimonieuses pour le suivi multi-objets / Representations for tracking : exploiting sparse representations for multi-object tracking Fagot-Bouquet, Loïc Pierre 20 March 2017 (has links) Le suivi multi-objets, malgré les avancées récentes en détection d'objets, présente encore plusieurs difficultés spécifiques et reste ainsi une problématique difficile. Au cours de cette thèse nous proposons d'examiner l'emploi de représentations parcimonieuses au sein de méthodes de suivi multi-objets, dans le but d'améliorer les performances de ces dernières. La première contribution de cette thèse consiste à employer des représentations parcimonieuses collaboratives dans un système de suivi en ligne pour distinguer au mieux les cibles. Des représentations parcimonieuses structurées sont ensuite considérées pour s'adapter plus spécifiquement aux approches de suivi à fenêtre glissante. Une dernière contribution consiste à employer des dictionnaires denses, prenant en considération un grand nombre de positions non détectées au sein des images, de manière à être plus robuste vis-à-vis de la performance du détecteur d'objets employé. / Despite recent advances in object detection, multi-object tracking still raises some specific issues and therefore remains a challenging problem. In this thesis, we propose to investigate the use of sparse representations within multi-object tracking approaches in order to gain in performances. The first contribution of this thesis consists in designing an online tracking approach that takes advantage of collaborative sparse representations to better distinguish between the targets. Then, structured sparse representations are considered in order to be more suited to traking approaches based on a sliding window. In order to rely less on the object detector quality, we consider for the last contribution of this thesis to use dense dictionaries that are taking into account a large number of undetected locations inside each frame. Suivi visuel multi-objets Suivi par détection Représentations parcimonieuses Multi-object tracking Tracking by detection Sparse representations
9	Détermination et implémentation temps-réel de stratégies de gestion de capteurs pour le pistage multi-cibles / Real-Time Sensor Management Strategies for Multi-Object Tracking Gomes borges, Marcos Eduardo 19 December 2018 (has links) Les systèmes de surveillance modernes doivent coordonner leurs stratégies d’observation pour améliorer l’information obtenue lors de leurs futures mesures afin d’estimer avec précision les états des objets d’intérêt (emplacement, vitesse, apparence, etc.). Par conséquent, la gestion adaptative des capteurs consiste à déterminer les stratégies de mesure des capteurs exploitant les informations a priori afin de déterminer les actions de détection actuelles. L’une des applications la plus connue de la gestion des capteurs est le suivi multi-objet, qui fait référence au problème de l’estimation conjointe du nombre d’objets et de leurs états ou trajectoires à partir de mesures bruyantes. Cette thèse porte sur les stratégies de gestion des capteurs en temps réel afin de résoudre le problème du suivi multi-objet dans le cadre de l’approche RFS labélisée. La première contribution est la formulation théorique rigoureuse du filtre mono-capteur LPHD avec son implémentation Gaussienne. La seconde contribution est l’extension du filtre LPHD pour le cas multi-capteurs. La troisième contribution est le développement de la méthode de gestion de capteurs basée sur la minimisation du risque Bayes et formulée dans les cadres POMDP et LRFS. En outre, des analyses et des simulations des approches de gestion de capteurs existantes pour le suivi multi-objets sont fournies / Modern surveillance systems must coordinate their observation strategies to enhance the information obtained by their future measurements in order to accurately estimate the states of objects of interest (location, velocity, appearance, etc). Therefore, adaptive sensor management consists of determining sensor measurement strategies that exploit a priori information in order to determine current sensing actions. One of the most challenging applications of sensor management is the multi-object tracking, which refers to the problem of jointly estimating the number of objects and their states or trajectories from noisy sensor measurements. This thesis focuses on real-time sensor management strategies formulated in the POMDP framework to address the multi-object tracking problem within the LRFS approach. The first key contribution is the rigorous theoretical formulation of the mono-sensor LPHD filter with its Gaussian-mixture implementation. The second contribution is the extension of the mono-sensor LPHD filter for superpositional sensors, resulting in the theoretical formulation of the multi-sensor LPHD filter. The third contribution is the development of the Expected Risk Reduction (ERR) sensor management method based on the minimization of the Bayes risk and formulated in the POMDP and LRFS framework. Additionally, analyses and simulations of the existing sensor management approaches for multi-object tracking, such as Task-based, Information-theoretic, and Risk-based sensor management, are provided. Gestion de capteur Pistage multi-cibles Filtre PHD labelisée Filtre PHD multi-capteurs labelisée Sensor management Multi-object tracking Labeled PHD filter Multi-sensors labeled PHD filter
10	visual tracking and object motion prediction for intelligent vehicles / Suivi visuel et prédiction de mouvement des objets pour véhicules intelligents Yang, Tao 02 May 2019 (has links) Le suivi d’objets et la prédiction de mouvement sont des aspects importants pour les véhicules autonomes. Tout d'abord, nous avons développé une méthode de suivi mono-objet en utilisant le compressive tracking, afin de corriger le suivi à base de flux optique et d’arriver ainsi à un compromis entre performance et vitesse de traitement. Compte tenu de l'efficacité de l'extraction de caractéristiques comprimées (compressive features), nous avons appliqué cette méthode de suivi au cas multi-objets pour améliorer les performances sans trop ralentir la vitesse de traitement. Deuxièmement, nous avons amélioré la méthode de suivi mono-objet basée sur DCF en utilisant des caractéristiques provenant d’un CNN multicouches, une analyse de fiabilité spatiale (via un masque d'objet) ainsi qu’une stratégie conditionnelle de mise à jour de modèle. Ensuite, nous avons appliqué la méthode améliorée au cas du suivi multi-objets. Les VGGNet-19 et DCFNet pré-entraînés sont testés respectivement en tant qu’extracteurs de caractéristiques. Le modèle discriminant réalisé par DCF est pris en compte dans l’étape d'association des données. Troisièmement, deux modèles LSTM (seq2seq et seq2dense) pour la prédiction de mouvement des véhicules et piétons dans le système de référence de la caméra sont proposés. En se basant sur des données visuelles et un nuage de points 3D (LiDAR), un système de suivi multi-objets basé sur un filtre de Kalman avec un détecteur 3D sont utilisés pour générer les trajectoires des objets à tester. Les modèles proposées et le modèle de régression polynomiale, considéré comme méthode de référence, sont comparés et évalués. / Object tracking and motion prediction are important for autonomous vehicles and can be applied in many other fields. First, we design a single object tracker using compressive tracking to correct the optical flow tracking in order to achieve a balance between performance and processing speed. Considering the efficiency of compressive feature extraction, we apply this tracker to multi-object tracking to improve the performance without slowing down too much speed. Second, we improve the DCF based single object tracker by introducing multi-layer CNN features, spatial reliability analysis (through a foreground mask) and conditionally model updating strategy. Then, we apply the DCF based CNN tracker to multi-object tracking. The pre-trained VGGNet-19 and DCFNet are tested as feature extractors respectively. The discriminative model achieved by DCF is considered for data association. Third, two proposed LSTM models (seq2seq and seq2dense) for motion prediction of vehicles and pedestrians in the camera coordinate are proposed. Based on visual data and 3D points cloud (LiDAR), a Kalman filter based multi-object tracking system with a 3D detector are used to generate the object trajectories for testing. The proposed models, and polynomial regression model, considered as baseline, are compared for evaluation. Suivi mono-Objet Suivi multi-Objets Prédiction de mouvement Acquisition comprimée Apprentissage profond Single object tracking Multi-Object tracking Motion prediction Compressive sensing Deep learning 004

Search results