1 |
Visual Tracking Using Deep Motion Features / Visuell följning med hjälp av djup inlärning och optiskt flödeGladh, Susanna January 2016 (has links)
Generic visual tracking is a challenging computer vision problem, where the position of a specified target is estimated through a sequence of frames. The only given information is the initial location of the target. Therefore, the tracker has to adapt and learn any kind of object, which it describes through visual features used to differentiate target from background. Standard appearance features only capture momentary visual information. This master’s thesis investigates the use of deep features extracted through optical flow images processed in a deep convolutional network. The optical flow is calculated using two consecutive images, and thereby captures the dynamic nature of the scene. Results show that this information is complementary to the standard appearance features, and improves performance of the tracker. Deep features are typically very high dimensional. Employing dimensionality reduction can increase both the efficiency and performance of the tracker. As a second aim in this thesis, PCA and PLS were evaluated and compared. The evaluations show that the two methods are almost equal in performance, with PLS actually receiving slightly better score than the popular PCA. The final proposed tracker was evaluated on three challenging datasets, and was shown to outperform other state-of-the-art trackers.
|
2 |
Incorporating Scene Depth in Discriminative Correlation Filters for Visual TrackingStynsberg, John January 2018 (has links)
Visual tracking is a computer vision problem where the task is to follow a targetthrough a video sequence. Tracking has many important real-world applications in several fields such as autonomous vehicles and robot-vision. Since visual tracking does not assume any prior knowledge about the target, it faces different challenges such occlusion, appearance change, background clutter and scale change. In this thesis we try to improve the capabilities of tracking frameworks using discriminative correlation filters by incorporating scene depth information. We utilize scene depth information on three main levels. First, we use raw depth information to segment the target from its surroundings enabling occlusion detection and scale estimation. Second, we investigate different visual features calculated from depth data to decide which features are good at encoding geometric information available solely in depth data. Third, we investigate handling missing data in the depth maps using a modified version of the normalized convolution framework. Finally, we introduce a novel approach for parameter search using genetic algorithms to find the best hyperparameters for our tracking framework. Experiments show that depth data can be used to estimate scale changes and handle occlusions. In addition, visual features calculated from depth are more representative if they were combined with color features. It is also shown that utilizing normalized convolution improves the overall performance in some cases. Lastly, the usage of genetic algorithms for hyperparameter search leads to accuracy gains as well as some insights on the performance of different components within the framework.
|
Page generated in 0.0222 seconds