1 |
Exploratory search through large video corporaCastañón, Gregory David 21 June 2016 (has links)
Activity retrieval is a growing field in electrical engineering that specializes in the search and retrieval of relevant activities and events in video corpora. With the affordability and popularity of cameras for government, personal and retail use, the quantity of available video data is rapidly outscaling our ability to reason over it. Towards the end of empowering users to navigate and interact with the contents of these video corpora, we propose a framework for exploratory search that emphasizes activity structure and search space reduction over complex feature representations.
Exploratory search is a user driven process wherein a person provides a system with a query describing the activity, event, or object he is interested in finding. Typically, this description takes the implicit form of one or more exemplar videos, but it can also involve an explicit description. The system returns candidate matches, followed by query refinement and iteration. System performance is judged by the run-time of the system and the precision/recall curve of of the query matches returned.
Scaling is one of the primary challenges in video search. From vast web-video archives like youtube (1 billion videos and counting) to the 30 million active surveillance cameras shooting an estimated 4 billion hours of footage every week in the United States, trying to find a set of matches can be like looking for a needle in a haystack. Our goal is to create an efficient archival representation of video corpora that can be calculated in real-time as video streams in, and then enables a user to quickly get a set of results that match.
First, we design a system for rapidly identifying simple queries in large-scale video corpora. Instead of focusing on feature design, our system focuses on the spatiotemporal relationships between those features as a means of disambiguating an activity of interest from background. We define a semantic feature vocabulary of concepts that are both readily extracted from video and easily understood by an operator. As data streams in, features are hashed to an inverted index and retrieved in constant time after the system is presented with a user's query.
We take a zero-shot approach to exploratory search: the user manually assembles vocabulary elements like color, speed, size and type into a graph. Given that information, we perform an initial downsampling of the archived data, and design a novel dynamic programming approach based on genome-sequencing to search for similar patterns. Experimental results indicate that this approach outperforms other methods for detecting activities in surveillance video datasets.
Second, we address the problem of representing complex activities that take place over long spans of space and time. Subgraph and graph matching methods have seen limited use in exploratory search because both problems are provably NP-hard. In this work, we render these problems computationally tractable by identifying the maximally discriminative spanning tree (MDST), and using dynamic programming to optimally reduce the archive data based on a custom algorithm for tree-matching in attributed relational graphs. We demonstrate the efficacy of this approach on popular surveillance video datasets in several modalities.
Finally, we design an approach for successive search space reduction in subgraph matching problems. Given a query graph and archival data, our algorithm iteratively selects spanning trees from the query graph that optimize the expected search space reduction at each step until the archive converges. We use this approach to efficiently reason over video surveillance datasets, simulated data, as well as large graphs of protein data.
|
2 |
Effective and Efficient Similarity Search in Video DatabasesJie Shao Unknown Date (has links)
Searching relevant information based on content features in video databases is an interesting and challenging research topic that has drawn lots of attention recently. Video similarity search has many practical applications such as TV broadcast monitoring, copyright compliance enforcement and search result clustering, etc. However, existing studies are limited to provide fast and accurate solutions due to the diverse variations among the videos in large collections. In this thesis, we introduce the database support for effective and efficient video similarity search from various sources, even if there exists some transformation distortion, partial content re-ordering, insertion, deletion or replacement. Specifically, we focus on processing two different types of content-based queries: video clip retrieval in a large collection of segmented short videos, and video subsequence identification from a long unsegmented stream. The first part of the thesis investigates the problem of how to process a number of individual kNN searches on the same database simultaneously to reduce the computational overhead of current content-based video search systems. We propose a Dynamic Query Ordering (DQO) algorithm for efficiently processing Batch Nearest Neighbor (BNN) search in high-dimensional space, with advanced optimizations of both I/O cost and CPU cost. The second part of the thesis challenges an unstudied problem of temporal localization of similar content from a long unsegmented video sequence, with extension to identify the occurrence of potentially different ordering or length with respect to query due to video content editing. A graph transformation and matching approach supported by the above BNN search is proposed, as a filter-and-refine query processing strategy to effectively but still efficiently identify the most similar subsequence. The third part of the thesis extends the method of Bounded Coordinate System (BCS) we introduced earlier for video clip retrieval. A novel collective perspective of exploiting the distributional discrepancy of samples for assessing the similarity between two video clips is presented. Several ideas of non-parametric hypothesis tests in statistics are utilized to check the hypothesis whether two ensembles of points are from a same distribution. The proposed similarity measures can provide a more comprehensive analysis that captures the essence of invariant distribution information for retrieving video clips. For each part, we demonstrate comprehensive experimental evaluations, which show improved performance compared with state-of-the-art methods. In the end, some scheduled extensions of this work are highlighted as future research objectives.
|
3 |
Vyhledávání informací a navigace v audiovizuálních archívech / Information retrieval and navigation in audio-visual archivesGaluščáková, Petra January 2018 (has links)
The thesis probes issues associated with interactive audio and video retrieval of relevant segments. Text-based methods for search in audio-visual archives using automatic transcripts, subtitles and metadata are first described. Search quality is analyzed with respect to video segmentation methods. Navigation using multimodal hyperlinks between video segments is then examined as well as methods for automatic detection of the most informative anchoring segments suitable for subsequent hyperlinking application. The described text-based search, hyperlinking and anchoring methods are finally presented in working form through their incorporation in an online graphical user interface.
|
Page generated in 0.0644 seconds