1 |
A Wireless Traffic Surveillance System Using Video AnalyticsLuo, Ning 05 1900 (has links)
Video surveillance systems have been commonly used in transportation systems to support traffic monitoring, speed estimation, and incident detection. However, there are several challenges in developing and deploying such systems, including high development and maintenance costs, bandwidth bottleneck for long range link, and lack of advanced analytics. In this thesis, I leverage current wireless, video camera, and analytics technologies, and present a wireless traffic monitoring system. I first present an overview of the system. Then I describe the site investigation and several test links with different hardware/software configurations to demonstrate the effectiveness of the system. The system development process was documented to provide guidelines for future development. Furthermore, I propose a novel speed-estimation analytics algorithm that takes into consideration roads with slope angles. I prove the correctness of the algorithm theoretically, and validate the effectiveness of the algorithm experimentally. The experimental results on both synthetic and real dataset show that the algorithm is more accurate than the baseline algorithm 80% of the time. On average the accuracy improvement of speed estimation is over 3.7% even for very small slope angles.
|
2 |
Bayesian Nonparametric Modeling of Temporal Coherence for Entity-Driven Video AnalyticsMitra, Adway January 2015 (has links) (PDF)
In recent times there has been an explosion of online user-generated video content. This has generated significant research interest in video analytics. Human users understand videos based on high-level semantic concepts. However, most of the current research in video analytics are driven by low-level features and descriptors, which often lack semantic interpretation. Existing attempts in semantic video analytics are specialized and require additional resources like movie scripts, which are not available for most user-generated videos. There are no general purpose approaches to understanding videos through semantic concepts.
In this thesis we attempt to bridge this gap. We view videos as collections of entities which are semantic visual concepts like the persons in a movie, or cars in a F1 race video. We focus on two fundamental tasks in Video Understanding, namely summarization and scene- discovery. Entity-driven Video Summarization and Entity-driven Scene discovery are important open problems. They are challenging due to the spatio-temporal nature of videos, and also due to lack of apriori information about entities. We use Bayesian nonparametric methods to solve these problems. In the absence of external resources like scripts we utilize fundamental structural properties like temporal coherence in videos- which means that adjacent frames should contain the same set of entities and have similar visual features. There have been no focussed attempts to model this important property. This thesis makes several contributions in Computer Vision and Bayesian nonparametrics by addressing Entity-driven Video Understanding through temporal coherence modeling.
Temporal Coherence in videos is observed across its frames at the level of features/descriptors, as also at semantic level. We start with an attempt to model TC at the level of features/descriptors. A tracklet is a spatio-temporal fragment of a video- a set of spatial regions in a short sequence (5-20) of consecutive frames, each of which enclose a particular entity. We attempt to find a representation of tracklets to aid tracking of entities. We explore region descriptors like Covari- ance Matrices of spatial features in individual frames. Due to temporal coherence, such matrices from corresponding spatial regions in successive frames have nearly identical eigenvectors. We utilize this property to model a tracklet using a covariance matrix, and use it for region-based entity tracking. We propose a new method to estimate such a matrix. Our method is found to be much more efficient and effective than alternative covariance-based methods for entity tracking.
Next, we move to modeling temporal coherence at a semantic level, with special emphasis on videos of movies and TV-series episodes. Each tracklet is associated with an entity (say a particular person). Spatio-temporally close but non-overlapping tracklets are likely to belong to the same entity, while tracklets that overlap in time can never belong to the same entity. Our aim is to cluster the tracklets based on the entities associated with them, with the goal of discovering the entities in a video along with all their occurrences. We argue that Bayesian Nonparametrics is the most convenient way for this task. We propose a temporally coherent version of Chinese Restaurant Process (TC-CRP) that can encode such constraints easily, and results in discovery of pure clusters of tracklets, and also filter out tracklets resulting from false detections. TC-CRP shows excellent performance on person discovery from TV-series videos. We also discuss semantic video summarization, based on entity discovery.
Next, we consider entity-driven temporal segmentation of a video into scenes, where each scene is characterized by the entities present in it. This is a novel application, as existing work on temporal segmentation have focussed on low-level features of frames, rather than entities. We propose EntScene: a generative model for videos based on entities and scenes, and propose an inference algorithm based on Blocked Gibbs Sampling, for simultaneous entity discovery and scene discovery. We compare it to alternative inference algorithms, and show significant improvements in terms of segmentatio and scene discovery.
Video representation by low-rank matrix has gained popularity recently, and has been used for various tasks in Computer Vision. In such a representation, each column corresponds to a frame or a single detection. Such matrices are likely to have contiguous sets of identical columns due to temporal coherence, and hence they should be low-rank. However, we discover that none of the existing low-rank matrix recovery algorithms are able to preserve such structures. We study regularizers to encourage these structures for low-rank matrix recovery through convex optimization, but note that TC-CRP-like Bayesian modeling is better for enforcing them.
We then focus our attention on modeling temporal coherence in hierarchically grouped sequential data, such as word-tokens grouped into sentences, paragraphs, documents etc in a text corpus. We attempt Bayesian modeling for such data, with application to multi-layer segmentation. We first make a detailed study of existing models for such data. We present a taxonomy for such models called Degree-of-Sharing (DoS), based on how various mixture components are shared by the groups of data in these models. We come up with Layered Dirichlet Process which generalizes Hierarchical Dirichlet Process to multiple layers, and can also handle sequential information easily through Markovian approach. This is applied to hierarchical co-segmentation of a set of news transcripts- into broad categories (like politics, sports etc) and individual stories. We also propose a explicit-duration (semi-Markov) approach for this purpose, and provide an efficient inference algorithm for this. We also discuss generative processes for distribution matrices, where each column is a probability distribution. For this we discuss an application: to infer the correct answers to questions on online answering forums from opinions provided by different users.
|
3 |
Video Processing for Agricultural ApplicationsHe Liu (8735115) 24 April 2020 (has links)
Cameras are widely used as sensors for a variety of engineering applications. In a typical video-based application, spatial segmentation is a fundamental step which provides the spatial positions of different targets for further analysis. In this thesis, we focus on videos analytics applied to the agricultural industry and describe several video segmentation methods in the context of two practical projects: autonomous farming vehicles and analysis of dairy cow health. In the autonomous farming vehicle project, we propose three spatial segmentation methods based on traditional video features to isolate the regions of the video frame where critical information appears. Two applications that apply the segmentation method are presented: farming activity classification and header-height control for a combine harvester. In the project on cow health, we propose a cow structural model based on the keypoints of joints from a side-view cow video. A detection system is developed using deep learning techniques to automatically extract the structural model from the videos. Based on this model, we also present a preliminary application which estimates the cow’s weight based on video information.<div><br></div>
|
4 |
Digital Media Analytics: Towards an Understanding of Content Design and Social Media PromotionJanuary 2020 (has links)
abstract: Digital media refers to any form of media which depends on electronic devices for its creation, distribution, view, and storage. Digital media analytics involves qualitative and quantitative analysis from the business to understand users’ behaviors. This technique brings disruptive changes to many industries and its path of economic disruption is getting wider and wider. Under the context of the increasingly popular digital media market, this dissertation investigates what are the best content delivery strategy and the new cultural phenomenon: Internet Water Army. The first essay proposes a theory-guided computational approach that consolidates distinct data sources spanning unstructured text, image, and video data, systematically measures modes of persuasion, and unveils the multimedia content design strategies for crowdfunding projects. The second essay studies whether using the Internet Water Army helps sales and under what conditions it helps. This study finds that the Internet water army helps product sales at both post-level and fans-level. The effect is largely reflected by changing the number of emotional fans. Furthermore, the earlier to purchase the water armies, more haters, likers, and neutral fans it can attract. The last essay builds a game model to study the trade- off between honestly promoting the product according to their evaluation and catering to the consumer’s prior belief on the product quality to stay on the market as long as possible. It provides insights on the optimum usage of promotion on social media and demonstrate how conventional wisdom about negative reviews will hurt business may be misleading in the presence of social media. These three studies jointly contribute to the crowdfunding and social media studies literature by elucidating the content delivery strategy, and the impact and purchasing strategy of the Internet Water Army. / Dissertation/Thesis / Doctoral Dissertation Business Administration 2020
|
5 |
Exploring Video Analytics as a Course Assessment Tool for Online Writing Instruction StakeholdersGodfrey, Jason Michael 01 December 2018 (has links)
Online Writing Instruction (OWI) programs, like online learning classes in general, are becoming more popular in post-secondary education. Yet few articles discuss how to tailor course assessment methods to an exclusively online environment. This thesis explores video analytics as a possible course assessment tool for online writing classrooms. Video analytics allow instructors, course designers, and writing program administrators to view how many students are engaging in video-based course materials. Additionally, video analytics can provide information about how active students are in their data-finding methods while they watch. By means of example, this thesis examines video analytics from one semester of a large western university’s online first-year writing sections (n=283). This study finds that video analytics afford stakeholders knowledge of patterns in how students interact with video-based course materials. Assuming the end goal of course assessment is to provide meaningful insight that will help improve student and teacher experience, video analytics can be a powerful, dynamic course assessment tool.
|
6 |
Software Systems for Large-Scale Retrospective Video AnalyticsTiantu Xu (10706787) 29 April 2021 (has links)
<p>Pervasive cameras are generating videos at an unprecedented pace, making videos the new frontier of big data. As the processors, e.g., CPU/GPU, become increasingly powerful, the cloud and edge nodes can generate useful insights from colossal video data. However, as the research in computer vision (CV) develops vigorously, the system area has been a blind spot in CV research. With colossal video data generated from cameras every day and limited compute resource budgets, how to design software systems to generate insights from video data efficiently?</p><p><br></p><p>Designing cost-efficient video analytics software systems is challenged by the expensive computation of vision operators, the colossal data volume, and the precious wireless bandwidth of surveillance cameras. To address above challenges, three software systems are proposed in this thesis. For the first system, we present VStore, a data store that supports fast, resource-efficient analytics over large archival videos. VStore manages video ingestion, storage, retrieval, and consumption and controls video formats through backward derivation of configuration: in the opposite direction along the video data path, VStore passes the video quantity and quality expected by analytics backward to retrieval, to storage, and to ingestion. VStore derives an optimal set of video formats, optimizes for different resources in a progressive manner, and runs queries as fast as 362x of video realtime. For the second system, we present a camera/cloud runtime called DIVA that supports querying cold videos distributed on low-cost wireless cameras. DIVA is built upon a novel zero-streaming paradigm: to save wireless bandwidth, when capturing video frames, a camera builds sparse yet accurate landmark frames without uploading any video data; when executing a query, a camera processes frames in multiple passes with increasingly more expensive operators. On diverse queries over 15 videos, DIVA runs at more than 100x realtime and outperforms competitive alternatives remarkably. For the third system, we present Clique, a practical object re-identification (ReID) engine that builds upon two unconventional techniques. First, Clique assesses target occurrences by clustering unreliable object features extracted by ReID algorithms, with each cluster representing the general impression of a distinct object to be matched against the input. Second, to search across camera videos, Clique samples cameras to maximize the spatiotemporal coverage and incrementally adds cameras for processing on demand. Through evaluation on 25 hours of traffic videos from 25 cameras, Clique reaches a high recall at 5 of 0.87 across 70 queries and runs at 830x of video realtime in achieving high accuracy.</p>
|
7 |
Accelerating Multi-target Visual Tracking on Smart Edge DevicesNalaie, Keivan January 2023 (has links)
\prefacesection{Abstract}
Multi-object tracking (MOT) is a key building block in video analytics and finds extensive use in surveillance, search and rescue, and autonomous driving applications. Object detection, a crucial stage in MOT, dominates in the overall tracking inference time due to its reliance on Deep Neural Networks (DNNs). Despite the superior performance of cutting-edge object detectors, their extensive computational demands limit their real-time application on embedded devices that possess constrained processing capabilities. Hence, we aim to reduce the computational burdens of object detection while maintaining tracking performance.
As the first approach, we adapt frame resolutions to reduce computational complexity. During inference, frame resolutions can be tuned according to the complexity of visual scenes. We present DeepScale, a model-agnostic frame resolution selection approach that operates on top of existing fully convolutional network-based trackers. By analyzing the effect of frame resolution on detection performance, DeepScale strikes good trade-offs between detection accuracy and processing speed by adapting frame resolutions on-the-fly.
Our second approach focuses on enhancing the efficiency of a tracker by model adaptation. We introduce AttTrack to expedite tracking by interleaving the execution of object detectors of different model sizes in inference. A sophisticated network (teacher) runs for keyframes only while, for non-keyframe, knowledge is transferred from the teacher to a smaller network (student) to improve the latter’s performance.
Our third contribution involves exploiting temporal-spatial redundancies to enable real-time multi-camera tracking. We propose the MVSparse pipeline which consists of a central processing unit that aggregates information from multiple cameras (on an edge server or in the cloud) and distributed lightweight Reinforcement Learning (RL) agents running on individual cameras that predict the informative blocks in the current frame based on past frames on the same camera and detection results from other cameras. / Thesis / Doctor of Science (PhD)
|
8 |
Visualization of web site visit and usage data / Visualisering av webbplatsbesöks- och användningsdataWinblad, Emanuel January 2014 (has links)
This report documents the work and results of a master’s thesis in Media Tech- nology that has been carried out at the Department of Science and Technology at Linköping University with the support of Sports Editing Sweden AB (SES). Its aim is to create a solution which aids the users of SES’ web CMS products in gaining insight into web site visit and usage statistics. The resulting solu- tion is the concept and initial version of a web based service. This service has been developed through an agile process with user centered design in mind and provides a graphical user interface which makes high use of visualizations to achieve the project goal.
|
9 |
Video Analytics with Spatio-Temporal Characteristics of ActivitiesCheng, Guangchun 05 1900 (has links)
As video capturing devices become more ubiquitous from surveillance cameras to smart phones, the demand of automated video analysis is increasing as never before. One obstacle in this process is to efficiently locate where a human operator’s attention should be, and another is to determine the specific types of activities or actions without ambiguity. It is the special interest of this dissertation to locate spatial and temporal regions of interest in videos and to develop a better action representation for video-based activity analysis. This dissertation follows the scheme of “locating then recognizing” activities of interest in videos, i.e., locations of potentially interesting activities are estimated before performing in-depth analysis. Theoretical properties of regions of interest in videos are first exploited, based on which a unifying framework is proposed to locate both spatial and temporal regions of interest with the same settings of parameters. The approach estimates the distribution of motion based on 3D structure tensors, and locates regions of interest according to persistent occurrences of low probability. Two contributions are further made to better represent the actions. The first is to construct a unifying model of spatio-temporal relationships between reusable mid-level actions which bridge low-level pixels and high-level activities. Dense trajectories are clustered to construct mid-level actionlets, and the temporal relationships between actionlets are modeled as Action Graphs based on Allen interval predicates. The second is an effort for a novel and efficient representation of action graphs based on a sparse coding framework. Action graphs are first represented using Laplacian matrices and then decomposed as a linear combination of primitive dictionary items following sparse coding scheme. The optimization is eventually formulated and solved as a determinant maximization problem, and 1-nearest neighbor is used for action classification. The experiments have shown better results than existing approaches for regions-of-interest detection and action recognition.
|
10 |
Anomaly detection in surveillance camera dataSemerenska, Viktoriia January 2023 (has links)
The importance of detecting anomalies in surveillance camera data cannot be overemphasized. With the increasing availability of surveillance cameras in public and private locations, the need for reliable and effective methods to detect anomalous behavior has become critical to public safety. Anomaly detection algorithms can help identify potential threats in real time, allowing for rapid intervention and prevention of criminal activity. The examples of anomalies that can be detected by analyzing surveillance camera data include suspicious loitering or lingering, unattended bags or packages, crowd gatherings or dispersals, trespassing or unauthorized access, vandalism or property damage, violence or aggressive behavior, abnormal traffic patterns, missing or abducted persons, unusual pedestrian behavior, environmental anomalies. Detecting these anomalies in surveillance camera data can enable law enforcement, security personnel, and other relevant authorities to respond quickly and effectively to potential threats, ultimately contributing to a safer environment for all. Surveillance camera data contains a large amount of information that is difficult for humans to analyze in real time. In addition, the sheer volume of data generated by surveillance cameras makes manual analysis impractical. Therefore, the development of automated anomaly detection algorithms is crucial for effective and efficient surveillance. The goal of this master's thesis is to detect anomalies using video cameras with an embedded machine learning processor and video analytics, such as human behavior. For this purpose, the most appropriate machine learning techniques will be selected and after comparing the results of these techniques, the best anomaly detection technique for the given circumstances will be identified. To gather the evidence needed to answer the research questions, I will use a combination of methods appropriate to the study design. The study will follow a mixed-methods approach, combining a systematic literature review (SLR) and a formal experiment. In this study, we investigated the effectiveness of various machine learning algorithms in detecting anomalous human behavior in video surveillance data.
|
Page generated in 0.1797 seconds