Global ETD Search

1	Investigating audio classification to automate the trimming of recorded lectures Govender, Devandran 01 February 2018 (has links) With the demand for recorded lectures to be made available as soon as possible, the University of Cape Town (UCT) needs to find innovative ways of removing bottlenecks in lecture capture workflow and thereby improving turn-around times from capture to publication. UCT utilises Opencast, which is an open source system to manage all the steps in the lecture-capture process. One of the steps involves manual trimming of unwanted segments from the beginning and end of video before it is published. These segments generally contain student chatter. The trimming step of the lecture-capture process has been identified as a bottleneck due to its dependence on staff availability. In this study, we investigate the potential of audio classification to automate this step. A classification model was trained to detect 2 classes: speech and non-speech. Speech represents a single dominant voice, for example, the lecturer, and non-speech represents student chatter, silence and other environmental sounds. In conjunction with the classification model, the first and last instances of the speech class together with their timestamps are detected. These timestamps are used to predict the start and end trim points for the recorded lecture. The classification model achieved a 97.8% accuracy rate at detecting speech from non-speech. The start trim point predictions were very positive, with an average difference of -11.22s from gold standard data. End trim point predictions showed a much greater deviation, with an average difference of 145.16s from gold standard data. Discussions between the lecturer and students, after the lecture, was predominantly the reason for this discrepancy. I.5 PATTERN RECOGNITION
2	Anomaly Detection and Prediction of Human Actions in a Video Surveillance Environment Spasic, Nemanja 01 December 2007 (has links) World wide focus has over the years been shifting towards security issues, not in least due to recent world wide terrorist activities. Several researchers have proposed state of the art surveillance systems to help with some of the security issues with varying success. Recent studies have suggested that the ability of these surveillance systems to learn common environmental behaviour patterns as wells as to detect and predict unusual, or anomalous, activities based on those learnt patterns are possible improvements to those systems. In addition, some of these surveillance systems are still run by human operators, who are prone to mistakes and may need some help from the surveillance systems themselves in detection of anomalous activities. This dissertation attempts to address these suggestions by combining the fields of Image Understanding and Artificial Intelligence, specifically Bayesian Networks, to develop a prototype video surveillance system that can learn common environmental behaviour patterns, thus being able to detect and predict anomalous activity in the environment based on those learnt patterns. In addition, this dissertation aims to show how the prototype system can adapt to these anomalous behaviours and integrate them into its common patterns over a prolonged occurrence period. The prototype video surveillance system showed good performance and ability to detect, predict and integrate anomalous activity in the evaluation tests that were performed using a volunteer in an experimental indoor environment. In addition, the prototype system performed quite well on the PETS 2002 dataset 1, which it was not designed for. The evaluation procedure used some of the evaluation metrics commonly used on the PETS datasets. Hence, the prototype system provides a good approach to anomaly detection and prediction using Bayesian Networks trained on common environmental activities. I.5 PATTERN RECOGNITION I.2 ARTIFICIAL INTELLIGENCE I.4 IMAGE PROCESSING AND COMPUTER VISION

Search results

Investigating audio classification to automate the trimming of recorded lectures

Anomaly Detection and Prediction of Human Actions in a Video Surveillance Environment