Analyzing large volumes of video data is a challenging and time-consuming task. Automating this process would very valuable, especially in ecological research where massive amounts of video can be used to unlock new avenues of ecological research into the behaviour of animals in their environments. Deep Neural Networks, particularly Deep Convolutional Neural Networks, are a powerful class of models for computer vision. When combined with Recurrent Neural Networks, Deep Convolutional models can be applied to video for frame level video classification. This research studies two datasets: penguins and seals. The purpose of the research is to compare the performance of image-only CNNs, which treat each frame of a video independently, against a combined CNN-RNN approach; and to assess whether incorporating the motion information in the temporal aspect of video improves the accuracy of classifications in these two datasets. Video and image-only models offer similar out-of-sample performance on the simpler seals dataset but the video model led to moderate performance improvements on the more complex penguin action recognition dataset.
Identifer | oai:union.ndltd.org:netd.ac.za/oai:union.ndltd.org:uct/oai:localhost:11427/32520 |
Date | January 2020 |
Creators | Conway, Alexander |
Contributors | Durbach, Ian |
Publisher | University of Cape Town, Faculty of Science, Department of Statistical Sciences |
Source Sets | South African National ETD Portal |
Language | English |
Detected Language | English |
Type | Master Thesis, Masters, MSc |
Format | application/pdf |
Page generated in 0.002 seconds