While convolutional neural networks (CNNs) have taken the lead for many learning tasks, action recognition in videos has yet to see this jump in performance. Many teams are working on the issue but so far there is no definitive answer how to make CNNs work well with video data. Recently, introduced convolutional kernel networks, a special case of CNNs which can be trained layer by layer in an unsupervised manner. This is done by approximating a kernel function in every layer with finite-dimensional descriptors. In this work we show the application of the CKN training to video, discuss the adjustments necessary and the influence of the type of data presented to the networks as well as the number of filters used.
Identifer | oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:kth-175797 |
Date | January 2015 |
Creators | Wynen, Daan |
Publisher | KTH, Skolan för datavetenskap och kommunikation (CSC) |
Source Sets | DiVA Archive at Upsalla University |
Language | English |
Detected Language | English |
Type | Student thesis, info:eu-repo/semantics/bachelorThesis, text |
Format | application/pdf |
Rights | info:eu-repo/semantics/openAccess |
Page generated in 0.0024 seconds