Return to search

3D Densely Connected Convolutional Network for the Recognition of Human Shopping Actions

In recent years, deep convolutional neural networks (CNNs) have shown remarkable results in the image domain. However, most of the neural networks in action recognition do not have very deep layer compared with the CNN in the image domain. This thesis presents a 3D Densely Connected Convolutional Network (3D-DenseNet) for action recognition that can have more than 100 layers without exhibiting performance degradation or overfitting. Our network expands Densely Connected Convolutional Networks (DenseNet) [32] to 3D-DenseNet by adding the temporal dimension to all internal convolution and pooling layers. The internal layers of our model are connected with each other in a feed-forward fashion. In each layer, the feature-maps of all preceding layers are concatenated along the last dimension and are used as inputs to all subsequent layers. We propose two different versions of 3D-DenseNets: general 3D-DenseNet and lite 3D-DenseNet. While general 3D-DenseNet has the same architecture as DenseNet, lite 3D-DenseNet adds a 3D pooling layer right after the first 3D convolution layer of general 3D-DenseNet to reduce the number of training parameters at the beginning so that we can reach a deeper network.
We test on two action datasets: the MERL shopping dataset [69] and the KTH dataset [63]. Our experiment results demonstrate that our method performs better than the state-of-the-art action recognition method on the MERL shopping dataset and achieves a competitive result on the KTH dataset.

Identiferoai:union.ndltd.org:uottawa.ca/oai:ruor.uottawa.ca:10393/36739
Date January 2017
CreatorsGu, Dongfeng
ContributorsLaganière, Robert, Petriu, Emil
PublisherUniversité d'Ottawa / University of Ottawa
Source SetsUniversité d’Ottawa
LanguageEnglish
Detected LanguageEnglish
TypeThesis

Page generated in 0.0023 seconds