Alex Krizhevsky and his colleagues changed the world of machine vision and image
processing in 2012 when their deep learning model, named Alexnet, won the Im-
ageNet Large Scale Visual Recognition Challenge with more than 10.8% lower error
rate than their closest competitor. Ever since, deep learning approaches have been
an area of extensive research for the tasks of object detection, classification, pose esti-
mation, etc...This thesis presents a comprehensive analysis of different deep learning
models and architectures that have delivered state of the art performances in various
machine vision tasks. These models are compared to each other and their strengths
and weaknesses are highlighted.
We introduce a new approach for human head and shoulder detection from RGB-
D data based on a combination of image processing and deep learning approaches.
Candidate head-top locations(CHL) are generated from a fast and accurate image
processing algorithm that operates on depth data. We propose enhancements to the
CHL algorithm making it three times faster. Different deep learning models are then
evaluated for the tasks of classification and detection on the candidate head-top loca-
tions to regress the head bounding boxes and detect shoulder keypoints. We propose
3 different small models based on convolutional neural networks for this problem.
Experimental results for different architectures of our model are highlighted. We
also compare the performance of our model to mobilenet.
Finally, we show the differences between using 3 types of inputs CNN models:
RGB images, a 3-channel representation generated from depth data (Depth map,
Multi-order depth template, and Height difference map or DMH), and a 4 channel
input composed of RGB+D data.
Identifer | oai:union.ndltd.org:uottawa.ca/oai:ruor.uottawa.ca:10393/39448 |
Date | 18 July 2019 |
Creators | El Ahmar, Wassim |
Contributors | Laganière, Robert |
Publisher | Université d'Ottawa / University of Ottawa |
Source Sets | Université d’Ottawa |
Language | English |
Detected Language | English |
Type | Thesis |
Format | application/pdf |
Page generated in 0.0018 seconds