• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • No language data
  • Tagged with
  • 2
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Automatic Analysis of Facial Actions: Learning from Transductive, Supervised and Unsupervised Frameworks

Chu, Wen-Sheng 01 January 2017 (has links)
Automatic analysis of facial actions (AFA) can reveal a person’s emotion, intention, and physical state, and make possible a wide range of applications. To enable reliable, valid, and efficient AFA, this thesis investigates automatic analysis of facial actions through transductive, supervised and unsupervised learning. Supervised learning for AFA is challenging, in part, because of individual differences among persons in face shape and appearance and variation in video acquisition and context. To improve generalizability across persons, we propose a transductive framework, Selective Transfer Machine (STM), which personalizes generic classifiers through joint sample reweighting and classifier learning. By personalizing classifiers, STM offers improved generalization to unknown persons. As an extension, we develop a variant of STM for use when partially labeled data are available. Additional challenges for supervised learning include learning an optimal representation for classification, variation in base rates of action units (AUs), correlation between AUs and temporal consistency. While these challenges could be partly accommodated with an SVM or STM, a more powerful alternative is afforded by an end-to-end supervised framework (i.e., deep learning). We propose a convolutional network with long short-term memory (LSTM) and multi-label sampling strategies. We compared SVM, STM and deep learning approaches with respect to AU occurrence and intensity in and between BP4D+ [282] and GFT [93] databases, which consist of around 0.6 million annotated frames. Annotated video is not always possible or desirable. We introduce an unsupervised Branch-and-Bound framework to discover correlated facial actions in un-annotated video. We term this approach Common Event Discovery (CED). We evaluate CED in video and motion capture data. CED achieved moderate convergence with supervised approaches and enabled discovery of novel patterns occult to supervised approaches.
2

CONTENT UNDERSTANDING FOR IMAGING SYSTEMS: PAGE CLASSIFICATION, FADING DETECTION, EMOTION RECOGNITION, AND SALIENCY BASED IMAGE QUALITY ASSESSMENT AND CROPPING

Shaoyuan Xu (9116033) 12 October 2021 (has links)
<div>This thesis consists of four sections which are related with four research projects.</div><div><br></div><div>The first section is about Page Classification. In this section, we extend our previous approach which could classify 3 classes of pages: Text, Picture and Mixed, to 5 classes which are: Text, Picture, Mixed, Receipt and Highlight. We first design new features to define those two new classes and then use DAG-SVM to classify those 5 classes of images. Based on the results, our algorithm performs well and is able to classify 5 types of pages.</div><div><br></div><div>The second section is about Fading Detection. In this section, we develop an algorithm that can automatically detect fading for both text and non-text region. For text region, we first do global alignment and then perform local alignment. After that, we create a 3D color node system, assign each connected component to a color node and get the color difference between raster page connected component and scanned page connected. For non-text region, after global alignment, we divide the page into "super pixels" and get the color difference between raster super pixels and testing super pixels. Compared with the traditional method that uses a diagnostic page, our method is more efficient and effective.</div><div><br></div><div>The third section is about CNN Based Emotion Recognition. In this section, we build our own emotion recognition classification and regression system from scratch. It includes data set collection, data preprocessing, model training and testing. We extend the model to real-time video application and it performs accurately and smoothly. We also try another approach of solving the emotion recognition problem using Facial Action Unit detection. By extracting Facial Land Mark features and adopting SVM training framework, the Facial Action Unit approach achieves comparable accuracy to the CNN based approach.</div><div><br></div><div>The forth section is about Saliency Based Image Quality Assessment and Cropping. In this section, we propose a method of doing image quality assessment and recomposition with the help of image saliency information. Saliency is the remarkable region of an image that attracts people's attention easily and naturally. By showing everyday examples as well as our experimental results, we demonstrate the fact that, utilizing the saliency information will be beneficial for both tasks.</div>

Page generated in 0.1651 seconds