Thesis: Ph. D., Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2018. / Cataloged from PDF version of thesis. / Includes bibliographical references (pages 96-106). / Data driven methods based on deep neural networks (DNNs) have ushered in a new era in the field of machine learning computer vision. Conventional algorithmic approaches are being replaced by end-to-end deep learning systems that can leverage big data. Deep learning has begun revolutionizing human centric fields such as health-care and finance, finding its way into automated screening and diagnoses. At present, developing and training artificial neural network architectures requires both human expertise and labor, requiring millions of labeled data-points to train and hours of engineering effort to develop best performing architectures. In this dissertation, my goal is to make deep learning more accessible by developing algorithms for low shot learning (learning from a few examples). This work includes new semi-supervised approaches to learn from unlabeled datasets with only a fraction of labeled examples, deep learning methods to learn from generated data using simulation based techniques, and learning to optimize neural networks for smaller data sets. Specifically, this dissertation focuses on two proposed directions which will contribute towards both technical and conceptual advances in literature. -- How can we use invariant-based approaches when training from small datasets ? -- How to enable training from multiple data sources carrying very small amounts of data ? -- How to use meta-modeling approach to automatically generate high-performing DNNs ? To address these questions, this dissertation describes machine learning algorithms as follows (a) an action recognition autoencoder which learns over very small datasets; (b) an algorithm to train deep neural networks over multiple entities; (c) a meta-modeling approach to automatically generate high-performing architectures. We also provide a dataset of neural network topologies used for predicting accuracy of a deep neural network. / by Otkrist Gupta. / Ph. D.
Identifer | oai:union.ndltd.org:MIT/oai:dspace.mit.edu:1721.1/119072 |
Date | January 2018 |
Creators | Gupta, Otkrist |
Contributors | Ramesh Raskar., Program in Media Arts and Sciences (Massachusetts Institute of Technology), Program in Media Arts and Sciences (Massachusetts Institute of Technology) |
Publisher | Massachusetts Institute of Technology |
Source Sets | M.I.T. Theses and Dissertation |
Language | English |
Detected Language | English |
Type | Thesis |
Format | 106 pages, application/pdf |
Rights | MIT theses are protected by copyright. They may be viewed, downloaded, or printed from this source but further reproduction or distribution in any format is prohibited without written permission., http://dspace.mit.edu/handle/1721.1/7582 |
Page generated in 0.0292 seconds