Global ETD Search

151	Reducing animator keyframes Holden, Daniel January 2017 (has links) The aim of this doctoral thesis is to present a body of work aimed at reducing the time spent by animators manually constructing keyframed animation. To this end we present a number of state of the art machine learning techniques applied to the domain of character animation. Data-driven tools for the synthesis and production of character animation have a good track record of success. In particular, they have been adopted thoroughly in the games industry as they allow designers as well as animators to simply specify the high-level descriptions of the animations to be created, and the rest is produced automatically. Even so, these techniques have not been thoroughly adopted in the film industry in the production of keyframe based animation [Planet, 2012]. Due to this, the cost of producing high quality keyframed animation remains very high, and the time of professional animators is increasingly precious. We present our work in four main chapters. We first tackle the key problem in the adoption of data-driven tools for key framed animation - a problem called the inversion of the rig function. Secondly, we show the construction of a new tool for data-driven character animation called the motion manifold - a representation of motion constructed using deep learning that has a number of properties useful for animation research. Thirdly, we show how the motion manifold can be extended as a general tool for performing data-driven animation synthesis and editing. Finally, we show how these techniques developed for keyframed animation can also be adapted to advance the state of the art in the games industry.
152	The Effect of Teaching with Stories on Associate Degree Nursing Students' approach to Learning and Reflective Practice January 2012 (has links) abstract: This action research study is the culmination of several action cycles investigating cognitive information processing and learning strategies based on students approach to learning theory and assessing students' meta-cognitive learning, motivation, and reflective development suggestive of deep learning. The study introduces a reading assignment as an integrative teaching method with the purpose of challenging students' assumptions and requiring them to think from multiple perspectives thus influencing deep learning. The hypothesis is that students who are required to critically reflect on their own perceptions will develop the deep learning skills needed in the 21st century. Pre and post surveys were used to assess for changes in students' preferred approach to learning and reflective practice styles. Qualitative data was collected in the form of student stories and student literature circle transcripts to further describe student perceptions of the experience. Results indicate stories that include examples of critical reflection may influence students to use more transformational types of reflective learning actions. Approximately fifty percent of the students in the course increased their preference for deep learning by the end of the course. Further research is needed to determine the effect of narratives on student preferences for deep learning. / Dissertation/Thesis / Ed.D. Leadership and Innovation 2012 Higher education Nursing deep learning narrative nursing education reflection
153	Multi-person Pose Estimation in Soccer Videos with Convolutional Neural Networks Skyttner, Axel January 2018 (has links) Pose estimation is the problem of detecting poses of people in images, multiperson pose estimation is the problem of detecting poses of multiple persons in images. This thesis investigates multi-person pose estimation by applying the associative embedding method on images from soccer videos. Three models are compared, first a pre-trained model, second a fine-tuned model and third a model extended to handle image sequences. The pre-trained method performed well on soccer images and the fine-tuned model performed better then the pre-trained model. The image sequence model performed equally as the fine-tuned model but not better. This thesis concludes that the associative embedding model is a feasible option for pose estimation in soccer videos and should be further researched. Deep Learning Pose Estimation Sport Analysis Engineering and Technology Teknik och teknologier
154	Klasifikace na množinách bodů v 3D / Klasifikace na množinách bodů v 3D Střelský, Jakub January 2018 (has links) Increasing interest for classification of 3D geometrical data has led to discov- ery of PointNet, which is a neural network architecture capable of processing un- ordered point sets. This thesis explores several methods of utilizing conventional point features within PointNet and their impact on classification. Classification performance of the presented methods was experimentally evaluated and com- pared with a baseline PointNet model on four different datasets. The results of the experiments suggest that some of the considered features can improve clas- sification effectiveness of PointNet on difficult datasets with objects that are not aligned into canonical orientation. In particular, the well known spin image rep- resentations can be employed successfully and reliably within PointNet. Further- more, a feature-based alternative to spatial transformer, which is a sub-network of PointNet responsible for aligning misaligned objects into canonical orientation, have been introduced. Additional experiments demonstrate that the alternative might be competitive with spatial transformer on challenging datasets. 1
155	Techniques d'analyse de contenu appliquées à l'imagerie spatiale / Machine learning applied to remote sensing images Le Goff, Matthieu 20 October 2017 (has links) Depuis les années 1970, la télédétection a permis d’améliorer l’analyse de la surface de la Terre grâce aux images satellites produites sous format numérique. En comparaison avec les images aéroportées, les images satellites apportent plus d’information car elles ont une couverture spatiale plus importante et une période de revisite courte. L’essor de la télédétection a été accompagné de l’émergence des technologies de traitement qui ont permis aux utilisateurs de la communauté d’analyser les images satellites avec l’aide de chaînes de traitement de plus en plus automatiques. Depuis les années 1970, les différentes missions d’observation de la Terre ont permis d’accumuler une quantité d’information importante dans le temps. Ceci est dû notamment à l’amélioration du temps de revisite des satellites pour une même région, au raffinement de la résolution spatiale et à l’augmentation de la fauchée (couverture spatiale d’une acquisition). La télédétection, autrefois cantonnée à l’étude d’une seule image, s’est progressivement tournée et se tourne de plus en plus vers l’analyse de longues séries d’images multispectrales acquises à différentes dates. Le flux annuel d’images satellite est supposé atteindre plusieurs Péta octets prochainement. La disponibilité d’une si grande quantité de données représente un atout pour développer de chaines de traitement avancées. Les techniques d’apprentissage automatique beaucoup utilisées en télédétection se sont beaucoup améliorées. Les performances de robustesse des approches classiques d’apprentissage automatique étaient souvent limitées par la quantité de données disponibles. Des nouvelles techniques ont été développées pour utiliser efficacement ce nouveau flux important de données. Cependant, la quantité de données et la complexité des algorithmes mis en place nécessitent une grande puissance de calcul pour ces nouvelles chaînes de traitement. En parallèle, la puissance de calcul accessible pour le traitement d’images s’est aussi accrue. Les GPUs («Graphic Processing Unit ») sont de plus en plus utilisés et l’utilisation de cloud public ou privé est de plus en plus répandue. Désormais, pour le traitement d’images, toute la puissance nécessaire pour les chaînes de traitements automatiques est disponible à coût raisonnable. La conception des nouvelles chaînes de traitement doit prendre en compte ce nouveau facteur. En télédétection, l’augmentation du volume de données à exploiter est devenue une problématique due à la contrainte de la puissance de calcul nécessaire pour l’analyse. Les algorithmes de télédétection traditionnels ont été conçus pour des données pouvant être stockées en mémoire interne tout au long des traitements. Cette condition est de moins en moins respectée avec la quantité d’images et leur résolution. Les algorithmes de télédétection traditionnels nécessitent d’être revus et adaptés pour le traitement de données à grande échelle. Ce besoin n’est pas propre à la télédétection et se retrouve dans d’autres secteurs comme le web, la médecine, la reconnaissance vocale,… qui ont déjà résolu une partie de ces problèmes. Une partie des techniques et technologies développées par les autres domaines doivent encore être adaptées pour être appliquée aux images satellites. Cette thèse se focalise sur les algorithmes de télédétection pour le traitement de volumes de données massifs. En particulier, un premier algorithme existant d’apprentissage automatique est étudié et adapté pour une implantation distribuée. L’objectif de l’implantation est le passage à l’échelle c’est-à-dire que l’algorithme puisse traiter une grande quantité de données moyennant une puissance de calcul adapté. Enfin, la deuxième méthodologie proposée est basée sur des algorithmes récents d’apprentissage automatique les réseaux de neurones convolutionnels et propose une méthodologie pour les appliquer à nos cas d’utilisation sur des images satellites. / Since the 1970s, remote sensing has been a great tool to study the Earth in particular thanks to satellite images produced in digital format. Compared to airborne images, satellite images provide more information with a greater spatial coverage and a short revisit period. The rise of remote sensing was followed by the development of processing technologies enabling users to analyze satellite images with the help of automatic processing chains. Since the 1970s, the various Earth observation missions have gathered an important amount of information over time. This is caused in particular by the frequent revisiting time for the same region, the improvement of spatial resolution and the increase of the swath (spatial coverage of an acquisition). Remote sensing, which was once confined to the study of a single image, has gradually turned into the analysis of long time series of multispectral images acquired at different dates. The annual flow of satellite images is expected to reach several Petabytes in the near future. The availability of such a large amount of data is an asset to develop advanced processing chains. The machine learning techniques used in remote sensing have greatly improved. The robustness of traditional machine learning approaches was often limited by the amount of available data. New techniques have been developed to effectively use this new and important data flow. However, the amount of data and the complexity of the algorithms embedded in the new processing pipelines require a high computing power. In parallel, the computing power available for image processing has also increased. Graphic Processing Units (GPUs) are increasingly being used and the use of public or private clouds is becoming more widespread. Now, all the power required for image processing is available at a reasonable cost. The design of the new processing lines must take this new factor into account. In remote sensing, the volume of data currently available for exploitation has become a problem due to the constraint of the computing power required for the analysis. Traditional remote sensing algorithms have often been designed for data that can be stored in internal memory throughout processing. This condition is violated with the quantity of images and their resolution taken into account. Traditional remote sensing algorithms need to be reviewed and adapted for large-scale data processing. This need is not specific to remote sensing and is found in other sectors such as the web, medicine, speech recognition ... which have already solved some of these problems. Some of the techniques and technologies developed by the other domains still need to be adapted to be applied to satellite images. This thesis focuses on remote sensing algorithms for processing massive data volumes. In particular, a first algorithm of machine learning is studied and adapted for a distributed implementation. The aim of the implementation is the scalability, i.e. the algorithm can process a large quantity of data with a suitable computing power. Finally, the second proposed methodology is based on recent algorithms of learning convolutional neural networks and proposes a methodology to apply them to our cases of use on satellite images. Apprentissage automatique Télédétection Machine learning Deep learning Remote Sensing
156	Image Reconstruction, Classification, and Tracking for Compressed Sensing Imaging and Video January 2016 (has links) abstract: Compressed sensing (CS) is a novel approach to collecting and analyzing data of all types. By exploiting prior knowledge of the compressibility of many naturally-occurring signals, specially designed sensors can dramatically undersample the data of interest and still achieve high performance. However, the generated data are pseudorandomly mixed and must be processed before use. In this work, a model of a single-pixel compressive video camera is used to explore the problems of performing inference based on these undersampled measurements. Three broad types of inference from CS measurements are considered: recovery of video frames, target tracking, and object classification/detection. Potential applications include automated surveillance, autonomous navigation, and medical imaging and diagnosis. Recovery of CS video frames is far more complex than still images, which are known to be (approximately) sparse in a linear basis such as the discrete cosine transform. By combining sparsity of individual frames with an optical flow-based model of inter-frame dependence, the perceptual quality and peak signal to noise ratio (PSNR) of reconstructed frames is improved. The efficacy of this approach is demonstrated for the cases of \textit{a priori} known image motion and unknown but constant image-wide motion. Although video sequences can be reconstructed from CS measurements, the process is computationally costly. In autonomous systems, this reconstruction step is unnecessary if higher-level conclusions can be drawn directly from the CS data. A tracking algorithm is described and evaluated which can hold target vehicles at very high levels of compression where reconstruction of video frames fails. The algorithm performs tracking by detection using a particle filter with likelihood given by a maximum average correlation height (MACH) target template model. Motivated by possible improvements over the MACH filter-based likelihood estimation of the tracking algorithm, the application of deep learning models to detection and classification of compressively sensed images is explored. In tests, a Deep Boltzmann Machine trained on CS measurements outperforms a naive reconstruct-first approach. Taken together, progress in these three areas of CS inference has the potential to lower system cost and improve performance, opening up new applications of CS video cameras. / Dissertation/Thesis / Doctoral Dissertation Electrical Engineering 2016 Electrical engineering Compressed Sensing Computer Vision Deep Learning Image Processing
157	Compressive Light Field Reconstruction using Deep Learning January 2017 (has links) abstract: Light field imaging is limited in its computational processing demands of high sampling for both spatial and angular dimensions. Single-shot light field cameras sacrifice spatial resolution to sample angular viewpoints, typically by multiplexing incoming rays onto a 2D sensor array. While this resolution can be recovered using compressive sensing, these iterative solutions are slow in processing a light field. We present a deep learning approach using a new, two branch network architecture, consisting jointly of an autoencoder and a 4D CNN, to recover a high resolution 4D light field from a single coded 2D image. This network decreases reconstruction time significantly while achieving average PSNR values of 26-32 dB on a variety of light fields. In particular, reconstruction time is decreased from 35 minutes to 6.7 minutes as compared to the dictionary method for equivalent visual quality. These reconstructions are performed at small sampling/compression ratios as low as 8%, allowing for cheaper coded light field cameras. We test our network reconstructions on synthetic light fields, simulated coded measurements of real light fields captured from a Lytro Illum camera, and real coded images from a custom CMOS diffractive light field camera. The combination of compressive light field capture with deep learning allows the potential for real-time light field video acquisition systems in the future. / Dissertation/Thesis / Masters Thesis Computer Engineering 2017 Computer engineering Electrical engineering compressive deep learning field light
158	Compressive Visual Question Answering January 2017 (has links) abstract: Compressive sensing theory allows to sense and reconstruct signals/images with lower sampling rate than Nyquist rate. Applications in resource constrained environment stand to benefit from this theory, opening up many possibilities for new applications at the same time. The traditional inference pipeline for computer vision sequence reconstructing the image from compressive measurements. However,the reconstruction process is a computationally expensive step that also provides poor results at high compression rate. There have been several successful attempts to perform inference tasks directly on compressive measurements such as activity recognition. In this thesis, I am interested to tackle a more challenging vision problem - Visual question answering (VQA) without reconstructing the compressive images. I investigate the feasibility of this problem with a series of experiments, and I evaluate proposed methods on a VQA dataset and discuss promising results and direction for future work. / Dissertation/Thesis / Masters Thesis Computer Engineering 2017 Computer science Mathematics compressive sensing deep learning visual question anwering
159	Unconstrained Periocular Face Recognition: From Reconstructive Dictionary Learning to Generative Deep Learning and Beyond Juefei-Xu, Felix 01 April 2018 (has links) Many real-world face recognition tasks are under unconstrained conditions such as off-angle pose variations, illumination variations, facial occlusion, facial expression, etc. In this work, we are focusing on the real-world scenarios where only the periocular region of a face is visible such as in the ISIS case. In Part I of the dissertation, we will showcase the face recognition capability based on the periocular region, which we call the periocular face recognition. We will demonstrate that face matching using the periocular region directly is more robust than the full face in terms of age-tolerant face recognition, expression-tolerant face recognition, pose-tolerant face recognition, as well as contains more cues for determining the gender information of a subject. In this dissertation, we will study direct periocular matching more comprehensively and systematically using both shallow and deep learning methods. Based on this, in Part II and Part III of the dissertation, we will continue to explore an indirect way of carrying out the periocular face recognition: periocular-based full face hallucination, because we want to capitalize on the powerful commercial face matchers and deep learning-based face recognition engines which are all trained on large-scale full face images. The reproducibility and feasibility of re-training for a proprietary facial region, such as the periocular region, is relatively low, due to the nonopen source nature of commercial face matchers as well as the amount of training data and computation power required by the deep learning based models. We will carry out the periocular-based full face hallucination based on two proposed reconstructive dictionary learning methods, including the dimensionally weighted K-SVD (DW-KSVD) dictionary learning approach and its kernel feature space counterpart using Fastfood kernel expansion approximation to reconstruct high-fidelity full face images from the periocular region, as well as two proposed generative deep learning approaches that build upon deep convolutional generative adversarial networks (DCGAN) to generate the full face from the periocular region observations, including the Gang of GANs (GoGAN) method and the discriminant nonlinear many-to-one generative adversarial networks (DNMM-GAN) for applications such as the generative open-set landmark-free frontalization (Golf) for faces and universal face optimization (UFO), which tackles an even broader set of problems than periocular based full face hallucination. Throughout Parts I-III, we will study how to handle challenging realworld scenarios such as unconstrained pose variations, unconstrained illumination conditions, and unconstrained low resolution of the periocular and facial images. Together, we aim to achieve unconstrained periocular face recognition through both direct periocular face matching and indirect periocular-based full face hallucination. In the final Part IV of the dissertation, we will go beyond and explore several new methods in deep learning that are statistically efficient for generalpurpose image recognition. Methods include the local binary convolutional neural networks (LBCNN), the perturbative neural networks (PNN), and the polynomial convolutional neural networks (PolyCNN). Biometrics Deep Learning Dictionary Learning Face Recognition Periocular Recognition
160	A study of semantics across different representations of language Dharmaretnam, Dhanush 28 May 2018 (has links) Semantics is the study of meaning and here we explore it through three major representations: brain, image and text. Researchers in the past have performed various studies to understand the similarities between semantic features across all the three representations. Distributional Semantic (DS) models or word vectors that are trained on text corpora have been widely used to study the convergence of semantic information in the human brain. Moreover, they have been incorporated into various NLP applications such as document categorization, speech to text and machine translation. Due to their widespread adoption by researchers and industry alike, it becomes imperative to test and evaluate the performance of di erent word vectors models. In this thesis, we publish the second iteration of BrainBench: a system designed to evaluate and benchmark word vectors using brain data by incorporating two new Italian brain datasets collected using fMRI and EEG technology. In the second half of the thesis, we explore semantics in Convolutional Neural Network (CNN). CNN is a computational model that is the state of the art technology for object recognition from images. However, these networks are currently considered a black-box and there is an apparent lack of understanding on why various CNN architectures perform better than the other. In this thesis, we also propose a novel method to understand CNNs by studying the semantic representation through its hierarchical layers. The convergence of semantic information in these networks is studied with the help of DS models following similar methodologies used to study semantics in the human brain. Our results provide substantial evidence that Convolutional Neural Networks do learn semantics from the images, and the features learned by the CNNs correlate to the semantics of the object in the image. Our methodology and results could potentially pave the way for improved design and debugging of CNNs. / Graduate Computational linguistics Semantics Semantics in Brain Convolutional Neural Networks Deep learning

Search results