Global ETD Search

141	Gesture passwords: concepts, methods and challenges Wu, Jonathan 21 June 2016 (has links) Biometrics are a convenient alternative to traditional forms of access control such as passwords and pass-cards since they rely solely on user-specific traits. Unlike alphanumeric passwords, biometrics cannot be given or told to another person, and unlike pass-cards, are always “on-hand.” Perhaps the most well-known biometrics with these properties are: face, speech, iris, and gait. This dissertation proposes a new biometric modality: gestures. A gesture is a short body motion that contains static anatomical information and changing behavioral (dynamic) information. This work considers both full-body gestures such as a large wave of the arms, and hand gestures such as a subtle curl of the fingers and palm. For access control, a specific gesture can be selected as a “password” and used for identification and authentication of a user. If this particular motion were somehow compromised, a user could readily select a new motion as a “password,” effectively changing and renewing the behavioral aspect of the biometric. This thesis describes a novel framework for acquiring, representing, and evaluating gesture passwords for the purpose of general access control. The framework uses depth sensors, such as the Kinect, to record gesture information from which depth maps or pose features are estimated. First, various distance measures, such as the log-euclidean distance between feature covariance matrices and distances based on feature sequence alignment via dynamic time warping, are used to compare two gestures, and train a classifier to either authenticate or identify a user. In authentication, this framework yields an equal error rate on the order of 1-2% for body and hand gestures in non-adversarial scenarios. Next, through a novel decomposition of gestures into posture, build, and dynamic components, the relative importance of each component is studied. The dynamic portion of a gesture is shown to have the largest impact on biometric performance with its removal causing a significant increase in error. In addition, the effects of two types of threats are investigated: one due to self-induced degradations (personal effects and the passage of time) and the other due to spoof attacks. For body gestures, both spoof attacks (with only the dynamic component) and self-induced degradations increase the equal error rate as expected. Further, the benefits of adding additional sensor viewpoints to this modality are empirically evaluated. Finally, a novel framework that leverages deep convolutional neural networks for learning a user-specific “style” representation from a set of known gestures is proposed and compared to a similar representation for gesture recognition. This deep convolutional neural network yields significantly improved performance over prior methods. A byproduct of this work is the creation and release of multiple publicly available, user-centric (as opposed to gesture-centric) datasets based on both body and hand gestures. Electrical engineering Biometrics Deep learning Depth sensors Gestures
142	Artificial intelligence system for continuous affect estimation from naturalistic human expressions Abd Gaus, Yona Falinie January 2018 (has links) The analysis and automatic affect estimation system from human expression has been acknowledged as an active research topic in computer vision community. Most reported affect recognition systems, however, only consider subjects performing well-defined acted expression, in a very controlled condition, so they are not robust enough for real-life recognition tasks with subject variation, acoustic surrounding and illumination change. In this thesis, an artificial intelligence system is proposed to continuously (represented along a continuum e.g., from -1 to +1) estimate affect behaviour in terms of latent dimensions (e.g., arousal and valence) from naturalistic human expressions. To tackle the issues, feature representation and machine learning strategies are addressed. In feature representation, human expression is represented by modalities such as audio, video, physiological signal and text modality. Hand- crafted features is extracted from each modality per frame, in order to match with consecutive affect label. However, the features extracted maybe missing information due to several factors such as background noise or lighting condition. Haar Wavelet Transform is employed to determine if noise cancellation mechanism in feature space should be considered in the design of affect estimation system. Other than hand-crafted features, deep learning features are also analysed in terms of the layer-wise; convolutional and fully connected layer. Convolutional Neural Network such as AlexNet, VGGFace and ResNet has been selected as deep learning architecture to do feature extraction on top of facial expression images. Then, multimodal fusion scheme is applied by fusing deep learning feature and hand-crafted feature together to improve the performance. In machine learning strategies, two-stage regression approach is introduced. In the first stage, baseline regression methods such as Support Vector Regression are applied to estimate each affect per time. Then in the second stage, subsequent model such as Time Delay Neural Network, Long Short-Term Memory and Kalman Filter is proposed to model the temporal relationships between consecutive estimation of each affect. In doing so, the temporal information employed by a subsequent model is not biased by high variability present in consecutive frame and at the same time, it allows the network to exploit the slow changing dynamic between emotional dynamic more efficiently. Following of two-stage regression approach for unimodal affect analysis, fusion information from different modalities is elaborated. Continuous emotion recognition in-the-wild is leveraged by investigating mathematical modelling for each emotion dimension. Linear Regression, Exponent Weighted Decision Fusion and Multi-Gene Genetic Programming are implemented to quantify the relationship between each modality. In summary, the research work presented in this thesis reveals a fundamental approach to automatically estimate affect value continuously from naturalistic human expression. The proposed system, which consists of feature smoothing, deep learning feature, two-stage regression framework and fusion using mathematical equation between modalities is demonstrated. It offers strong basis towards the development artificial intelligent system on estimation continuous affect estimation, and more broadly towards building a real-time emotion recognition system for human-computer interaction.
143	ScatterNet hybrid frameworks for deep learning Singh, Amarjot January 2019 (has links) Image understanding is the task of interpreting images by effectively solving the individual tasks of object recognition and semantic image segmentation. An image understanding system must have the capacity to distinguish between similar looking image regions while being invariant in its response to regions that have been altered by the appearance-altering transformation. The fundamental challenge for any such system lies within this simultaneous requirement for both invariance and specificity. Many image understanding systems have been proposed that capture geometric properties such as shapes, textures, motion and 3D perspective projections using filtering, non-linear modulus, and pooling operations. Deep learning networks ignore these geometric considerations and compute descriptors having suitable invariance and stability to geometric transformations using (end-to-end) learned multi-layered network filters. These deep learning networks in recent years have come to dominate the previously separate fields of research in machine learning, computer vision, natural language understanding and speech recognition. Despite the success of these deep networks, there remains a fundamental lack of understanding in the design and optimization of these networks which makes it difficult to develop them. Also, training of these networks requires large labeled datasets which in numerous applications may not be available. In this dissertation, we propose the ScatterNet Hybrid Framework for Deep Learning that is inspired by the circuitry of the visual cortex. The framework uses a hand-crafted front-end, an unsupervised learning based middle-section, and a supervised back-end to rapidly learn hierarchical features from unlabelled data. Each layer in the proposed framework is automatically optimized to produce the desired computationally efficient architecture. The term `Hybrid' is coined because the framework uses both unsupervised as well as supervised learning. We propose two hand-crafted front-ends that can extract locally invariant features from the input signals. Next, two ScatterNet Hybrid Deep Learning (SHDL) networks (a generative and a deterministic) were introduced by combining the proposed front-ends with two unsupervised learning modules which learn hierarchical features. These hierarchical features were finally used by a supervised learning module to solve the task of either object recognition or semantic image segmentation. The proposed front-ends have also been shown to improve the performance and learning of current Deep Supervised Learning Networks (VGG, NIN, ResNet) with reduced computing overhead.
144	Using Capsule Networks for Image and Speech Recognition Problems January 2018 (has links) abstract: In recent years, conventional convolutional neural network (CNN) has achieved outstanding performance in image and speech processing applications. Unfortunately, the pooling operation in CNN ignores important spatial information which is an important attribute in many applications. The recently proposed capsule network retains spatial information and improves the capabilities of traditional CNN. It uses capsules to describe features in multiple dimensions and dynamic routing to increase the statistical stability of the network. In this work, we first use capsule network for overlapping digit recognition problem. We evaluate the performance of the network with respect to recognition accuracy, convergence and training time per epoch. We show that capsule network achieves higher accuracy when training set size is small. When training set size is larger, capsule network and conventional CNN have comparable recognition accuracy. The training time per epoch for capsule network is longer than conventional CNN because of the dynamic routing algorithm. An analysis of the GPU timing shows that adjusting the capsule structure can help decrease the time complexity of the dynamic routing algorithm significantly. Next, we design a capsule network for speech recognition, specifically, overlapping word recognition. We use both capsule network and conventional CNN to recognize 2 overlapping words in speech files created from 5 word classes. We show that capsule network achieves a considerably higher recognition accuracy (96.92%) compared to conventional CNN (85.19%). Our results show that capsule network recognizes overlapping word by recognizing each individual word in the speech. We also verify the scalability of capsule network by increasing the number of word classes from 5 to 10. Capsule network still shows a high recognition accuracy of 95.42% in case of 10 words while the accuracy of conventional CNN decreases sharply to 73.18%. / Dissertation/Thesis / Masters Thesis Electrical Engineering 2018 Electrical engineering Capsule network deep learning Image recognition keyword recognition
145	Systematic generation of datasets and benchmarks for modern computer vision Malireddi, Sri Raghu 03 April 2019 (has links) Deep Learning is dominant in the field of computer vision, thanks to its high performance. This high performance is driven by large annotated datasets and proper evaluation benchmarks. However, two important areas in computer vision, depth-based hand segmentation, and local features, respectively lack a large well-annotated dataset and a benchmark protocol that properly demonstrates its practical performance. Therefore, in this thesis, we focus on these two problems. For hand segmentation, we create a novel systematic way to easily create automatic semantic segmentation annotations for large datasets. We achieved this with the help of traditional computer vision techniques and minimal hardware setup of one RGB-D camera and two distinctly colored skin-tight gloves. Our method allows easy creation of large-scale datasets with high annotation quality. For local features, we create a new modern benchmark, that reveals their different aspects. Specifically wide-baseline stereo matching and Multi-View Stereo (MVS), of keypoints in a more practical setup, namely Structure-from-Motion (SfM). We believe that through our new benchmark, we will be able to spur research on learned local features to a more practical direction. In this respect, the benchmark developed for the thesis will be used to host a challenge on local features. / Graduate sfm computer-vision deep-learning hand segmentation dataset
146	Adaptive Lighting for Data-Driven Non-Line-Of-Sight 3D Localization January 2019 (has links) abstract: Non-line-of-sight (NLOS) imaging of objects not visible to either the camera or illumina- tion source is a challenging task with vital applications including surveillance and robotics. Recent NLOS reconstruction advances have been achieved using time-resolved measure- ments. Acquiring these time-resolved measurements requires expensive and specialized detectors and laser sources. In work proposes a data-driven approach for NLOS 3D local- ization requiring only a conventional camera and projector. The localisation is performed using a voxelisation and a regression problem. Accuracy of greater than 90% is achieved in localizing a NLOS object to a 5cm × 5cm × 5cm volume in real data. By adopting the regression approach an object of width 10cm to localised to approximately 1.5cm. To generalize to line-of-sight (LOS) scenes with non-planar surfaces, an adaptive lighting al- gorithm is adopted. This algorithm, based on radiosity, identifies and illuminates scene patches in the LOS which most contribute to the NLOS light paths, and can factor in sys- tem power constraints. Improvements ranging from 6%-15% in accuracy with a non-planar LOS wall using adaptive lighting is reported, demonstrating the advantage of combining the physics of light transport with active illumination for data-driven NLOS imaging. / Dissertation/Thesis / Masters Thesis Electrical Engineering 2019 Electrical engineering deep learning Non-line-of-sight-imaging radiosity
147	Multi-scale convolutional neural networks for segmentation of pulmonary structures in computed tomography Gerard, Sarah E. 01 December 2018 (has links) Computed tomography (CT) is routinely used for diagnosing lung disease and developing treatment plans using images of intricate lung structure with submillimeter resolution. Automated segmentation of anatomical structures in such images is important to enable efficient processing in clinical and research settings. Convolution neural networks (ConvNets) are largely successful at performing image segmentation with the ability to learn discriminative abstract features that yield generalizable predictions. However, constraints in hardware memory do not allow deep networks to be trained with high-resolution volumetric CT images. Restricted by memory constraints, current applications of ConvNets on volumetric medical images use a subset of the full image; limiting the capacity of the network to learn informative global patterns. Local patterns, such as edges, are necessary for precise boundary localization, however, they suffer from low specificity. Global information can disambiguate structures that are locally similar. The central thesis of this doctoral work is that both local and global information is important for segmentation of anatomical structures in medical images. A novel multi-scale ConvNet is proposed that divides the learning task across multiple networks; each network learns features over different ranges of scales. It is hypothesized that multi-scale ConvNets will lead to improved segmentation performance, as no compromise needs to be made between image resolution, image extent, and network depth. Three multi-scale models were designed to specifically target segmentation of three pulmonary structures: lungs, fissures, and lobes. The proposed models were evaluated on a diverse datasets and compared to architectures that do not use both local and global features. The lung model was evaluated on humans and three animal species; the results demonstrated the multi-scale model outperformed single scale models at different resolutions. The fissure model showed superior performance compared to both a traditional Hessian filter and a standard U-Net architecture that is limited in global extent. The results demonstrated that multi-scale ConvNets improved pulmonary CT segmentation by incorporating both local and global features using multiple ConvNets within a constrained-memory system. Overall, the proposed pipeline achieved high accuracy and was robust to variations resulting from different imaging protocols, reconstruction kernels, scanners, lung volumes, and pathological alterations; demonstrating its potential for enabling high-throughput image analysis in clinical and research settings. Computed Tomography Convolutional Neural Networks Deep Learning Pulmonary Segmentation
148	A STUDY OF REAL TIME SEARCH IN FLOOD SCENES FROM UAV VIDEOS USING DEEP LEARNING TECHNIQUES Gagandeep Singh Khanuja (7486115) 17 October 2019 (has links) <div>Following a natural disaster, one of the most important facet that influence a persons chances of survival/being found out is the time with which they are rescued. Traditional means of search operations involving dogs, ground robots, humanitarian intervention; are time intensive and can be a major bottleneck in search operations. The main aim of these operations is to rescue victims without critical delay in the shortest time possible which can be realized in real-time by using UAVs. With advancements in computational devices and the ability to learn from complex data, deep learning can be leveraged in real time environment for purpose of search and rescue operations. This research aims to solve the traditional means of search operation using the concept of deep learning for real time object detection and Photogrammetry for precise geo-location mapping of the objects(person,car) in real time. In order to do so, various pre-trained algorithms like Mask-RCNN, SSD300, YOLOv3 and trained algorithms like YOLOv3 have been deployed with their results compared with means of addressing the search operation in</div><div>real time.</div><div><br></div> Deep Learning Theory machine Learning Predictions
149	Association Learning Via Deep Neural Networks Landeen, Trevor J. 01 May 2018 (has links) Deep learning has been making headlines in recent years and is often portrayed as an emerging technology on a meteoric rise towards fully sentient artificial intelligence. In reality, deep learning is the most recent renaissance of a 70 year old technology and is far from possessing true intelligence. The renewed interest is motivated by recent successes in challenging problems, the accessibility made possible by hardware developments, and dataset availability. The predecessor to deep learning, commonly known as the artificial neural network, is a computational network setup to mimic the biological neural structure found in brains. However, unlike human brains, artificial neural networks, in most cases cannot make inferences from one problem to another. As a result, developing an artificial neural network requires a large number of examples of desired behavior for a specific problem. Furthermore, developing an artificial neural network capable of solving the problem can take days, or even weeks, of computations. Two specific problems addressed in this dissertation are both input association problems. One problem challenges a neural network to identify overlapping regions in images and is used to evaluate the ability of a neural network to learn associations between inputs of similar types. The other problem asks a neural network to identify which observed wireless signals originated from observed potential sources and is used to assess the ability of a neural network to learn associations between inputs of different types. The neural network solutions to both problems introduced, discussed, and evaluated in this dissertation demonstrate deep learning’s applicability to problems which have previously attracted little attention. Deep Learning Neural Networks Multiple Input Electrical and Computer Engineering
150	Organisering, matematiskt innehåll och feedback i specialundervisning : En kvalitativ fallstudie av några specialpedagogers matematikundervisning Bäckström, Inger January 2008 (has links) Sammanfattning Detta är en kvalitativ fallstudie av fem specialpedagogers arbete med specialundervisning i matematik. Syftet är att kartlägga deras arbete och uppfattningar genom att beskriva och analysera hur de organiserar sin undervisning i matematik, hur de undervisar ämnesinnehållet och hur de ger feedback till eleverna i klassrummet, samt hur de själva beskriver det de uppfattar som det specialpedagogiska inslaget i sin undervisning. För insamling av empirin har en kvalitativ metod med halvstrukturerade djupintervjuer samt observationer i form av ljudinspelningar och ostrukturerade fältanteckningar under lektioner gjorts. Ramfaktorteorier, fenomenografiska teorier, inlärningsteorier samt fallstudien, har styrt mitt sätt att bearbeta och analysera det empiriska materialet. Resultatet visar att det lärande som erbjuds eleverna av fyra av specialpedagogerna bidrar till ett ytinriktat lärande, och feedback som erbjuds bidrar till yttre motivation. Det specialpedagogiska inslaget anser de vara att se människan bakom ett beteende, att ha en positiv förväntan, att utgå från barnet, att vara personlig, att tycka om eleverna och att vara konkret i undervisningen. Den femte pedagogen erbjuder ett djupinriktat lärande och erbjuder ett arbetssätt under lektionerna som skapar vilja och motivation att lära sig. Det specialpedagogiska inslaget anser hon vara att ta reda på var eleven befinner sig kunskapsmässigt och utgå därifrån så eleven har möjlighet att förstå det den inte har förstått. / Summary This is a qualitative case study of five special educators work with special education in mathematics. The aim is to identify their work and ideas by describing and analyzing how they organize their teaching of mathematics, how they teach the subject matter, how they give feedback to students in the classroom, how they describe what they perceive to be the special education component of their teaching. For the collection of empirical data, a qualitative approach with semi-structured interviews and observations in the form of audio recordings and unstructured field notes during classes was used. Frame factor theory, phenomenographic theory, learning theories and case study have leaded me through the way of processing and analyzing the empirical material. Results show that the learning that four of the special educators offer the students contributes to a surface approach to learning, and the feedback they offer contributes to external motivation. The special education components they say are important: to see the person behind a behavior, have positive expectations, let the teaching be based on the child’s experience, to be personal, to like the students and to be concrete in teaching. The fifth special educator offers the students a contribution to deep approach to learning and she offers a way to work at the lessons that creates motivation and willingness to learn. The special education component she thinks is important is to find out what knowledge the child has, and work from there so the child has the possibility to understand what he or she hasn’t understood. Phenomenography motivation praise deep learning Fenomenografi motivation beröm djupinriktat

Search results