• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 55
  • 9
  • 6
  • 4
  • 3
  • 3
  • 3
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 103
  • 103
  • 43
  • 38
  • 26
  • 24
  • 22
  • 21
  • 21
  • 17
  • 16
  • 15
  • 14
  • 14
  • 13
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
11

Image understanding for automatic human and machine separation

Romero Macias, Cristina January 2013 (has links)
The research presented in this thesis aims to extend the capabilities of human interaction proofs in order to improve security in web applications and services. The research focuses on developing a more robust and efficient Completely Automated Public Turing test to tell Computers and Human Apart (CAPTCHA) to increase the gap between human recognition and machine recognition. Two main novel approaches are presented, each one of them targeting a different area of human and machine recognition: a character recognition test, and an image recognition test. Along with the novel approaches, a categorisation for the available CAPTCHA methods is also introduced. The character recognition CAPTCHA is based on the creation of depth perception by using shadows to represent characters. The characters are created by the imaginary shadows produced by a light source, using as a basis the gestalt principle that human beings can perceive whole forms instead of just a collection of simple lines and curves. This approach was developed in two stages: firstly, two dimensional characters, and secondly three-dimensional character models. The image recognition CAPTCHA is based on the creation of cartoons out of faces. The faces used belong to people in the entertainment business, politicians, and sportsmen. The principal basis of this approach is that face perception is a cognitive process that humans perform easily and with a high rate of success. The process involves the use of face morphing techniques to distort the faces into cartoons, allowing the resulting image to be more robust against machine recognition. Exhaustive tests on both approaches using OCR software, SIFT image recognition, and face recognition software show an improvement in human recognition rate, whilst preventing robots break through the tests.
12

Using Capsule Networks for Image and Speech Recognition Problems

January 2018 (has links)
abstract: In recent years, conventional convolutional neural network (CNN) has achieved outstanding performance in image and speech processing applications. Unfortunately, the pooling operation in CNN ignores important spatial information which is an important attribute in many applications. The recently proposed capsule network retains spatial information and improves the capabilities of traditional CNN. It uses capsules to describe features in multiple dimensions and dynamic routing to increase the statistical stability of the network. In this work, we first use capsule network for overlapping digit recognition problem. We evaluate the performance of the network with respect to recognition accuracy, convergence and training time per epoch. We show that capsule network achieves higher accuracy when training set size is small. When training set size is larger, capsule network and conventional CNN have comparable recognition accuracy. The training time per epoch for capsule network is longer than conventional CNN because of the dynamic routing algorithm. An analysis of the GPU timing shows that adjusting the capsule structure can help decrease the time complexity of the dynamic routing algorithm significantly. Next, we design a capsule network for speech recognition, specifically, overlapping word recognition. We use both capsule network and conventional CNN to recognize 2 overlapping words in speech files created from 5 word classes. We show that capsule network achieves a considerably higher recognition accuracy (96.92%) compared to conventional CNN (85.19%). Our results show that capsule network recognizes overlapping word by recognizing each individual word in the speech. We also verify the scalability of capsule network by increasing the number of word classes from 5 to 10. Capsule network still shows a high recognition accuracy of 95.42% in case of 10 words while the accuracy of conventional CNN decreases sharply to 73.18%. / Dissertation/Thesis / Masters Thesis Electrical Engineering 2018
13

Incremental nonparametric discriminant analysis based active learning and its applications

Dhoble, Kshitij January 2010 (has links)
Learning is one such innate general cognitive ability which has empowered the living animate entities and especially humans with intelligence. It is obtained by acquiring new knowledge and skills that enable them to adapt and survive. With the advancement of technology, a large amount of information gets amassed. Due to the sheer volume of increasing information, its analysis is humanly unfeasible and impractical. Therefore, for the analysis of massive data we need machines (such as computers) with the ability to learn and evolve in order to discover new knowledge from the analysed data. The majority of the traditional machine learning algorithms function optimally on a parametric (static) data. However, the datasets acquired in real practices are often vast, inaccurate, inconsistent, non-parametric and highly volatile. Therefore, the learning algorithms’ optimized performance can only be transitory, thus requiring a learning algorithm that can constantly evolve and adapt according to the data it processes. In light of a need for such machine learning algorithm, we look for the inspiration in humans’ innate cognitive learning ability. Active learning is one such biologically inspired model, designed to mimic humans’ dynamic, evolving, adaptive and intelligent cognitive learning ability. Active learning is a class of learning algorithms that aim to create an accurate classifier by iteratively selecting essentially important unlabeled data points by the means of adaptive querying and training the classifier on those data points which are potentially useful for the targeted learning task (Tong & Koller, 2002). The traditional active learning techniques are implemented under supervised or semi-supervised learning settings (Pang et al., 2009). Our proposed model performs the active learning in an unsupervised setting by introducing a discriminative selective sampling criterion, which reduces the computational cost by substantially decreasing the number of irrelevant instances to be learned by the classifier. The methods based on passive learning (which assumes the entire dataset for training is truly informative and is presented in advance) prove to be inadequate in a real world application (Pang et al., 2009). To overcome this limitation, we have developed Active Mode Incremental Nonparametric Discriminant Analysis (aIncNDA) which undertakes adaptive discriminant selection of the instances for an incremental NDA learning. NDA is a discriminant analysis method that has been incorporated in our selective sampling technique in order to reduce the effects of the outliers (which are anomalous observations/data points in a dataset). It works with significant efficiency on the anomalous datasets, thereby minimizing the computational cost (Raducanu & Vitri´a, 2008). NDA is one of the methods used in the proposed active learning model. This thesis presents the research on a discrimination-based active learning where NDA is extended for fast discrimination analysis and data sampling. In addition to NDA, a base classifier (such as Support Vector Machine (SVM) and k-Nearest Neighbor (k-NN)) is applied to discover and merge the knowledge from the newly acquired data. The performance of our proposed method is evaluated against benchmark University of California, Irvine (UCI) datasets, face image, and object image category datasets. The assessment that was carried out on the UCI datasets showed that Active Mode Incremental NDA (aIncNDA) performs at par and in many cases better than the incremental NDA with a lower number of instances. Additionally, aIncNDA also performs efficiently under the different levels of redundancy, but has an improved discrimination performance more often than a passive incremental NDA. In an application that undertakes the face image and object image recognition and retrieval task, it can be seen that the proposed multi-example active learning system dynamically and incrementally learns from the newly obtained images, thereby gradually reducing its retrieval (classification) error rate by the means of iterative refinement. The results of the empirical investigation show that our proposed active learning model can be used for classification with increased efficiency. Furthermore, given the nature of network data which is large, streaming, and constantly changing, we believe that our method can find practical application in the field of Internet security.
14

Incremental nonparametric discriminant analysis based active learning and its applications

Dhoble, Kshitij January 2010 (has links)
Learning is one such innate general cognitive ability which has empowered the living animate entities and especially humans with intelligence. It is obtained by acquiring new knowledge and skills that enable them to adapt and survive. With the advancement of technology, a large amount of information gets amassed. Due to the sheer volume of increasing information, its analysis is humanly unfeasible and impractical. Therefore, for the analysis of massive data we need machines (such as computers) with the ability to learn and evolve in order to discover new knowledge from the analysed data. The majority of the traditional machine learning algorithms function optimally on a parametric (static) data. However, the datasets acquired in real practices are often vast, inaccurate, inconsistent, non-parametric and highly volatile. Therefore, the learning algorithms’ optimized performance can only be transitory, thus requiring a learning algorithm that can constantly evolve and adapt according to the data it processes. In light of a need for such machine learning algorithm, we look for the inspiration in humans’ innate cognitive learning ability. Active learning is one such biologically inspired model, designed to mimic humans’ dynamic, evolving, adaptive and intelligent cognitive learning ability. Active learning is a class of learning algorithms that aim to create an accurate classifier by iteratively selecting essentially important unlabeled data points by the means of adaptive querying and training the classifier on those data points which are potentially useful for the targeted learning task (Tong & Koller, 2002). The traditional active learning techniques are implemented under supervised or semi-supervised learning settings (Pang et al., 2009). Our proposed model performs the active learning in an unsupervised setting by introducing a discriminative selective sampling criterion, which reduces the computational cost by substantially decreasing the number of irrelevant instances to be learned by the classifier. The methods based on passive learning (which assumes the entire dataset for training is truly informative and is presented in advance) prove to be inadequate in a real world application (Pang et al., 2009). To overcome this limitation, we have developed Active Mode Incremental Nonparametric Discriminant Analysis (aIncNDA) which undertakes adaptive discriminant selection of the instances for an incremental NDA learning. NDA is a discriminant analysis method that has been incorporated in our selective sampling technique in order to reduce the effects of the outliers (which are anomalous observations/data points in a dataset). It works with significant efficiency on the anomalous datasets, thereby minimizing the computational cost (Raducanu & Vitri´a, 2008). NDA is one of the methods used in the proposed active learning model. This thesis presents the research on a discrimination-based active learning where NDA is extended for fast discrimination analysis and data sampling. In addition to NDA, a base classifier (such as Support Vector Machine (SVM) and k-Nearest Neighbor (k-NN)) is applied to discover and merge the knowledge from the newly acquired data. The performance of our proposed method is evaluated against benchmark University of California, Irvine (UCI) datasets, face image, and object image category datasets. The assessment that was carried out on the UCI datasets showed that Active Mode Incremental NDA (aIncNDA) performs at par and in many cases better than the incremental NDA with a lower number of instances. Additionally, aIncNDA also performs efficiently under the different levels of redundancy, but has an improved discrimination performance more often than a passive incremental NDA. In an application that undertakes the face image and object image recognition and retrieval task, it can be seen that the proposed multi-example active learning system dynamically and incrementally learns from the newly obtained images, thereby gradually reducing its retrieval (classification) error rate by the means of iterative refinement. The results of the empirical investigation show that our proposed active learning model can be used for classification with increased efficiency. Furthermore, given the nature of network data which is large, streaming, and constantly changing, we believe that our method can find practical application in the field of Internet security.
15

Robotics semantic localization using deep learning techniques

Cruz, Edmanuel 20 March 2020 (has links)
The tremendous technological advance experienced in recent years has allowed the development and implementation of algorithms capable of performing different tasks that help humans in their daily lives. Scene recognition is one of the fields most benefited by these advances. Scene recognition gives different systems the ability to define a context for the identification or recognition of objects or places. In this same line of research, semantic localization allows a robot to identify a place semantically. Semantic classification is currently an exciting topic and it is the main goal of a large number of works. Within this context, it is a challenge for a system or for a mobile robot to identify semantically an environment either because the environment is visually different or has been gradually modified. Changing environments are challenging scenarios because, in real-world applications, the system must be able to adapt to these environments. This research focuses on recent techniques for categorizing places that take advantage of DL to produce a semantic definition for a zone. As a contribution to the solution of this problem, in this work, a method capable of updating a previously trained model is designed. This method was used as a module of an agenda system to help people with cognitive problems in their daily tasks. An augmented reality mobile phone application was designed which uses DL techniques to locate a customer’s location and provide useful information, thus improving their shopping experience. These solutions will be described and explained in detail throughout the following document.
16

Compressing Deep Convolutional Neural Networks

Mancevo del Castillo Ayala, Diego January 2017 (has links)
Deep Convolutional Neural Networks and "deep learning" in general stand at the cutting edge on a range of applications, from image based recognition and classification to natural language processing, speech and speaker recognition and reinforcement learning. Very deep models however are often large, complex and computationally expensive to train and evaluate. Deep learning models are thus seldom deployed natively in environments where computational resources are scarce or expensive. To address this problem we turn our attention towards a range of techniques that we collectively refer to as "model compression" where a lighter student model is trained to approximate the output produced by the model we wish to compress. To this end, the output from the original model is used to craft the training labels of the smaller student model. This work contains some experiments on CIFAR-10 and demonstrates how to use the aforementioned techniques to compress a people counting model whose precision, recall and F1-score are improved by as much as 14% against our baseline.
17

Finding license-plates in varying lighting conditions using two machine learning methods

Sturesson, André, Böök, Johannes January 2023 (has links)
Object detection and machine learning are important fields in Computer science. This report presents two methods to find the bounding box of a license plate and tries to evaluate the best approach to deal with various lighting conditions. The first method uses edge detection to find a number of potential candidates, where each candidate is fed to a machine learning model who decides if the candidate is a license plate or not. This had an accuracy of 39%. This method is pointing towards struggling with varying light levels and the lowest accuracy was measured at the highest and lowest mean brightness values. The second method uses mostly machine learning to find the bounding box of a license plate which achieved a higher accuracy with 68%. This method seems to be better in low-light conditions and is more uniform in accuracy across different lighting conditions.
18

TOPOLOGICAL PROPERTIES OF A NETWORK OF SPIKING NEURONS IN FACE IMAGE RECOGNITION

Shin, Joo-Heon 24 March 2010 (has links)
We introduce a novel system for recognition of partially occluded and rotated images. The system is based on a hierarchical network of integrate-and-fire spiking neurons with random synaptic connections and a novel organization process. The network generates integrated output sequences that are used for image classification. The network performed satisfactorily given appropriate topology, i.e. the number of neurons and synaptic connections, which corresponded to the size of input images. Comparison of Synaptic Plasticity Activity Rule (SAPR) and Spike Timing Dependant Plasticity (STDP) rules, used to update connections between the neurons, indicated that the SAPR gave better results and thus was used throughout. Test results showed that the network performed better than Support Vector Machines. We also introduced a stopping criterion based on entropy, which significantly shortened the iterative process while only slightly affecting classification performance.
19

Attribute learning for image/video understanding

Fu, Yanwei January 2015 (has links)
For the past decade computer vision research has achieved increasing success in visual recognition including object detection and video classification. Nevertheless, these achievements still cannot meet the urgent needs of image and video understanding. The recently rapid development of social media sharing has created a huge demand for automatic media classification and annotation techniques. In particular, these types of media data usually contain very complex social activities of a group of people (e.g. YouTube video of a wedding reception) and are captured by consumer devices with poor visual quality. Thus it is extremely challenging to automatically understand such a high number of complex image and video categories, especially when these categories have never been seen before. One way to understand categories with no or few examples is by transfer learning which transfers knowledge across related domains, tasks, or distributions. In particular, recently lifelong learning has become popular which aims at transferring information to tasks without any observed data. In computer vision, transfer learning often takes the form of attribute learning. The key underpinning idea of attribute learning is to exploit transfer learning via an intermediatelevel semantic representations – attributes. The semantic attributes are most commonly used as a semantically meaningful bridge between low feature data and higher level class concepts, since they can be used both descriptively (e.g., ’has legs’) and discriminatively (e.g., ’cats have it but dogs do not’). Previous works propose many different attribute learning models for image and video understanding. However, there are several intrinsic limitations and problems that exist in previous attribute learning work. Such limitations discussed in this thesis include limitations of user-defined attributes, projection domain-shift problems, prototype sparsity problems, inability to combine multiple semantic representations and noisy annotations of relative attributes. To tackle these limitations, this thesis explores attribute learning on image and video understanding from the following three aspects. Firstly to break the limitations of user-defined attributes, a framework for learning latent attributes is present for automatic classification and annotation of unstructured group social activity in videos, which enables the tasks of attribute learning for understanding complex multimedia data with sparse and incomplete labels. We investigate the learning of latent attributes for content-based understanding, which aims to model and predict classes and tags relevant to objects, sounds and events – anything likely to be used by humans to describe or search for media. Secondly, we propose the framework of transductive multi-view embedding hypergraph label propagation and solve three inherent limitations of most previous attribute learning work, i.e., the projection domain shift problems, the prototype sparsity problems and the inability to combine multiple semantic representations. We explore the manifold structure of the data distributions of different views projected onto the same embedding space via label propagation on a graph. Thirdly a novel framework for robust learning is presented to effectively learn relative attributes from the extremely noisy and sparse annotations. Relative attributes are increasingly learned from pairwise comparisons collected via crowdsourcing tools which are more economic and scalable than the conventional laboratory based data annotation. However, a major challenge for taking a crowdsourcing strategy is the detection and pruning of outliers. We thus propose a principled way to identify annotation outliers by formulating the relative attribute prediction task as a unified robust learning to rank problem, tackling both the outlier detection and relative attribute prediction tasks jointly. In summary, this thesis studies and solves the key challenges and limitations of attribute learning in image/video understanding. We show the benefits of solving these challenges and limitations in our approach which thus achieves better performance than previous methods.
20

Development of three AI techniques for 2D platform games

Persson, Martin January 2005 (has links)
<p>This thesis serves as an introduction to anyone that has an interest in artificial intelligence games and has experience in programming or anyone who knows nothing of computer games but wants to learn about it. The first part will present a brief introduction to AI, then it will give an introduction to games and game programming for someone that has little knowledge about games. This part includes game programming terminology, different game genres and a little history of games. Then there is an introduction of a couple of common techniques used in game AI. The main contribution of this dissertation is in the second part where three techniques that never were properly implemented before 3D games took over the market are introduced and it is explained how they would be done if they were to live up to today’s standards and demands. These are: line of sight, image recognition and pathfinding. These three techniques are used in today’s 3D games so if a 2D game were to be released today the demands on the AI would be much higher then they were ten years ago when 2D games stagnated. The last part is an evaluation of the three discussed topics.</p>

Page generated in 0.1008 seconds