Global ETD Search

791	Mixture models for clustering and dimension reduction Verbeek, Jakob 08 December 2004 (has links) (PDF) In Chapter 1 we give a general introduction and motivate the need for clustering and dimension reduction methods. We continue in Chapther 2 with a review of different types of existing clustering and dimension reduction methods.<br /><br />In Chapter 3 we introduce mixture densities and the expectation-maximization (EM) algorithm to estimate their parameters. Although the EM algorithm has many attractive properties, it is not guaranteed to return optimal parameter estimates. We present greedy EM parameter estimation algorithms which start with a one-component mixture and then iteratively add a component to the mixture and re-estimate the parameters of the current mixture. Experimentally, we demonstrate that our algorithms avoid many of the sub-optimal estimates returned by the EM algorithm. Finally, we present an approach to accelerate mixture densities estimation from many data points. We apply this approach to both the standard EM algorithm and our greedy EM algorithm.<br /><br />In Chapter 4 we present a non-linear dimension reduction method that uses a constrained EM algorithm for parameter estimation. Our approach is similar to Kohonen's self-organizing map, but in contrast to the self-organizing map, our parameter estimation algorithm is guaranteed to converge and optimizes a well-defined objective function. In addition, our method allows data with missing values to be used for parameter estimation and it is readily applied to data that is not specified by real numbers but for example by discrete variables. We present the results of several experiments to demonstrate our method and to compare it with Kohonen's self-organizing map.<br /><br />In Chapter 5 we consider an approach for non-linear dimension reduction which is based on a combination of clustering and linear dimension reduction. This approach forms one global non-linear low dimensional data representation by combining multiple, locally valid, linear low dimensional representations. We derive an improvement of the original parameter estimation algorithm, which requires less computation and leads to better parameter estimates. We experimentally compare this approach to several other dimension reduction methods. We also apply this approach to a setting where high dimensional 'outputs' have to be predicted from high dimensional 'inputs'. Experimentally, we show that the considered non-linear approach leads to better predictions than a similar approach which also combines several local linear representations, but does not combine them into one global non-linear representation.<br /><br />In Chapter 6 we summarize our conclusions and discuss directions for further research. [INFO] Computer Science Machine Learning clustering dimension reduction mixture models
792	SPEECH AND LANGUAGE TECHNOLOGIES FOR SEMANTICALLY LINKED INSTRUCTIONAL CONTENT Swaminathan, Ranjini January 2011 (has links) Recent advances in technology have made it possible to offer educational content online in the form of e-learning systems. The Semantically Linked Instructional Content (SLIC) system, developed at The University of Arizona,is one such system that hosts educational and technical videos online.This dissertation proposes the integration of speech and language technologies with the SLIC system.Speech transcripts are being used increasingly in video browsing systems to help understand the video content better and to do search on the content with text queries. Transcripts are especially useful for people with disabilities and those who have a limited understanding of the language of the video. Automatic Speech Recognizers (ASRs) are commonly used to generate speech transcripts for videos but are not consistent in their performance. This issue is more pronounced in a system like SLIC due to the technical nature of talks with words not seen in the ASR vocabulary and many speakers with different voices and accents making recognition harder.The videos in SLIC come with presentation slides that contain words specific to the talk subject and the speech transcript itself can be considered to be composed of these slide words interspersed with other words. Furthermore, the errors in the transcript are words that sound similar to what was actually spoken; notes instead of nodes for example. The errors that occur due to misrecognized slide words can be fixed if we know which slide words were actually spoken and where they occur in the transcript. In other words, the slide words are matched or aligned with the transcript.In this dissertation two algorithms are developed to phonetically align transcript words with slide words based on a Hidden Markov Model and a Hybrid hidden semi-Markov model respectively. The slide words constitute the hidden states and the transcript words are the observed states in both models. The alignment algorithms are adapted for different applications such as transcript correction (as already mentioned), search and indexing, video segmentation and closed captioning. Results from experiments conducted show that the corrected transcripts have improved accuracy andyield better search results for slide word queries. machine learning multimedia Computer Science education technology language processing
793	Using Real-Time Physiological and Behavioral Data to Predict Students' Engagement during Problem Solving: A Machine Learning Approach Cirett Galan, Federico M. January 2012 (has links) The goal of this study was to evaluate whether Electroencephalography (EEG) estimates of attention and cognitive workload captured as students solved math problems could be used to predict success or failure at solving the problems. Students solved a series of SAT math problems while wearing an EEG headset that generated estimates of sustained attention and cognitive workload each second. Students also reported on their level of frustration and the perceived difficulty of each problem. Results from a Support Vector Machine (SVM) training indicated that problem outcomes could be correctly predicted from the combination of attention and workload signals at rates better than chance. The EEG data was also correlated with students' self-report of problem difficulty. Findings suggest that relatively non-intrusive EEG technologies could be used to improve the efficacy of tutoring systems. Intelligent Tutoring Systems Machine Learning physiology Computer Science behavior Electroencephalography
794	Human Action Recognition on Videos: Different Approaches Mejia, Maria Helena January 2012 (has links) The goal of human action recognition on videos is to determine in an automatic way what is happening in a video. This work focuses on providing an answer to this question: given consecutive frames from a video where a person or persons are doing an action, is an automatic system able to recognize the action that is going on for each person? Seven approaches have been provided, most of them based on an alignment process in order to find a measure of distance or similarity for obtaining the classification. Some are based on fluents that are converted to qualitative sequences of Allen relations to make it possible to measure the distance between the pair of sequences by aligning them. The fluents are generated in various ways: representation based on feature extraction of human pose propositions in just an image or a small sequence of images, changes of time series mainly on the angle of slope, changes of the time series focus on the slope direction, and propositions based on symbolic sequences generated by SAX. Another approach based on alignment corresponds to Dynamic Time Warping on subsets of highly dependent parts of the body. An additional approach explored is based on SAX symbolic sequences and respective pair wise alignment. The last approach is based on discretization of the multivariate time series, but instead of alignment, a spectrum kernel and SVM are used as is employed to classify protein sequences in biology. Finally, a sliding window method is used to recognize the actions along the video. These approaches were tested on three datasets derived from RGB-D cameras (e.g., Microsoft Kinect) as well as ordinary video, and a selection of the approaches was compared to the results of other researchers. Gesture Recognition Machine learning Computer Science Activity recognition Computer vision
795	Recognizing User Identity by Touch on Tabletop Displays: An Interactive Authentication Method Torres Peralta, Raquel January 2012 (has links) Multi-touch tablets allow users to interact with computers through intuitive, natural gestures and direct manipulation of digital objects. One advantage of these devices is that they can offer a large, collaborative space where several users can work on a task at the same time. However the lack of privacy in these situations makes standard password-based authentication easily compromised. This work presents a new gesture-based authentication system based on users' unique signature of touch motion. This technique has two key features. First, at each step in authentication the system prompts the user to make a specific gesture selected to maximize the expected long-term information gain. Second, each gesture is integrated using a hierarchical probabilistic model, allowing the system to accept or reject a user after a variable number of gestures. This touch-based approach would allow the user to accurately authenticate without the need to cover their hand or look over their shoulder. This method has been tested using a set of samples collected under real-world conditions in a business office, with a touch tablet that was used on a near daily basis by users familiar with the device. Despite the lack of sophisticated, high-precision equipment, the system is able to achieve high user recognition accuracy with relatively few gestures, demonstrating that human touch patterns have a distinctive signature" that can be used as a powerful biometric measure for user recognition and personalization. Multitouch User Recognition Computer Science Authentication Machine Learning
796	Emotional Sophistication: Studies of Facial Expressions in Games Rossi, Filippo January 2012 (has links) Decision-making is a complex process. Monetary incentives constitute one of the forces driving it, however the motivational space of decision-makers is much broader. We care about other people, we experience emotional reactions, and sometimes we make mistakes. Such social motivations (Sanfey, 2007) drive our own decisions, as well as affect our beliefs about what motivates others' decisions. Behavioral and brain sciences have started addressing the role of social motivations in economic games (Camerer, 2004; Glimcher et al., 2009), however several aspects of social decisions, such as the process of thinking about others' emotional states - emotional sophistication - have been rarely investigated. The goal of this project is to use automatic measurements of dynamic facial expressions to investigate non-monetary motivations and emotional sophistication. The core of our approach is to use state-of-the-art computer vision techniques to extract facial actions from videos in real-time (based on the Facial Action Coding System of Ekman and Friesen (1978)), while participants are playing economic games. We will use powerful statistical machine learning techniques to make inferences about participants internal emotional states during these interactions. These inferences will be used (a) to predict behavior; (b) to explain why a decision is made in terms of the hidden forces driving it; and (c) to investigate the ways in which people construct their beliefs about other people's future actions. The contributions of this targeted interdisciplinary project are threefold. First, it develops new methodologies to study decision processes. Second, it uses these methods to test hypotheses about the role of first order beliefs about social motivations. Finally, our statistical approach sets the ground for "affectively aware" systems, that can use facial expressions to assess the internal states of their users, thus improving human-machine interactions. game theory machine learning neuroeconomics Psychology emotions facial expressions
797	Machine Learning in Computational Biology: Models of Alternative Splicing Shai, Ofer 03 March 2010 (has links) Alternative splicing, the process by which a single gene may code for similar but different proteins, is an important process in biology, linked to development, cellular differentiation, genetic diseases, and more. Genome-wide analysis of alternative splicing patterns and regulation has been recently made possible due to new high throughput techniques for monitoring gene expression and genomic sequencing. This thesis introduces two algorithms for alternative splicing analysis based on large microarray and genomic sequence data. The algorithms, based on generative probabilistic models that capture structure and patterns in the data, are used to study global properties of alternative splicing. In the first part of the thesis, a microarray platform for monitoring alternative splicing is introduced. A spatial noise removal algorithm that removes artifacts and improves data fidelity is presented. The GenASAP algorithm (generative model for alternative splicing array platform) models the non-linear process in which targeted molecules bind to a microarray’s probes and is used to predict patterns of alternative splicing. Two versions of GenASAP have been developed. The first uses variational approximation to infer the relative amounts of the targeted molecules, while the second incorporates a more accurate noise and generative model and utilizes Markov chain Monte Carlo (MCMC) sampling. GenASAP, the first method to provide quantitative predictions of alternative splicing patterns on large scale data sets, is shown to generate useful and precise predictions based on independent RT-PCR validation (a slow but more accurate approach to measuring cellular expression patterns). In the second part of the thesis, the results obtained by GenASAP are analysed to reveal jointly regulated genes. The sequences of the genes are examined for potential regulatory factors binding sites using a new motif finding algorithm designed for this purpose. The motif finding algorithm, called GenBITES (generative model for binding sites) uses a fully Bayesian generative model for sequences, and the MCMC approach used for inference in the model includes moves that can efficiently create or delete motifs, and extend or contract the width of existing motifs. GenBITES has been applied to several synthetic and real data sets, and is shown to be highly competitive at a task for which many algorithms already exist. Although developed to analyze alternative splicing data, GenBITES outperforms most reported results on a benchmark data set based on transcription data. Machine Learning Graphical Models Computational Biology Alternative Splicing
798	Representing and learning affordance-based behaviors Hermans, Tucker Ryer 22 May 2014 (has links) Autonomous robots deployed in complex, natural human environments such as homes and offices need to manipulate numerous objects throughout their deployment. For an autonomous robot to operate effectively in such a setting and not require excessive training from a human operator, it should be capable of discovering how to reliably manipulate novel objects it encounters. We characterize the possible methods by which a robot can act on an object using the concept of affordances. We define affordance-based behaviors as object manipulation strategies available to a robot, which correspond to specific semantic actions over which a task-level planner or end user of the robot can operate. This thesis concerns itself with developing the representation of these affordance- based behaviors along with associated learning algorithms. We identify three specific learning problems. The first asks which affordance-based behaviors a robot can successfully apply to a given object, including ones seen for the first time. Second, we examine how a robot can learn to best apply a specific behavior as a function of an object’s shape. Third, we investigate how learned affordance knowledge can be transferred between different objects and different behaviors. We claim that decomposing affordance-based behaviors into three separate factors— a control policy, a perceptual proxy, and a behavior primitive—aids an autonomous robot in learning to manipulate. Having a varied set of affordance-based behaviors available allows a robot to learn which behaviors perform most effectively as a function of an object’s identity or pose in the workspace. For a specific behavior a robot can use interactions with previously encountered objects to learn to robustly manipulate a novel object when first encountered. Finally, our factored representation allows a robot to transfer knowledge learned with one behavior to effectively manipulate an object in a qualitatively different manner by using a distinct controller or behavior primitive. We evaluate all work on a bimanual, mobile-manipulator robot. In all experiments the robot interacts with real-world objects sensed by an RGB-D camera. Robot learning Affordance learning Autonomous robots Machine learning
799	Parallelizing support vector machines for scalable image annotation Alham, Nasullah Khalid January 2011 (has links) Machine learning techniques have facilitated image retrieval by automatically classifying and annotating images with keywords. Among them Support Vector Machines (SVMs) are used extensively due to their generalization properties. However, SVM training is notably a computationally intensive process especially when the training dataset is large. In this thesis distributed computing paradigms have been investigated to speed up SVM training, by partitioning a large training dataset into small data chunks and process each chunk in parallel utilizing the resources of a cluster of computers. A resource aware parallel SVM algorithm is introduced for large scale image annotation in parallel using a cluster of computers. A genetic algorithm based load balancing scheme is designed to optimize the performance of the algorithm in heterogeneous computing environments. SVM was initially designed for binary classifications. However, most classification problems arising in domains such as image annotation usually involve more than two classes. A resource aware parallel multiclass SVM algorithm for large scale image annotation in parallel using a cluster of computers is introduced. The combination of classifiers leads to substantial reduction of classification error in a wide range of applications. Among them SVM ensembles with bagging is shown to outperform a single SVM in terms of classification accuracy. However, SVM ensembles training are notably a computationally intensive process especially when the number replicated samples based on bootstrapping is large. A distributed SVM ensemble algorithm for image annotation is introduced which re-samples the training data based on bootstrapping and training SVM on each sample in parallel using a cluster of computers. The above algorithms are evaluated in both experimental and simulation environments showing that the distributed SVM algorithm, distributed multiclass SVM algorithm, and distributed SVM ensemble algorithm, reduces the training time significantly while maintaining a high level of accuracy in classifications. 006.3
800	Learning generative models of mid-level structure in natural images Heess, Nicolas Manfred Otto January 2012 (has links) Natural images arise from complicated processes involving many factors of variation. They reflect the wealth of shapes and appearances of objects in our three-dimensional world, but they are also affected by factors such as distortions due to perspective, occlusions, and illumination, giving rise to structure with regularities at many different levels. Prior knowledge about these regularities and suitable representations that allow efficient reasoning about the properties of a visual scene are important for many image processing and computer vision tasks. This thesis focuses on models of image structure at intermediate levels of complexity as required, for instance, for image inpainting or segmentation. It aims at developing generative, probabilistic models of this kind of structure, and, in particular, at devising strategies for learning such models in a largely unsupervised manner from data. One hallmark of natural images is that they can often be decomposed into regions with very different visual characteristics. The main approach of this thesis is therefore to represent images in terms of regions that are characterized by their shapes and appearances, and an image is then composed from many such regions. We explore approaches to learn about the appearance of regions, to learn about region shapes, and ways to combine several regions to form a full image. To achieve this goal, we make use of some ideas for unsupervised learning developed in the literature on models of low-level image structure and in the “deep learning” literature. These models are used as building blocks of more structured model formulations that incorporate additional prior knowledge of how images are formed. The thesis makes the following contributions: Firstly, we investigate a popular, MRF based prior of natural image structure, the Field-of Experts, with respect to its ability to model image textures, and propose an extended formulation that is considerably more successful at this task. This formulation gives rise to a fully parametric, translation-invariant probabilistic generative model of image textures. We illustrate how this model can be used as a component of a more comprehensive model of images comprising multiple textured regions. Secondly, we develop a model of region shape. This work is an extension of the “Masked Restricted Boltzmann Machine” proposed by Le Roux et al. (2011) and it allows explicit reasoning about the independent shapes and relative depths of occluding objects. We develop an inference and unsupervised learning scheme and demonstrate how this shape model, in combination with the masked RBM gives rise to a good model of natural image patches. Finally, we demonstrate how this model of region shape can be extended to model shapes in large images. The result is a generative model of large images which are formed by composition from many small, partially overlapping and occluding objects. 006.3

Search results