Watkins, Andrew B.
Thesis (M.S.)--Mississippi State University. Department of Computer Science. / Title from title screen. Includes bibliographical references.
Sadid-Al-Hasan, Sheikh, University of Lethbridge. Faculty of Arts and Science
The term “Google” has become a verb for most of us. Search engines, however, have certain limitations. For example ask it for the impact of the current global financial crisis in different parts of the world, and you can expect to sift through thousands of results for the answer. This motivates the research in complex question answering where the purpose is to create summaries of large volumes of information as answers to complex questions, rather than simply offering a listing of sources. Unlike simple questions, complex questions cannot be answered easily as they often require inferencing and synthesizing information from multiple documents. Hence, this task is accomplished by the query-focused multidocument summarization systems. In this thesis we apply different supervised learning techniques to confront the complex question answering problem. To run our experiments, we consider the DUC-2007 main task. A huge amount of labeled data is a prerequisite for supervised training. It is expensive and time consuming when humans perform the labeling task manually. Automatic labeling can be a good remedy to this problem. We employ five different automatic annotation techniques to build extracts from human abstracts using ROUGE, Basic Element (BE) overlap, syntactic similarity measure, semantic similarity measure and Extended String Subsequence Kernel (ESSK). The representative supervised methods we use are Support Vector Machines (SVM), Conditional Random Fields (CRF), Hidden Markov Models (HMM) and Maximum Entropy (MaxEnt). We annotate DUC-2006 data and use them to train our systems, whereas 25 topics of DUC-2007 data set are used as test data. The evaluation results reveal the impact of automatic labeling methods on the performance of the supervised approaches to complex question answering. We also experiment with two ensemble-based approaches that show promising results for this problem domain. / x, 108 leaves : ill. ; 29 cm
Melhoria da atratividade de faces em imagens = Enhancement of faces attractiveness in images / Enhancement of faces attractiveness in imagesLeite, Tatiane Silvia 20 August 2018 (has links)
Orientador: José Mario De Martino / Dissertação (mestrado) - Universidade Estadual de Campinas, Faculdade de Engenharia Elétrica e de Computação / Made available in DSpace on 2018-08-20T14:28:14Z (GMT). No. of bitstreams: 1 Leite_TatianeSilvia_M.pdf: 77678050 bytes, checksum: 402062baa2ae89224527d82c64355abd (MD5) Previous issue date: 2012 / Resumo: O rosto desempenha um papel importante na comunicação e expressão de emoções. Por ser o cartão de visitas individual e caracterizar a primeira impressão de cada um, sua aparência e seu formato tornam-se alvo de diversos estudos. Um rosto mais atraente é capaz de capturar com maior facilidade não apenas a atenção de quem o observa, como também sua empatia. Nesta linha, o presente trabalho tem como objetivo o desenvolvimento de uma metodologia para manipulação e transformação de imagens fotográficas de faces com a finalidade de aumentar a atratividade destes rostos. Para isso, foram abordados dois aspectos de modificação da face: o geométrico e o de textura da pele do rosto. No contexto deste trabalho, foi construída uma base de imagens de faces. Nas imagens desta base foram identificados pontos de interesse e calculadas distâncias entre eles para a caracterização das proporções da face. Adicionalmente, foi atribuído um grau de atratividade para cada face, a partir de avaliação realizada por um grupo de 40 voluntários. As medidas de proporção e atratividade foram utilizadas, no processo de melhoria geométrica da face, como conjunto de treinamento para os algoritmos de aprendizado de máquina. Como resultado do processamento são geradas novas medidas para o rosto que se deseja tornar mais atraente. Utilizando a técnica de warping, a imagem do rosto de entrada é modificada para as novas medidas encontradas. A imagem resultante deste processo serve como imagem de entrada para o processo de modificação da textura. Neste processamento é gerada uma nova imagem com a cor dos pixels da região de pele do rosto alterada. A principal contribuição deste trabalho consiste em unir o processo de modificação geométrica do rosto à modificação de textura da pele. Esta união resultou em um ganho de atratividade maior do que se estas técnicas fossem utilizadas separadamente. Este ganho foi comprovado com testes de pós-avaliação realizados com voluntários analisando os resultados finais nas imagens / Abstract: The face plays an important role in communication and expression of emotions. Face characterizes the first impression of each person; thus, its appearance and shape became the target of several studies. An attractive face is capable of capturing more easily not only the attention of the beholder, as well as his/her empathy. In this vein, this study aims to develop a methodology for handling and processing of images of faces in order to increase the attractiveness of these faces. It was addressed two aspects of modification of the face: the geometric and texture (considering only the skin of the face). In this work, a large database of face images was built. All these faces were marked with feature points and from them it was taken measures considered interesting to analyze the dimensions and proportions of the faces. Besides that, they were also evaluated according to their degree of attraction by a group of volunteers. This information was used in the enhancement of the face geometry, using machine learning algorithms. At this stage new measures were generated for the input face which is considered in the beautification process. Using the technique of warping, the input face image is warped to fit the new measures found by the algorithms. The resulting image from this process serves as the input image to the process of texture modification. At this stage it is generated a new image with the color of pixels in the region of skin of the face changed. The main contribution of this work is to join the process of face geometry modification with the process of face skin texture modification. The result of this union generates image faces which have greater enhancement of attractiveness than if the processes were used separately. This gain was confirmed by post-evaluation tests conducted with volunteers that analyzed the final results / Mestrado / Engenharia de Computação / Mestre em Engenharia Elétrica
Craddock, Richard Cameron
17 November 2009
Since its discovery in 1995 resting state functional connectivity derived from functional MRI data has become a popular neuroimaging method for study psychiatric disorders. Current methods for analyzing resting state functional connectivity in disease involve thousands of univariate tests, and the specification of regions of interests to employ in the analysis. There are several drawbacks to these methods. First the mass univariate tests employed are insensitive to the information present in distributed networks of functional connectivity. Second, the null hypothesis testing employed to select functional connectivity dierences between groups does not evaluate the predictive power of identified functional connectivities. Third, the specification of regions of interests is confounded by experimentor bias in terms of which regions should be modeled and experimental error in terms of the size and location of these regions of interests. The objective of this dissertation is to improve the methods for functional connectivity analysis using multivariate predictive modeling, feature selection, and whole brain parcellation. A method of applying Support vector classification (SVC) to resting state functional connectivity data was developed in the context of a neuroimaging study of depression. The interpretability of the obtained classifier was optimized using feature selection techniques that incorporate reliability information. The problem of selecting regions of interests for whole brain functional connectivity analysis was addressed by clustering whole brain functional connectivity data to parcellate the brain into contiguous functionally homogenous regions. This newly developed famework was applied to derive a classifier capable of correctly seperating the functional connectivity patterns of patients with depression from those of healthy controls 90% of the time. The features most relevant to the obtain classifier match those previously identified in previous studies, but also include several regions not previously implicated in the functional networks underlying depression.
Torres, Juan Félix
17 May 2010
Speech communication encompasses diverse types of information, including phonetics, affective state, voice quality, and speaker identity. From a speech production standpoint, the acoustic speech signal can be mainly divided into glottal source and vocal tract components, which play distinct roles in rendering the various types of information it contains. Most deployed speech analysis systems, however, do not explicitly represent these two components as distinct entities, as their joint estimation from the acoustic speech signal becomes an ill-defined blind deconvolution problem. Nevertheless, because of the desire to understand glottal behavior and how it relates to perceived voice quality, there has been continued interest in explicitly estimating the glottal component of the speech signal. To this end, several inverse filtering (IF) algorithms have been proposed, but they are unreliable in practice because of the blind formulation of the separation problem. In an effort to develop a method that can bypass the challenging IF process, this thesis proposes a new glottal source information extraction method that relies on supervised machine learning to transform smoothed spectral representations of speech, which are already used in some of the most widely deployed and successful speech analysis applications, into a set of glottal source features. A transformation method based on Gaussian mixture regression (GMR) is presented and compared to current IF methods in terms of feature similarity, reliability, and speaker discrimination capability on a large speech corpus, and potential representations of the spectral envelope of speech are investigated for their ability represent glottal source variation in a predictable manner. The proposed system was found to produce glottal source features that reasonably matched their IF counterparts in many cases, while being less susceptible to spurious errors. The development of the proposed method entailed a study into the aspects of glottal source information that are already contained within the spectral features commonly used in speech analysis, yielding an objective assessment regarding the expected advantages of explicitly using glottal information extracted from the speech signal via currently available IF methods, versus the alternative of relying on the glottal source information that is implicitly contained in spectral envelope representations.
On discriminative semi-supervised incremental learning with a multi-view perspective for image concept modelingByun, Byungki 17 January 2012 (has links)
This dissertation presents the development of a semi-supervised incremental learning framework with a multi-view perspective for image concept modeling. For reliable image concept characterization, having a large number of labeled images is crucial. However, the size of the training set is often limited due to the cost required for generating concept labels associated with objects in a large quantity of images. To address this issue, in this research, we propose to incrementally incorporate unlabeled samples into a learning process to enhance concept models originally learned with a small number of labeled samples. To tackle the sub-optimality problem of conventional techniques, the proposed incremental learning framework selects unlabeled samples based on an expected error reduction function that measures contributions of the unlabeled samples based on their ability to increase the modeling accuracy. To improve the convergence property of the proposed incremental learning framework, we further propose a multi-view learning approach that makes use of multiple features such as color, texture, etc., of images when including unlabeled samples. For robustness to mismatches between training and testing conditions, a discriminative learning algorithm, namely a kernelized maximal- figure-of-merit (kMFoM) learning approach is also developed. Combining individual techniques, we conduct a set of experiments on various image concept modeling problems, such as handwritten digit recognition, object recognition, and image spam detection to highlight the effectiveness of the proposed framework.
Performance comparison of support vector machine and relevance vector machine classifiers for functional MRI dataPerez, Daniel Antonio 12 July 2010 (has links)
Multivariate pattern analysis (MVPA) of fMRI data has been growing in popularity due to its sensitivity to networks of brain activation. It is performed in a predictive modeling framework which is natural for implementing brain state prediction and real-time fMRI applications such as brain computer interfaces. Support vector machines (SVM) have been particularly popular for MVPA owing to their high prediction accuracy even with noisy datasets. Recent work has proposed the use of relevance vector machines (RVM) as an alternative to SVM. RVMs are particularly attractive in time sensitive applications such as real-time fMRI since they tend to perform classification faster than SVMs. Despite the use of both methods in fMRI research, little has been done to compare the performance of these two techniques. This study compares RVM to SVM in terms of time and accuracy to determine which is better suited to real-time applications.
02 July 2012
Traditionally design engineers have used the Factor of Safety method for ensuring that designs do not fail in the field. Access to advanced computational tools and resources have made this process obsolete and new methods to introduce higher levels of reliability in an engineering systems are currently being investigated. However, even though high computational resources are available the computational resources required by reliability analysis procedures leave much to be desired. Furthermore, the regression based surrogate modeling techniques fail when there is discontinuity in the design space, caused by failure mechanisms, when the design is required to perform under severe externalities. Hence, in this research we propose efficient Semi-Supervised Learning based surrogate modeling techniques that will enable accurate estimation of a system's response, even under discontinuity. These methods combine the available set of labeled dataset and unlabeled dataset and provide better models than using labeled data alone. Labeled data is expensive to obtain since the responses have to be evaluated whereas unlabeled data is available in plenty, during reliability estimation, since the PDF information of uncertain variables is assumed to be known. This superior performance is gained by combining the efficiency of Probabilistic Neural Networks (PNN) for classification and Expectation-Maximization (EM) algorithm for treating the unlabeled data as labeled data with hidden labels.
25 September 2009
Cette thèse se penche sur les problèmes de décisions devant être prises de manière séquentielle au sein d'un environnement aléatoire. Lors de chaque étape d'un tel problème décisionnel, une alternative doit être sélectionnée parmi un ensemble d'alternatives. Chaque alternative possède un gain moyen qui lui est propre et lorsque l'une d'elles est sélectionnée, celle-ci engendre un gain aléatoire. La sélection opérée peut suivre deux types d'objectifs.<p>Dans un premier cas, les tests viseront à maximiser la somme des gains collectés. Un juste compromis doit alors être trouvé entre l'exploitation et l'exploration. Ce problème est couramment dénommé dans la littérature scientifique "multi-armed bandit problem".<p>Dans un second cas, un nombre de sélections maximal est imposé et l'objectif consistera à répartir ces sélections de façon à augmenter les chances de trouver l'alternative présentant le gain moyen le plus élevé. Ce deuxième problème est couramment repris dans la littérature scientifique sous l'appellation "selecting the best".<p>La sélection de type gloutonne joue un rôle important dans la résolution de ces problèmes de décision et opère en choisissant l'alternative qui s'est jusqu'ici montrée optimale. Or, la nature généralement aléatoire de l'environnement rend incertains les résultats d'une telle sélection. <p>Dans cette thèse, nous introduisons une nouvelle quantité, appelée le "gain espéré d'une action gloutonne". Sur base de quelques propriétés de cette quantité, de nouveaux algorithmes permettant de résoudre les deux problèmes décisionnels précités seront proposés.<p>Une attention particulière sera ici prêtée à l'application des techniques présentées au domaine de la sélection de modèles en l'apprentissage artificiel supervisé. <p>La collaboration avec le service d'anesthésie de l'Hôpital Erasme nous a permis d'appliquer les algorithmes proposés à des données réelles, provenant du milieu médical. Nous avons également développé un système d'aide à la décision dont un prototype a déjà été testé en conditions réelles sur un échantillon restreint de patients. / Doctorat en Sciences / info:eu-repo/semantics/nonPublished
08 April 2010
Supervised learning tasks like building a classifier, estimating the error rate of the predictors, are typically performed with labeled data. In most cases, obtaining labeled data is costly as it requires manual labeling. On the other hand, unlabeled data is available in abundance. In this thesis, we discuss methods to perform supervised learning tasks with no labeled data. We prove consistency of the proposed methods and demonstrate its applicability with synthetic and real world experiments. In some cases, small quantities of labeled data maybe easily available and supplemented with large quantities of unlabeled data (semi-supervised learning). We derive the asymptotic efficiency of generative models for semi-supervised learning and quantify the effect of labeled and unlabeled data on the quality of the estimate. Another independent track of the thesis is efficient computational methods for nonnegative tensor factorization (NTF). NTF provides the user with rich modeling capabilities but it comes with an added computational cost. We provide a fast algorithm for performing NTF using a modified active set method called block principle pivoting method and demonstrate its applicability to social network analysis and text mining.
Page generated in 0.2572 seconds