Global ETD Search

121	Machine Learning Methods for Articulatory Data Berry, Jeffrey James January 2012 (has links) Humans make use of more than just the audio signal to perceive speech. Behavioral and neurological research has shown that a person's knowledge of how speech is produced influences what is perceived. With methods for collecting articulatory data becoming more ubiquitous, methods for extracting useful information are needed to make this data useful to speech scientists, and for speech technology applications. This dissertation presents feature extraction methods for ultrasound images of the tongue and for data collected with an Electro-Magnetic Articulograph (EMA). The usefulness of these features is tested in several phoneme classification tasks. Feature extraction methods for ultrasound tongue images presented here consist of automatically tracing the tongue surface contour using a modified Deep Belief Network (DBN) (Hinton et al. 2006), and methods inspired by research in face recognition which use the entire image. The tongue tracing method consists of training a DBN as an autoencoder on concatenated images and traces, and then retraining the first two layers to accept only the image at runtime. This 'translational' DBN (tDBN) method is shown to produce traces comparable to those made by human experts. An iterative bootstrapping procedure is presented for using the tDBN to assist a human expert in labeling a new data set. Tongue contour traces are compared with the Eigentongues method of (Hueber et al. 2007), and a Gabor Jet representation in a 6-class phoneme classification task using Support Vector Classifiers (SVC), with Gabor Jets performing the best. These SVC methods are compared to a tDBN classifier, which extracts features from raw images and classifies them with accuracy only slightly lower than the Gabor Jet SVC method.For EMA data, supervised binary SVC feature detectors are trained for each feature in three versions of Distinctive Feature Theory (DFT): Preliminaries (Jakobson et al. 1954), The Sound Pattern of English (Chomsky and Halle 1968), and Unified Feature Theory (Clements and Hume 1995). Each of these feature sets, together with a fourth unsupervised feature set learned using Independent Components Analysis (ICA), are compared on their usefulness in a 46-class phoneme recognition task. Phoneme recognition is performed using a linear-chain Conditional Random Field (CRF) (Lafferty et al. 2001), which takes advantage of the temporal nature of speech, by looking at observations adjacent in time. Results of the phoneme recognition task show that Unified Feature Theory performs slightly better than the other versions of DFT. Surprisingly, ICA actually performs worse than running the CRF on raw EMA data. Conditional Random Fields Deep Belief Networks Machine Learning Ultrasound Imaging Linguistics Articulatory Speech Data Automatic Speech Recognition
122	Segmentation of heterogeneous document images : an approach based on machine learning, connected components analysis, and texture analysis Bonakdar Sakhi, Omid 06 December 2012 (has links) (PDF) Document page segmentation is one of the most crucial steps in document image analysis. It ideally aims to explain the full structure of any document page, distinguishing text zones, graphics, photographs, halftones, figures, tables, etc. Although to date, there have been made several attempts of achieving correct page segmentation results, there are still many difficulties. The leader of the project in the framework of which this PhD work has been funded () uses a complete processing chain in which page segmentation mistakes are manually corrected by human operators. Aside of the costs it represents, this demands tuning of a large number of parameters; moreover, some segmentation mistakes sometimes escape the vigilance of the operators. Current automated page segmentation methods are well accepted for clean printed documents; but, they often fail to separate regions in handwritten documents when the document layout structure is loosely defined or when side notes are present inside the page. Moreover, tables and advertisements bring additional challenges for region segmentation algorithms. Our method addresses these problems. The method is divided into four parts:1. Unlike most of popular page segmentation methods, we first separate text and graphics components of the page using a boosted decision tree classifier.2. The separated text and graphics components are used among other features to separate columns of text in a two-dimensional conditional random fields framework.3. A text line detection method, based on piecewise projection profiles is then applied to detect text lines with respect to text region boundaries.4. Finally, a new paragraph detection method, which is trained on the common models of paragraphs, is applied on text lines to find paragraphs based on geometric appearance of text lines and their indentations. Our contribution over existing work lies in essence in the use, or adaptation, of algorithms borrowed from machine learning literature, to solve difficult cases. Indeed, we demonstrate a number of improvements : on separating text columns when one is situated very close to the other; on preventing the contents of a cell in a table to be merged with the contents of other adjacent cells; on preventing regions inside a frame to be merged with other text regions around, especially side notes, even when the latter are written using a font similar to that the text body. Quantitative assessment, and comparison of the performances of our method with competitive algorithms using widely acknowledged metrics and evaluation methodologies, is also provided to a large extend.() This PhD thesis has been funded by Conseil Général de Seine-Saint-Denis, through the FUI6 project Demat-Factory, lead by Safig SA [INFO:INFO_OH] Computer Science/Other [INFO:INFO_OH] Informatique/Autre Document image segmentation Text line detection Machine learning Condtional random fields
123	Multiscale Methods in Image Modelling and Image Processing Alexander, Simon January 2005 (has links) The field of modelling and processing of 'images' has fairly recently become important, even crucial, to areas of science, medicine, and engineering. The inevitable explosion of imaging modalities and approaches stemming from this fact has become a rich source of mathematical applications. <br /><br /> 'Imaging' is quite broad, and suffers somewhat from this broadness. The general question of 'what is an image?' or perhaps 'what is a natural image?' turns out to be difficult to address. To make real headway one may need to strongly constrain the class of images being considered, as will be done in part of this thesis. On the other hand there are general principles that can guide research in many areas. One such principle considered is the assertion that (classes of) images have multiscale relationships, whether at a pixel level, between features, or other variants. There are both practical (in terms of computational complexity) and more philosophical reasons (mimicking the human visual system, for example) that suggest looking at such methods. Looking at scaling relationships may also have the advantage of opening a problem up to many mathematical tools. <br /><br /> This thesis will detail two investigations into multiscale relationships, in quite different areas. One will involve Iterated Function Systems (IFS), and the other a stochastic approach to reconstruction of binary images (binary phase descriptions of porous media). The use of IFS in this context, which has often been called 'fractal image coding', has been primarily viewed as an image compression technique. We will re-visit this approach, proposing it as a more general tool. Some study of the implications of that idea will be presented, along with applications inferred by the results. In the area of reconstruction of binary porous media, a novel, multiscale, hierarchical annealing approach is proposed and investigated. Mathematics iterated function systems IFS IFSM IFSW random fields MCMC simulated annealing multiscale image modelling image processing
124	Modèles graphiques discriminants pour l'étiquetage de séquences : application à la reconnaissance d'entités nommées radiophiniques / Discriminative graphical models for sequence labelling : application to named entity recognition in audio broadcast news Zidouni, Azeddine 08 December 2010 (has links) Le traitement automatique des données complexes et variées est un processus fondamental dans les applications d'extraction d'information. L'explosion combinatoire dans la composition des textes journalistiques et l'évolution du vocabulaire rend la tâche d'extraction d'indicateurs sémantiques, tel que les entités nommées, plus complexe par les approches symboliques. Les modèles stochastiques structurels tel que les champs conditionnels aléatoires (CRF) permettent d'optimiser des systèmes d'extraction d'information avec une importante capacité de généralisation. La première contribution de cette thèse est consacrée à la définition du contexte optimal pour l'extraction des régularités entre les mots et les annotations dans la tâche de reconnaissance d'entités nommées. Nous allons intégrer diverses informations dans le but d'enrichir les observations et améliorer la qualité de prédiction du système. Dans la deuxième partie nous allons proposer une nouvelle approche d'adaptation d'annotations entre deux protocoles différents. Le principe de cette dernière est basé sur l'enrichissement d'observations par des données générées par d'autres systèmes. Ces travaux seront expérimentés et validés sur les données de la campagne ESTER. D'autre part, nous allons proposer une approche de couplage entre le niveau signal représenté par un indice de la qualité de voisement et le niveau sémantique. L'objectif de cette étude est de trouver le lien entre le degré d'articulation du locuteur et l'importance de son discours / Recent researches in Information Extraction are designed to extract fixed types of information from data. Sequence annotation systems are developed to associate structured annotations to input data presented in sequential form. The named entity recognition (NER) task consists of identifying and classifying every word in a document into some predefined categories such as person name, locations, organizations, and dates. The complexity of the NER is largely related to the definition of the task and to the complexity of the relationships between words and the semantic associated. Our first contribution is devoted to solving the NER problem using discriminative graphical models. The proposed approach investigates the use of various contexts of the words to improve recognition. NER systems are fixed in accordance with a specific annotation protocol. Thus, new applications are developed for new protocols. The challenge is how we can adapt an annotation system which is performed for a specific application to other target application? We will propose in this work an adaptation approach of sequence labelling task based on annotation enrichment using conditional random fields (CRF). Experimental results show that the proposed approach outperform rules-based approach in NER task. Finally, we propose a multimodal approach of NER by integrating low level features as contextual information in radio broadcast news data. The objective of this study is to measure the correlation between the speaker voicing quality and the importance of his speech Recherche d'information Reconnaissance d'entités nommées Modèles graphiques Champs conditionnels aléatoires Information retrieval Named entity recognition Graphical models Conditional random fields
125	Modelos gaussianos geoestatísticos espaço-temporais e aplicações / Space-time geostatisticals guassian models and aplications Silva, Alexandre Sousa da 08 February 2007 (has links) A especificação de funções de covariância espaço-temporais é uma das possíveis estratégias para modelagem de processos dos quais observações são tomadas em diferentes posições do espaço e do tempo. Tais funções podem definir processos separáveis ou não separáveis e na sua especificação deve-se garantir que são funções de covariância válidas atendendo a condição de serem positiva definidas. Entre estratégias para obtenção de tais funções estão as de Cressie e Huang (1999) e Gneiting (2002). A primeira se baseia na idéia de obter funções em um espaç de dimensão aumentada a partir de funções válidas no espaço original e necessita de operações no domínio da freqüência. Alternativamente a segunda proposta utiliza combinação de funções completamente monótonas e estritamente crescentes, evitando inversão de representações espectrais. Há ainda poucos relatos de uso e avaliações comparativas das diferentes propostas. Neste trabalho considerou-se a metodologia proposta por Gneiting, com diferentes valores do parâmetro que indica a força da interação entre o espaço e o tempo. Diferentes modelos foram aplicados à dois conjuntos de dados, um referente a estoques de peixe na costa de Portugual, e outro referente à armazenagem de água em um solo com citros. Utilizou-se a implementação no pacote RandomFields do programa R, revisando-se a metodologia e investigando-se a implementação computacional. Para os dois conjuntos de dados o modelo de covariância separável se mostrou adequado para descrever o comportamento das observações disponíveis sendo a escolha do modelo determinada por ajustes de máxima verossimilhança. / The specification of space-time covariance functions is one of the possible strategies to model processes observed at different locations and time points. Such functions can define separable and non-separable processes and must attend the condition of positivedefiniteness. Among the strategies to obtain such valid functions are the ones suggested by Cressie and Huang (1999) and by Gneiting (2002). The former is based on the idea of obtaining valid functions in a space of increased dimension from valid functions on the primary dimension and requires operations in the frequency domain. Alternatively, the latter combines increasing monotone functions avoiding the inversion of spectral representations. There are still few reports of usage and comparisons of the strategies. This work follows Gneiting?s proposals with different values for the space-time interaction parameter. Models were applied for the analysis of two real data sets, one about fish stocks in the Portuguese coast and a second on soil water storage. The implementation on the R package RandomFields was used, with methodology and computational implementation being reviewed. For both case the separable model provided a satisfactory fit, based on maximum likelihood estimation. Análise de covariância Campos aleatórios Estatística computacional Geoestatística Geostatistic Random fields package Random fileds Sapce-time models Simulação (estatística)
126	On conditional random fields: applications, feature selection, parameter estimation and hierarchical modelling Tran, The Truyen January 2008 (has links) There has been a growing interest in stochastic modelling and learning with complex data, whose elements are structured and interdependent. One of the most successful methods to model data dependencies is graphical models, which is a combination of graph theory and probability theory. This thesis focuses on a special type of graphical models known as Conditional Random Fields (CRFs) (Lafferty et al., 2001), in which the output state spaces, when conditioned on some observational input data, are represented by undirected graphical models. The contributions of thesis involve both (a) broadening the current applicability of CRFs in the real world and (b) deepening the understanding of theoretical aspects of CRFs. On the application side, we empirically investigate the applications of CRFs in two real world settings. The first application is on a novel domain of Vietnamese accent restoration, in which we need to restore accents of an accent-less Vietnamese sentence. Experiments on half a million sentences of news articles show that the CRF-based approach is highly accurate. In the second application, we develop a new CRF-based movie recommendation system called Preference Network (PN). The PN jointly integrates various sources of domain knowledge into a large and densely connected Markov network. We obtained competitive results against well-established methods in the recommendation field. / On the theory side, the thesis addresses three important theoretical issues of CRFs: feature selection, parameter estimation and modelling recursive sequential data. These issues are all addressed under a general setting of partial supervision in that training labels are not fully available. For feature selection, we introduce a novel learning algorithm called AdaBoost.CRF that incrementally selects features out of a large feature pool as learning proceeds. AdaBoost.CRF is an extension of the standard boosting methodology to structured and partially observed data. We demonstrate that the AdaBoost.CRF is able to eliminate irrelevant features and as a result, returns a very compact feature set without significant loss of accuracy. Parameter estimation of CRFs is generally intractable in arbitrary network structures. This thesis contributes to this area by proposing a learning method called AdaBoost.MRF (which stands for AdaBoosted Markov Random Forests). As learning proceeds AdaBoost.MRF incrementally builds a tree ensemble (a forest) that cover the original network by selecting the best spanning tree at a time. As a result, we can approximately learn many rich classes of CRFs in linear time. The third theoretical work is on modelling recursive, sequential data in that each level of resolution is a Markov sequence, where each state in the sequence is also a Markov sequence at the finer grain. One of the key contributions of this thesis is Hierarchical Conditional Random Fields (HCRF), which is an extension to the currently popular sequential CRF and the recent semi-Markov CRF (Sarawagi and Cohen, 2004). Unlike previous CRF work, the HCRF does not assume any fixed graphical structures. / Rather, it treats structure as an uncertain aspect and it can estimate the structure automatically from the data. The HCRF is motivated by Hierarchical Hidden Markov Model (HHMM) (Fine et al., 1998). Importantly, the thesis shows that the HHMM is a special case of HCRF with slight modification, and the semi-Markov CRF is essentially a flat version of the HCRF. Central to our contribution in HCRF is a polynomial-time algorithm based on the Asymmetric Inside Outside (AIO) family developed in (Bui et al., 2004) for learning and inference. Another important contribution is to extend the AIO family to address learning with missing data and inference under partially observed labels. We also derive methods to deal with practical concerns associated with the AIO family, including numerical overflow and cubic-time complexity. Finally, we demonstrate good performance of HCRF against rivals on two applications: indoor video surveillance and noun-phrase chunking.
127	Road Surface Modeling using Stereo Vision / Modellering av Vägyta med hjälp av Stereokamera Lorentzon, Mattis, Andersson, Tobias January 2012 (has links) Modern day cars are often equipped with a variety of sensors that collect information about the car and its surroundings. The stereo camera is an example of a sensor that in addition to regular images also provides distances to points in its environment. This information can, for example, be used for detecting approaching obstacles and warn the driver if a collision is imminent or even automatically brake the vehicle. Objects that constitute a potential danger are usually located on the road in front of the vehicle which makes the road surface a suitable reference level from which to measure the object's heights. This Master's thesis describes how an estimate of the road surface can be found to in order to make these height measurements. The thesis describes how the large amount of data generated by the stereo camera can be scaled down to a more effective representation in the form of an elevation map. The report discusses a method for relating data from different instances in time using information from the vehicle's motion sensors and shows how this method can be used for temporal filtering of the elevation map. For estimating the road surface two different methods are compared, one that uses a RANSAC-approach to iterate for a good surface model fit and one that uses conditional random fields for modeling the probability of different parts of the elevation map to be part of the road. A way to detect curb lines and how to use them to improve the road surface estimate is shown. Both methods for road classification show good results with a few differences that are discussed towards the end of the report. An example of how the road surface estimate can be used to detect obstacles is also included. Stereo camera Digital Elevation map Temporal filtering Curb detection Road surface classification Random sample consensus Conditional random fields Obstacle detection
128	Multiscale Methods in Image Modelling and Image Processing Alexander, Simon January 2005 (has links) The field of modelling and processing of 'images' has fairly recently become important, even crucial, to areas of science, medicine, and engineering. The inevitable explosion of imaging modalities and approaches stemming from this fact has become a rich source of mathematical applications. <br /><br /> 'Imaging' is quite broad, and suffers somewhat from this broadness. The general question of 'what is an image?' or perhaps 'what is a natural image?' turns out to be difficult to address. To make real headway one may need to strongly constrain the class of images being considered, as will be done in part of this thesis. On the other hand there are general principles that can guide research in many areas. One such principle considered is the assertion that (classes of) images have multiscale relationships, whether at a pixel level, between features, or other variants. There are both practical (in terms of computational complexity) and more philosophical reasons (mimicking the human visual system, for example) that suggest looking at such methods. Looking at scaling relationships may also have the advantage of opening a problem up to many mathematical tools. <br /><br /> This thesis will detail two investigations into multiscale relationships, in quite different areas. One will involve Iterated Function Systems (IFS), and the other a stochastic approach to reconstruction of binary images (binary phase descriptions of porous media). The use of IFS in this context, which has often been called 'fractal image coding', has been primarily viewed as an image compression technique. We will re-visit this approach, proposing it as a more general tool. Some study of the implications of that idea will be presented, along with applications inferred by the results. In the area of reconstruction of binary porous media, a novel, multiscale, hierarchical annealing approach is proposed and investigated. Mathematics iterated function systems IFS IFSM IFSW random fields MCMC simulated annealing multiscale image modelling image processing
129	Statistical computation and inference for functional data analysis Jiang, Huijing 09 November 2010 (has links) My doctoral research dissertation focuses on two aspects of functional data analysis (FDA): FDA under spatial interdependence and FDA for multi-level data. The first part of my thesis focuses on developing modeling and inference procedure for functional data under spatial dependence. The methodology introduced in this part is motivated by a research study on inequities in accessibility to financial services. The first research problem in this part is concerned with a novel model-based method for clustering random time functions which are spatially interdependent. A cluster consists of time functions which are similar in shape. The time functions are decomposed into spatial global and time-dependent cluster effects using a semi-parametric model. We also assume that the clustering membership is a realization from a Markov random field. Under these model assumptions, we borrow information across curves from nearby locations resulting in enhanced estimation accuracy of the cluster effects and of the cluster membership. In a simulation study, we assess the estimation accuracy of our clustering algorithm under a series of settings: small number of time points, high noise level and varying dependence structures. Over all simulation settings, the spatial-functional clustering method outperforms existing model-based clustering methods. In the case study presented in this project, we focus on estimates and classifies service accessibility patterns varying over a large geographic area (California and Georgia) and over a period of 15 years. The focus of this study is on financial services but it generally applies to any other service operation. The second research project of this part studies an association analysis of space-time varying processes, which is rigorous, computational feasible and implementable with standard software. We introduce general measures to model different aspects of the temporal and spatial association between processes varying in space and time. Using a nonparametric spatiotemporal model, we show that the proposed association estimators are asymptotically unbiased and consistent. We complement the point association estimates with simultaneous confidence bands to assess the uncertainty in the point estimates. In a simulation study, we evaluate the accuracy of the association estimates with respect to the sample size as well as the coverage of the confidence bands. In the case study in this project, we investigate the association between service accessibility and income level. The primary objective of this association analysis is to assess whether there are significant changes in the income-driven equity of financial service accessibility over time and to identify potential under-served markets. The second part of the thesis discusses novel statistical methodology for analyzing multilevel functional data including a clustering method based on a functional ANOVA model and a spatio-temporal model for functional data with a nested hierarchical structure. In this part, I introduce and compare a series of clustering approaches for multilevel functional data. For brevity, I present the clustering methods for two-level data: multiple samples of random functions, each sample corresponding to a case and each random function within a sample/case corresponding to a measurement type. A cluster consists of cases which have similar within-case means (level-1 clustering) or similar between-case means (level-2 clustering). Our primary focus is to evaluate a model-based clustering to more straightforward hard clustering methods. The clustering model is based on a multilevel functional principal component analysis. In a simulation study, we assess the estimation accuracy of our clustering algorithm under a series of settings: small vs. moderate number of time points, high noise level and small number of measurement types. We demonstrate the applicability of the clustering analysis to a real data set consisting of time-varying sales for multiple products sold by a large retailer in the U.S. My ongoing research work in multilevel functional data analysis is developing a statistical model for estimating temporal and spatial associations of a series of time-varying variables with an intrinsic nested hierarchical structure. This work has a great potential in many real applications where the data are areal data collected from different data sources and over geographic regions of different spatial resolution. Service distribution equity Multi-level data Model-based clustering Spatio-temporal Functional data analysis Multilevel models (Statistics) Markov random fields
130	Stochastic m-estimators: controlling accuracy-cost tradeoffs in machine learning Dillon, Joshua V. 15 November 2011 (has links) m-Estimation represents a broad class of estimators, including least-squares and maximum likelihood, and is a widely used tool for statistical inference. Its successful application however, often requires negotiating physical resources for desired levels of accuracy. These limiting factors, which we abstractly refer as costs, may be computational, such as time-limited cluster access for parameter learning, or they may be financial, such as purchasing human-labeled training data under a fixed budget. This thesis explores these accuracy- cost tradeoffs by proposing a family of estimators that maximizes a stochastic variation of the traditional m-estimator. Such "stochastic m-estimators" (SMEs) are constructed by stitching together different m-estimators, at random. Each such instantiation resolves the accuracy-cost tradeoff differently, and taken together they span a continuous spectrum of accuracy-cost tradeoff resolutions. We prove the consistency of the estimators and provide formulas for their asymptotic variance and statistical robustness. We also assess their cost for two concerns typical to machine learning: computational complexity and labeling expense. For the sake of concreteness, we discuss experimental results in the context of a variety of discriminative and generative Markov random fields, including Boltzmann machines, conditional random fields, model mixtures, etc. The theoretical and experimental studies demonstrate the effectiveness of the estimators when computational resources are insufficient or when obtaining additional labeled samples is necessary. We also demonstrate that in some cases the stochastic m-estimator is associated with robustness thereby increasing its statistical accuracy and representing a win-win. Approximate inference Semi-supervised learning Parameter learning Structured prediction Graphical models Machine learning Computational complexity Markov random fields

Search results