• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 5630
  • 578
  • 282
  • 275
  • 167
  • 157
  • 83
  • 66
  • 50
  • 42
  • 24
  • 21
  • 20
  • 19
  • 12
  • Tagged with
  • 9097
  • 9097
  • 3034
  • 1699
  • 1538
  • 1530
  • 1425
  • 1369
  • 1202
  • 1189
  • 1168
  • 1131
  • 1117
  • 1029
  • 1028
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
861

Machine Learning Algorithms for the Analysis of Social Media and Detection of Malicious User Generated Content

Unknown Date (has links)
One of the de ning characteristics of the modern Internet is its massive connectedness, with information and human connection simply a few clicks away. Social media and online retailers have revolutionized how we communicate and purchase goods or services. User generated content on the web, through social media, plays a large role in modern society; Twitter has been in the forefront of political discourse, with politicians choosing it as their platform for disseminating information, while websites like Amazon and Yelp allow users to share their opinions on products via online reviews. The information available through these platforms can provide insight into a host of relevant topics through the process of machine learning. Speci - cally, this process involves text mining for sentiment analysis, which is an application domain of machine learning involving the extraction of emotion from text. Unfortunately, there are still those with malicious intent and with the changes to how we communicate and conduct business, comes changes to their malicious practices. Social bots and fake reviews plague the web, providing incorrect information and swaying the opinion of unaware readers. The detection of these false users or posts from reading the text is di cult, if not impossible, for humans. Fortunately, text mining provides us with methods for the detection of harmful user generated content. This dissertation expands the current research in sentiment analysis, fake online review detection and election prediction. We examine cross-domain sentiment analysis using tweets and reviews. Novel techniques combining ensemble and feature selection methods are proposed for the domain of online spam review detection. We investigate the ability for the Twitter platform to predict the United States 2016 presidential election. In addition, we determine how social bots in uence this prediction. / Includes bibliography. / Dissertation (Ph.D.)--Florida Atlantic University, 2018. / FAU Electronic Theses and Dissertations Collection
862

Enhancement of Deep Neural Networks and Their Application to Text Mining

Unknown Date (has links)
Many current application domains of machine learning and arti cial intelligence involve knowledge discovery from text, such as sentiment analysis, document ontology, and spam detection. Humans have years of experience and training with language, enabling them to understand complicated, nuanced text passages with relative ease. A text classi er attempts to emulate or replicate this knowledge so that computers can discriminate between concepts encountered in text; however, learning high-level concepts from text, such as those found in many applications of text classi- cation, is a challenging task due to the many challenges associated with text mining and classi cation. Recently, classi ers trained using arti cial neural networks have been shown to be e ective for a variety of text mining tasks. Convolutional neural networks have been trained to classify text from character-level input, automatically learn high-level abstract representations and avoiding the need for human engineered features. This dissertation proposes two new techniques for character-level learning, log(m) character embedding and convolutional window classi cation. Log(m) embedding is a new character-vector representation for text data that is more compact and memory e cient than previous embedding vectors. Convolutional window classi cation is a technique for classifying long documents, i.e. documents with lengths exceeding the input dimension of the neural network. Additionally, we investigate the performance of convolutional neural networks combined with long short-term memory networks, explore how document length impacts classi cation performance and compare performance of neural networks against non-neural network-based learners in text classi cation tasks. / Includes bibliography. / Dissertation (Ph.D.)--Florida Atlantic University, 2018. / FAU Electronic Theses and Dissertations Collection
863

Parallel Distributed Deep Learning on Cluster Computers

Unknown Date (has links)
Deep Learning is an increasingly important subdomain of arti cial intelligence. Deep Learning architectures, arti cial neural networks characterized by having both a large breadth of neurons and a large depth of layers, bene ts from training on Big Data. The size and complexity of the model combined with the size of the training data makes the training procedure very computationally and temporally expensive. Accelerating the training procedure of Deep Learning using cluster computers faces many challenges ranging from distributed optimizers to the large communication overhead speci c to a system with o the shelf networking components. In this thesis, we present a novel synchronous data parallel distributed Deep Learning implementation on HPCC Systems, a cluster computer system. We discuss research that has been conducted on the distribution and parallelization of Deep Learning, as well as the concerns relating to cluster environments. Additionally, we provide case studies that evaluate and validate our implementation. / Includes bibliography. / Thesis (M.S.)--Florida Atlantic University, 2018. / FAU Electronic Theses and Dissertations Collection
864

Robot behavior learning with adaptive categorization in logical-perceptual space. / CUHK electronic theses & dissertations collection

January 2001 (has links)
Fung Wai-keung. / "February 5, 2001." / Thesis (Ph.D.)--Chinese University of Hong Kong, 2001. / Includes bibliographical references (p. 109-116). / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Mode of access: World Wide Web. / Abstracts in English and Chinese.
865

Natural language understanding across application domains and languages.

January 2002 (has links)
Tsui Wai-Ching. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2002. / Includes bibliographical references (leaves 115-122). / Abstracts in English and Chinese. / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Overview --- p.1 / Chapter 1.2 --- Natural Language Understanding Using Belief Networks --- p.5 / Chapter 1.3 --- Integrating Speech Recognition with Natural Language Un- derstanding --- p.7 / Chapter 1.4 --- Thesis Goals --- p.9 / Chapter 1.5 --- Thesis Organization --- p.10 / Chapter 2 --- Background --- p.12 / Chapter 2.1 --- Natural Language Understanding Approaches --- p.13 / Chapter 2.1.1 --- Rule-based Approaches --- p.15 / Chapter 2.1.2 --- Stochastic Approaches --- p.16 / Chapter 2.1.3 --- Mixed Approaches --- p.18 / Chapter 2.2 --- Portability of Natural Language Understanding Frameworks --- p.19 / Chapter 2.2.1 --- Portability across Domains --- p.19 / Chapter 2.2.2 --- Portability across Languages --- p.20 / Chapter 2.2.3 --- Portability across both Domains and Languages --- p.21 / Chapter 2.3 --- Spoken Language Understanding --- p.21 / Chapter 2.3.1 --- Integration of Speech Recognition Confidence into Nat- ural Language Understanding --- p.22 / Chapter 2.3.2 --- Integration of Other Potential Confidence Features into Natural Language Understanding --- p.24 / Chapter 2.4 --- Belief Networks --- p.24 / Chapter 2.4.1 --- Overview --- p.24 / Chapter 2.4.2 --- Bayesian Inference --- p.26 / Chapter 2.5 --- Transformation-based Parsing Technique --- p.27 / Chapter 2.6 --- Chapter Summary --- p.28 / Chapter 3 --- Portability of the Natural Language Understanding Frame- work across Application Domains and Languages --- p.31 / Chapter 3.1 --- Natural Language Understanding Framework --- p.32 / Chapter 3.1.1 --- Semantic Tagging --- p.33 / Chapter 3.1.2 --- Informational Goal Inference with Belief Networks --- p.34 / Chapter 3.2 --- The ISIS Stocks Domain --- p.36 / Chapter 3.3 --- A Unified Framework for English and Chinese --- p.38 / Chapter 3.3.1 --- Semantic Tagging for the ISIS domain --- p.39 / Chapter 3.3.2 --- Transformation-based Parsing --- p.40 / Chapter 3.3.3 --- Informational Goal Inference with Belief Networks for the ISIS domain --- p.43 / Chapter 3.4 --- Experiments --- p.45 / Chapter 3.4.1 --- Goal Identification Experiments --- p.45 / Chapter 3.4.2 --- A Cross-language Experiment --- p.49 / Chapter 3.5 --- Chapter Summary --- p.55 / Chapter 4 --- Enhancement in the Belief Networks for Informational Goal Inference --- p.57 / Chapter 4.1 --- Semantic Concept Selection in Belief Networks --- p.58 / Chapter 4.1.1 --- Selection of Positive Evidence --- p.58 / Chapter 4.1.2 --- Selection of Negative Evidence --- p.62 / Chapter 4.2 --- Estimation of Statistical Probabilities in the Enhanced Belief Networks --- p.64 / Chapter 4.2.1 --- Estimation of Prior Probabilities --- p.65 / Chapter 4.2.2 --- Estimation of Posterior Probabilities --- p.66 / Chapter 4.3 --- Experiments --- p.73 / Chapter 4.3.1 --- Belief Networks Developed with Positive Evidence --- p.74 / Chapter 4.3.2 --- Belief Networks with the Injection of Negative Evidence --- p.76 / Chapter 4.4 --- Chapter Summary --- p.82 / Chapter 5 --- Integration between Speech Recognition and Natural Lan- guage Understanding --- p.84 / Chapter 5.1 --- The Speech Corpus for the Chinese ISIS Stocks Domain --- p.86 / Chapter 5.2 --- Our Extended Natural Language Understanding Framework for Spoken Language Understanding --- p.90 / Chapter 5.2.1 --- Integrated Scoring for Chinese Speech Recognition and Natural Language Understanding --- p.92 / Chapter 5.3 --- Experiments --- p.92 / Chapter 5.3.1 --- Training and Testing on the Perfect Reference Data Sets --- p.93 / Chapter 5.3.2 --- Mismatched Training and Testing Conditions ´ؤ Perfect Reference versus Imperfect Hypotheses --- p.93 / Chapter 5.3.3 --- Comparing Goal Identification between the Use of Single- best versus N-best Recognition Hypotheses --- p.95 / Chapter 5.3.4 --- Integration of Speech Recognition Confidence Scores into Natural Language Understanding --- p.97 / Chapter 5.3.5 --- Feasibility of Our Approach for Spoken Language Un- derstanding --- p.99 / Chapter 5.3.6 --- Justification of Using Max-of-max Classifier in Our Single Goal Identification Scheme --- p.107 / Chapter 5.4 --- Chapter Summary --- p.109 / Chapter 6 --- Conclusions and Future Work --- p.110 / Chapter 6.1 --- Conclusions --- p.110 / Chapter 6.2 --- Contributions --- p.112 / Chapter 6.3 --- Future Work --- p.113 / Bibliography --- p.115 / Chapter A --- Semantic Frames for Chinese --- p.123 / Chapter B --- Semantic Frames for English --- p.127 / Chapter C --- The Concept Set of Positive Evidence for the Nine Goalsin English --- p.131 / Chapter D --- The Concept Set of Positive Evidence for the Ten Goalsin Chinese --- p.133 / Chapter E --- The Complete Concept Set including Both the Positive and Negative Evidence for the Ten Goals in English --- p.135 / Chapter F --- The Complete Concept Set including Both the Positive and Negative Evidence for the Ten Goals in Chinese --- p.138 / Chapter G --- The Assignment of Statistical Probabilities for Each Selected Concept under the Corresponding Goals in Chinese --- p.141 / Chapter H --- The Assignment of Statistical Probabilities for Each Selected Concept under the Corresponding Goals in English --- p.146
866

Automating inference, learning, and design using probabilistic programming

Rainforth, Thomas William Gamlen January 2017 (has links)
Imagine a world where computational simulations can be inverted as easily as running them forwards, where data can be used to refine models automatically, and where the only expertise one needs to carry out powerful statistical analysis is a basic proficiency in scientific coding. Creating such a world is the ambitious long-term aim of probabilistic programming. The bottleneck for improving the probabilistic models, or simulators, used throughout the quantitative sciences, is often not an ability to devise better models conceptually, but a lack of expertise, time, or resources to realize such innovations. Probabilistic programming systems (PPSs) help alleviate this bottleneck by providing an expressive and accessible modeling framework, then automating the required computation to draw inferences from the model, for example finding the model parameters likely to give rise to a certain output. By decoupling model specification and inference, PPSs streamline the process of developing and drawing inferences from new models, while opening up powerful statistical methods to non-experts. Many systems further provide the flexibility to write new and exciting models which would be hard, or even impossible, to convey using conventional statistical frameworks. The central goal of this thesis is to improve and extend PPSs. In particular, we will make advancements to the underlying inference engines and increase the range of problems which can be tackled. For example, we will extend PPSs to a mixed inference-optimization framework, thereby providing automation of tasks such as model learning and engineering design. Meanwhile, we make inroads into constructing systems for automating adaptive sequential design problems, providing potential applications across the sciences. Furthermore, the contributions of the work reach far beyond probabilistic programming, as achieving our goal will require us to make advancements in a number of related fields such as particle Markov chain Monte Carlo methods, Bayesian optimization, and Monte Carlo fundamentals.
867

Improving the performance of microscope mass spectrometry imaging

Guo, Ang January 2018 (has links)
Mass spectrometry imaging (MSI) is a powerful tool that provides mass-specific surface images with micron or sub-micron spatial resolutions. In a microscope MSI experiment, large sample surfaces are illuminated with a defocused laser or primary ion beam, enabling all surface molecules to be desorbed and ionised simultaneously before being electrostatically projected onto a position-sensitive imaging detector at the end of a time-of-flight mass analyser. Traditionally only the image of one mass-to-charge ratio can be obtained in a single acquisition, which limits its applicability. However, the development of event-triggered sensors, such as CMOS-based cameras, revives the microscope MSI method by allowing multi-mass imaging. Therefore, the challenges facing microscope have MSI shifted to improving its mass resolution, effective mass range, and mass accuracy. This thesis proposes effective solutions to each of them, and thus significantly improves the performance and applicability of microscope MSI. To increase the mass range, two modified post-extraction differential acceleration (PEDA) techniques, double-field PEDA and time-variable PEDA, were used to demonstrate mass-resolved stigmatic imaging over a broad m/z range. In double-field PEDA, a potential energy cusp was introduced into the ion acceleration region of an imaging mass spectrometer, creating two m/z foci that were tuned to overlap at the detector plane. This resulted in two focused m/z distributions that stretched the mass-resolved window with m/Δm >= 1000 to 165 Da without any loss in image quality; a range that doubled the 65 Da achieved under similar conditions using the original PEDA technique. In time-variable PEDA, a dynamic pulsed electric field was used to maximize the effective mass range of PEDA. By simultaneously focusing ions between 300 to 700 m/z using an exponentially rising voltage pulse, time-variable PEDA provides an effective mass range more than six times wider than the original PEDA method. Although reflectrons are widely used to improve the mass resolving power of ToF-MS, incorporating them in a microscope MSI instrument is novel. A reflectron MSI instrument was designed and implemented. Simulations demonstrated that one-stage gridless reflectrons were more compatible with the spatial imaging goal of the microscope MSI instrument than the gridded reflectrons. Preliminary experimental results showed that coupling the gridless reflectron with single-field PEDA achieved a mass resolution above 8,000 m/Δm while keeping a spatial resolution of 20 um. In conclusion, the gridless reflectron was able to triple the mass resolving power without losing any spatial imaging power. The poor mass accuracy hurdle was overcome by machine learning algorithms, which can construct clinical diagnostic models that recognise the peak pattern of biological mass spectra and classify them accurately without knowing the actual mass of each peak. After a proof of concept "experiment", where the mass spectra of dye molecules were classified by various learning algorithms, three pairs of datasets (ovarian cancer, prostate cancer, chronic fatigue and their respective controls) were used to build classifiers that accurately distinguish blood samples from controls. Possible biomarkers were also discovered by evaluating the importance of each m/z feature, which may assist further studies.
868

Scalable Gaussian process inference using variational methods

Matthews, Alexander Graeme de Garis January 2017 (has links)
Gaussian processes can be used as priors on functions. The need for a flexible, principled, probabilistic model of functional relations is common in practice. Consequently, such an approach is demonstrably useful in a large variety of applications. Two challenges of Gaussian process modelling are often encountered. These are dealing with the adverse scaling with the number of data points and the lack of closed form posteriors when the likelihood is non-Gaussian. In this thesis, we study variational inference as a framework for meeting these challenges. An introductory chapter motivates the use of stochastic processes as priors, with a particular focus on Gaussian process modelling. A section on variational inference reviews the general definition of Kullback-Leibler divergence. The concept of prior conditional matching that is used throughout the thesis is contrasted to classical approaches to obtaining tractable variational approximating families. Various theoretical issues arising from the application of variational inference to the infinite dimensional Gaussian process setting are settled decisively. From this theory we are able to give a new argument for existing approaches to variational regression that settles debate about their applicability. This view on these methods justifies the principled extensions found in the rest of the work. The case of scalable Gaussian process classification is studied, both for its own merits and as a case study for non-Gaussian likelihoods in general. Using the resulting algorithms we find credible results on datasets of a scale and complexity that was not possible before our work. An extension to include Bayesian priors on model hyperparameters is studied alongside a new inference method that combines the benefits of variational sparsity and MCMC methods. The utility of such an approach is shown on a variety of example modelling tasks. We describe GPflow, a new Gaussian process software library that uses TensorFlow. Implementations of the variational algorithms discussed in the rest of the thesis are included as part of the software. We discuss the benefits of GPflow when compared to other similar software. Increased computational speed is demonstrated in relevant, timed, experimental comparisons.
869

Automatic model selection on local Gaussian structures with priors: comparative investigations and applications. / 基於帶先驗的局部高斯結构的自動模型選擇: 比較性分析及應用研究 / CUHK electronic theses & dissertations collection / Ji yu dai xian yan de ju bu Gaosi jie gou de zi dong mo xing xuan ze: bi jiao xing fen xi ji ying yong yan jiu

January 2012 (has links)
作為機器學習領域中的一個重要課題,模型選擇旨在給定有限樣本的情況下、恰當地確定模型的複雜度。自動模型選擇是指一類快速有效的模型選擇方法,它們以一個足夠大的模型複雜度作為初始,在學習過程中有一種內在機制能夠驅使冗餘結構自動地變為不起作用、從而可以剔除。爲了輔助自動模型選擇的進行,模型的參數通常被假設帶有先驗。對於考慮先驗的各種自動模型選擇方法,已有工作中尚缺乏系統性的比較研究。本篇論文著眼於具有局部高斯結構的模型,進行了系統性的比較分析。 / 具體而言,本文比較了三種典型的自動模型選擇方法的優劣勢,它們分別為變分貝葉斯(Variational Bayesian),最小信息長度(Minimum Message Length),以及貝葉斯陰陽和諧學習(Bayesian Ying‐Yang harmony learning)。首先,我們研究針對高斯混合模型(Gaussian Mixture Model)的模型選擇,即確定該模型中高斯成份的個數。進而,我们假設每個高斯成份都有子空間結構、并研究混合因子分析模型(Mixture of Factor Analyzers)及局部因子分析模型(Local Factor Analysis)下的模型選擇問題,即確定模型中混合成份的個數及各個局部子空間的維度。 / 本篇論文考慮以上各模型的參數的兩類先驗,分別為共軛型先驗及Jeffreys 先驗。其中,共軛型先驗在高斯混合模型上為DNW(Dirichlet‐Normal‐Wishart)先驗,在混合因子分析模型及局部因子分析模型上均為DNG(Dirichlet‐Normal‐Gamma)先驗。由於推導對應Fisher 信息矩陣的解析表達非常困難,在混合因子分析模型及局部因子分析模型上,我們不考慮Jeffreys 先驗以及最小信息長度方法。 / 通過一系列的仿真實驗及應用分析,本文比較了幾種自動模型選擇算法(包括基於高斯混合模型的6 個算法,基於混合因子分析模型及局部因子分析模型的4 個算法),并得到了如下主要發現:1. 對於各種自動模型選擇方法,在所有參數上加先驗都比僅在混合權重上加先驗的效果好。2. 在高斯混合模型上,考慮 DNW 先驗的效果比考慮Jeffreys 先驗的效果好。其中,考慮Jeffreys 先驗時,最小信息長度比變分貝葉斯的效果略好;而考慮DNW 先驗時,變分貝葉斯比最小信息長度的效果好。3. 在高斯混合模型上,當DNW 先驗的超參數(hyper‐parameters)由保持固定變為根據各自學習準則進行優化時,貝葉斯陰陽和諧學習的效果得到了提高,而變分貝葉斯及最小信息長度的結果都會變差。在基於帶DNG 先驗的混合因子分析模型及局部因子分析模型的比較中,以上觀察結果同樣維持。事實上,變分貝葉斯及最小信息長度都缺乏一種引導先驗超參數優化的良好機制。4. 對以上各種模型、無論考慮哪種先驗、以及無論先驗超參數是否在學習過程中進行優化,貝葉斯陰陽和諧學習的效果都明顯地優於變分貝葉斯和最小信息長度。與后兩者相比,貝葉斯陰陽和諧學習對於先驗的依賴程度不高,它的結果在不考慮先驗的情況下已較好,並在考慮Jeffreys 或共軛型先驗時有進一步提高。5. 儘管混合因子分析模型及局部因子分析模型在最大似然準則的參數估計中等價,它們在變分貝葉斯及貝葉斯陰陽和諧學習下的自動模型選擇中卻表现不同。在這兩種方法下,局部因子分析模型皆以明顯的優勢優於混合因子分析模型。 / 爲進行以上比較分析,除了直接使用已有算法或做少許修改之外,本篇論文還提出了五個新的算法來填補空白。針對高斯混合模型,我們提出了帶Jeffreys 先驗的變分貝葉斯算法;通過邊際化(marginalization),我們得到了有多變量學生分佈(Student’s T‐distribution)形式的后驗,并提出了帶DNW 先驗的貝葉斯陰陽和諧學習算法。針對混合因子分析模型及局部因子分析模型,我們通過一系列的近似邊際化過程,得到了有多個學生分佈乘積形式的后驗,并提出了帶DNG 先驗的貝葉斯陰陽和諧學習算法。對應於已有的基於混合因子分析模型的變分貝葉斯算法,我們還提出了基於局部因子分析模型的變分貝葉斯算法,作為一種更有效的可替代選擇。 / Model selection aims to determine an appropriate model scale given a small size of samples, which is an important topic in machine learning. As one type of efficient solution, an automatic model selection starts from a large enough model scale, and has an intrinsic mechanism to push redundant structures to be ineffective and thus discarded automatically during learning. Priors are usually imposed on parameters to facilitate an automatic model selection. There still lack systematic comparisons on automatic model selection approaches with priors, and this thesis is motivated for such a study based on models with local Gaussian structures. / Particularly, we compare the relative strength and weakness of three typical automatic model selection approaches, namely Variational Bayesian (VB), Minimum Message Length (MML) and Bayesian Ying-Yang (BYY) harmony learning, on models with local Gaussian structures. First, we consider Gaussian Mixture Model (GMM), for which the number of Gaussian components is to be determined. Further assuming each Gaussian component has a subspace structure, we extend to consider two models namely Mixture of Factor Analyzers (MFA) and Local Factor Analysis (LFA), for both of which the component number and local subspace dimensionalities are to be determined. / Two types of priors are imposed on parameters, namely a conjugate form prior and a Jeffreys prior. The conjugate form prior is chosen as a Dirichlet-Normal- Wishart (DNW) prior for GMM, and as a Dirichlet-Normal-Gamma (DNG) prior for both MFA and LFA. The Jeffreys prior and the MML approach are not considered on MFA/LFA due to the difficulty in deriving the corresponding Fisher information matrix. Via extensive simulations and applications, comparisons on the automatic model selection algorithms (six for GMM and four for MFA/LFA), we get following main findings:1. Considering priors on all parameters makes each approach perform better than considering priors merely on the mixing weights.2. For all the three approaches on GMM, the performance with the DNW prior is better than with the Jeffreys prior. Moreover, Jeffreys prior makes MML slightly better than VB, while the DNW prior makes VB better than MML.3. As the DNW prior hyper-parameters on GMM are changed from fixed to freely optimized by each of its own learning principle, BYY improves its performance, while VB and MML deteriorate their performances. This observation remains the same when we compare BYY and VB on either MFA or LFA with the DNG prior. Actually, VB and MML lack a good guide for optimizing prior hyper-parameters.4. For bothGMMand MFA/LFA, BYY considerably outperforms both VB and MML, for any type of priors and whether hyper-parameters are optimized. Being different from VB and MML that rely on appropriate priors, BYY does not highly depend on the type of priors. It performs already well without priors and improves by imposing a Jeffreys or a conjugate form prior. 5. Despite the equivalence in maximum likelihood parameter learning, MFA and LFA affect the performances by VB and BYY in automatic model selection. Particularly, both BYY and VB perform better on LFA than on MFA, and the superiority of LFA is reliable and robust. / In addition to adopting the existing algorithms either directly or with some modifications, this thesis develops five new algorithms to fill the missing gap. Particularly on GMM, the VB algorithm with Jeffreys prior and the BYY algorithm with DNW prior are developed, in the latter of which a multivariate Student’s Tdistribution is obtained as the posterior via marginalization. On MFA and LFA, BYY algorithms with DNG priors are developed, where products of multiple Student’s T-distributions are obtained in posteriors via approximated marginalization. Moreover, a VB algorithm on LFA is developed as an alternative choice to the existing VB algorithm on MFA. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Shi, Lei. / Thesis (Ph.D.)--Chinese University of Hong Kong, 2012. / Includes bibliographical references (leaves 153-166). / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Abstract also in Chinese. / Abstract --- p.i / Acknowledgement --- p.iv / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Background --- p.3 / Chapter 1.2 --- Main Contributions of the Thesis --- p.11 / Chapter 1.3 --- Outline of the Thesis --- p.14 / Chapter 2 --- Automatic Model Selection on GMM --- p.16 / Chapter 2.1 --- Introduction --- p.17 / Chapter 2.2 --- Gaussian Mixture, Model Selection, and Priors --- p.21 / Chapter 2.2.1 --- Gaussian Mixture Model and EM algorithm --- p.21 / Chapter 2.2.2 --- Three automatic model selection approaches --- p.22 / Chapter 2.2.3 --- Jeffreys prior and Dirichlet-Normal-Wishart prior --- p.24 / Chapter 2.3 --- Algorithms with Jeffreys Priors --- p.25 / Chapter 2.3.1 --- Bayesian Ying-Yang learning and BYY-Jef algorithms --- p.25 / Chapter 2.3.2 --- Variational Bayesian and VB-Jef algorithms --- p.29 / Chapter 2.3.3 --- Minimum Message Length and MML-Jef algorithms --- p.33 / Chapter 2.4 --- Algorithms with Dirichlet and DNW Priors --- p.35 / Chapter 2.4.1 --- Algorithms BYY-Dir(α), VB-Dir(α) and MML-Dir(α) --- p.35 / Chapter 2.4.2 --- Algorithms with DNW priors --- p.40 / Chapter 2.5 --- Empirical Analysis on Simulated Data --- p.44 / Chapter 2.5.1 --- With priors on mixing weights: a quick look --- p.44 / Chapter 2.5.2 --- With full priors: extensive comparisons --- p.51 / Chapter 2.6 --- Concluding Remarks --- p.55 / Chapter 3 --- Applications of GMM Algorithms --- p.57 / Chapter 3.1 --- Face and Handwritten Digit Images Clustering --- p.58 / Chapter 3.2 --- Unsupervised Image Segmentation --- p.59 / Chapter 3.3 --- Image Foreground Extraction --- p.62 / Chapter 3.4 --- Texture Classification --- p.68 / Chapter 3.5 --- Concluding Remarks --- p.71 / Chapter 4 --- Automatic Model Selection on MFA/LFA --- p.73 / Chapter 4.1 --- Introduction --- p.74 / Chapter 4.2 --- MFA/LFA Models and the Priors --- p.78 / Chapter 4.2.1 --- MFA and LFA models --- p.78 / Chapter 4.2.2 --- The Dirichlet-Normal-Gamma priors --- p.79 / Chapter 4.3 --- Algorithms on MFA/LFA with DNG Priors --- p.82 / Chapter 4.3.1 --- BYY algorithm on MFA with DNG prior --- p.83 / Chapter 4.3.2 --- BYY algorithm on LFA with DNG prior --- p.86 / Chapter 4.3.3 --- VB algorithm on MFA with DNG prior --- p.89 / Chapter 4.3.4 --- VB algorithm on LFA with DNG prior --- p.91 / Chapter 4.4 --- Empirical Analysis on Simulated Data --- p.93 / Chapter 4.4.1 --- On the “chair data: a quick look --- p.94 / Chapter 4.4.2 --- Extensive comparisons on four series of simulations --- p.97 / Chapter 4.5 --- Concluding Remarks --- p.101 / Chapter 5 --- Applications of MFA/LFA Algorithms --- p.102 / Chapter 5.1 --- Face and Handwritten Digit Images Clustering --- p.103 / Chapter 5.2 --- Unsupervised Image Segmentation --- p.105 / Chapter 5.3 --- Radar HRRP based Airplane Recognition --- p.106 / Chapter 5.3.1 --- Background of HRRP radar target recognition --- p.106 / Chapter 5.3.2 --- Data description --- p.109 / Chapter 5.3.3 --- Experimental results --- p.111 / Chapter 5.4 --- Concluding Remarks --- p.113 / Chapter 6 --- Conclusions and FutureWorks --- p.114 / Chapter A --- Referred Parametric Distributions --- p.117 / Chapter B --- Derivations of GMM Algorithms --- p.119 / Chapter B.1 --- The BYY-DNW Algorithm --- p.119 / Chapter B.2 --- The MML-DNW Algorithm --- p.124 / Chapter B.3 --- The VB-DNW Algorithm --- p.127 / Chapter C --- Derivations of MFA/LFA Algorithms --- p.130 / Chapter C.1 --- The BYY Algorithms with DNG Priors --- p.130 / Chapter C.1.1 --- The BYY-DNG-MFA algorithm --- p.130 / Chapter C.1.2 --- The BYY-DNG-LFA algorithm --- p.137 / Chapter C.2 --- The VB Algorithms with DNG Priors --- p.145 / Chapter C.2.1 --- The VB-DNG-MFA algorithm --- p.145 / Chapter C.2.2 --- The VB-DNG-LFA algorithm --- p.149 / Bibliography --- p.152
870

Learning non-Gaussian factor analysis with different structures: comparative investigations on model selection and applications. / 基於多種結構的非高斯因數分析的模型選擇學習演算法比較研究及其應用 / CUHK electronic theses & dissertations collection / Ji yu duo zhong jie gou de fei Gaosi yin shu fen xi de mo xing xuan ze xue xi yan suan fa bi jiao yan jiu ji qi ying yong

January 2012 (has links)
高維資料的隱含結構挖掘是機器學習、模式識別和生物資訊學等領域中的重要問題。本論文從實踐和理論上研究了具有不同隱含結構模式的非高斯因數分析(Non-Gaussian Factor Analysis)模型。本文既從兩步法又從自動法的角度重點研究確定隱因數個數的模型選擇問題,及其在模式識別和生物資訊學上的實際應用。 / 非高斯因數分析在單高斯因數的情況下退化為傳統的因數分析(Factor Analysis)模型。我們發展了一套系統地比較模型選擇方法性能的工具,比較研究了經典的模型選擇準則(比如AIC 等),及近年來基於隨機矩陣理論的統計檢驗方法,還有貝葉斯陰陽(Bayesian Ying-Yang)和諧學習理論。同時,我們也對四個經典準則提供了一個適用於小樣本的低估因數數目傾向的相對排序的理論結果。 / 基於傳統的因數分析模型,我們還研究了參數化形式對模型選擇方法的性能的影響,一個重要的但被忽略或很少研究的問題,因為似然函數等價的參數化形式在傳統的模型選擇準則像AIC 下不會有性能差異。但是,我們通過大量的模擬資料和實際資料上的結果發現,在兩個常用的似然函數等價的因數分析參數化形式中,其中一個更加有利於在變分貝葉斯(Variational Bayes)和貝葉斯陰陽理論框架下做模型選擇。 進一步地,該兩個參數化形式被作為兩端拓展成一系列具有等價似然函數的參數化形式。實驗結果更加可靠地揭示了參數化形式的逐漸變化對模型選擇的影響。同時,實驗結果也顯示參數先驗分佈的引入可以提高模型選擇的準確度,並給出了相應的新的學習演算法。系統比較表明,不僅是兩步法還是自動法,貝葉斯陰陽學習理論都比變分貝葉斯的模型選擇的性能更佳,並且能在有利的參數化形式中獲得更大的提高。 / 二元因數分析(Binary FA)也是一種非高斯因數分析模型,它用伯努利因數去解釋隱含結構。首先,我們引入一種叫做正則對偶(canonical dual)的方法去解決在二元因數分析學習演算法中遇到的一個計算複雜度很大的二值二次規劃(Binary Quadratic Programming)問題。雖然它不能準確找到二值二次規劃的全域最優,它卻提高了整個學習演算法的計算速度和自動模型選擇的準確性。由此表明,局部嵌套的子優化問題的解不需要太精確反而能對整個學習演算法的性能更有利。然後,先驗分佈的引入進一步提高了模型選擇的性能,並且貝葉斯陰陽學習理論被系統的實驗結果證實要優於變分貝葉斯。接著,我們進一步發展了一個適用於二值資料的二元矩陣分解演算法。該演算法有理論的結果保證它的性能,並且在實際應用中,能以比其他相關演算法更優的性能從大規模的蛋白相互作用網路中檢測出蛋白功能複合物。 / 進一步,我們在一個半盲(semi-blind)的框架下研究了非高斯因數分析的演算法及其在系統生物學中的應用。非高斯因數分析模型被用於基因轉錄調控建模,並引入稀疏約束到連接矩陣,從而提出一個能有效估計轉錄因數調控信號的方法,而不需要像網路分量分析(Network Component Analysis)方法那樣預先給定轉錄因數調控基因的拓撲網路結構。特別地,借助二元因數分析,調控信號中的二元特徵能被直接捕捉。這種似開關的模式在很多生物過程的調控機制裡面起著重要作用。 / 最後,基於半盲非高斯因數分析學習演算法,我們提出了一套分析外顯子測序數據的方法,能有效地找出與疾病關聯的易感基因,提供了一個可能的方向去解決傳統的全基因組關聯分析(GWAS)方法在低頻高雜訊的外顯子測序數據上失效的問題。在一個1457 個樣本的大規模外顯子測序數據的初步結果顯示,我們的方法既能確認很多已經被認為是與疾病相關的基因,又能找到新的被重複驗證有顯著性的易感基因。相關的表達譜資料進一步顯示所找到的新基因在疾病和對照上有顯著的上下調的表達差異。 / Mining the underlying structure from high dimensional observations is of critical importance in machine learning, pattern recognition and bioinformatics. In this thesis, we, empirically or theoretically, investigate non-Gaussian Factor Analysis (NFA) models with different underlying structures. We focus on the problem of determining the number of latent factors of NFA, from two-stage approach model selection to automatic model selection, with real applications in pattern recognition and bioinformatics. / We start by a degenerate case of NFA, the conventional Factor Analysis (FA) with latent Gaussian factors. Many model selection methods have been proposed and used for FA, and it is important to examine their relative strengths and weaknesses. We develop an empirical analysis tool, to facilitate a systematic comparison on model selection performances of not only classical criteria (e.g., Akaike’s information criterion or shortly AIC) but also recently developed methods (e.g., Kritchman & Nadler’s hypothesis tests), as well as the Bayesian Ying-Yang (BYY) harmony learning. Also, we prove a theoretical relative order of underestimation tendency of four classical criteria. / Then, we investigate how parameterizations affect model selection performance, an issue that has been ignored or seldom studied since traditional model selection criteria, like AIC, perform equivalently on different parameterizations that have equivalent likelihood functions. Focusing on two typical parameterizations of FA, one of which is found to be better than the other under both Variational Bayes (VB) and BYY via extensive experiments on synthetic and real data. Moreover, a family of FA parameterizations that have equivalent likelihood functions are presented, where each one is featured by an integer r, with the two known parameterizations being both ends as r varies from zero to its upper bound. Investigations on this FA family not only confirm the significant difference between the two parameterizations in terms of model selection performance, but also provide insights into what makes a better parameterization. With a Bayesian treatment to the new FA family, alternative VB algorithms on FA are derived, and also BYY algorithms on FA are extended to be equipped with prior distributions on the parameters. A systematic comparison shows that BYY generally outperforms VB under various scenarios including varying simulation configurations and incrementally adding priors to parameters, as well as automatic model selection. / To describe binary latent features, we proceed to binary factor analysis (BFA), which considers Bernoulli factors. First, we introduce a canonical dual approach to tackling a difficult Binary Quadratic Programming (BQP) problem encountered as a computational bottleneck in BFA learning. Although it is not an exact BQP solver, it improves the learning speed and model selection accuracy, which indicates that some amount of error in solving the BQP, a problem nested in the hierarchy of the whole learning process, brings gain on both computational efficiency and model selection performance. The results also imply that optimization is important in learning, but learning is not just a simple optimization. Second, we develop BFA algorithms under VB and BYY to incorporate Bayesian priors on the parameters to improve the automatic model selection performance, and also show that BYY is superior to VB under a systematic comparison. Third, for binary observations, we propose a Bayesian Binary Matrix Factorization (BMF) algorithm under the BYY framework. The performance of the BMF algorithm is guaranteed with theoretical proofs and verified by experiments. We apply it to discovering protein complexes from protein-protein interaction (PPI) networks, an important problem in bioinformatics, with outperformance comparing to other related methods. / Furthermore, we investigate NFA under a semi-blind learning framework. In practice, there exist many scenarios of knowing partially either or both of the system and the input. Here, we modify Network Component Analysis (NCA) to model gene transcriptional regulation in system biology by NFA. The previous hardcut NFA algorithm is extended here as sparse BYY-NFA by considering either or both of a priori connectivity and a priori sparse constraint. Therefore, the a priori knowledge about the connection topology of the TF-gene regulatory network required by NCA is not necessary for our NFA algorithm. The sparse BYY-NFA can be further modified to get a sparse BYY-BFA algorithm, which directly models the switching patterns of latent transcription factor (TF) activities in gene regulation, e.g., whether or not a TF is activated. Mining switching patterns provides insights into exploring regulation mechanism of many biological processes. / Finally, the semi-blind NFA learning is applied to identify those single nucleotide polymorphisms (SNPs) that are significantly associated with a disease or a complex trait from exome sequencing data. By encoding each exon/gene (which may contain multiple SNPs) as a vector, an NFA classifier, obtained in a supervised way on a training set, is used for prediction on a testing set. The genes are selected according to the p-values of Fisher’s exact test on the confusion tables collected from prediction results. The selected genes on a real dataset from an exome sequencing project on psoriasis are consistent in part with published results, and some of them are probably novel susceptible genes of the disease according to the validation results. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Tu, Shikui. / Thesis (Ph.D.)--Chinese University of Hong Kong, 2012. / Includes bibliographical references (leaves 196-212). / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Abstract also in Chinese. / Abstract --- p.i / Acknowledgement --- p.iv / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Background --- p.1 / Chapter 1.1.1 --- Motivations --- p.1 / Chapter 1.1.2 --- Independent Factor Analysis (IFA) --- p.2 / Chapter 1.1.3 --- Learning Methods --- p.6 / Chapter 1.2 --- Related Work --- p.14 / Chapter 1.2.1 --- Learning Gaussian FA --- p.14 / Chapter 1.2.2 --- Learning NFA --- p.16 / Chapter 1.2.3 --- Learning Semi-blind NFA --- p.18 / Chapter 1.3 --- Main Contribution of the Thesis --- p.18 / Chapter 1.4 --- Thesis Organization --- p.25 / Chapter 1.5 --- Publication List --- p.27 / Chapter 2 --- FA comparative analysis --- p.31 / Chapter 2.1 --- Determining the factor number --- p.32 / Chapter 2.2 --- Model Selection Methods --- p.34 / Chapter 2.2.1 --- Two-Stage Procedure and Classical Model Selection Criteria --- p.34 / Chapter 2.2.2 --- Kritchman&Nadler's Hypothesis Test (KN) --- p.35 / Chapter 2.2.3 --- Minimax Rank Estimation (MM) --- p.37 / Chapter 2.2.4 --- Minka's Criterion (MK) for PCA --- p.38 / Chapter 2.2.5 --- Bayesian Ying-Yang (BYY) Harmony Learning --- p.39 / Chapter 2.3 --- Empirical Analysis --- p.42 / Chapter 2.3.1 --- A New Tool for Empirical Comparison --- p.42 / Chapter 2.3.2 --- Investigation On Model Selection Performance --- p.44 / Chapter 2.4 --- A Theoretic Underestimation Partial Order --- p.49 / Chapter 2.4.1 --- Events of Estimating the Hidden Dimensionality --- p.49 / Chapter 2.4.2 --- The Structural Property of the Criterion Function --- p.49 / Chapter 2.4.3 --- Experimental Justification --- p.54 / Chapter 2.5 --- Concluding Remarks --- p.58 / Chapter 3 --- FA parameterizations affect model selection --- p.70 / Chapter 3.1 --- Parameterization Issue in Model Selection --- p.71 / Chapter 3.2 --- FAr: ML-equivalent Parameterizations of FA --- p.72 / Chapter 3.3 --- Variational Bayes on FAr --- p.74 / Chapter 3.4 --- Bayesian Ying-Yang Harmony Learning on FAr --- p.77 / Chapter 3.5 --- Empirical Analysis --- p.82 / Chapter 3.5.1 --- Three levels of investigations --- p.82 / Chapter 3.5.2 --- FA-a vs FA-b: performances of BYY, VB, AIC, BIC, and DNLL --- p.84 / Chapter 3.5.3 --- FA-r: performances of VB versus BYY --- p.87 / Chapter 3.5.4 --- FA-a vs FA-b: automatic model selection performance of BYYandVB --- p.90 / Chapter 3.5.5 --- Classification Performance on Real World Data Sets --- p.92 / Chapter 3.6 --- Concluding remarks --- p.93 / Chapter 4 --- BFA learning versus optimization --- p.104 / Chapter 4.1 --- Binary Factor Analysis --- p.105 / Chapter 4.2 --- BYY Harmony Learning on BFA --- p.107 / Chapter 4.3 --- Empirical Analysis --- p.108 / Chapter 4.3.1 --- BIC and Variational Bayes (VB) on BFA --- p.108 / Chapter 4.3.2 --- Error in solving BQP affects model selection --- p.110 / Chapter 4.3.3 --- Priors over parameters affect model selection --- p.114 / Chapter 4.3.4 --- Comparisons among BYY, VB, and BIC --- p.115 / Chapter 4.3.5 --- Applications in recovering binary images --- p.116 / Chapter 4.4 --- Concluding Remarks --- p.117 / Chapter 5 --- BMF for PPI network analysis --- p.124 / Chapter 5.1 --- The problem of protein complex prediction --- p.125 / Chapter 5.2 --- A novel binary matrix factorization (BMF) algorithm --- p.126 / Chapter 5.3 --- Experimental Results --- p.130 / Chapter 5.3.1 --- Other methods in comparison --- p.130 / Chapter 5.3.2 --- Data sets --- p.131 / Chapter 5.3.3 --- Evaluation criteria --- p.131 / Chapter 5.3.4 --- On altered graphs by randomly adding and deleting edges --- p.132 / Chapter 5.3.5 --- On real PPI data sets --- p.137 / Chapter 5.3.6 --- On gene expression data for biclustering --- p.137 / Chapter 5.4 --- A Theoretical Analysis on BYY-BMF --- p.138 / Chapter 5.4.1 --- Main results --- p.138 / Chapter 5.4.2 --- Experimental justification --- p.140 / Chapter 5.4.3 --- Proofs --- p.143 / Chapter 5.5 --- Concluding Remarks --- p.147 / Chapter 6 --- Semi-blind NFA: algorithms and applications --- p.148 / Chapter 6.1 --- Determining transcription factor activity --- p.148 / Chapter 6.1.1 --- A brief review on NCA --- p.149 / Chapter 6.1.2 --- Sparse NFA --- p.150 / Chapter 6.1.3 --- Sparse BFA --- p.156 / Chapter 6.1.4 --- On Yeast cell-cycle data --- p.160 / Chapter 6.1.5 --- On E. coli carbon source transition data --- p.166 / Chapter 6.2 --- Concluding Remarks --- p.170 / Chapter 7 --- Applications on Exome Sequencing Data Analysis --- p.172 / Chapter 7.1 --- From GWAS to Exome Sequencing --- p.172 / Chapter 7.2 --- Encoding An Exon/Gene --- p.173 / Chapter 7.3 --- An NFA Classifier --- p.175 / Chapter 7.4 --- Results --- p.176 / Chapter 7.4.1 --- Simulation --- p.176 / Chapter 7.4.2 --- On a real exome sequencing data set: AHMUe --- p.177 / Chapter 7.5 --- Concluding Remarks --- p.186 / Chapter 8 --- Conclusion and FutureWork --- p.187 / Chapter A --- Derivations of the learning algorithms on FA-r --- p.190 / Chapter A.1 --- The VB learning algorithm on FA-r --- p.190 / Chapter A.2 --- The BYY learning algorithm on FA-r --- p.193 / Bibliography --- p.195

Page generated in 0.4038 seconds