Global ETD Search

51	Fault Isolation By Manifold Learning Thurén, Mårten January 1985 (has links) This thesis investigates the possibility of improving black box fault diagnosis by a process called manifold learning, which simply stated is a way of finding patterns in recorded sensor data. The idea is that there is more information in the data than is exploited when using simple classification algorithms such as k-Nearest Neighbor and Support Vector Machines, and that this additional information can be found by using manifold learning methods. To test the idea of using manifold learning, data from two different fault diagnosis scenarios is used: A Scania truck engine and an electrical system called Adapt. Two linear and one non-linear manifold learning methods are used: Principal Component Analysis and Linear Discriminant Analysis (linear) and Laplacian Eigenmaps (non-linear).Some improvements are achieved given certain conditions on the diagnosis scenarios. The improvements for different methods correspond to the systems in which they are achieved in terms of linearity. The positive results for the relatively linear electrical system are achieved mainly by the linear methods Principal Component Analysis and Linear Discriminant Analysis and the positive results for the non-linear Scania system are achieved by the non-linear method Laplacian Eigenmaps.The results for scenarios without these special conditions are not improved however, and it is uncertain wether the improvements in special condition scenarios are due to gained information or to the nature of the cases themselves. manifold pca lda laplacian eigenmaps fault isolation fault diagnosis Information Systems
52	When Does it Mean? Detecting Semantic Change in Historical Texts Hengchen, Simon 06 December 2017 (has links) Contrary to what has been done to date in the hybrid field of natural language processing (NLP), this doctoral thesis holds that the new approach developed below makes it possible to semi-automatically detect semantic changes in digitised, OCRed, historical corpora. We define the term semi-automatic as “making use of an advanced tool whilst remaining in control of key decisions regarding the processing of the corpus”. If the tool utilised – “topic modelling”, and more precisely the “Latent Dirichlet Allocation” (LDA) – is not unknown in NLP or computational historical semantics, where it is already mobilised to follow a priori selected words and try to detect when these words change meaning, it has never been used, to the best of our knowledge, to detect which words change in a humanistically-relevant way. In other terms, our method does not study a word in context to gather information on this specific word, but the whole context – which we consider a witness to a potential evolution of reality – to gather more contextual information on one or several particular semantic shift candidates. In order to detect these semantic changes, we use the algorithm to create lexical fields: groups of words that together define a subject to which they all relate. By comparing lexical fields over different time periods of the same corpus (that is, by mobilising a diachronic approach), we try to determine whether words appear over time. We support that if a word starts to be used in a certain context at a certain time, it is a likely candidate for semantic change. Of course, the method developed here and illustrated by a case study applies to a certain context: that of digitised, OCRed, historical archives in Dutch. Nevertheless, this doctoral work also describes the advantages and disadvantages of the algorithm and postulates, on the basis of this evaluation, that the method is applicable to other fields, under other conditions. By carrying out a critical evaluation of the tools available and used, this doctoral thesis invites the community to the reproducibility of the method, whilst pointing out obvious limitations of the approach and propositions on how to solve them. / Doctorat en Information et communication / info:eu-repo/semantics/nonPublished Linguistique appliquée Linguistique historique Lexicologie LDA topic modelling topic modeling semantic change
53	LDA přístup k modelování operačního rizika / LDA approach to operational risk modelling Kaplanová, Martina January 2016 (has links) In this thesis we will deal with the term of operational risk, as it is presented in the directives Basel 2 that are mandatory for financial institutions in the European Union. The main problem is operational risk modeling, therefore, how to measure and manage it. In the first part we will look at the possibility of calculating the capital requirements for operational risk under Basel 2, mainly the calculation with the internal model. We will describe the specific procedures for the development of the internal model and we will focus on Loss Distribution Approach. The internal model will be based on modeling of loss in each risk cell separately. In the second part we will show, how to include modeling of dependence structure between risk cells to the internal model with using copulas. Finally, we will show the illustrative example, where we will see, whether the modeling of dependence leads to a reduction of the total capital requirement. Powered by TCPDF (www.tcpdf.org)
54	Topic Explorer Dashboard : A Visual Analytics Tool for an Innovation Management System enhanced by Machine Learning Techniques Knoth, Stefanie January 2020 (has links) Innovation Management Software contains complex data with many different variables. This data is usually presented in tabular form or with isolated graphs that visualize a single independent aspect of a dataset. However, displaying this data with interconnected, interactive charts provide much more flexibility and opportunities for working with and understanding the data. Charts that show multiple aspects of the data at once can help in uncovering hidden relationships between different aspects of the data and in finding new insights that might be difficult to see with the traditional way of displaying data. The size and complexity of the available data also invites analyzing it with machine learning techniques. In this thesis it is first explored how machine learning techniques can be used to gain additional insight from the data and then the results of this investigation are used together with the original data in order to build a prototypical dashboard for exploratory visual data analysis. This dashboard is then evaluated by means of ICE-T heuristics and the results and findings are discussed. Innovation Management doc2vec LDA TSNE Exploratory Visual Analysis Data Dashboard Media and Communication Technology Medieteknik
55	Rozpoznání obličeje / Face Recognition Kopřiva, Adam January 2010 (has links) This master's thesis considers methods of face recognition. There are described methods with different approachs: knowledge-based methods, feature invariant approaches, template matching methods and appearance-based methods. This master's thesis is focused particulary on template matching method and statistical methods like a principal component analysis (PCA) and linear discriminant analysis (LDA). There are described in detail template matching methods like active shape models (ASM) and active appearance models (AAM).
56	Získavanie a analýza dát pre oblasť crowdfundingu Koštial, Martin January 2019 (has links) The thesis deals with data acquisition from crowdfunding and their analysis. The theoretical part is focused on the description of available technologies and algorithms for data analysis. In the practical part the data collection is realized. Data mining and text mining algorithms are applied in this section for data.
57	Tracking Online Trend Locations using a Geo-Aware Topic Model Schreiber, Jonah January 2016 (has links) In automatically categorizing massive corpora of text, various topic models have been applied with good success. Much work has been done on applying machine learning and NLP methods on Internet media, such as Twitter, to survey online discussion. However, less focus has been placed on studying how geographical locations discussed in online fora evolve over time, and even less on associating such location trends with topics. Can online discussions be geographically tracked over time? This thesis attempts to answer this question by evaluating a geo-aware Streaming Latent Dirichlet Allocation (SLDA) implementation which can recognize location terms in text. We show how the model can predict time-dependent locations of the 2016 American primaries by automatic discovery of election topics in various Twitter corpora, and deduce locations over time. Computer Sciences Datavetenskap (datalogi)
58	Unsupervised Topic Modeling to Improve Stormwater Investigations Arvidsson, David January 2022 (has links) Stormwater investigations are an important part of the detail plan that is necessary for companies and industries to write. The detail plan is used to show that an area is well suited for among other things, construction. Writing these detail plans is a costly and time consuming process and it is not uncommon they get rejected. This is because it is difficult to find information about the criteria you need to meet and what you need to address within the investigation. This thesis aims to make this problem less ambiguous by applying the topic modeling algorithm LDA (latent Dirichlet allocation) in order to identify the structure of stormwater investigations. Moreover, sentences that contain words from the topic modeling will be extracted to give each word a perspective of how it can be used in the context of writing a stormwater investigation. Finally a knowledge graph will be created with the extracted topics and sentences. The result of this study indicates that topic modeling and NLP (natural language processing) can be used to identify the structure of stormwater investigations. Furthermore it can also be used to extract useful information that can be used as a guidance when learning and writing stormwater investigations. NLP topic modeling LDA unsupervised machine learning urban process graph database knowledge graph Computer Engineering Datorteknik
59	Bayes Factors for the Proposition of a Common Source of Amphetamine Seizures. Pawar, Yash January 2021 (has links) This thesis sets out to address the challenges with the comparison of Amphetamine material in determining whether they originate from the same source or different sources using pairwise ratios of peak areas within each chromatogram of material and then modeling the difference between the ratios for each comparison as a basis for evaluation. The evaluation of an existing method that uses these ratios to determine the sum of significant differences between each comparison of material that is provided is done. The outcome of this evaluation suggests that there the distributions for comparison of samples originating from the same source and the comparison of samples originating from different sources have an overlap leading to uncertainties in conclusions. In this work, the differences between the ratios of peak areas have been modeled using a feature-based approach. Because the feature space is quite large, Discriminant Analysis methods such as Linear Discriminant Analysis (LDA) and Partial least squares Discriminant Analysis (PLS-DA) have been implemented to perform classification by dimensionality reduction. Another popular method that works on the principle of nearest centroid classifier called as Nearest shrunken centroid is also applied that performs classification on shrunken centroids of the features. The results and analysis of all the methods have been performed to obtain the classification results for classes +1 (samples originate from the same source) and ́1 (samples originate from different sources). Likelihood ratios of each class for each of these methods have also been evaluated using the Empirical Cross-Entropy (ECE) method to determine the robustness of the classifiers. All three models seem to have performed fairly well in terms of classification with LDA being the most robust and reliable with its predictions. Amphetamine Data LDA PLS-DA NSC Gaussaianize pairwise ratios Probability Theory and Statistics Sannolikhetsteori och statistik
60	Stochastic EM for generic topic modeling using probabilistic programming Saberi Nasseri, Robin January 2021 (has links) Probabilistic topic models are a versatile class of models for discovering latent themes in document collections through unsupervised learning. Conventional inferential methods lack the scaling capabilities necessary for extensions to large-scale applications. In recent years Stochastic Expectation Maximization has proven scalable for the simplest topic model: Latent Dirichlet Allocation. Performing analytical maximization is unfortunately not possible for many more complex topic models. With the rise of probabilistic programming languages, the ability to infer flexibly specified probabilistic models using sophisticated numerical optimization procedures has become widely available. These frameworks have however mainly been developed for optimization of continuous parameters, often prohibiting direct optimization of discrete parameters. This thesis explores the potential of utilizing probabilistic programming for generic topic modeling using Stochastic Expectation Maximization with numerical maximization of discrete parameters reparameterized to unconstrained space. The method achieves results of similar quality as other methods for Latent Dirichlet Allocation in simulated experiments. Further application is made to infer a Dirichlet-multinomial Regression model with metadata covariates. A real dataset is used and the method produces interpretable topics. SEM topic model probabilistic programming LDA DMR TFP Probability Theory and Statistics Sannolikhetsteori och statistik

Search results