Global ETD Search

121	Creating an experimental testbed for information-theoretic analysis of architectures for x-ray anomaly detection Coccarelli, David, Greenberg, Joel A., Mandava, Sagar, Gong, Qian, Huang, Liang-Chih, Ashok, Amit, Gehm, Michael E. 01 May 2017 (has links) Anomaly detection requires a system that can reliably convert measurements of an object into knowledge about that object. Previously, we have shown that an information-theoretic approach to the design and analysis of such systems provides insight into system performance as it pertains to architectural variations in source fluence, view number/angle, spectral resolution, and spatial resolution.(1) However, this work was based on simulated measurements which, in turn, relied on assumptions made in our simulation models and virtual objects. In this work, we describe our experimental testbed capable of making transmission x-ray measurements. The spatial, spectral, and temporal resolution is sufficient to validate aspects of the simulation-based framework, including the forward models, bag packing techniques, and performance analysis. In our experimental CT system, designed baggage is placed on a rotation stage located between a tungsten-anode source and a spectroscopic detector array. The setup is able to measure a full 360 rotation with 18,000 views, each of which defines a 10 ms exposure of 1,536 detector elements, each with 64 spectral channels. Measurements were made of 1,000 bags that comprise 100 clutter instantiations each with 10 different target materials. Moreover, we develop a systematic way to generate bags representative of our desired clutter and target distributions. This gives the dataset a statistical significance valuable in future investigations. Information Theory High Dimensionality X-Ray System Geometry X-Ray System Architecture
122	Learning and recognizing texture characteristics using local binary patterns Turtinen, M. (Markus) 05 June 2007 (has links) Abstract Texture plays an important role in numerous computer vision applications. Many methods for describing and analyzing of textured surfaces have been proposed. Variations in the appearance of texture caused by changing illumination and imaging conditions, for example, set high requirements on different analysis methods. In addition, real-world applications tend to produce a great deal of complex texture data to be processed that should be handled effectively in order to be exploited. A local binary pattern (LBP) operator offers an efficient way of analyzing textures. It has a simple theory and combines properties of structural and statistical texture analysis methods. LBP is invariant against monotonic gray-scale variations and has also extensions to rotation invariant texture analysis. Analysis of real-world texture data is typically very laborious and time consuming. Often there is no ground truth or other prior knowledge of the data available, and important properties of the textures must be learned from the images. This is a very challenging task in texture analysis. In this thesis, methods for learning and recognizing texture categories using local binary pattern features are proposed. Unsupervised clustering and dimensionality reduction methods combined to visualization provide useful tools for analyzing texture data. Uncovering the data structures is done in an unsupervised fashion, based only on texture features, and no prior knowledge of the data, for example texture classes, is required. In this thesis, non-linear dimensionality reduction, data clustering and visualization are used for building a labeled training set for a classifier, and for studying the performance of the features. The thesis also proposes a multi-class approach to learning and labeling part based texture appearance models to be used in scene texture recognition using only little human interaction. Also a semiautomatic approach to learning texture appearance models for view based texture classification is proposed. The goal of texture characterization is often to classify textures into different categories. In this thesis, two texture classification systems suitable for different applications are proposed. First, a discriminative classifier that combines local and contextual texture information of the image in scene recognition is proposed. Secondly, a real-time capable texture classifier with a self-intuitive user interface to be used in industrial texture classification is proposed. Two challenging real-world texture analysis applications are used to study the performance and usefulness of the proposed methods. The first one is visual paper analysis which aims to characterize paper quality based on texture properties. The second application is outdoor scene image analysis where texture information is used to recognize different regions in the scenes. classification computer vision dimensionality reduction learning paper characterization scene image analysis texture analysis visualization
123	Nowcasting by the BSTS-U-MIDAS Model Duan, Jun 23 September 2015 (has links) Using high frequency data for forecasting or nowcasting, we have to deal with three major problems: the mixed frequency problem, the high dimensionality (fat re- gression, parameter proliferation) problem, and the unbalanced data problem (miss- ing observations, ragged edge data). We propose a BSTS-U-MIDAS model (Bayesian Structural Time Series-Unlimited-Mixed-Data Sampling model) to handle these prob- lem. This model consists of four parts. First of all, a structural time series with regressors model (STM) is used to capture the dynamics of target variable, and the regressors are chosen to boost the forecast accuracy. Second, a MIDAS model is adopted to handle the mixed frequency of the regressors in the STM. Third, spike- and-slab regression is used to implement variable selection. Fourth, Bayesian model averaging (BMA) is used for nowcasting. We use this model to nowcast quarterly GDP for Canada, and find that this model outperform benchmark models: ARIMA model and Boosting model, in terms of MAE (mean absolute error) and MAPE (mean absolute percentage error). / Graduate / 0501 / 0508 / 0463 / jonduan@uvic.ca forecasting nowcasting BSTS-U-MIDAS model high frequency data mixed frequency problem high dimensionality
124	Técnicas computacionais de apoio à classificação visual de imagens e outros dados / Computational techniques to support classification of images and other data José Gustavo de Souza Paiva 20 December 2012 (has links) O processo automático de classificação de dados em geral, e em particular de classificação de imagens, é uma tarefa computacionalmente intensiva e variável em termos de precisão, sendo consideravelmente dependente da configuração do classificador e da representação dos dados utilizada. Muitos dos fatores que afetam uma adequada aplicação dos métodos de classificação ou categorização para imagens apontam para a necessidade de uma maior interferência do usuário no processo. Para isso são necessárias mais ferramentas de apoio às várias etapas do processo de classificação, tais como, mas não limitadas, a extração de características, a parametrização dos algoritmos de classificação e a escolha de instâncias de treinamento adequadas. Este doutorado apresenta uma metodologia para Classificação Visual de Imagens, baseada na inserção do usuário no processo de classificação automática através do uso de técnicas de visualização. A ideia é permitir que o usuário participe de todos os passos da classificação de determinada coleção, realizando ajustes e consequentemente melhorando os resultados de acordo com suas necessidades. Um estudo de diversas técnicas de visualização candidatas para a tarefa é apresentado, com destaque para as árvores de similaridade, sendo apresentadas melhorias do algoritmo de construção em termos de escalabilidade visual e de tempo de processamento. Adicionalmente, uma metodologia de redução de dimensionalidade visual semi-supervisionada é apresentada para apoiar, pela utilização de ferramentas visuais, a criação de espaços reduzidos que melhorem as características de segregação do conjunto original de características. A principal contribuição do trabalho é um sistema de classificação visual incremental que incorpora todos os passos da metodologia proposta, oferecendo ferramentas interativas e visuais que permitem a interferência do usuário na classificação de coleções incrementais com configuração de classes variável. Isso possibilita a utilização do conhecimento do ser humano na construção de classificadores que se adequem a diferentes necessidades dos usuários em diferentes cenários, produzindo resultados satisfatórios para coleções de dados diversas. O foco desta tese é em categorização de coleções de imagens, com exemplos também para conjuntos de dados textuais / Automatic data classification in general, and image classification in particular, are computationally intensive tasks with variable results concerning precision, being considerably dependent on the classifier´s configuration and data representation. Many of the factors that affect an adequate application of classification or categorization methods for images point to the need for more user interference in the process. To accomplish that, it is necessary to develop a larger set of supporting tools for the various stages of the classification set up, such as, but not limited to, feature extraction, parametrization of the classification algorithm and selection of adequate training instances. This doctoral Thesis presents a Visual Image Classification methodology based on the user´s insertion in the classification process through the use of visualization techniques. The idea is to allow the user to participate in all classification steps, adjusting several stages and consequently improving the results according to his or her needs. A study on several candidate visualization techniques is presented, with emphasis on similarity trees, and improvements of the tree construction algorithm, both in visual and time scalability, are shown. Additionally, a visual semi-supervised dimensionality reduction methodology was developed to support, through the use of visual tools, the creation of reduced spaces that improve segregation of the original feature space. The main contribution of this work is an incremental visual classification system incorporating all the steps of the proposed methodology, and providing interactive and visual tools that permit user controlled classification of an incremental collection with evolving class configuration. It allows the use of the human knowledge on the construction of classifiers that adapt to different user needs in different scenarios, producing satisfactory results for several data collections. The focus of this Thesis is image data sets, with examples also in classification of textual collections Classificação visual de dados Redução de dimensionalidade Visualização de informação Visual data classification
125	Extração de tópicos baseado em agrupamento de regras de associação / Topic extraction based on association rule clustering Fabiano Fernandes dos Santos 29 May 2015 (has links) Uma representação estruturada dos documentos em um formato apropriado para a obtenção automática de conhecimento, sem que haja perda de informações relevantes em relação ao formato originalmente não-estruturado, é um dos passos mais importantes da mineração de textos, pois a qualidade dos resultados obtidos com as abordagens automáticas para obtenção de conhecimento de textos estão fortemente relacionados à qualidade dos atributos utilizados para representar a coleção de documentos. O Modelo de Espaço de Vetores (MEV) é um modelo tradicional para obter uma representação estruturada dos documentos. Neste modelo, cada documento é representado por um vetor de pesos correspondentes aos atributos do texto. O modelo bag-of-words é a abordagem de MEV mais utilizada devido a sua simplicidade e aplicabilidade. Entretanto, o modelo bag-of-words não trata a dependência entre termos e possui alta dimensionalidade. Diversos modelos para representação dos documentos foram propostos na literatura visando capturar a informação de relação entre termos, destacando-se os modelos baseados em frases ou termos compostos, o Modelo de Espaço de Vetores Generalizado (MEVG) e suas extensões, modelos de tópicos não-probabilísticos, como o Latent Semantic Analysis (LSA) ou o Non-negative Matrix Factorization (NMF), e modelos de tópicos probabilísticos, como o Latent Dirichlet Allocation (LDA) e suas extensões. A representação baseada em modelos de tópicos é uma das abordagens mais interessantes uma vez que elas fornece uma estrutura que descreve a coleção de documentos em uma forma que revela sua estrutura interna e as suas inter-relações. As abordagens de extração de tópicos também fornecem uma estratégia de redução da dimensionalidade visando a construção de novas dimensões que representam os principais tópicos ou assuntos identificados na coleção de documentos. Entretanto, a extração é eficiente de informações sobre as relações entre os termos para construção da representação de documentos ainda é um grande desafio de pesquisa. Os modelos para representação de documentos que exploram a correlação entre termos normalmente enfrentam um grande desafio para manter um bom equilíbrio entre (i) a quantidade de dimensões obtidas, (ii) o esforço computacional e (iii) a interpretabilidade das novas dimensões obtidas. Assim,é proposto neste trabalho o modelo para representação de documentos Latent Association Rule Cluster based Model (LARCM). Este é um modelo de extração de tópicos não-probabilístico que explora o agrupamento de regras de associação para construir uma representação da coleção de documentos com dimensionalidade reduzida tal que as novas dimensões são extraídas a partir das informações sobre as relações entre os termos. No modelo proposto, as regras de associação são extraídas para cada documento para obter termos correlacionados que formam expressões multi-palavras. Essas relações entre os termos formam o contexto local da relação entre termos. Em seguida, aplica-se um processo de agrupamento em todas as regras de associação para formar o contexto geral das relações entre os termos, e cada grupo de regras de associação obtido formará um tópico, ou seja, uma dimensão da representação. Também é proposto neste trabalho uma metodologia de avaliação que permite selecionar modelos que maximizam tanto os resultados na tarefa de classificação de textos quanto os resultados de interpretabilidade dos tópicos obtidos. O modelo LARCM foi comparado com o modelo LDA tradicional e o modelo LDA utilizando uma representação que inclui termos compostos (bag-of-related-words). Os resultados dos experimentos indicam que o modelo LARCM produz uma representação para os documentos que contribui significativamente para a melhora dos resultados na tarefa de classificação de textos, mantendo também uma boa interpretabilidade dos tópicos obtidos. O modelo LARCM também apresentou ótimo desempenho quando utilizado para extração de informação de contexto para aplicação em sistemas de recomendação sensíveis ao contexto. / A structured representation of documents in an appropriate format for the automatic knowledge extraction without loss of relevant information is one of the most important steps of text mining, since the quality of the results obtained with automatic approaches for the text knowledge extraction is strongly related to the quality of the selected attributes to represent the collection of documents. The Vector Space model (VSM) is a traditional structured representation of documents. In this model, each document is represented as a vector of weights that corresponds to the features of the document. The bag-of-words model is the most popular VSM approach because of its simplicity and general applicability. However, the bag-of-words model does not include dependencies of the terms and has a high dimensionality. Several models for document representation have been proposed in the literature in order to capture the dependence among the terms, especially models based on phrases or compound terms, the Generalized Vector Space Model (GVSM) and their extensions, non-probabilistic topic models as Latent Semantic Analysis (LSA) or Non-negative Matrix Factorization (NMF) and still probabilistic topic models as the Latent Dirichlet Allocation (LDA) and their extensions. The topic model representation is one of the most interesting approaches since it provides a structure that describes the collection of documents in a way that reveals their internal structure and their interrelationships. Also, this approach provides a dimensionality reduction strategy aiming to built new dimensions that represent the main topics or ideas of the document collection. However, the efficient extraction of information about the relations of terms for document representation is still a major research challenge nowadays. The document representation models that explore correlated terms usually face a great challenge of keeping a good balance among the (i) number of extracted features, (ii) the computational performance and (iii) the interpretability of new features. In this way, we proposed the Latent Association Rule Cluster based Model (LARCM). The LARCM is a non-probabilistic topic model that explores association rule clustering to build a document representation with low dimensionality in a way that each dimension is composed by information about the relations among the terms. In the proposed approach, the association rules are built for each document to extract the correlated terms that will compose the multi-word expressions. These relations among the terms are the local context of relations. Then, a clustering process is applied for all association rules to discover the general context of the relations, and each obtained cluster is an extracted topic or a dimension of the new document representation. This work also proposes in this work an evaluation methodology to select topic models that maximize the results in the text classification task as much as the interpretability of the obtained topics. The LARCM model was compared against both the traditional LDA model and the LDA model using a document representation that includes multi-word expressions (bag-of-related-words). The experimental results indicate that LARCM provides an document representation that improves the results in the text classification task and even retains a good interpretability of the extract topics. The LARCM model also achieved great results as a method to extract contextual information for context-aware recommender systems. Agrupamento de regras de associação Extração de tópicos Mineração de textos Redução de dimensionalidade Association rule clustering Dimensionality reduction Topic extraction
126	Projective geometry, toric algebra and tropical computations Görlach, Paul 04 December 2020 (has links) No description available. info:eu-repo/classification/ddc/500 ddc:500
127	Efficient Inversion of Large-Scale Problems Exploiting Structure and Randomization January 2020 (has links) abstract: Dimensionality reduction methods are examined for large-scale discrete problems, specifically for the solution of three-dimensional geophysics problems: the inversion of gravity and magnetic data. The matrices for the associated forward problems have beneficial structure for each depth layer of the volume domain, under mild assumptions, which facilitates the use of the two dimensional fast Fourier transform for evaluating forward and transpose matrix operations, providing considerable savings in both computational costs and storage requirements. Application of this approach for the magnetic problem is new in the geophysics literature. Further, the approach is extended for padded volume domains. Stabilized inversion is obtained efficiently by applying novel randomization techniques within each update of the iteratively reweighted scheme. For a general rectangular linear system, a randomization technique combined with preconditioning is introduced and investigated. This is shown to provide well-conditioned inversion, stabilized through truncation. Applying this approach, while implementing matrix operations using the two dimensional fast Fourier transform, yields computationally effective inversion, in memory and cost. Validation is provided via synthetic data sets, and the approach is contrasted with the well-known LSRN algorithm when applied to these data sets. The results demonstrate a significant reduction in computational cost with the new algorithm. Further, this new algorithm produces results for inversion of real magnetic data consistent with those provided in literature. Typically, the iteratively reweighted least squares algorithm depends on a standard Tikhonov formulation. Here, this is solved using both a randomized singular value de- composition and the iterative LSQR Krylov algorithm. The results demonstrate that the new algorithm is competitive with these approaches and offers the advantage that no regularization parameter needs to be found at each outer iteration. Given its efficiency, investigating the new algorithm for the joint inversion of these data sets may be fruitful. Initial research on joint inversion using the two dimensional fast Fourier transform has recently been submitted and provides the basis for future work. Several alternative directions for dimensionality reduction are also discussed, including iteratively applying an approximate pseudo-inverse and obtaining an approximate Kronecker product decomposition via randomization for a general matrix. These are also topics for future consideration. / Dissertation/Thesis / Doctoral Dissertation Applied Mathematics 2020 Applied mathematics dimensionality reduction inverse problem large-scale linear algebra Randomization Toeplitz
128	Construction and Visualization of Semantic Spaces for Domain-Specific Text Corpora Choudhary, Rishabh R. 04 October 2021 (has links) No description available. Artificial Intelligence natural language processing text embedding dimensionality reduction cognitive map
129	Data mining / Data mining Mrázek, Michal January 2019 (has links) The aim of this master’s thesis is analysis of the multidimensional data. Three dimensionality reduction algorithms are introduced. It is shown how to manipulate with text documents using basic methods of natural language processing. The goal of the practical part of the thesis is to process real-world data from the internet forum. Posted messages are transformed to the numerical representation, then to two-dimensional space and visualized. Later on, topics of the messages are discovered. In the last part, a few selected algorithms are compared.
130	Sample-Efficient Reinforcement Learning of Robot Control Policies in the Real World January 2019 (has links) abstract: The goal of reinforcement learning is to enable systems to autonomously solve tasks in the real world, even in the absence of prior data. To succeed in such situations, reinforcement learning algorithms collect new experience through interactions with the environment to further the learning process. The behaviour is optimized by maximizing a reward function, which assigns high numerical values to desired behaviours. Especially in robotics, such interactions with the environment are expensive in terms of the required execution time, human involvement, and mechanical degradation of the system itself. Therefore, this thesis aims to introduce sample-efficient reinforcement learning methods which are applicable to real-world settings and control tasks such as bimanual manipulation and locomotion. Sample efficiency is achieved through directed exploration, either by using dimensionality reduction or trajectory optimization methods. Finally, it is demonstrated how data-efficient reinforcement learning methods can be used to optimize the behaviour and morphology of robots at the same time. / Dissertation/Thesis / Doctoral Dissertation Computer Science 2019 Artificial intelligence Robotics Dimensionality Reduction Machine Learning Morphology Reinforcement Learning Robotics Sample-Efficient Learning

Search results