61 |
Automatic detection of significant features and event timeline construction from temporally tagged dataErande, Abhijit January 1900 (has links)
Master of Science / Department of Computing and Information Sciences / William H. Hsu / The goal of my project is to summarize large volumes of data and help users to visualize how events have unfolded over time. I address the problem of extracting overview terms from a time-tagged corpus of data and discuss some previous work conducted in this area. I use a statistical approach to automatically extract key terms, form groupings of related terms, and display the resultant groups on a timeline. I use a static corpus composed of news stories, as opposed to an on-line setting where continual additions to the corpus are being made. Terms are extracted using a Named Entity Recognizer, and importance of a term is determined using the [superscript]X[superscript]2 measure. My approach does not address the problem of associating time and date stamps with data, and is restricted to corpora that been explicitly tagged. The quality of results obtained is gauged subjectively and objectively by measuring the degree to which events known to exist in the corpus were identified by the system.
|
62 |
Study of Single and Ensemble Machine Learning Models on Credit Data to Detect Underlying Non-performing LoansLi, Qiongzhu January 2016 (has links)
In this paper, we try to compare the performance of two feature dimension reduction methods, the LASSO and PCA. Both simulation study and empirical study show that the LASSO is superior to PCA when selecting significant variables. We apply Logistics Regression (LR), Artificial Neural Network (ANN), Support Vector Machine (SVM), Decision Tree (DT) and their corresponding ensemble machines constructed by bagging and adaptive boosting (adaboost) in our study. Three experiments are conducted to explore the impact of class-unbalanced data set on all models. Empirical study indicates that when the percentage of performing loans exceeds 83.3%, the training models shall be carefully applied. When we have class-balanced data set, ensemble machines indeed have a better performance over single machines. The weaker the single machine, the more obvious the improvement we can observe.
|
63 |
微弱光源下之人臉辨識李黛雲, Tai-Yun Li Unknown Date (has links)
本論文的主要目的是建立一套人臉辨識系統,即使在光源不足或甚至是完全黑暗的環境下也能夠正確地進行身分辨識。在完全黑暗的情形下,我們可以利用具有夜視功能(近紅外線)的攝影機來擷取影像,然而,近紅外線影像通常呈現亮度非常不均勻的情形,導致我們無法直接利用現有的人臉辨識系統來作辨識。因此,我們首先觀察近紅外線影像的特性,然後依據此特性提出一個影像成像的模型;接著,利用同構增晰的原理來減低因成像過程所造成的不均勻現象;經由實驗的結果,我們得知現有的全域式人臉辨識系統無法有效地處理近紅外線影像,因此,我們提出了一個新的區域式的人臉辨識演算法,針對光線不足的情況作特殊考量,以得到較佳的辨識結果。本論文實作的系統是以最近點分類法來作身份辨識,在現有的32個人臉影像資料集中,正確的辨識率達75%。 / The main objective of this thesis is to develop a face recognition system that could recognize human faces even when the surrounding environment is totally dark. The images of objects in total darkness can be captured using a relatively low-cost camcorder with the NightShot® function. By overcoming the illumination factor, a face recognition system would continue to function independent of the surrounding lighting condition. However, images acquired exhibit non-uniformity due to irregular illumination and current face recognition systems may not be put in use directly. In this thesis, we first investigate the characteristics of NIR images and propose an image formation model. A homomorphic processing technique built upon the image model is then developed to reduce the artifact of the captured images. After that, we conduct experiments to show that existing holistic face recognition systems perform poorly with NIR images. Finally, a more robust feature-based method is proposed to achieve better recognition rate under low illumination. A nearest neighbor classifier using Euclidean distance function is employed to recognize familiar faces from a database. The feature-based recognition method we developed achieves a recognition rate of 75% on a database of 32 people, with one sample image for each subject.
|
64 |
Investigation into the use of neural networks for visual inspection of ceramic tablewareFinney, Graham Barry January 1998 (has links)
No description available.
|
65 |
Closing the loop on multiple motionsWiles, Charles S. January 1995 (has links)
No description available.
|
66 |
Comparative Proteomics in the Absence of Tandem Mass SpectraWielens, Bjorn 09 December 2013 (has links)
Mass spectrometry plays a significant role in many proteomics experiments owing to its ability to provide high quality, detailed data on complex samples containing proteins and/or their constituent peptides. As with any technology, the capabilities of mass spectrometers are constantly increasing to provide better resolution, faster data acquisition, and more accurate mass measurements. However, the existence and widespread use of previous-generation instruments is not negligible. While these instruments may not have the capabilities of their modern counterparts they are still able to collect useful experimental data, though their limitations can result in trade-offs between certain parameters such as resolution, sample run-time, and tandem MS experiments.
This work describes an alternative method of MS data analysis, dubbed Parallel Isotopic Tag Screening (PITS), which seeks to enable higher throughput and the collection of better quality data on such previous generation instruments.
|
67 |
EdithGriffin, Henry 17 May 2013 (has links)
No description available.
|
68 |
Exploração visual do espaço de características: uma abordagem para análise de imagens via projeção de dados multidimensionais / Visual feature space exploration: an approach to image analysis via multidimensional data projectionMachado, Bruno Brandoli 13 December 2010 (has links)
Sistemas para análise de imagens partem da premissa de que o conjunto de dados sob investigação está corretamente representado por características. Entretanto, definir quais características representam apropriadamente um conjunto de dados é uma tarefa desafiadora e exaustiva. Grande parte das técnicas de descrição existentes na literatura, especialmente quando os dados têm alta dimensionalidade, são baseadas puramente em medidas estatísticas ou abordagens baseadas em inteligência artificial, e normalmente são caixas-pretas para os usuários. A abordagem proposta nesta dissertação busca abrir esta caixa-preta por meio de representações visuais criadas pela técnica Multidimensional Classical Scaling, permitindo que usuários capturem interativamente a essência sobre a representatividade das características computadas de diferentes descritores. A abordagem é avaliada sobre seis conjuntos de imagens que contém texturas, imagens médicas e cenas naturais. Os experimentos mostram que, conforme a combinação de um conjunto de características melhora a qualidade da representação visual, a acurácia de classificação também melhora. A qualidade das representações é medida pelo índice da silhueta, superando problemas relacionados com a subjetividade de conclusões baseadas puramente em análise visual. Além disso, a capacidade de exploração visual do conjunto sob análise permite que usuários investiguem um dos maiores desafios em classificação de dados: a presença de variação intra-classe. Os resultados sugerem fortemente que esta abordagem pode ser empregada com sucesso como um guia para auxiliar especialistas a explorar, refinar e definir as características que representam apropriadamente um conjunto de imagens / Image analysis systems rely on the fact that the dataset under investigation is correctly represented by features. However, defining a set of features that properly represents a dataset is still a challenging and, in most cases, an exhausting task. Most of the available techniques, especially when a large number of features is considered, are based on purely quantitative statistical measures or approaches based on artificial intelligence, and normally are black-boxes to the user. The approach proposed here seeks to open this black-box by means of visual representations via Multidimensional Classical Scaling projection technique, enabling users to get insight about the meaning and representativeness of the features computed from different feature extraction algorithms and sets of parameters. This approach is evaluated over six image datasets that contains textures, medical images and outdoor scenes. The results show that, as the combination of sets of features and changes in parameters improves the quality of the visual representation, the accuracy of the classification for the computed features also improves. In order to reduce this subjectiveness, a measure called silhouette index, which was originally proposed to evaluate results of clustering algorithms, is employed. Moreover, the visual exploration of datasets under analysis enable users to investigate one of the greatest challenges in data classification: the presence of intra-class variation. The results strongly suggest that our approach can be successfully employed as a guidance to defining and understanding a set of features that properly represents an image dataset
|
69 |
Quantifying the stability of feature selectionNogueira, Sarah January 2018 (has links)
Feature Selection is central to modern data science, from exploratory data analysis to predictive model-building. The "stability"of a feature selection algorithm refers to the robustness of its feature preferences, with respect to data sampling and to its stochastic nature. An algorithm is "unstable" if a small change in data leads to large changes in the chosen feature subset. Whilst the idea is simple, quantifying this has proven more challenging---we note numerous proposals in the literature, each with different motivation and justification. We present a rigorous statistical and axiomatic treatment for this issue. In particular, with this work we consolidate the literature and provide (1) a deeper understanding of existing work based on a small set of properties, and (2) a clearly justified statistical approach with several novel benefits. This approach serves to identify a stability measure obeying all desirable properties, and (for the first time in the literature) allowing confidence intervals and hypothesis tests on the stability of an approach, enabling rigorous comparison of feature selection algorithms.
|
70 |
Application of prior information to discriminative feature learningLiu, Yang January 2018 (has links)
Learning discriminative feature representations has attracted a great deal of attention since it is a critical step to facilitate the subsequent classification, retrieval and recommendation tasks. In this dissertation, besides incorporating prior knowledge about image labels into the image classification as most prevalent feature learning methods currently do, we also explore some other general-purpose priors and verify their effectiveness in the discriminant feature learning. As a more powerful representation can be learned by implementing such general priors, our approaches achieve state-of-the-art results on challenging benchmarks. We elaborate on these general-purpose priors and highlight where we have made novel contributions. We apply sparsity and hierarchical priors to the explanatory factors that describe the data, in order to better discover the data structure. More specifically, in the first approach we propose that we only incorporate sparse priors into the feature learning. To this end, we present a support discrimination dictionary learning method, which finds a dictionary under which the feature representation of images from the same class have a common sparse structure while the size of the overlapped signal support of different classes is minimised. Then we incorporate sparse priors and hierarchical priors into a unified framework, that is capable of controlling the sparsity of the neuron activation in deep neural networks. Our proposed approach automatically selects the most useful low-level features and effectively combines them into more powerful and discriminative features for our specific image classification problem. We also explore priors on the relationships between multiple factors. When multiple independent factors exist in the image generation process and only some of them are of interest to us, we propose a novel multi-task adversarial network to learn a disentangled feature which is optimized with respect to the factor of interest to us, while being distraction factors agnostic. When common factors exist in multiple tasks, leveraging common factors cannot only make the learned feature representation more robust, but also enable the model to generalise from very few labelled samples. More specifically, we address the domain adaptation problem and propose the re-weighted adversarial adaptation network to reduce the feature distribution divergence and adapt the classifier from source to target domains.
|
Page generated in 0.0495 seconds