1 |
Near Images: A Tolerance Based Approach to Image Similarity and its Robustness to Noise and LighteningShahfar, Shabnam 27 September 2011 (has links)
This thesis represents a tolerance near set approach to detect similarity between digital images. Two images are considered as sets of perceptual objects and a tolerance relation defines the nearness between objects. Two perceptual objects resemble each other if the difference between their descriptions is smaller than a tolerable level of error. Existing tolerance near set approaches to image similarity consider both images in a single tolerance space and compare the size of tolerance classes. This approach is shown to be sensitive to noise and distortions. In this thesis, a new tolerance-based method is proposed that considers each image in a separate tolerance space and defines the similarity based on differences between histograms of the size of tolerance classes. The main advantage of the proposed method is its lower sensitivity to distortions such as adding noise, darkening or brightening. This advantage has been shown here through a set of experiments.
|
2 |
Near Images: A Tolerance Based Approach to Image Similarity and its Robustness to Noise and LighteningShahfar, Shabnam 27 September 2011 (has links)
This thesis represents a tolerance near set approach to detect similarity between digital images. Two images are considered as sets of perceptual objects and a tolerance relation defines the nearness between objects. Two perceptual objects resemble each other if the difference between their descriptions is smaller than a tolerable level of error. Existing tolerance near set approaches to image similarity consider both images in a single tolerance space and compare the size of tolerance classes. This approach is shown to be sensitive to noise and distortions. In this thesis, a new tolerance-based method is proposed that considers each image in a separate tolerance space and defines the similarity based on differences between histograms of the size of tolerance classes. The main advantage of the proposed method is its lower sensitivity to distortions such as adding noise, darkening or brightening. This advantage has been shown here through a set of experiments.
|
3 |
Thermal Imaging As A Biometrics Approach To Facial Signature AuthenticationGuzman Tamayo, Ana M 07 November 2011 (has links)
This dissertation develops an image processing framework with unique feature extraction and similarity measurements for human face recognition in the mid-wave infrared portion of the electromagnetic spectrum. The goal is to design specialized algorithms that would extract vasculature information, create a thermal facial signature and identify the individual. The objective is to use such findings in support of a biometrics system for human identification with a high degree of accuracy and a high degree of reliability. This last assertion is due to the minimal to no risk for potential alteration of the intrinsic physiological characteristics seen through thermal imaging. Thermal facial signature authentication is fully integrated and consolidates the main and critical steps of feature extraction, registration, matching through similarity measures, and validation through the principal component analysis.
Feature extraction was accomplished by first registering the images to a reference image using the functional MRI of the Brain’s (FMRIB’s) Linear Image Registration Tool (FLIRT) modified to suit thermal images. This was followed by segmentation of the facial region using an advanced localized contouring algorithm applied on anisotropically diffused thermal images. Thermal feature extraction from facial images was attained by performing morphological operations such as opening and top-hat segmentation to yield thermal signatures for each subject. Four thermal images taken over a period of six months were used to generate a thermal signature template for each subject to contain only the most prevalent and consistent features. Finally a similarity measure technique was used to match images to the signature templates and the Principal Component Analysis (PCA) was used to validating the results of the matching process.
Thirteen subjects were used for testing the developed technique on an in-house thermal imaging system. The matching using the similarity measures showed 88% accuracy in case of skeletonized feature signatures and 90% accuracy for anisotropically diffused feature signatures.
The highly accurate results obtained in the matching process along with the generalized design process clearly demonstrate the ability of the developed thermal infrared system to be used on other thermal imaging based systems and related databases.
|
4 |
Interactive System for Scientific Publication Visualization and Similarity Measurement based on Citation NetworkAlfraidi, Hanadi Humoud A January 2015 (has links)
Online scientific publications are becoming more and more popular. The number of publications we can access almost instantaneously is rapidly increasing. This makes it more challenging for researchers to pursue a topic, review literature, track research history or follow research trends. Using online resources such as search engines and digital libraries is helpful to find scientific publications, however most of the time the user ends up with an overwhelming amount of linear results to go through.
This thesis proposes an alternative system, which takes advantage of citation/reference relations between publications. This demonstrates better insight of the hierarchy distribution of publications around a given topic. We also utilize information visualization techniques to represent the publications as a network. Our system is designed to automatically retrieve publications from Google Scholar and visualize them as a 2-dimensional graph representation using the citation relations. In this, the nodes represent the documents while the links represent the citation/reference relations between them.
Our visualization system provides a better view of publications, making it easier to identify the research flow, connect publications, and assess similarities/differences between them. It is an interactive web based system, which allows the users to get more information about any selected publication and calculate a similarity score between two selected publications.
Traditionally, similar documents are found using Natural Language Processing (NLP), which compares documents based on matching their contents. In the proposed method, similar documents are found using the citation/reference relations which are
iii
represented by the relationship that was originally inputted by the authors. We propose a new path based metric for measuring the similarity scores between any pair of publications. This is based on both the number of paths and the length of each path. More paths and shorter lengths increase the similarity score. We compare our similarity score results with another similarity score from Scurtu’s Document Similarity [1] that uses the NLP method. We then use the average of the similarity scores collected from 15 users as a ground truth to validate the efficiency of our method. The results indicate that our Citation Network approach yielded better scores than Scurtu’s approach.
|
5 |
Mathematical and Experimental Investigation of Ontological Similarity Measures and Their Use in Biomedical DomainsYu, Xinran 18 August 2010 (has links)
No description available.
|
6 |
Classification of heterogeneous data based on data type impact of similarityAli, N., Neagu, Daniel, Trundle, Paul R. 11 August 2018 (has links)
Yes / Real-world datasets are increasingly heterogeneous, showing a mixture of numerical, categorical and other feature types. The main challenge for mining heterogeneous datasets is how to deal with heterogeneity present in the dataset records. Although some existing classifiers (such as decision trees) can handle heterogeneous data in specific circumstances, the performance of such models may be still improved, because heterogeneity involves specific adjustments to similarity measurements and calculations. Moreover, heterogeneous data is still treated inconsistently and in ad-hoc manner. In this paper, we study the problem of heterogeneous data classification: our purpose is to use heterogeneity as a positive feature of the data classification effort by using consistently the similarity between data objects. We address the heterogeneity issue by studying the impact of mixing data types in the calculation of data objects’ similarity. To reach our goal, we propose an algorithm to divide the initial data records based on pairwise similarity for classification subtasks with the aim to increase the quality of the data subsets and apply specialized classifier models on them. The performance of the proposed approach is evaluated on 10 publicly available heterogeneous data sets. The results show that the models achieve better performance for heterogeneous datasets when using the proposed similarity process.
|
7 |
Role of Similarity Measures in Time Series Analysis / Uloga mera sličnosti u analizi vremenskih serijaGeler Zoltan 18 September 2015 (has links)
<p>The subject of this dissertation encompasses a comprehensive overview<br />and analysis of the impact of Sakoe-Chiba global constraint on the most<br />commonly used elastic similarity measures in the field of time-series data<br />mining with a focus on classification accuracy. The choice of similarity<br />measure is one of the most significant aspects of time-series analysis - it<br />should correctly reflect the resemblance between the data presented in<br />the form of time series. Similarity measures represent a critical<br />component of many tasks of mining time series, including: classification,<br />clustering, prediction, anomaly detection, and others.</p><p>The research covered by this dissertation is oriented on several issues:</p><p>1. review of the effects of global constraints on the<br />performance of computing similarity measures,</p><p>2. a detailed analysis of the influence of constraining the elastic<br />similarity measures on the accuracy of classical classification<br />techniques,</p><p>3. an extensive study of the impact of different weighting<br />schemes on the classification of time series,</p><p>4. development of an open source library that integrates the<br />main techniques and methods required for analysis and<br />mining time series, and which is used for the realization of<br />these experiments</p> / <p>Predmet istraživanja ove disertacije obuhvata detaljan pregled i analizu uticaja Sakoe-Chiba globalnog ograničenja na najčešće korišćene elastične mere sličnosti u oblasti data mining-a vremenskih serija sa naglaskom na tačnost klasifikacije. Izbor mere sličnosti jedan je od najvažnijih aspekata analize vremenskih serija - ona treba verno reflektovati sličnost između podataka prikazanih u obliku vremenskih serija. Mera sličnosti predstavlјa kritičnu komponentu mnogih zadataka mining-a vremenskih serija, uklјučujući klasifikaciju, grupisanje (eng. clustering), predviđanje, otkrivanje anomalija i drugih.</p><p>Istraživanje obuhvaćeno ovom disertacijom usmereno je na nekoliko pravaca:</p><p>1. pregled efekata globalnih ograničenja na performanse računanja mera sličnosti,</p><p>2. detalјna analiza posledice ograničenja elastičnih mera sličnosti na tačnost klasifikacije klasičnih tehnika klasifikacije,</p><p>3. opsežna studija uticaj različitih načina računanja težina (eng. weighting scheme) na klasifikaciju vremenskih serija,</p><p>4. razvoj biblioteke otvorenog koda (Framework for Analysis and Prediction - FAP) koja će integrisati glavne tehnike i metode potrebne za analizu i mining vremenskih serija i koja je korišćena za realizaciju ovih eksperimenata.</p> / <p>Predmet istraživanja ove disertacije obuhvata detaljan pregled i analizu uticaja Sakoe-Chiba globalnog ograničenja na najčešće korišćene elastične mere sličnosti u oblasti data mining-a vremenskih serija sa naglaskom na tačnost klasifikacije. Izbor mere sličnosti jedan je od najvažnijih aspekata analize vremenskih serija - ona treba verno reflektovati sličnost između podataka prikazanih u obliku vremenskih serija. Mera sličnosti predstavlja kritičnu komponentu mnogih zadataka mining-a vremenskih serija, uključujući klasifikaciju, grupisanje (eng. clustering), predviđanje, otkrivanje anomalija i drugih.</p><p>Istraživanje obuhvaćeno ovom disertacijom usmereno je na nekoliko pravaca:</p><p>1. pregled efekata globalnih ograničenja na performanse računanja mera sličnosti,</p><p>2. detaljna analiza posledice ograničenja elastičnih mera sličnosti na tačnost klasifikacije klasičnih tehnika klasifikacije,</p><p>3. opsežna studija uticaj različitih načina računanja težina (eng. weighting scheme) na klasifikaciju vremenskih serija,</p><p>4. razvoj biblioteke otvorenog koda (Framework for Analysis and Prediction - FAP) koja će integrisati glavne tehnike i metode potrebne za analizu i mining vremenskih serija i koja je korišćena za realizaciju ovih eksperimenata.</p>
|
8 |
Adaptive Similarity Measures for Material Identification in Hyperspectral ImageryBue, Brian 16 September 2013 (has links)
Remotely-sensed hyperspectral imagery has become one the most advanced tools for analyzing the processes that shape the Earth and other planets. Effective, rapid analysis of high-volume, high-dimensional hyperspectral image data sets demands efficient, automated techniques to identify signatures of known materials in such imagery. In this thesis, we develop a framework for automatic material identification in hyperspectral imagery using adaptive similarity measures. We frame the material identification problem as a multiclass similarity-based classification problem, where our goal is to predict material labels for unlabeled target spectra based upon their similarities to source spectra with known material labels. As differences in capture conditions affect the spectral representations of materials, we divide the material identification problem into intra-domain (i.e., source and target spectra captured under identical conditions) and inter-domain (i.e., source and target spectra captured under different conditions) settings.
The first component of this thesis develops adaptive similarity measures for intra-domain settings that measure the relevance of spectral features to the given classification task using small amounts of labeled data. We propose a technique based on multiclass Linear Discriminant Analysis (LDA) that combines several distinct similarity measures into a single hybrid measure capturing the strengths of each of the individual measures. We also provide a comparative survey of techniques for low-rank Mahalanobis metric learning, and demonstrate that regularized LDA yields competitive results to the state-of-the-art, at substantially lower computational cost.
The second component of this thesis shifts the focus to inter-domain settings, and proposes a multiclass domain adaptation framework that reconciles systematic differences between spectra captured under similar, but not identical, conditions. Our framework computes a similarity-based mapping that captures structured, relative relationships between classes shared between source and target domains, allowing us apply a classifier trained using labeled source spectra to classify target spectra. We demonstrate improved domain adaptation accuracy in comparison to recently-proposed multitask learning and manifold alignment techniques in several case studies involving state-of-the-art synthetic and real-world hyperspectral imagery.
|
9 |
Evaluation of Melody Similarity MeasuresKelly, MATTHEW 08 September 2012 (has links)
Similarity in music is a concept with significant impact on ethnomusicology studies, music recommendation systems, and music information retrieval systems such as Shazam and SoundHound. Various computer-based melody similarity measures have been proposed, but comparison and evaluation of similarity measures is inherently difficult due to the subjective and application-dependent nature of similarity in music. In this thesis, we address the diversity of the problem by defining a set of music transformations that provide the criteria for comparing and evaluating melody similarity measures. This approach provides a flexible and extensible method for characterizing selected facets of melody similarity, because the set of music transformations can be tailored to the user and to the application.
We demonstrate this approach using three music transformations (transposition, tempo rescaling, and selected forms of ornamentation) to compare and evaluate several existing similarity measures, including String Edit Distance measures, Geometric measures, and N-Gram based measures. We also evaluate a newly implemented distance measure, the Beat and Direction Distance Measure, which is designed to have greater awareness of the beat hierarchy and better responsiveness to ornamentation. Training and test data is drawn from music incipits from the RISM A/II collection, and ground truth is taken from the MIREX 2005 Symbolic Melodic Similarity task. Our test results show that similarity measures that are responsive to music transformations generally have better agreement with human generated ground truth. / Thesis (Master, Computing) -- Queen's University, 2012-08-31 11:03:01.167
|
10 |
Effective and Efficient Similarity Search in Video DatabasesJie Shao Unknown Date (has links)
Searching relevant information based on content features in video databases is an interesting and challenging research topic that has drawn lots of attention recently. Video similarity search has many practical applications such as TV broadcast monitoring, copyright compliance enforcement and search result clustering, etc. However, existing studies are limited to provide fast and accurate solutions due to the diverse variations among the videos in large collections. In this thesis, we introduce the database support for effective and efficient video similarity search from various sources, even if there exists some transformation distortion, partial content re-ordering, insertion, deletion or replacement. Specifically, we focus on processing two different types of content-based queries: video clip retrieval in a large collection of segmented short videos, and video subsequence identification from a long unsegmented stream. The first part of the thesis investigates the problem of how to process a number of individual kNN searches on the same database simultaneously to reduce the computational overhead of current content-based video search systems. We propose a Dynamic Query Ordering (DQO) algorithm for efficiently processing Batch Nearest Neighbor (BNN) search in high-dimensional space, with advanced optimizations of both I/O cost and CPU cost. The second part of the thesis challenges an unstudied problem of temporal localization of similar content from a long unsegmented video sequence, with extension to identify the occurrence of potentially different ordering or length with respect to query due to video content editing. A graph transformation and matching approach supported by the above BNN search is proposed, as a filter-and-refine query processing strategy to effectively but still efficiently identify the most similar subsequence. The third part of the thesis extends the method of Bounded Coordinate System (BCS) we introduced earlier for video clip retrieval. A novel collective perspective of exploiting the distributional discrepancy of samples for assessing the similarity between two video clips is presented. Several ideas of non-parametric hypothesis tests in statistics are utilized to check the hypothesis whether two ensembles of points are from a same distribution. The proposed similarity measures can provide a more comprehensive analysis that captures the essence of invariant distribution information for retrieving video clips. For each part, we demonstrate comprehensive experimental evaluations, which show improved performance compared with state-of-the-art methods. In the end, some scheduled extensions of this work are highlighted as future research objectives.
|
Page generated in 0.1356 seconds