• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 239
  • 72
  • 28
  • 28
  • 18
  • 9
  • 9
  • 9
  • 6
  • 3
  • 2
  • 2
  • 2
  • 1
  • 1
  • Tagged with
  • 487
  • 487
  • 487
  • 159
  • 136
  • 113
  • 111
  • 82
  • 78
  • 73
  • 73
  • 65
  • 63
  • 57
  • 52
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
271

HUMAN FACE RECOGNITION BASED ON FRACTAL IMAGE CODING

Tan, Teewoon January 2004 (has links)
Human face recognition is an important area in the field of biometrics. It has been an active area of research for several decades, but still remains a challenging problem because of the complexity of the human face. In this thesis we describe fully automatic solutions that can locate faces and then perform identification and verification. We present a solution for face localisation using eye locations. We derive an efficient representation for the decision hyperplane of linear and nonlinear Support Vector Machines (SVMs). For this we introduce the novel concept of $\rho$ and $\eta$ prototypes. The standard formulation for the decision hyperplane is reformulated and expressed in terms of the two prototypes. Different kernels are treated separately to achieve further classification efficiency and to facilitate its adaptation to operate with the fast Fourier transform to achieve fast eye detection. Using the eye locations, we extract and normalise the face for size and in-plane rotations. Our method produces a more efficient representation of the SVM decision hyperplane than the well-known reduced set methods. As a result, our eye detection subsystem is faster and more accurate. The use of fractals and fractal image coding for object recognition has been proposed and used by others. Fractal codes have been used as features for recognition, but we need to take into account the distance between codes, and to ensure the continuity of the parameters of the code. We use a method based on fractal image coding for recognition, which we call the Fractal Neighbour Distance (FND). The FND relies on the Euclidean metric and the uniqueness of the attractor of a fractal code. An advantage of using the FND over fractal codes as features is that we do not have to worry about the uniqueness of, and distance between, codes. We only require the uniqueness of the attractor, which is already an implied property of a properly generated fractal code. Similar methods to the FND have been proposed by others, but what distinguishes our work from the rest is that we investigate the FND in greater detail and use our findings to improve the recognition rate. Our investigations reveal that the FND has some inherent invariance to translation, scale, rotation and changes to illumination. These invariances are image dependent and are affected by fractal encoding parameters. The parameters that have the greatest effect on recognition accuracy are the contrast scaling factor, luminance shift factor and the type of range block partitioning. The contrast scaling factor affect the convergence and eventual convergence rate of a fractal decoding process. We propose a novel method of controlling the convergence rate by altering the contrast scaling factor in a controlled manner, which has not been possible before. This helped us improve the recognition rate because under certain conditions better results are achievable from using a slower rate of convergence. We also investigate the effects of varying the luminance shift factor, and examine three different types of range block partitioning schemes. They are Quad-tree, HV and uniform partitioning. We performed experiments using various face datasets, and the results show that our method indeed performs better than many accepted methods such as eigenfaces. The experiments also show that the FND based classifier increases the separation between classes. The standard FND is further improved by incorporating the use of localised weights. A local search algorithm is introduced to find a best matching local feature using this locally weighted FND. The scores from a set of these locally weighted FND operations are then combined to obtain a global score, which is used as a measure of the similarity between two face images. Each local FND operation possesses the distortion invariant properties described above. Combined with the search procedure, the method has the potential to be invariant to a larger class of non-linear distortions. We also present a set of locally weighted FNDs that concentrate around the upper part of the face encompassing the eyes and nose. This design was motivated by the fact that the region around the eyes has more information for discrimination. Better performance is achieved by using different sets of weights for identification and verification. For facial verification, performance is further improved by using normalised scores and client specific thresholding. In this case, our results are competitive with current state-of-the-art methods, and in some cases outperform all those to which they were compared. For facial identification, under some conditions the weighted FND performs better than the standard FND. However, the weighted FND still has its short comings when some datasets are used, where its performance is not much better than the standard FND. To alleviate this problem we introduce a voting scheme that operates with normalised versions of the weighted FND. Although there are no improvements at lower matching ranks using this method, there are significant improvements for larger matching ranks. Our methods offer advantages over some well-accepted approaches such as eigenfaces, neural networks and those that use statistical learning theory. Some of the advantages are: new faces can be enrolled without re-training involving the whole database; faces can be removed from the database without the need for re-training; there are inherent invariances to face distortions; it is relatively simple to implement; and it is not model-based so there are no model parameters that need to be tweaked.
272

A Note on the Generalization Performance of Kernel Classifiers with Margin

Evgeniou, Theodoros, Pontil, Massimiliano 01 May 2000 (has links)
We present distribution independent bounds on the generalization misclassification performance of a family of kernel classifiers with margin. Support Vector Machine classifiers (SVM) stem out of this class of machines. The bounds are derived through computations of the $V_gamma$ dimension of a family of loss functions where the SVM one belongs to. Bounds that use functions of margin distributions (i.e. functions of the slack variables of SVM) are derived.
273

Bioinformatics approaches to analysing RNA mediated regulation of gene expression

Childs, Liam January 2010 (has links)
The genome can be considered the blueprint for an organism. Composed of DNA, it harbours all organism-specific instructions for the synthesis of all structural components and their associated functions. The role of carriers of actual molecular structure and functions was believed to be exclusively assumed by proteins encoded in particular segments of the genome, the genes. In the process of converting the information stored genes into functional proteins, RNA – a third major molecule class – was discovered early on to act a messenger by copying the genomic information and relaying it to the protein-synthesizing machinery. Furthermore, RNA molecules were identified to assist in the assembly of amino acids into native proteins. For a long time, these - rather passive - roles were thought to be the sole purpose of RNA. However, in recent years, new discoveries have led to a radical revision of this view. First, RNA molecules with catalytic functions - thought to be the exclusive domain of proteins - were discovered. Then, scientists realized that much more of the genomic sequence is transcribed into RNA molecules than there are proteins in cells begging the question what the function of all these molecules are. Furthermore, very short and altogether new types of RNA molecules seemingly playing a critical role in orchestrating cellular processes were discovered. Thus, RNA has become a central research topic in molecular biology, even to the extent that some researcher dub cells as “RNA machines”. This thesis aims to contribute towards our understanding of RNA-related phenomena by applying Bioinformatics means. First, we performed a genome-wide screen to identify sites at which the chemical composition of DNA (the genotype) critically influences phenotypic traits (the phenotype) of the model plant Arabidopsis thaliana. Whole genome hybridisation arrays were used and an informatics strategy developed, to identify polymorphic sites from hybridisation to genomic DNA. Following this approach, not only were genotype-phenotype associations discovered across the entire Arabidopsis genome, but also regions not currently known to encode proteins, thus representing candidate sites for novel RNA functional molecules. By statistically associating them with phenotypic traits, clues as to their particular functions were obtained. Furthermore, these candidate regions were subjected to a novel RNA-function classification prediction method developed as part of this thesis. While determining the chemical structure (the sequence) of candidate RNA molecules is relatively straightforward, the elucidation of its structure-function relationship is much more challenging. Towards this end, we devised and implemented a novel algorithmic approach to predict the structural and, thereby, functional class of RNA molecules. In this algorithm, the concept of treating RNA molecule structures as graphs was introduced. We demonstrate that this abstraction of the actual structure leads to meaningful results that may greatly assist in the characterization of novel RNA molecules. Furthermore, by using graph-theoretic properties as descriptors of structure, we indentified particular structural features of RNA molecules that may determine their function, thus providing new insights into the structure-function relationships of RNA. The method (termed Grapple) has been made available to the scientific community as a web-based service. RNA has taken centre stage in molecular biology research and novel discoveries can be expected to further solidify the central role of RNA in the origin and support of life on earth. As illustrated by this thesis, Bioinformatics methods will continue to play an essential role in these discoveries. / Das Genom eines Organismus enthält alle Informationen für die Synthese aller strukturellen Komponenten und deren jeweiligen Funktionen. Lange Zeit wurde angenommen, dass Proteine, die auf definierten Abschnitten auf dem Genom – den Genen – kodiert werden, die alleinigen Träger der molekularen - und vor allem katalytischen - Funktionen sind. Im Prozess der Umsetzung der genetischen Information von Genen in die Funktion von Proteinen wurden RNA Moleküle als weitere zentrale Molekülklasse identifiziert. Sie fungieren dabei als Botenmoleküle (mRNA) und unterstützen als Trägermoleküle (in Form von tRNA) die Zusammenfügung der einzelnen Aminosäurebausteine zu nativen Proteine. Diese eher passiven Funktionen wurden lange als die einzigen Funktionen von RNA Molekülen angenommen. Jedoch führten neue Entdeckungen zu einer radikalen Neubewertung der Rolle von RNA. So wurden RNA-Moleküle mit katalytischen Eigenschaften entdeckt, sogenannte Ribozyme. Weiterhin wurde festgestellt, dass über proteinkodierende Abschnitte hinaus, weit mehr genomische Sequenzbereiche abgelesen und in RNA Moleküle transkribiert werden als angenommen. Darüber hinaus wurden sehr kleine und neuartige RNA Moleküle identifiziert, die entscheidend bei der Koordinierung der Genexpression beteiligt sind. Diese Entdeckungen rückten RNA als Molekülklasse in den Mittelpunkt moderner molekularbiologischen Forschung und führten zu einer Neubewertung ihrer funktionellen Rolle. Die vorliegende Promotionsarbeit versucht mit Hilfe bioinformatorischer Methoden einen Beitrag zum Verständnis RNA-bezogener Phänomene zu leisten. Zunächst wurde eine genomweite Suche nach Abschnitten im Genom der Modellpflanze Arabidopsis thaliana vorgenommen, deren veränderte chemische Struktur (dem Genotyp) die Ausprägung ausgewählter Merkmale (dem Phänotyp) entscheidend beeinflusst. Dabei wurden sogenannte Ganz-Genom Hybridisierungschips eingesetzt und eine bioinformatische Strategie entwickelt, Veränderungen der chemischen Struktur (Polymorphismen) anhand der veränderten Bindung von genomischer DNA aus verschiedenen Arabidopsis Kultivaren an definierte Proben auf dem Chip zu detektieren. In dieser Suche wurden nicht nur systematisch Genotyp-Phänotyp Assoziationen entdeckt, sondern dabei auch Bereiche identifiziert, die bisher nicht als proteinkodierende Abschnitte annotiert sind, aber dennoch die Ausprägung eines konkreten Merkmals zu bestimmen scheinen. Diese Bereiche wurden desweiteren auf mögliche neue RNA Moleküle untersucht, die in diesen Abschnitten kodiert sein könnten. Hierbei wurde ein neuer Algorithmus eingesetzt, der ebenfalls als Teil der vorliegenden Arbeit entwickelt wurde. Während es zum Standardrepertoire der Molekularbiologen gehört, die chemische Struktur (die Sequenz) eines RNA Moleküls zu bestimmen, ist die Aufklärung sowohl der Struktur als auch der konkreten Funktion des Moleküls weitaus schwieriger. Zu diesem Zweck wurde in dieser Arbeit ein neuer algorithmischer Ansatz entwickelt, der mittels Computermethoden eine Zuordnung von RNA Molekülen zu bestimmten Funktionsklassen gestattet. Hierbei wurde das Konzept der Beschreibung von RNA-Sekundärstrukturen als Graphen genutzt. Es konnte gezeigt werden, dass diese Abstraktion von der konkreten Struktur zu nützlichen Aussagen zur Funktion führt. Des weiteren konnte demonstriert werden, dass graphen-theoretisch abgeleitete Merkmale von RNA-Molekülen einen neuen Zugang zum Verständnis der Struktur-Funktionsbeziehungen ermöglichen. Die entwickelte Methode (Grapple) wurde als web-basierte Anwendung der wissenschaftlichen Welt zur Verfügung gestellt. RNA hat sich als ein zentraler Forschungsgegenstand der Molekularbiologie etabliert und neue Entdeckungen können erwartet werden, die die zentrale Rolle von RNA bei der Entstehung und Aufrechterhaltung des Lebens auf der Erde weiter untermauern. Bioinformatische Methoden werden dabei weiterhin eine essentielle Rolle spielen.
274

Efficient Kernel Methods For Large Scale Classification

Asharaf, S 07 1900 (has links)
Classification algorithms have been widely used in many application domains. Most of these domains deal with massive collection of data and hence demand classification algorithms that scale well with the size of the data sets involved. A classification algorithm is said to be scalable if there is no significant increase in time and space requirements for the algorithm (without compromising the generalization performance) when dealing with an increase in the training set size. Support Vector Machine (SVM) is one of the most celebrated kernel based classification methods used in Machine Learning. An SVM capable of handling large scale classification problems will definitely be an ideal candidate in many real world applications. The training process involved in SVM classifier is usually formulated as a Quadratic Programing(QP) problem. The existing solution strategies for this problem have an associated time and space complexity that is (at least) quadratic in the number of training points. This makes the SVM training very expensive even on classification problems having a few thousands of training examples. This thesis addresses the scalability of the training algorithms involved in both two class and multiclass Support Vector Machines. Efficient training schemes reducing the space and time requirements of the SVM training process are proposed as possible solutions. The classification schemes discussed in the thesis for handling large scale two class classification problems are a) Two selective sampling based training schemes for scaling Non-linear SVM and b) Clustering based approaches for handling unbalanced data sets with Core Vector Machine. To handle large scale multicalss classification problems, the thesis proposes Multiclass Core Vector Machine (MCVM), a scalable SVM based multiclass classifier. In MVCM, the multiclass SVM problem is shown to be equivalent to a Minimum Enclosing Ball (MEB) problem and is then solved using a fast approximate MEB finding algorithm. Experimental studies were done with several large real world data sets such as IJCNN1 and Acoustic data sets from LIBSVM page, Extended USPS data set from CVM page and network intrusion detection data sets of DARPA, US Defense used in KDD 99 contest. From the empirical results it is observed that the proposed classification schemes achieve good generalization performance at low time and space requirements. Further, the scalability experiments done with large training data sets have demonstrated that the proposed schemes scale well. A novel soft clustering scheme called Rough Support Vector Clustering (RSVC) employing the idea of Soft Minimum Enclosing Ball Problem (SMEB) is another contribution discussed in this thesis. Experiments done with a synthetic data set and the real world data set namely IRIS, have shown that RSVC finds meaningful soft cluster abstractions.
275

Speech Signal Classification Using Support Vector Machines

Sood, Gaurav 07 1900 (has links)
Hidden Markov Models (HMMs) are, undoubtedly, the most employed core technique for Automatic Speech Recognition (ASR). Nevertheless, we are still far from achieving high‐performance ASR systems. Some alternative approaches, most of them based on Artificial Neural Networks (ANNs), were proposed during the late eighties and early nineties. Some of them tackled the ASR problem using predictive ANNs, while others proposed hybrid HMM/ANN systems. However, despite some achievements, nowadays, the dependency on Hidden Markov Models is a fact. During the last decade, however, a new tool appeared in the field of machine learning that has proved to be able to cope with hard classification problems in several fields of application: the Support Vector Machines (SVMs). The SVMs are effective discriminative classifiers with several outstanding characteristics, namely: their solution is that with maximum margin; they are capable to deal with samples of a very higher dimensionality; and their convergence to the minimum of the associated cost function is guaranteed. In this work a novel approach based upon probabilistic kernels in support vector machines have been attempted for speech data classification. The classification accuracy in case of support vector classification depends upon the kernel function used which in turn depends upon the data set in hand. But still as of now there is no way to know a priori which kernel will give us best results The kernel used in this work tries to normalize the time dimension by fitting a probability distribution over individual data points which normalizes the time dimension inherent to speech signals which facilitates the use of support vector machines since it acts on static data only. The divergence between these probability distributions fitted over individual speech utterances is used to form the kernel matrix. Vowel Classification, Isolated Word Recognition (Digit Recognition), have been attempted and results are compared with state of art systems.
276

Statistical methods with application to machine learning and artificial intelligence

Lu, Yibiao 11 May 2012 (has links)
This thesis consists of four chapters. Chapter 1 focuses on theoretical results on high-order laplacian-based regularization in function estimation. We studied the iterated laplacian regularization in the context of supervised learning in order to achieve both nice theoretical properties (like thin-plate splines) and good performance over complex region (like soap film smoother). In Chapter 2, we propose an innovative static path-planning algorithm called m-A* within an environment full of obstacles. Theoretically we show that m-A* reduces the number of vertex. In the simulation study, our approach outperforms A* armed with standard L1 heuristic and stronger ones such as True-Distance heuristics (TDH), yielding faster query time, adequate usage of memory and reasonable preprocessing time. Chapter 3 proposes m-LPA* algorithm which extends the m-A* algorithm in the context of dynamic path-planning and achieves better performance compared to the benchmark: lifelong planning A* (LPA*) in terms of robustness and worst-case computational complexity. Employing the same beamlet graphical structure as m-A*, m-LPA* encodes the information of the environment in a hierarchical, multiscale fashion, and therefore it produces a more robust dynamic path-planning algorithm. Chapter 4 focuses on an approach for the prediction of spot electricity spikes via a combination of boosting and wavelet analysis. Extensive numerical experiments show that our approach improved the prediction accuracy compared to those results of support vector machine, thanks to the fact that the gradient boosting trees method inherits the good properties of decision trees such as robustness to the irrelevant covariates, fast computational capability and good interpretation.
277

Automated Ice-Water Classification using Dual Polarization SAR Imagery

Leigh, Steve January 2013 (has links)
Mapping ice and open water in ocean bodies is important for numerous purposes including environmental analysis and ship navigation. The Canadian Ice Service (CIS) currently has several expert ice analysts manually generate ice maps on a daily basis. The CIS would like to augment their current process with an automated ice-water discrimination algorithm capable of operating on dual-pol synthetic aperture radar (SAR) images produced by RADARSAT-2. Automated methods can provide mappings in larger volumes, with more consistency, and in finer resolutions that are otherwise impractical to generate. We have developed such an automated ice-water discrimination system called MAGIC. The algorithm first classifies the HV scene using the glocal method, a hierarchical region-based classification method. The glocal method incorporates spatial context information into the classification model using a modified watershed segmentation and a previously developed MRF classification algorithm called IRGS. Second, a pixel-based support vector machine (SVM) using a nonlinear RBF kernel classification is performed exploiting SAR grey-level co-occurrence matrix (GLCM) texture and backscatter features. Finally, the IRGS and SVM classification results are combined using the IRGS approach but with a modified energy function to accommodate the SVM pixel-based information. The combined classifier was tested on 61 ground truthed dual-pol RADARSAT-2 scenes of the Beaufort Sea containing a variety of ice types and water patterns across melt, summer, and freeze-up periods. The average leave-one-out classification accuracy with respect to these ground truths is 95.8% and MAGIC attains an accuracy of 90% or above on 88% of the scenes. The MAGIC system is now under consideration by CIS for operational use.
278

Development of an Innovative System for the Reconstruction of New Generation Satellite Images

LORENZI, Luca 29 November 2012 (has links) (PDF)
Les satellites de télédétection sont devenus incontournables pour la société civile. En effet, les images satellites ont été exploitées avec succès pour traiter plusieurs applications, notamment la surveillance de l'environnement et de la prévention des catastrophes naturelles. Dans les dernières années, l'augmentation de la disponibilité de très haute résolution spatiale (THR) d'images de télédétection abouti à de nouvelles applications potentiellement pertinentes liées au suivi d'utilisation des sols et à la gestion environnementale. Cependant, les capteurs optiques, en raison du fait qu'ils acquièrent directement la lumière réfléchie par le soleil, ils peuvent souffrir de la présence de nuages dans le ciel et / ou d'ombres sur la terre. Il s'agit du problème des données manquantes, qui induit un problème important et crucial, en particulier dans le cas des images THR, où l'augmentation des détails géométriques induit une grande perte d'informations. Dans cette thèse, de nouvelles méthodologies de détection et de reconstruction de la région contenant des données manquantes dans les images THR sont proposées et appliquées sur les zones contaminées par la présence de nuages et / ou d'ombres. En particulier, les contributions méthodologiques proposées comprennent: i) une stratégie multirésolution d'inpainting visant à reconstruire les images contaminées par des nuages ; ii) une nouvelle combinaison d'information radiométrique et des informations de position spatiale dans deux noyaux spécifiques pour effectuer une meilleure reconstitution des régions contaminés par les nuages en adoptant une régression par méthode a vecteurs supports (RMVS) ; iii) l'exploitation de la théorie de l'échantillonnage compressé avec trois stratégies différentes (orthogonal matching pursuit, basis pursuit et une solution d'échantillonnage compressé, basé sur un algorithme génétique) pour la reconstruction d'images contaminés par des nuages; iv) une chaîne de traitement complète qui utilise une méthode à vecteurs de supports (SVM) pour la classification et la détection des zones d'ombre, puis une régression linéaire pour la reconstruction de ces zones, et enfin v) plusieurs critères d'évaluation promptes à évaluer la performance de reconstruction des zones d'ombre. Toutes ces méthodes ont été spécialement développées pour fonctionner avec des images très haute résolution. Les résultats expérimentaux menés sur des données réelles sont présentés afin de montrer et de confirmer la validité de toutes les méthodes proposées. Ils suggèrent que, malgré la complexité des problèmes, il est possible de récupérer de façon acceptable les zones manquantes masquées par les nuages ou rendues erronées les ombres.
279

Analysis And Classification Of Spelling Paradigm Eeg Data And An Attempt For Optimization Of Channels Used

Yildirim, Asil 01 December 2010 (has links) (PDF)
Brain Computer Interfaces (BCIs) are systems developed in order to control devices by using only brain signals. In BCI systems, different mental activities to be performed by the users are associated with different actions on the device to be controlled. Spelling Paradigm is a BCI application which aims to construct the words by finding letters using P300 signals recorded via channel electrodes attached to the diverse points of the scalp. Reducing the letter detection error rates and increasing the speed of letter detection are crucial for Spelling Paradigm. By this way, disabled people can express their needs more easily using this application. In this thesis, two different methods, Support Vector Machine (SVM) and AdaBoost, are used for classification in the analysis. Classification and Regression Trees is used as the weak classifier of the AdaBoost. Time-frequency domain characteristics of P300 evoked potentials are analyzed in addition to time domain characteristics. Wigner-Ville Distribution is used for transforming time domain signals into time-frequency domain. It is observed that classification results are better in time domain. Furthermore, optimum subset of channels that models P300 signals with minimum error rate is searched. A method that uses both SVM and AdaBoost is proposed to select channels. 12 channels are selected in time domain with this method. Also, effect of dimension reduction is analyzed using Principal Component Analysis (PCA) and AdaBoost methods.
280

Discovering Discussion Activity Flows in an On-line Forum Using Data Mining Techniques

Hsieh, Lu-shih 22 July 2008 (has links)
In the Internet era, more and more courses are taught through a course management system (CMS) or learning management system (LMS). In an asynchronous virtual learning environment, an instructor has the need to beware the progress of discussions in forums, and may intervene if ecessary in order to facilitate students¡¦ learning. This research proposes a discussion forum activity flow tracking system, called FAFT (Forum Activity Flow Tracer), to utomatically monitor the discussion activity flow of threaded forum postings in CMS/LMS. As CMS/LMS is getting popular in facilitating learning activities, the proposedFAFT can be used to facilitate instructors to identify students¡¦ interaction types in discussion forums. FAFT adopts modern data/text mining techniques to discover the patterns of forum discussion activity flows, which can be used for instructors to facilitate the online learning activities. FAFT consists of two subsystems: activity classification (AC) and activity flow discovery (AFD). A posting can be perceived as a type of announcement, questioning, clarification, interpretation, conflict, or assertion. AC adopts a cascade model to classify various activitytypes of posts in a discussion thread. The empirical evaluation of the classified types from a repository of postings in earth science on-line courses in a senior high school shows that AC can effectively facilitate the coding rocess, and the cascade model can deal with the imbalanced distribution nature of discussion postings. AFD adopts a hidden Markov model (HMM) to discover the activity flows. A discussion activity flow can be presented as a hidden Markov model (HMM) diagram that an instructor can adopt to predict which iscussion activity flow type of a discussion thread may be followed. The empirical results of the HMM from an online forum in earth science subject in a senior high school show that FAFT can effectively predict the type of a discussion activity flow. Thus, the proposed FAFT can be embedded in a course management system to automatically predict the activity flow type of a discussion thread, and in turn reduce the teachers¡¦ loads on managing online discussion forums.

Page generated in 0.0745 seconds