Global ETD Search

221	Multiple classifier systems for the classification of hyperspectral data / ystème de classifieurs multiple pour la classification de données hyperspectrales Xia, Junshi 23 October 2014 (has links) Dans cette thèse, nous proposons plusieurs nouvelles techniques pour la classification d'images hyperspectrales basées sur l'apprentissage d'ensemble. Le cadre proposé introduit des innovations importantes par rapport aux approches précédentes dans le même domaine, dont beaucoup sont basées principalement sur un algorithme individuel. Tout d'abord, nous proposons d'utiliser la Forêt de Rotation (Rotation Forest) avec différentes techiniques d'extraction de caractéristiques linéaire et nous comparons nos méthodes avec les approches d'ensemble traditionnelles, tels que Bagging, Boosting, Sous-espace Aléatoire et Forêts Aléatoires. Ensuite, l'intégration des machines à vecteurs de support (SVM) avec le cadre de sous-espace de rotation pour la classification de contexte est étudiée. SVM et sous-espace de rotation sont deux outils puissants pour la classification des données de grande dimension. C'est pourquoi, la combinaison de ces deux méthodes peut améliorer les performances de classification. Puis, nous étendons le travail de la Forêt de Rotation en intégrant la technique d'extraction de caractéristiques locales et l'information contextuelle spatiale avec un champ de Markov aléatoire (MRF) pour concevoir des méthodes spatio-spectrale robustes. Enfin, nous présentons un nouveau cadre général, ensemble de sous-espace aléatoire, pour former une série de classifieurs efficaces, y compris les arbres de décision et la machine d'apprentissage extrême (ELM), avec des profils multi-attributs étendus (EMaPS) pour la classification des données hyperspectrales. Six méthodes d'ensemble de sous-espace aléatoire, y compris les sous-espaces aléatoires avec les arbres de décision, Forêts Aléatoires (RF), la Forêt de Rotation (RoF), la Forêt de Rotation Aléatoires (Rorf), RS avec ELM (RSELM) et sous-espace de rotation avec ELM (RoELM), sont construits par multiples apprenants de base. L'efficacité des techniques proposées est illustrée par la comparaison avec des méthodes de l'état de l'art en utilisant des données hyperspectrales réelles dans de contextes différents. / In this thesis, we propose several new techniques for the classification of hyperspectral remote sensing images based on multiple classifier system (MCS). Our proposed framework introduces significant innovations with regards to previous approaches in the same field, many of which are mainly based on an individual algorithm. First, we propose to use Rotation Forests with several linear feature extraction and compared them with the traditional ensemble approaches, such as Bagging, Boosting, Random subspace and Random Forest. Second, the integration of the support vector machines (SVM) with Rotation subspace framework for context classification is investigated. SVM and Rotation subspace are two powerful tools for high-dimensional data classification. Therefore, combining them can further improve the classification performance. Third, we extend the work of Rotation Forests by incorporating local feature extraction technique and spatial contextual information with Markov random Field (MRF) to design robust spatial-spectral methods. Finally, we presented a new general framework, Random subspace ensemble, to train series of effective classifiers, including decision trees and extreme learning machine (ELM), with extended multi-attribute profiles (EMAPs) for classifying hyperspectral data. Six RS ensemble methods, including Random subspace with DT (RSDT), Random Forest (RF), Rotation Forest (RoF), Rotation Random Forest (RoRF), RS with ELM (RSELM) and Rotation subspace with ELM (RoELM), are constructed by the multiple base learners. The effectiveness of the proposed techniques is illustrated by comparing with state-of-the-art methods by using real hyperspectral data sets with different contexts. Télédétection Analyse d'images Classification Multiple classifier systems, Hyperspectral data Classification Spectralspatial classifiers Rotation-based ensemble Rotation forest Support vector machines Manifold learning Markov random field Random subspace Extended multi-attribute profiles 620
222	Detec??o e diagn?stico de falhas n?o-supervisionados baseados em estimativa de densidade recursiva e classificador fuzzy auto-evolutivo Costa, Bruno Sielly Jales 13 May 2014 (has links) Made available in DSpace on 2015-03-03T15:08:47Z (GMT). No. of bitstreams: 1 BrunoSJC_TESE.pdf: 2605632 bytes, checksum: cc7fdbd9d8d7dfe3adac23f17fab1ae2 (MD5) Previous issue date: 2014-05-13 / Coordena??o de Aperfei?oamento de Pessoal de N?vel Superior / In this work, we propose a two-stage algorithm for real-time fault detection and identification of industrial plants. Our proposal is based on the analysis of selected features using recursive density estimation and a new evolving classifier algorithm. More specifically, the proposed approach for the detection stage is based on the concept of density in the data space, which is not the same as probability density function, but is a very useful measure for abnormality/outliers detection. This density can be expressed by a Cauchy function and can be calculated recursively, which makes it memory and computational power efficient and, therefore, suitable for on-line applications. The identification/diagnosis stage is based on a self-developing (evolving) fuzzy rule-based classifier system proposed in this work, called AutoClass. An important property of AutoClass is that it can start learning from scratch". Not only do the fuzzy rules not need to be prespecified, but neither do the number of classes for AutoClass (the number may grow, with new class labels being added by the on-line learning process), in a fully unsupervised manner. In the event that an initial rule base exists, AutoClass can evolve/develop it further based on the newly arrived faulty state data. In order to validate our proposal, we present experimental results from a level control didactic process, where control and error signals are used as features for the fault detection and identification systems, but the approach is generic and the number of features can be significant due to the computationally lean methodology, since covariance or more complex calculations, as well as storage of old data, are not required. The obtained results are significantly better than the traditional approaches used for comparison / Este trabalho prop?e um algoritmo de dois estagios para detec??o e identifica??o de falhas, em tempo real, em plantas industriais. A proposta baseia-se na analise de caracter?sticas selecionadas utilizando estimativa de densidade recursiva e um novo algoritmo evolutivo de classifica??o. Mais especificamente, a abordagem proposta para detec??o e baseada no conceito de densidade no espa?o de dados, o que difere da tradicional fun??o densidade de probabilidade, porem, sendo uma medida bastante util na detec??o de anormalidades/outliers. Tal densidade pode ser expressa por uma fun??o de Cauchy e calculada recursivamente, o que torna o algoritmo computacionalmente eficiente, em termos de processamento e memoria, e, dessa maneira, apropriado para aplica??es on-line. O estagio de identifica??o/diagnostico e realizado por um classificador baseado em regras fuzzy capaz de se auto-desenvolver (evolutivo), chamado de AutoClass, e introduzido neste trabalho. Uma propriedade importante do AutoClass e que ele e capaz de aprender a partir do zero". Tanto as regras fuzzy, quanto o numero de classes para o algoritmo n?o necessitam de pre-especifica??o (o numero de classes pode crescer, com os rotulos de classe sendo adicionados pelo processo de aprendizagem on-line), de maneira n~ao-supervisionada. Nos casos em que uma base de regras inicial existe, AutoClass pode evoluir/desenvolver-se a partir dela, baseado nos dados adquiridos posteriormente. De modo a validar a proposta, o trabalho apresenta resultados experimentais de simula??o e de aplica??es industriais reais, onde o sinal de controle e erro s?o utilizados como caracter?sticas para os estagios de detec??o e identifica??o, porem a abordagem e generica, e o numero de caracter?sticas selecionadas pode ser significativamente maior, devido ? metodologia computacionalmente eficiente adotada, uma vez que calculos mais complexos e armazenamento de dados antigos n?o s?o necess?rios. Os resultados obtidos s?o signifificativamente melhores que os gerados pelas abordagens tradicionais utilizadas para compara??o CNPQ::ENGENHARIAS::ENGENHARIA ELETRICA
223	Supervised Learning of Piecewise Linear Models Manwani, Naresh January 2012 (has links) (PDF) Supervised learning of piecewise linear models is a well studied problem in machine learning community. The key idea in piecewise linear modeling is to properly partition the input space and learn a linear model for every partition. Decision trees and regression trees are classic examples of piecewise linear models for classification and regression problems. The existing approaches for learning decision/regression trees can be broadly classified in to two classes, namely, fixed structure approaches and greedy approaches. In the fixed structure approaches, tree structure is fixed before hand by fixing the number of non leaf nodes, height of the tree and paths from root node to every leaf node of the tree. Mixture of experts and hierarchical mixture of experts are examples of fixed structure approaches for learning piecewise linear models. Parameters of the models are found using, e.g., maximum likelihood estimation, for which expectation maximization(EM) algorithm can be used. Fixed structure piecewise linear models can also be learnt using risk minimization under an appropriate loss function. Learning an optimal decision tree using fixed structure approach is a hard problem. Constructing an optimal binary decision tree is known to be NP Complete. On the other hand, greedy approaches do not assume any parametric form or any fixed structure for the decision tree classifier. Most of the greedy approaches learn tree structured piecewise linear models in a top down fashion. These are built by binary or multi-way recursive partitioning of the input space. The main issues in top down decision tree induction is to choose an appropriate objective function to rate the split rules. The objective function should be easy to optimize. Top-down decision trees are easy to implement and understand, but there are no optimality guarantees due to their greedy nature. Regression trees are built in the similar way as decision trees. In regression trees, every leaf node is associated with a linear regression function. All piece wise linear modeling techniques deal with two main tasks, namely, partitioning of the input space and learning a linear model for every partition. However, Partitioning of the input space and learning linear models for different partitions are not independent problems. Simultaneous optimal estimation of partitions and learning linear models for every partition, is a combinatorial problem and hence computationally hard. However, piecewise linear models provide better insights in to the classification or regression problem by giving explicit representation of the structure in the data. The information captured by piecewise linear models can be summarized in terms of simple rules, so that, they can be used to analyze the properties of the domain from which the data originates. These properties make piecewise linear models, like decision trees and regression trees, extremely useful in many data mining applications and place them among top data mining algorithms. In this thesis, we address the problem of supervised learning of piecewise linear models for classification and regression. We propose novel algorithms for learning piecewise linear classifiers and regression functions. We also address the problem of noise tolerant learning of classifiers in presence of label noise. We propose a novel algorithm for learning polyhedral classifiers which are the simplest form of piecewise linear classifiers. Polyhedral classifiers are useful when points of positive class fall inside a convex region and all the negative class points are distributed outside the convex region. Then the region of positive class can be well approximated by a simple polyhedral set. The key challenge in optimally learning a fixed structure polyhedral classifier is to identify sub problems, where each sub problem is a linear classification problem. This is a hard problem and identifying polyhedral separability is known to be NP complete. The goal of any polyhedral learning algorithm is to efficiently handle underlying combinatorial problem while achieving good classification accuracy. Existing methods for learning a fixed structure polyhedral classifier are based on solving non convex constrained optimization problems. These approaches do not efficiently handle the combinatorial aspect of the problem and are computationally expensive. We propose a method of model based estimation of posterior class probability to learn polyhedral classifiers. We solve an unconstrained optimization problem using a simple two step algorithm (similar to EM algorithm) to find the model parameters. To the best of our knowledge, this is the first attempt to form an unconstrained optimization problem for learning polyhedral classifiers. We then modify our algorithm to find the number of required hyperplanes also automatically. We experimentally show that our approach is better than the existing polyhedral learning algorithms in terms of training time, performance and the complexity. Most often, class conditional densities are multimodal. In such cases, each class region may be represented as a union of polyhedral regions and hence a single polyhedral classifier is not sufficient. To handle such situation, a generic decision tree is required. Learning optimal fixed structure decision tree is a computationally hard problem. On the other hand, top-down decision trees have no optimality guarantees due to the greedy nature. However, top-down decision tree approaches are widely used as they are versatile and easy to implement. Most of the existing top-down decision tree algorithms (CART,OC1,C4.5, etc.) use impurity measures to assess the goodness of hyper planes at each node of the tree. These measures do not properly capture the geometric structures in the data. We propose a novel decision tree algorithm that ,at each node, selects hyperplanes based on an objective function which takes into consideration geometric structure of the class regions. The resulting optimization problem turns out to be a generalized eigen value problem and hence is efficiently solved. We show through empirical studies that our approach leads to smaller size trees and better performance compared to other top-down decision tree approaches. We also provide some theoretical justification for the proposed method of learning decision trees. Piecewise linear regression is similar to the corresponding classification problem. For example, in regression trees, each leaf node is associated with a linear regression model. Thus the problem is once again that of (simultaneous) estimation of optimal partitions and learning a linear model for each partition. Regression trees, hinge hyperplane method, mixture of experts are some of the approaches to learn continuous piecewise linear regression models. Many of these algorithms are computationally intensive. We present a method of learning piecewise linear regression model which is computationally simple and is capable of learning discontinuous functions as well. The method is based on the idea of K plane regression that can identify a set of linear models given the training data. K plane regression is a simple algorithm motivated by the philosophy of k means clustering. However this simple algorithm has several problems. It does not give a model function so that we can predict the target value for any given input. Also, it is very sensitive to noise. We propose a modified K plane regression algorithm which can learn continuous as well as discontinuous functions. The proposed algorithm still retains the spirit of k means algorithm and after every iteration it improves the objective function. The proposed method learns a proper Piece wise linear model that can be used for prediction. The algorithm is also more robust to additive noise than K plane regression. While learning classifiers, one normally assumes that the class labels in the training data set are noise free. However, in many applications like Spam filtering, text classification etc., the training data can be mislabeled due to subjective errors. In such cases, the standard learning algorithms (SVM, Adaboost, decision trees etc.) start over fitting on the noisy points and lead to poor test accuracy. Thus analyzing the vulnerabilities of classifiers to label noise has recently attracted growing interest from the machine learning community. The existing noise tolerant learning approaches first try to identify the noisy points and then learn classifier on remaining points. In this thesis, we address the issue of developing learning algorithms which are inherently noise tolerant. An algorithm is inherently noise tolerant if, the classifier it learns with noisy samples would have the same performance on test data as that learnt from noise free samples. Algorithms having such robustness (under suitable assumption on the noise) are attractive for learning with noisy samples. Here, we consider non uniform label noise which is a generic noise model. In non uniform label noise, the probability of the class label for an example being incorrect, is a function of the feature vector of the example.(We assume that this probability is less than 0.5 for all feature vectors.) This can account for most cases of noisy data sets. There is no provably optimal algorithm for learning noise tolerant classifiers in presence of non uniform label noise. We propose a novel characterization of noise tolerance of an algorithm. We analyze noise tolerance properties of risk minimization frame work as risk minimization is a common strategy for classifier learning. We show that risk minimization under 01 loss has the best noise tolerance properties. None of the other convex loss functions have such noise tolerance properties. Empirical risk minimization under 01 loss is a hard problem as 01 loss function is not differentiable. We propose a gradient free stochastic optimization technique to minimize risk under 01 loss function for noise tolerant learning of linear classifiers. We show (under some conditions) that the algorithm converges asymptotically to the global minima of the risk under 01 loss function. We illustrate the noise tolerance of our algorithm through simulations experiments. We demonstrate the noise tolerance of the algorithm through simulations. Linear Models Linear Models (Classification) Linear Models (Regression) Polyhedral Classifiers Decision Trees Piecewise Linear Regression Noise Tolerant Learning Piecewise Linear Models Polyhedral Classifier Learning Geometric Decision Tree Regression Trees Nonlinear Models Supervised Learning Computer Science
224	Técnica experimental para quantificar a eficiência de distribuidores de líquidos industriais do tipo tubos perfurados paralelos. / Liquid aspersion effuciency quantification experiment: application in ladder type distributors. Marlene Silva de Moraes 07 July 2008 (has links) O presente texto descreve um método experimental simples para comparar a eficiência de distribuidores de líquido empregados nas indústrias de tratamento de minérios em lavadores, classificadores e moinhos e nas indústrias de processos químicos. A técnica consiste basicamente em analisar a dispersão pelo desvio padrão da massa do líquido coletado em tubos verticais dispostos em arranjo quadrático colocados abaixo do distribuidor. Como exemplo de aplicação, empregouse para a coleta da massa de líquido uma unidade piloto, montada no Laboratório de Engenharia Química da Universidade Santa Cecília em Santos, com um banco de 21 tubos verticais de 52 mm de diâmetro interno e 800 mm de comprimento. Uma manta acrílica que não dispersa o líquido com 50 mm de espessura foi fixada entre o distribuidor e o banco de tubos para evitar respingos. Foram realizados ensaios com nove distribuidores do tipo espinha de peixe de 4 tubos paralelos cada, para uma coluna piloto com 400 mm de diâmetro. A literatura é discordante no que concerne aos parâmetros de projeto e eficiência destes distribuidores. Variaram-se o número (n) de orifícios (95, 127 e 159 furos/m2, 12, 16 e 20 furos por distribuidor) o diâmetro (d) dos orifícios (2, 3 e 4 mm) e as vazões de entrada indicadas por rotâmetro nos distribuidores (q) de 1,2; 1,4 e 1,6 m3/h. A melhor eficiência de espalhamento pelo menor desvio padrão (0,302) foi obtida com n de 159 furos/m2, d de 2 mm e q de 1,4 m3/h indicando as limitações dos parâmetros de projeto da literatura. A pressão (p), na entrada do distribuidor para esta condição, foi de apenas 0,51 kgf/cm2. A relação adimensional entre a área da seção do tubo de alimentação e a somatória da área dos furos foi de 5,81, a vazão volumétrica total por unidade de área da seção da coluna para esta melhor condição foi de 11,32 m3/(h.m2) e a velocidade média (v) em cada orifício foi de 6,31 m/s. Portanto, o método proposto permite comparar e quantificar a eficiência de distribuidores além de demonstrar a não validade de alguns parâmetros de projeto recomendados pela literatura. / The current text describes a simple experimental method in order to compare the efficiency of the liquid distributors applied at the ore treatment industries in washers, classifiers and mills as well as at the chemical processing industries. The technique basically consist of analyzing the dispersion through the standard deviation of the liquid mass which was collected in vertical pipes placed in a square way under the distributor. As an example of us usage, it has been applied a pilot scale for collecting the liquid mass, installed at the Santa Cecília Universitys Chemical Engineering Laboratory in Santos, with a setting of 21 vertical tubes measuring 52 mm in internal diameter and 800 mm in length. A 50 mm thick acrylic blanket was fixed between the distributor and the pipe setting in order to avoid splashes. Some experiments have been made with a ladder-tipe distributors containing 4 parallel tubes each, for a pilot column of 400 mm in diameter. The literature shows disagreement regarding the characteristics of the project and the efficiency of the distributors. The number of holes has varied (n) 95, 127 and 159 holes/m2; 12, 16 and 20 holes for distributor, the diameter of the holes (d) 2, 3 and 4 mm and the flow of entrance in the distributors (q) of 1,2; 1,4 and 1,6 m3/h. The best efficiency of splashing of the lowest deviation pattern (0,302) was achieved with n of 159 holes/m2, d of 2 mm and q of 1,4 m3/h showing the limitation of characteristics of the project literature. The pressure (p), for this condition in the distributor entrance, was only 0,51 kgf/cm2. The measuring relation between the area of the section of the feeding pipe and the addition of the area of the roles was 5,81, the total volume of the out flow for unit of the area of the column section for this better condition was 11,32 m3/(h m2) and the average speed (v), in each hole was 6,31 m/s. Finally, the indicated method permits the comparison and quantification of the efficiency of the distributors, besides showing that some of the project concepts are not valid and the literature does not recommend them. Processamento mineral Qualidade do processo Dust dry cleaner Fisch spine Gas washer Ladder pipe distributors Liquid aspersion device Liquid distributor Log washer Ore classifiers Ore washer Washer distributor Wet mills Wet scrubber
225	APLICAÇÃO DE TÉCNICAS DE APRENDIZADO DE MÁQUINA PARA CLASSIFICAÇÃO DE DEPÓSITOS MINERAIS BASEADA EM MODELO TEOR-TONELAGEM / APPLICATION OF MACHINE LEARNING TECHNIQUES FOR CLASSIFICATION OF MINERAL DEPOSITS CONTENT-BASED MODEL TONNAGE Rocha, Jocielma Jerusa Leal 01 July 2010 (has links) Made available in DSpace on 2016-08-17T14:53:11Z (GMT). No. of bitstreams: 1 Jocielma Jerusa Leal Rocha.pdf: 3008647 bytes, checksum: 785c07837e5e5bb39cb7685000c9d145 (MD5) Previous issue date: 2010-07-01 / Classification of mineral deposits into types is traditionally done by experts. Since there are reasons to believe that computational techniques can aid this classification process and make it less subjective, the research and investigation of different methods of clustering and classification to this domain may be appropriate. The way followed by researches in this domain has directed for the use of information available in large public databases and the application of supervised machine learning techniques. This work uses information from mineral deposits available in grade-tonnage models published in the literature to conduct research about the suitability of these three techniques: Decision Tree, Multilayer Perceptron Network and Probabilistic Neural Network. Altogether, 1,861 mineral deposits of 18 types are used. The types refer to grade-tonnage models. Initially, each of these three techniques are used to classify mineral deposits into 18 types. Analysis of these results suggested that some deposits types could be treated as a group and also that the classification could be divided into two levels: the first level to classify deposits considering groups of deposits and the second level to classify deposits previously identified on a group into some of specific type belonging to that group. A series of experiments was carried out in order to build a two levels model from the combination of the techniques used, which resulted in an average accuracy rate of 85% of cases. Patterns of errors occurrence were identified within groups in types of deposits less representative in the database. This represents a promising way to achieve improvement in the process of mineral deposits classification that does not mean increasing in the amount of deposits used or in the amount of characteristics of the deposits. / A classificação de depósitos minerais em tipos tradicionalmente é feita por especialistas no assunto. A possibilidade de que técnicas computacionais auxiliem o processo de classificação e o torne menos subjetivo incentiva a pesquisa e aplicação de diferentes métodos de agrupamento e classificação sobre esse domínio de análise. A evolução das pesquisas nesse domínio tem direcionado os estudos para a utilização de informações disponíveis em grandes bases de dados publicadas e a aplicação de técnicas de aprendizado de máquina supervisionado. Este trabalho utiliza informações de depósitos minerais disponibilizadas em modelos teor-tonelagem publicados na literatura para proceder a investigação da adequabilidade de três dessas técnicas: Árvore de Decisão, Rede Percéptron Multicamadas e Rede Neural Probabilística. Ao todo, são 1.861 depósitos distribuídos em 18 tipos identificados pelo modelo teor-tonelagem. Inicialmente verificou-se o resultado apresentado por cada uma das três técnicas para a classificação dos depósitos em 18 tipos. A análise desses resultados sugeriu a possibilidade de agrupar esses tipos e dividir a classificação em dois níveis: o primeiro nível para classificar os depósitos considerando o agrupamento de tipos e o segundo nível para classificar os depósitos que resultaram em um grupo em um dos tipos específicos daquele grupo. Uma série de experimentos foi realizada no sentido de construir um modelo de classificação em dois níveis a partir da combinação das técnicas utilizadas, o que resultou em uma taxa de acerto média de 85% dos casos e as principais ocorrências de erros foram identificadas dentro de grupos em tipos de depósitos menos representativos na base de dados. Isso representa uma maneira promissora de conseguir melhoria no processo de classificação de depósitos minerais que não implica no aumento da quantidade de depósitos utilizada ou na quantidade de características dos depósitos. Classificação de Depósitos Minerais Modelo Teor-Tonelagem Aprendizado de Máquina Supervisionado Combinação de Classificadores Mineral Deposits Classification Grade-Tonnage Model Supervised Machine Learning Combination of Classifiers
226	Description morpho-syntaxique du nââ numèè (langue de l'extrême-Sud, Nouvelle-Calédonie) Wacalie, Fabrice Saiqë 08 November 2013 (has links) Cette thèse porte sur la description morpho-syntaxique du nââ numèè, langue de l’extrême-Sud de la Nouvelle-Calédonie, parlée par moins de 1 000 locuteurs dans les tribus de Touaourou et de Goro dans la commune de Yaté et de l’île Ouen dans la commune du Mont-Dore.Dans la première partie, l’auteur aborde les faits sociaux qui ont fait de la région du Grand Sud l’une des plus impactés par les guerres traditionnelles, la colonisation, l’évangélisation, la construction du barrage de Yaté et plus récemment, l’exploitation minière.Dans la deuxième partie, l’auteur procède à l’identification des phonèmes de la langue. Ensuite, il effectue une comparaison entre le système consonantique, vocalique et tonal ainsi établi et celui réalisé par Rivierre dans les années 70. Cette analyse d’écart lui permet de mettre en exergue les évolutions phonologiques subies par la langue en presque quarante ans. Pour définir les différents morphèmes de la langue, l’auteur utilise la méthode traditionnelle dite « distributionnelle ». Il propose ensuite une catégorisation des lexèmes.Dans la troisième partie, l’auteur traite des procédés morphologiques utilisaient en dérivation et en composition.Dans la quatrième partie consacrée à la syntaxe, l’auteur expose le syntagme nominal avec les différents modes et types de détermination. Cette partie comprend également un exposé des substituts du nom, des déictiques et du système numéral. Puis, l’auteur propose une description du syntagme verbal avec en point d’orgue, l’analyse des marqueurs aspecto-temporels et les constructions sérielles. Dans la partie dédiée aux circonstants, l’auteur s’intéresse au système d’orientation développé et les différents cadres référentiels utilisaient (absolus et relatifs).Dans la dernière partie, l’auteur expose les différents types d’énoncés (simples, complexes et marqués). / This thesis deals with the morpho-syntactic description of Nââ Numèè, a language of the far south of New Caledonia, spoken by fewer than 1,000 speakers in the tribes of Touaourou and Goro in the town of Yate and the isle Ouen in the town of Mont-Dore.In the first part, the author explains the social facts which led to the great southern region being one of the most affected by traditional wars, colonization, evangelization, the construction of the dam at Yate and more recently, mining.In the second part, the author will identify the phonemes of the language. Then a comparison will be made between the consonant, vowel and tonal system thus established and made by Rivierre in the 1970s. This gap analysis allows the author to highlight the phonological changes in the language over almost forty years. The author then uses the traditional "distributional" method to define the different morphemes of language before proposing a categorization of lexemes.In the third part, the author deals with morphological processes used in derivation and in composition. In this chapter, it is interesting to see the increased use of verbal classifiers.In the fourth section dealing with syntax, the author offers a description of the nominal phrase with the modes and types of determination. This part also includes a presentation of the nominal substitutes, deictic and numeral systems. Then, the author describes the verbal phrase with special focus on the analysis of the aspect and temporal markers and serial verbs. In the part dedicated to the circonstants, the author focuses on the orientation system developed and the various repositories spaces used (absolute and relative).In the last part, the author exposes the different types of clauses (simple, complex and marked). Nââ numèè Phonologie Morpho-syntaxe Marqueurs aspecto-temporels Classificateurs verbaux Verbes sériels Système d’orientation Cadre absolu Cadre relatif Nââ Numèè Phonology Morpho-syntax Aspect and temporal markers Verbal classifiers Serial verbs Orientation system Absolute spaces Relative spaces
227	[en] PATTERN RECOGNITION APPLIED IN FINE ART AUTHENTICATION / [pt] RECONHECIMENTO DE PADRÕES APLICADO NA AUTENTICAÇÃO DE QUADROS DE ARTE GUILHERME NOBREGA TEIXEIRA 29 August 2002 (has links) [pt] Assinaturas e caligrafias foram utilizadas durante décadas como uma marca característica de cada indivíduo. Por trás dos métodos utilizados para reconhecer estas caracterísitcas está o fato que toda pessoa possui seu próprio jeito de mover a mão enquanto escreve. Sendo assim é razoável pensar que cada pintor tem uma maneira própria de atacar a tela de pintura com o seu pincel, deixando assim um padrão pessoal de acidentes geométricos, que poderiam ser utilizados para identificá-lo.A partir desse principio surge a idéia de aplicar visão computacional para reconhecer padrões específicos de cada pintor que poderiam ser utilizados no processo de autenticar quadros de arte. A dissertação aqui descrita apresenta os resultados de uma pesquisa que objetiva o desenvolvimento de um método para definir a autenticidade de quadros de arte. Um novo procedimento para segmentação de pinceladas em um quadro juntamente com uma nova técnica de medição de textura para capturar as assinaturas nas pinceladas é proposto. Além disso, o trabalho investiga a utilização de métodos não- paramétricos de classificação, para discriminar entre potenciais pintores. O método proposto é avaliado com um conjunto de experimentos cujo objetivo é discriminar entre dois pintores brasileiros muito conhecidos: Portinari e Bianco. / [en] Signatures and hand writings were used during decades as a unique characteristic to recognize an individual. Methods to recognize these characteristics were base don the fact that each individual has an unique way to move his hand while writing. Taking that into account, it is reasonable to think that each painter has an unique way to strike the painting board with his stroke, leaving a distinguishing personal pattern, that can be used to identify him. From this principle comes the idea to apply computer vision to recognize specific patterns that could be used in the process of authentication of fine art paintings.This work shows the results of a research where the main purpose is to develop a methodology to find the authenticity of fine art paintings. A new segmentation process of strokes of a painting allied to a new technique of texture measure to get the implicit signatures in the strokes is proposed. Beyond that, this work investigates non-parametric classification methods to discriminate potential painters. The proposed method is evaluated with a set of experiments where the purpose is to discriminate between two well known Brazilian painters : Portinari and Bianco. [pt] REDES NEURAIS [en] NEURAL NETWORKS [pt] LOGICA FUZZY [en] FUZZY LOGIC [pt] VISAO COMPUTACIONAL [en] COMPUTER VISION [pt] HERANCA CULTURAL [pt] WAVELET [en] WAVELET [pt] AUTENTICACAO DE OBRAS DE ARTE [en] FINE ART AUTHENTICATION [pt] CLASSIFICADORES BAYSIANOS [en] BAYSIANS CLASSIFIERS
228	Dataset selection for aggregate model implementation in predictive data mining Lutu, P.E.N. (Patricia Elizabeth Nalwoga) 15 November 2010 (has links) Data mining has become a commonly used method for the analysis of organisational data, for purposes of summarizing data in useful ways and identifying non-trivial patterns and relationships in the data. Given the large volumes of data that are collected by business, government, non-government and scientific research organizations, a major challenge for data mining researchers and practitioners is how to select relevant data for analysis in sufficient quantities, in order to meet the objectives of a data mining task. This thesis addresses the problem of dataset selection for predictive data mining. Dataset selection was studied in the context of aggregate modeling for classification. The central argument of this thesis is that, for predictive data mining, it is possible to systematically select many dataset samples and employ different approaches (different from current practice) to feature selection, training dataset selection, and model construction. When a large amount of information in a large dataset is utilised in the modeling process, the resulting models will have a high level of predictive performance and should be more reliable. Aggregate classification models, also known as ensemble classifiers, have been shown to provide a high level of predictive accuracy on small datasets. Such models are known to achieve a reduction in the bias and variance components of the prediction error of a model. The research for this thesis was aimed at the design of aggregate models and the selection of training datasets from large amounts of available data. The objectives for the model design and dataset selection were to reduce the bias and variance components of the prediction error for the aggregate models. Design science research was adopted as the paradigm for the research. Large datasets obtained from the UCI KDD Archive were used in the experiments. Two classification algorithms: See5 for classification tree modeling and K-Nearest Neighbour, were used in the experiments. The two methods of aggregate modeling that were studied are One-Vs-All (OVA) and positive-Vs-negative (pVn) modeling. While OVA is an existing method that has been used for small datasets, pVn is a new method of aggregate modeling, proposed in this thesis. Methods for feature selection from large datasets, and methods for training dataset selection from large datasets, for OVA and pVn aggregate modeling, were studied. The experiments of feature selection revealed that the use of many samples, robust measures of correlation, and validation procedures result in the reliable selection of relevant features for classification. A new algorithm for feature subset search, based on the decision rule-based approach to heuristic search, was designed and the performance of this algorithm was compared to two existing algorithms for feature subset search. The experimental results revealed that the new algorithm makes better decisions for feature subset search. The information provided by a confusion matrix was used as a basis for the design of OVA and pVn base models which aren combined into one aggregate model. A new construct called a confusion graph was used in conjunction with new algorithms for the design of pVn base models. A new algorithm for combining base model predictions and resolving conflicting predictions was designed and implemented. Experiments to study the performance of the OVA and pVn aggregate models revealed the aggregate models provide a high level of predictive accuracy compared to single models. Finally, theoretical models to depict the relationships between the factors that influence feature selection and training dataset selection for aggregate models are proposed, based on the experimental results. / Thesis (PhD)--University of Pretoria, 2010. / Computer Science / unrestricted Dataset partitioning Data mining Bias reduction Predictive modeling Classification Model aggregation Ensemble classifiers Ova classification Pvn classification Dataset selection Featureselection Variable selection Large datasets Variance reduction Dataset sampling UCTD
229	Hard Drive Failure Prediction : A Rule Based Approach Agrawal, Vipul 07 1900 (has links) (PDF) The ability to accurately predict an impending hard disk failure is important for reliable storage system design. The facility provided by most hard drive manufacturers, called S.M.A.R.T. (self-monitoring, analysis and reporting technology), has been shown by current research to have poor predictive value. The problem of finding alternatives to S.M.A.R.T. for predicting disk failure is an area of active research. In this work, we present a rule discovery methodology, and show that it is possible to construct decision support systems that can detect such failures using information recorded from live disks. It is desired that any such prediction methodology should have high accuracy and must have ease of interpretability. Black box models can deliver highly accurate solutions but do not provide an understanding of events which explains the decision given by it. To this end we explore rule based classifiers for predicting hard disk failures from various disk events. We show that it is possible to learn easy to understand rules from disk events. Our evaluation shows that our system can be tuned either to have a high failure detection rate (i.e., classify a bad disk as bad) or to have a low false alarm rate (i.e., not classify a good disk as bad). We also propose a modification of MLRules algorithm for classification of data with imbalanced class distributions. The existing algorithm, assuming relatively balanced class distributions and equal misclassfication costs, performs poorly in classification of such datasets. The performance can be considerably improved by introducing cost- sensitive learning to the existing framework. Hard Drive Failure Prediction Hard Drive Verification Rule-based Classifiers Hard Disks (Computer Science) Rule Discovery Methodology Disk Events (Computer Science) Rule-based Learning Hard Drive Failures Computer Science
230	Efficient Feature Extraction for Shape Analysis, Object Detection and Tracking Solis Montero, Andres January 2016 (has links) During the course of this thesis, two scenarios are considered. In the first one, we contribute to feature extraction algorithms. In the second one, we use features to improve object detection solutions and localization. The two scenarios give rise to into four thesis sub-goals. First, we present a new shape skeleton pruning algorithm based on contour approximation and the integer medial axis. The algorithm effectively removes unwanted branches, conserves the connectivity of the skeleton and respects the topological properties of the shape. The algorithm is robust to significant boundary noise and to rigid shape transformations. It is fast and easy to implement. While shape-based solutions via boundary and skeleton analysis are viable solutions to object detection, keypoint features are important for textured object detection. Therefore, we present a keypoint featurebased planar object detection framework for vision-based localization. We demonstrate that our framework is robust against illumination changes, perspective distortion, motion blur, and occlusions. We increase robustness of the localization scheme in cluttered environments and decrease false detection of targets. We present an off-line target evaluation strategy and a scheme to improve pose. Third, we extend planar object detection to a real-time approach for 3D object detection using a mobile and uncalibrated camera. We develop our algorithm based on two novel naive Bayes classifiers for viewpoint and feature matching that improve performance and decrease memory usage. Our algorithm exploits the specific structure of various binary descriptors in order to boost feature matching by conserving descriptor properties. Our novel naive classifiers require a database with a small memory footprint because we only store efficiently encoded features. We improve the feature-indexing scheme to speed up the matching process creating a highly efficient database for objects. Finally, we present a model-free long-term tracking algorithm based on the Kernelized Correlation Filter. The proposed solution improves the correlation tracker based on precision, success, accuracy and robustness while increasing frame rates. We integrate adjustable Gaussian window and sparse features for robust scale estimation creating a better separation of the target and the background. Furthermore, we include fast descriptors and Fourier spectrum packed format to boost performance while decreasing the memory footprint. We compare our algorithm with state-of-the-art techniques to validate the results. Computer Vision Image Processing Skeleton pruning Object Detection Model-free Tracking Long-Term Tracking Planar Detection Ferns Classifiers Naive Bayes VOT Challenge Medial Axis 3D Object Binary Raster Shapes Binary descriptors fast HOG

Search results