Global ETD Search

131	Social training : aprendizado semi supervisionado utilizando funções de escolha social / Social-Training: Semi-Supervised Learning Using Social Choice Functions Alves, Matheus January 2017 (has links) Dada a grande quantidade de dados gerados atualmente, apenas uma pequena porção dos mesmos pode ser rotulada manualmente por especialistas humanos. Isso é um desafio comum para aplicações de aprendizagem de máquina. Aprendizado semi-supervisionado aborda este problema através da manipulação dos dados não rotulados juntamente aos dados rotulados. Entretanto, se apenas uma quantidade limitada de exemplos rotulados está disponível, o desempenho da tarefa de aprendizagem de máquina (e.g., classificação) pode ser não satisfatória. Diversas soluções abordam este problema através do uso de uma ensemble de classificadores, visto que essa abordagem aumenta a diversidade dos classificadores. Algoritmos como o co-training e o tri-training utilizam múltiplas partições de dados ou múltiplos algoritmos de aprendizado para melhorar a qualidade da classificação de instâncias não rotuladas através de concordância por maioria simples. Além disso, existem abordagens que estendem esta ideia e adotam processos de votação menos triviais para definir os rótulos, como eleição por maioria ponderada, por exemplo. Contudo, estas soluções requerem que os rótulos possuam um certo nível de confiança para serem utilizados no treinamento. Consequentemente, nem toda a informação disponível é utilizada. Por exemplo: informações associadas a níveis de confiança baixos são totalmente ignoradas. Este trabalho propõe uma abordagem chamada social-training, que utiliza toda a informação disponível na tarefa de aprendizado semi-supervisionado. Para isto, múltiplos classificadores heterogêneos são treinados com os dados rotulados e geram diversas classificações para as mesmas instâncias não rotuladas. O social-training, então, agrega estes resultados em um único rótulo por meio de funções de escolha social que trabalham com agregação de rankings sobre as instâncias. Especificamente, a solução trabalha com casos de classificação binária. Os resultados mostram que trabalhar com o ranking completo, ou seja, rotular todas as instâncias não rotuladas, é capaz de reduzir o erro de classificação para alguns conjuntos de dados da base da UCI utilizados. / Given the huge quantity of data currently being generated, just a small portion of it can be manually labeled by human experts. This is a challenge for machine learning applications. Semi-supervised learning addresses this problem by handling unlabeled data alongside labeled ones. However, if only a limited quantity of labeled examples is available, the performance of the machine learning task (e.g., classification) can be very unsatisfactory. Many solutions address this issue by using a classifier ensemble because this increases diversity. Algorithms such as co-training and tri-training use multiple views or multiple learning algorithms in order to improve the classification of unlabeled instances through simple majority agreement. Also, there are approaches that extend this idea and adopt less trivial voting processes to define the labels, like weighted majority voting. Nevertheless, these solutions require some confidence level on the label in order to use it for training. Hence, not all information is used, i.e., information associated with low confidence level is disregarded completely. An approach called social-training is proposed, which uses all information available in the semi-supervised learning task. For this, multiple heterogeneous classifiers are trained with the labeled data and generate diverse classifications for the same unlabeled instances. Social-training then aggregates these results into a single label by means of social choice functions that work with rank aggregation over the instances. The solution addresses binary classification cases. The results show that working with the full ranking, i.e., labeling all unlabeled instances, is able to reduce the classification error for some UCI data sets used. Aprendizado : máquina Gestão do conhecimento Semi-supervised learning Social choice functions Classifier ensembles
132	Ordering Classifier Chains using filter model feature selection techniques Gustafsson, Robin January 2017 (has links) Context: Multi-label classification concerns classification with multi-dimensional output. The Classifier Chain breaks the multi-label problem into multiple binary classification problems, chaining the classifiers to exploit dependencies between labels. Consequently, its performance is influenced by the chain's order. Approaches to finding advantageous chain orders have been proposed, though they are typically costly. Objectives: This study explored the use of filter model feature selection techniques to order Classifier Chains. It examined how feature selection techniques can be adapted to evaluate label dependence, how such information can be used to select a chain order and how this affects the classifier's performance and execution time. Methods: An experiment was performed to evaluate the proposed approach. The two proposed algorithms, Forward-Oriented Chain Selection (FOCS) and Backward-Oriented Chain Selection (BOCS), were tested with three different feature evaluators. 10-fold cross-validation was performed on ten benchmark datasets. Performance was measured in accuracy, 0/1 subset accuracy and Hamming loss. Execution time was measured during chain selection, classifier training and testing. Results: Both proposed algorithms led to improved accuracy and 0/1 subset accuracy (Friedman & Hochberg, p < 0.05). FOCS also improved the Hamming loss while BOCS did not. Measured effect sizes ranged from 0.20 to 1.85 percentage points. Execution time was increased by less than 3 % in most cases. Conclusions: The results showed that the proposed approach can improve the Classifier Chain's performance at a low cost. The improvements appear similar to comparable techniques in magnitude but at a lower cost. It shows that feature selection techniques can be applied to chain ordering, demonstrates the viability of the approach and establishes FOCS and BOCS as alternatives worthy of further consideration. multi-label classification classifier chain label dependence Computer Sciences Datavetenskap (datalogi)
133	SVM-based algorithms for aligning ontologies using literature Xu, Wei January 2008 (has links) Ontologies is one of the key techniques used in Semantic Web establishment. Nowadays，many ontologies have been developed and it is critical to understand the relationships between the terms of the ontologies, i.e. we need to align the ontologies. This thesis deals with an approach for finding relationships between ontologies using literature by classifying documents related to terms in the ontologies. In this project the general method from [1] is used, but in the classifier generation part, a brand new classifier based on SVMs algorithm is implemented by LPU and SVMlight. We evaluate our approach and compare it to previous approaches. ontology alignment text classifier SVM LPU SVM-light Computer Sciences Datavetenskap (datalogi)
134	Methods for dynamic selection and fusion of ensemble of classifiers Oliveira e Cruz, Rafael Menelau 31 January 2011 (has links) Made available in DSpace on 2014-06-12T15:58:13Z (GMT). No. of bitstreams: 2 arquivo3310_1.pdf: 8155353 bytes, checksum: 2f4dcd5adb2b0b1a23c40bf343b36b34 (MD5) license.txt: 1748 bytes, checksum: 8a4605be74aa9ea9d79846c1fba20a33 (MD5) Previous issue date: 2011 / Faculdade de Amparo à Ciência e Tecnologia do Estado de Pernambuco / Ensemble of Classifiers (EoC) é uma nova alternative para alcançar altas taxas de reconhecimento em sistemas de reconhecimento de padrões. O uso de ensemble é motivado pelo fato de que classificadores diferentes conseguem reconhecer padrões diferentes, portanto, eles são complementares. Neste trabalho, as metodologias de EoC são exploradas com o intuito de melhorar a taxa de reconhecimento em diferentes problemas. Primeiramente o problema do reconhecimento de caracteres é abordado. Este trabalho propõe uma nova metodologia que utiliza múltiplas técnicas de extração de características, cada uma utilizando uma abordagem diferente (bordas, gradiente, projeções). Cada técnica é vista como um sub-problema possuindo seu próprio classificador. As saídas deste classificador são utilizadas como entrada para um novo classificador que é treinado para fazer a combinação (fusão) dos resultados. Experimentos realizados demonstram que a proposta apresentou o melhor resultado na literatura pra problemas tanto de reconhecimento de dígitos como para o reconhecimento de letras. A segunda parte da dissertação trata da seleção dinâmica de classificadores (DCS). Esta estratégia é motivada pelo fato que nem todo classificador pertencente ao ensemble é um especialista para todo padrão de teste. A seleção dinâmica tenta selecionar apenas os classificadores que possuem melhor desempenho em uma dada região próxima ao padrão de entrada para classificar o padrão de entrada. É feito um estudo sobre o comportamento das técnicas de DCS demonstrando que elas são limitadas pela qualidade da região em volta do padrão de entrada. Baseada nesta análise, duas técnicas para seleção dinâmica de classificadores são propostas. A primeira utiliza filtros para redução de ruídos próximos do padrão de testes. A segunda é uma nova proposta que visa extrair diferentes tipos de informação, a partir do comportamento dos classificadores, e utiliza estas informações para decidir se um classificador deve ser selecionado ou não. Experimentos conduzidos em diversos problemas de reconhecimento de padrões demonstram que as técnicas propostas apresentam um aumento de performance significante Handwritten Recognition Feature Extraction Ensemble of Classifier Dynamic Ensemble Selection Regions of Competence Neural Networks
135	Comparação do desempenho do classificador de novidades com o classificador do vizinho mais próximo no reconhecimento facial Falcão, Thiago Azevedo 13 January 2014 (has links) Made available in DSpace on 2015-04-22T22:00:50Z (GMT). No. of bitstreams: 1 Thiago Falcao.pdf: 1370921 bytes, checksum: ec7b9ab219f2028eded75407403140be (MD5) Previous issue date: 2014-01-13 / This work proposes the new classifier for face recognition, novelty classifier, which is based on novelty filter proposed by Kohonen. In order to evaluate the new classifier performance, it is performed a comparison with nearest neighboard classifier, which uses the Euclidian distance as distance metric. ORL face database was chosen to be used in this comparison. There was not any pre-processing (photometric or geometric) on face images. It was used the following feature extraction methods: PCA, 2DPCA and (2D)2PCA. Some results in identification mode are exposed through rank 1 recognition rate and CMC curves. In verification mode, the results were presented by Correct Acceptance Rate (CAR), Equivalent Error Rate (EER), ROC curves and Area under the ROC curve (AUC). Results shown that the proposed classifier performs better than others previously published, when the 10-fold Cross Validation method is employed as a test strategy. Recognition rate of 100% is achieved with this test methodology. / Este trabalho propõe a utilização do classificador de novidades para reconhecimento de faces, o qual é baseado no filtro de novidades, proposto por Kohonen. Para avaliar o desempenho do novo classificador é feita uma comparação com o classificador do vizinho mais próximo, usando a métrica da distância euclidiana. A base de dados utilizada para essa comparação foi a base ORL. A informação da face é extraída utilizando os métodos PCA, 2DPCA e (2D)2PCA, sem usar qualquer tipo de pré-processamento (fotométrico ou geométrico). Os seguintes resultados são apresentados no modo de identificação: taxa de reconhecimento rank 1 e as curvas CMC, no modo verificação: as taxas de correta aceitação (CAR), de erro equivalente (EER), as curvas ROC e área sob a curva ROC (AUC). Os resultados obtidos mostraram que o classificador proposto tem um desempenho melhor do que o desempenho do vizinho mais próximo e do que outros classificadores anteriormente publicados usando a mesma base, quando a estratégia de validação cruzada 10-fold é usada, com essa estratégia a taxa de reconhecimento obtida foi de 100% Sistemas biométricos Reconhecimento facial Classificador de novidades Biometric system Face recognition Novelty classifier ENGENHARIAS: ENGENHARIA ELÉTRICA
136	Sistemas classificadores evolutivos para problemas multirrótulo / Learning classifier system for multi-label classification Rosane Maria Maffei Vallim 27 July 2009 (has links) Classificação é, provavelmente, a tarefa mais estudada na área de Aprendizado de Máquina, possuindo aplicação em uma grande quantidade de problemas reais, como categorização de textos, diagnóstico médico, problemas de bioinformática, além de aplicações comerciais e industriais. De um modo geral, os problemas de classificação podem ser categorizados quanto ao número de rótulos de classe que podem ser associados à cada exemplo de entrada. A abordagem mais investigada pela comunidade de Aprendizado de Máquina é a de classes mutuamente exclusivas. Entretanto, existe uma grande variedade de problemas importantes em que cada exemplo de entrada pode ser associado a mais de um rótulo ou classe. Esses problemas são denominados problemas de classificação multirrótulo. Os Learning Classifier Systems(LCS) constituem uma técnica de Indução de Regras de Classificação que tem como principal mecanismo de busca um Algoritmo Genético. Essa técnica busca encontrar um conjunto de regras que tenha alta precisão de classificação, que seja compreensível e que possua regras consideradas interessantes sob o ponto de vista de classificação. Apesar de existirem na literatura diversos trabalhos sobre os LCS para problemas de classificação com classes mutuamente exclusivas, pouco se tem conhecimento sobre um LCS que seja capaz de lidar com problemas multirrótulo. Dessa maneira, o objetivo desta monografia é apresentar uma proposta de LCS para problemas multirrótulo, que pretende induzir um conjunto de regras de classificação que produza um resultado eficaz e comparável com outras técnicas de classificação. De acordo com esse objetivo, apresenta-se também uma revisão bibliográfica dos temas envolvidos na proposta, que são: Sistemas Classificadores Evolutivos e Classificação Multirrótulo / Classification is probably the most studied task in the Machine Learning area, with applications in a broad number of real problems like text categorization, medical diagnosis, bioinformatics and even comercial and industrial applications. Generally, classification problems can be categorized considering the number of class labels associated to each input instance. The most studied approach by the community of Machine Learning is the one that considers mutually exclusive classes. However, there is a large variety of important problems in which each instance can be associated to more than one class label. This problems are called multi-label classification problems. Learning Classifier Systems (LCS) are a technique for rule induction which uses a Genetic Algorithm as the primary search mechanism. This technique searchs for sets of rules that have high classification accuracy and that are also understandable and interesting on the classification point of view. Although there are several works on LCS for classification problems with mutually exclusive classes, there is no record of an LCS that can deal with the multi-label classification problem. The objective of this work is to propose an LCS for multi-label classification that builds a set of classification rules which achieves results that are efficient and comparable to other multi-label methods. In accordance with this objective this work also presents a review of the themes involved: Learning Classifier Systems and Multi-label Classification Algoritmos genéticos Classificação multirrótulo Sistemas classificadores evolutivos Genetic algorithms Learning classifier systems Multi-label classification
137	Combinação de classificadores para detecção de fraudes em sinistros de automóveis. Rodrigues, Luis Alexandre 05 August 2014 (has links) Made available in DSpace on 2016-03-15T19:37:51Z (GMT). No. of bitstreams: 1 Luis Alexandre Rodrigues.pdf: 1364668 bytes, checksum: ac6c4273730fb6f75f7a0ceead7e4c1f (MD5) Previous issue date: 2014-08-05 / Universidade Presbiteriana Mackenzie / This work presents a process to detect suspected cases of fraud at automobile claims dataset, which is evaluated the economic created by it. Because of a detection process presenting misclassific ation, it is necessary to evaluate the financial economy made by the process not only its accuracy in detecting suspected cases of fraud. This process uses a combination of classifiers, with C4.5 Decision Tree, Naive Bayes and Support Vector Machine, const ructed by samples of the data set with automobile claims. This way, the process defined by this work can obtain the balance between the accuracy of classification and the financial economy. / Este trabalho apresenta um processo para detectar casos suspeitos de fraude em conjunto de dados com sinistros de automóvel, em que é avaliada a economia financeira gerada por ele. Devido ao fato de um processo de detecção apresentar erros de classificação, é necessário avaliar a economia financeira apresentada pelo processo e não somente a sua precisão na detecção de casos suspeitos de fraude. Este processo utiliza a combinação de classificadores, sendo Árvore de Decisão C4.5, Naive Bayes e Support Vector Machine, construídos por amostras do conjunto de dados com sinistros de automóvel. Desta forma, o processo definido por este trabalho pode obter o equilíbrio entre a precisão da classificação e a economia financeira. detecção de fraude combinação de classificadores mineração de dados fraud detection multi classifier data mining CNPQ::ENGENHARIAS::ENGENHARIA ELETRICA
138	Instance-based ontology alignment using decision trees Boujari, Tahereh January 2012 (has links) Using ontologies is a key technology in the semantic web. The semantic web helps people to store their data on the web, build vocabularies, and has written rules for handling these data and also helps the search engines to distinguish between the information they want to access in web easier. In order to use multiple ontologies created by different experts we need matchers to find the similar concepts in them to use it to merge these ontologies. Text based searches use the string similarity functions to find the equivalent concepts inside ontologies using their names.This is the method that is used in lexical matchers. But a global standard for naming the concepts in different research area does not exist or has not been used. The same name may refer to different concepts while different names may describe the same concept. To solve this problem we can use another approach for calculating the similarity value between concepts which is used in structural and constraint-based matchers. It uses relations between concepts, synonyms and other information that are stored in the ontologies. Another category for matchers is instance-based that uses additional information like documents related to the concepts of ontologies, the corpus, to calculate the similarity value for the concepts. Decision trees in the area of data mining are used for different kind of classification for different purposes. Using decision trees in an instance-based matcher is the main concept of this thesis. The results of this implemented matcher using the C4.5 algorithm are discussed. The matcher is also compared to other matchers. It also is used for combination with other matchers to get a better result. Biomedical ontologies ontology alignment decision tree classifier Computer Sciences Datavetenskap (datalogi)
139	A Hybrid heuristic-exhaustive search approach for rule extraction Rodic, Daniel 29 May 2006 (has links) The topic of this thesis is knowledge discovery and artificial intelligence based knowledge discovery algorithms. The knowledge discovery process and associated problems are discussed, followed by an overview of three classes of artificial intelligence based knowledge discovery algorithms. Typical representatives of each of these classes are presented and discussed in greater detail. Then a new knowledge discovery algorithm, called Hybrid Classifier System (HCS), is presented. The guiding concept behind the new algorithm was simplicity. The new knowledge discovery algorithm is loosely based on schemata theory. It is evaluated against one of the discussed algorithms from each class, namely: CN2; C4.5, BRAINNE and BGP. Results are discussed and compared. A comparison was done using a benchmark of classification problems. These results show that the new knowledge discovery algorithm performs satisfactory, yielding accurate, crisp rule sets. Probably the main strength of the HCS algorithm is its simplicity, so it can be the foundation for many possible future extensions. Some of the possible extensions of the new proposed algorithm are suggested in the final part of this thesis. / Dissertation (MSc)--University of Pretoria, 2007. / Computer Science / unrestricted Hybrid classifier system Artificial intelligence Data mining algorithms Hcs Automated knowledge UCTD
140	Automatic Segmentation of Knee Cartilage Using Quantitative MRI Data Lind, Marcus January 2017 (has links) This thesis investigates if support vector machine classification is a suitable approach when performing automatic segmentation of knee cartilage using quantitative magnetic resonance imaging data. The data sets used are part of a clinical project that investigates if patients that have suffered recent knee damage will develop cartilage damage. Therefore the thesis also investigates if the segmentation results can be used to predict the clinical outcome of the patients. Two methods that perform the segmentation using support vector machine classification are implemented and evaluated. The evaluation indicates that it is a good approach for the task, but the implemented methods needs to be further improved and tested on more data sets before clinical use. It was not possible to relate the cartilage properties to clinical outcome using the segmentation results. However, the investigation demonstrated good promise of how the segmentation results, if they are improved, can be used in combination with quantitative magnetic resonance imaging data to analyze how the cartilage properties change over time or vary between knees. qMRI quantitative MRI automatic segmentation knee cartilage SVM classifier Medical Image Processing Medicinsk bildbehandling

Search results