Global ETD Search

81	Um algoritmo de vida artificial para agrupamento de dados variantes no tempo Santos, Diego Gadens dos 14 September 2012 (has links) Made available in DSpace on 2016-03-15T19:37:44Z (GMT). No. of bitstreams: 1 Diego Gadens dos Santos.pdf: 2663525 bytes, checksum: 46be86494cd52896593a08e979b2a0ce (MD5) Previous issue date: 2012-09-14 / Fundo Mackenzie de Pesquisa / Current technologies have made it possible to generate and store data in high volumes. To process and collect information in large databases is not always as easy as creating them. Therefore, this gap has stimulated the search for efficient techniques capable of extracting useful and non-trivial knowledge, which are intrinsic to these large data sets. The goal of this work is to propose a bioinspired algorithm, based on the Boids artificial life model, which will be used to group data in dynamic environments, i.e. in databases updated over time. The Bo-ids algorithm was originally created to illustrate the simulation of the coordinated movement observed in a flock of birds and other animals. Thus, to use this algorithm for data clustering, some modifications must be applied. These changes will be implemented in the classical rules of cohesion, separation and alignment of the Boids model in order to consider the distance (similarity/dissimilarity) among objects. Thus, it creates objects that stand and move around the space, representing the natural groups within the data, and it is expected that similar ob-jects tend to form dynamic clusters (groups) of Boids in the environment, while dissimilar objects tend to keep a larger distance between them. The results presented attest the robust-ness of the algorithm for clustering time-varying data under the light of different evaluation measures and in various databases from the literature. / A capacidade de geração e armazenamento de dados proporcionada pelas tecnologias atuais levou ao surgimento de bases de dados com uma grande variedade de tipos e tamanhos. Extra-ir conhecimentos não triviais e úteis a partir de grandes bases de dados, entretanto, é uma tare-fa muito mais difícil do que a criação das mesmas. Esta lacuna tem estimulado a busca por técnicas eficientes de extração de conhecimentos intrínsecos a estes grandes conjuntos de da-dos, capazes de permitir tomadas estratégicas de decisão. Dentre as muitas tarefas da extração de conhecimentos a partir de dados, tem-se o agrupamento, que consiste na segmentação da base em grupos cujos objetos são mais parecidos entre si do que a objetos pertencentes a ou-tros grupos. Apesar de a área ser bastante ativa, pouco tem sido feito no sentido de desenvol-ver e investigar algoritmos de agrupamento para dados variantes no tempo, por exemplo, tran-sações financeiras, dados climáticos, informações e mensagens postadas em redes sociais e muitos outros. Tendo em vista a relevância prática desse tipo de análise e o crescente interesse pelos algoritmos inspirados na biologia, este trabalho tem como objetivo propor um algoritmo bioinspirado, baseado no modelo de vida artificial de Boids, para realizar o agrupamento de dados variantes no tempo. O algoritmo de Boids foi inicialmente criado para demonstrar ape-nas a simulação da movimentação coordenada observada em uma revoada de pássaros. A fim de utilizar este algoritmo para a tarefa de agrupamento de dados, algumas modificações tive-ram de ser propostas nas regras clássicas de coesão, separação e alinhamento dos Boids. Desta forma, foram criados objetos que se posicionam e se movimentam no espaço, de maneira a representar os grupos naturais existentes nos dados. A característica dinâmica intrínseca dos Boids tornou o algoritmo proposto, denominado dcBoids (dynamic clustering Boids), um can-didato natural para a resolução de problemas de agrupamento de dados variantes no tempo. Os resultados obtidos atestaram a robustez do método em seu contexto de aplicação, sob a pers-pectiva de diferentes medidas de avaliação de desempenho e quando aplicado a várias bases de dados da literatura com dinâmicas inseridas artificialmente. vida artificial computação natural mineração de dados agrupamento de dados boids dados variantes no tempo artificial life natural computing data mining data clustering boids time-varying data CNPQ::ENGENHARIAS::ENGENHARIA ELETRICA
82	Agrupamento nebuloso de dados baseado em enxame de partículas: seleção por métodos evolutivos e combinação via relação nebulosa do tipo-2 Szabo, Alexandre 29 October 2014 (has links) Made available in DSpace on 2016-03-15T19:38:52Z (GMT). No. of bitstreams: 1 Alexandre Szabo.pdf: 2177168 bytes, checksum: 8b503cd1beb4c700f1905e07a0b08362 (MD5) Previous issue date: 2014-10-29 / Fundação de Amparo a Pesquisa do Estado de São Paulo / Clustering usually treats objects as belonging to mutually exclusive clusters, what is usually im-precise, because an object may belong to more than one cluster simultaneously with different membership degrees. The clustering algorithms, both crisp and fuzzy, have a number of parameters to be adjusted so that they present the best performance for a given database. Furthermore, it is known that no single algorithm is better than all the others for all problem classes, and the combi-nation of solutions found by various algorithms (or the same algorithm with different parameters) may lead to a global solution that is better than those found by individual algorithms, including the best one. It is within this context that the present thesis proposes a new fuzzy clustering algo-rithm inspired by the behavior of particle swarms and, then, introduces a new form of combining the clustering algorithms using concepts from Type-2 fuzzy sets. / Da maneira tradicional o agrupamento trata os objetos que compõem a base como pertencentes a grupos mutuamente exclusivos, o que nem sempre é verdade, pois um objeto pode pertencer a mais de um grupo com diferentes graus de pertinência. Os algoritmos de agrupamento, sejam eles convencionais ou nebulosos (capazes de tratar múltiplas pertinências simultaneamente), possuem diversos parâmetros a serem ajustados de tal forma que ofereçam o melhor desempenho para uma base de dados. Além disso, é sabido que nenhum algoritmo é superior a todos os outros para todas as classes de problemas e que combinar soluções fornecidas por diferentes algoritmos pode levar a uma solução global superior a todas as soluções individuais, inclusive à melhor. É nesse contexto que a presente tese propõe um novo algoritmo de agrupamento nebuloso de dados inspirado no comportamento de enxames de partículas e, em seguida, propõe uma nova forma de realizar combinações (ensembles) de algoritmos de agrupamento usando conceitos da teoria de conjuntos nebulosos do Tipo-2. conjuntos nebulosos do tipo-1 conjuntos nebulosos do tipo-2 agrupamento de dados combinação de agrupamentos enxame de partículas type-1 fuzzy sets type-2 fuzzy sets data clustering cluster ensembles particle swarm CNPQ::ENGENHARIAS::ENGENHARIA ELETRICA
83	[en] CLUSTERING AND VISUALIZATION OF SEISMIC DATA USING VECTOR QUANTIZATION / [pt] AGRUPAMENTO E VISUALIZAÇÃO DE DADOS SÍSMICOS ATRAVÉS DE QUANTIZAÇÃO VETORIAL ERNESTO MARCHIONI FLECK 28 April 2005 (has links) [pt] Nesta tese é proposto um novo método de agrupamento de dados sísmicos para a visualização em mapas sísmicos. Os dados sísmicos (sinal + ruído) têm distribuições assimétricas. A classificação dos dados sísmicos é, atualmente, realizada através de métodos que induzem as referências dos grupos propostos às suas médias. No entanto, a média é sensível aos ruídos e aos outliers e as classificações com este estimador estão sujeitas a distorções nos resultados. Embora outros trabalhos sugiram o uso da mediana nos casos em que as distribuições são assimétricas - devido ao fato deste estimador ser robusto aos ruídos e aos outliers - em nenhum foi encontrado um método que induza as referências dos grupos propostos às medianas no tratamento dos dados sísmicos. O método proposto incluí um algoritmo que induz as referências dos grupos propostos às suas medianas. O tratamento iterativo dos dados sísmicos através da aplicação de uma função não linear adequada ao gradiente descendente gera resultados cujos erros médios quadráticos são inferiores aos dos resultados dos métodos que induzem à média. Um parâmetro existente no algoritmo, a constante de não linearidade, determina a maneira como os dados são induzidos, a partir da média, na direção da mediana. A convergência aos resultados requer poucas iterações no método proposto. O método proposto é uma ferramenta para o dimensionamento de reservatórios de petróleo e serve para a determinação de diferenças entre as propriedades de estruturas geológicas similares. / [en] This thesis suggests the use of a new method of seismic data clustering that can aid in the visualization of seismic maps. Seismic data are primarily made of signal and noise and, due to its dual composition, have asymmetric distributions. Seismic data are traditionally classified by methods that lead the proposed groups` references to their mean values. The mean value is, however, sensitive to noise and outliers and the classification methods that make use of this estimator are, consequently, subjected to generating distorted results. Although other works have suggested the use of the median in cases where the distributions are asymmetric - due to the fact that the estimator is robust with respect to noise and outliers - none have proposed a method that would lead the groups` references to the median while treating seismic data. The method proposed in this work includes, therefore, an algorithm that leads the groups` references to their medians. The iterative treatment of seismic data through the use of a non-linear function that is adequate for the gradient descent generates results with meansquare errors inferior to those of results generated by the use of the mean value. The algorithm`s non- linearity constant determines how the seismic data are led from the mean value towards the median. The proposed method requires little iteration for the results to converge. The proposed method can, therefore, be used as a tool in the sizing of petroleum reservoirs and can also be used to determine the differences between similar geological structures. [pt] REDES NEURAIS [en] NEURAL NETWORKS [pt] CLASSIFICACAO [en] CLASSIFICATION [pt] QUANTIZACAO VETORIAL [en] VECTOR QUANTISATION [pt] DADOS SISMICOS [en] SEISMIC DATA [pt] AGRUPAMENTO DE DADOS [en] DATA CLUSTERING [pt] MEDIANA [en] MEDIAN [pt] MAPAS SISMICOS [en] SEISMIC MAPS [pt] DISTRIBUICOES ASSIMETRICAS [en] ASYMMETRIC DISTRIBUTIONS
84	A local network neighbourhood artificial immune system Graaff, A.J. (Alexander Jakobus) 17 October 2011 (has links) As information is becoming more available online and will forevermore be part of any business, the true value of the large amounts of stored data is in the discovery of hidden and unknown relations and connections or traits in the data. The acquisition of these hidden relations can influence strategic decisions which have an impact on the success of a business. Data clustering is one of many methods to partition data into different groups in such a way that data patterns within the same group share some common trait compared to patterns across different groups. This thesis proposes a new artificial immune model for the problem of data clustering. The new model is inspired by the network theory of immunology and differs from its network based predecessor models in its formation of artificial lymphocyte networks. The proposed model is first applied to data clustering problems in stationary environments. Two different techniques are then proposed which enhances the proposed artificial immune model to dynamically determine the number of clusters in a data set with minimal to no user interference. A technique to generate synthetic data sets for data clustering of non-stationary environments is then proposed. Lastly, the original proposed artificial immune model and the enhanced version to dynamically determine the number of clusters are then applied to generated synthetic non-stationary data clustering problems. The influence of the parameters on the clustering performance is investigated for all versions of the proposed artificial immune model and supported by empirical results and statistical hypothesis tests. AFRIKAANS: Soos wat inligting meer aanlyn toeganglik raak en vir altyd meer deel vorm van enige besigheid, is die eintlike waarde van groot hoeveelhede data in die ontdekking van verskuilde en onbekende verwantskappe en konneksies of eienskappe in die data. Die verkryging van sulke verskuilde verwantskappe kan die strategiese besluitneming van ’n besigheid beinvloed, wat weer ’n impak het op die sukses van ’n besigheid. Data groepering is een van baie metodes om data op so ’n manier te groepeer dat data patrone wat deel vorm van dieselfde groep ’n gemeenskaplike eienskap deel in vergelyking met patrone wat verspreid is in ander groepe. Hierdie tesis stel ’n nuwe kunsmatige immuun model voor vir die probleem van data groepering. Die nuwe model is geinspireer deur die netwerk teorie in immunologie en verskil van vorige netwerk gebaseerde modelle deur die model se formasie van kunsmatige limfosiet netwerke. Die voorgestelde model word eers toegepas op data groeperingsprobleme in statiese omgewings. Twee verskillende tegnieke word dan voorgestel wat die voorgestelde kunsmatige immuun model op so ’n manier verbeter dat die model die aantal groepe in ’n data stel dinamies kan bepaal met minimum tot geen gebruiker invloed. ’n Tegniek om kunsmatige data stelle te genereer vir data groepering in dinamiese omgewings word dan voorgestel. Laastens word die oorspronklik voorgestelde model sowel as die verbeterde model wat dinamies die aantal groepe in ’n data stel kan bepaal toegepas op kunsmatig genereerde dinamiese data groeperingsprobleme. Die invloed van die parameters op die groepering prestasie is ondersoek vir alle weergawes van die voorgestelde kunsmatige immuun model en word toegelig deur empiriese resultate en statistiese hipotese toetse. / Thesis (PhD)--University of Pretoria, 2011. / Computer Science / unrestricted Klonale seleksie Kunsmatige immuun netwerke Immuun netwerk topologieë Groepering van dinamiese data Affiniteit volwassewording Somatiese hiper mutasie Kunsmatige limfosiete Data groepering Affinity maturation Clonal selection Artificial lymphocytes Somatic hyper mutation Groepering prestasie maatreëls Data clustering Dinamiese groepering UCTD
85	Analýza vlastností shlukovacích algoritmů / Analysis of Clustering Methods Lipták, Šimon January 2019 (has links) The aim of this master's thesis was to get acquainted with cluster analysis, clustering methods and their theoretical properties. It was necessary select clustering algorithms whose properties will be analyzed, find and select data sets on which these algorithms will be triggered. Also, the goal was to design and implement an application that will evaluate and display clustering results in an appropriate manner. The last step was to analyze the results and compare them with theoretical assumptions.
86	Fault Detection and Identification of Vehicle Starters and Alternators Using Machine Learning Techniques Seddik, Essam January 2016 (has links) Artificial Intelligence in Automotive Industry / Cost reduction is one of the main concerns in industry. Companies invest considerably for better performance in end-of-line fault diagnosis systems. A common strategy is to use data obtained from existing instrumentation. This research investigates the challenge of learning from historical data that have already been collected by companies. Machine learning is basically one of the most common and powerful techniques of artificial intelligence that can learn from data and identify fault features with no need for human interaction. In this research, labeled sound and vibration measurements are processed into fault signatures for vehicle starter motors and alternators. A fault detection and identification system has been developed to identify fault types for end-of-line testing of motors. However, labels are relatively difficult to obtain, expensive, time consuming and require experienced humans, while unlabeled samples needs less effort to collect. Thus, learning from unlabeled data together with the guidance of few labels would be a better solution. Furthermore, in this research, learning from unlabeled data with absolutely no human intervention is also implemented and discussed as well. / Thesis / Master of Applied Science (MASc) Machine Learning Fault Diagnosis Fault Detection and Identification Fault Detection Fault Identification Starters Alternators Automotive Industry Artificial Intelligence Deep Learning Data classification Data clustering Neural Network Support Vector Machine Label Propagation Unknown Faults Detection
87	Machine learning in complex networks: modeling, analysis, and applications / Aprendizado de máquina em redes complexas: modelagem, análise e aplicações Silva, Thiago Christiano 13 December 2012 (has links) Machine learning is evidenced as a research area with the main purpose of developing computational methods that are capable of learning with their previously acquired experiences. Although a large amount of machine learning techniques has been proposed and successfully applied in real systems, there are still many challenging issues, which need be addressed. In the last years, an increasing interest in techniques based on complex networks (large-scale graphs with nontrivial connection patterns) has been verified. This emergence is explained by the inherent advantages provided by the complex network representation, which is able to capture the spatial, topological and functional relations of the data. In this work, we investigate the new features and possible advantages offered by complex networks in the machine learning domain. In fact, we do show that the network-based approach really brings interesting features for supervised, semisupervised, and unsupervised learning. Specifically, we reformulate a previously proposed particle competition technique for both unsupervised and semisupervised learning using a stochastic nonlinear dynamical system. Moreover, an analytical analysis is supplied, which enables one to predict the behavior of the proposed technique. In addition to that, data reliability issues are explored in semisupervised learning. Such matter has practical importance and is found to be of little investigation in the literature. With the goal of validating these techniques for solving real problems, simulations on broadly accepted databases are conducted. Still in this work, we propose a hybrid supervised classification technique that combines both low and high orders of learning. The low level term can be implemented by any classification technique, while the high level term is realized by the extraction of features of the underlying network constructed from the input data. Thus, the former classifies the test instances by their physical features, while the latter measures the compliance of the test instances with the pattern formation of the data. Our study shows that the proposed technique not only can realize classification according to the semantic meaning of the data, but also is able to improve the performance of traditional classification techniques. Finally, it is expected that this study will contribute, in a relevant manner, to the machine learning area / Aprendizado de máquina figura-se como uma área de pesquisa que visa a desenvolver métodos computacionais capazes de aprender com a experiência. Embora uma grande quantidade de técnicas de aprendizado de máquina foi proposta e aplicada, com sucesso, em sistemas reais, existem ainda inúmeros problemas desafiantes que necessitam ser explorados. Nos últimos anos, um crescente interesse em técnicas baseadas em redes complexas (grafos de larga escala com padrões de conexão não triviais) foi verificado. Essa emergência é explicada pelas inerentes vantagens que a representação em redes complexas traz, sendo capazes de capturar as relações espaciais, topológicas e funcionais dos dados. Nesta tese, serão investigadas as possíveis vantagens oferecidas por redes complexas quando utilizadas no domínio de aprendizado de máquina. De fato, será mostrado que a abordagem por redes realmente proporciona melhorias nos aprendizados supervisionado, semissupervisionado e não supervisionado. Especificamente, será reformulada uma técnica de competição de partículas para o aprendizado não supervisionado e semissupervisionado por meio da utilização de um sistema dinâmico estocástico não linear. Em complemento, uma análise analítica de tal modelo será desenvolvida, permitindo o entendimento evolucional do modelo no tempo. Além disso, a questão de confiabilidade de dados será investigada no aprendizado semissupervisionado. Tal tópico tem importância prática e é pouco estudado na literatura. Com o objetivo de validar essas técnicas em problemas reais, simulações computacionais em bases de dados consagradas pela literatura serão conduzidas. Ainda nesse trabalho, será proposta uma técnica híbrica de classificação supervisionada que combina tanto o aprendizado de baixo como de alto nível. O termo de baixo nível pode ser implementado por qualquer técnica de classificação tradicional, enquanto que o termo de alto nível é realizado pela extração das características de uma rede construída a partir dos dados de entrada. Nesse contexto, aquele classifica as instâncias de teste segundo qualidades físicas, enquanto que esse estima a conformidade da instância de teste com a formação de padrões dos dados. Os estudos aqui desenvolvidos mostram que o método proposto pode melhorar o desempenho de técnicas tradicionais de classificação, além de permitir uma classificação de acordo com o significado semântico dos dados. Enfim, acredita-se que este estudo possa gerar contribuições relevantes para a área de aprendizado de máquina. Aprendizado competitivo Aprendizado não supervisionado Aprendizado semissupervisionado Caminhadas aleatórias Classificação de dados Classificação em alto nível Competição de partículas Competitive learning Complex networks Data classification Data clustering High level classification Particle competition Random walks Redes complexas Semisupervised learning Supervised learning Unsupervised learning
88	Information fusion and decision-making using belief functions : application to therapeutic monitoring of cancer / Fusion de l’information et prise de décisions à l’aide des fonctions de croyance : application au suivi thérapeutique du cancer Lian, Chunfeng 27 January 2017 (has links) La radiothérapie est une des méthodes principales utilisée dans le traitement thérapeutique des tumeurs malignes. Pour améliorer son efficacité, deux problèmes essentiels doivent être soigneusement traités : la prédication fiable des résultats thérapeutiques et la segmentation précise des volumes tumoraux. La tomographie d’émission de positrons au traceur Fluoro- 18-déoxy-glucose (FDG-TEP) peut fournir de manière non invasive des informations significatives sur les activités fonctionnelles des cellules tumorales. Les objectifs de cette thèse sont de proposer: 1) des systèmes fiables pour prédire les résultats du traitement contre le cancer en utilisant principalement des caractéristiques extraites des images FDG-TEP; 2) des algorithmes automatiques pour la segmentation de tumeurs de manière précise en TEP et TEP-TDM. La théorie des fonctions de croyance est choisie dans notre étude pour modéliser et raisonner des connaissances incertaines et imprécises pour des images TEP qui sont bruitées et floues. Dans le cadre des fonctions de croyance, nous proposons une méthode de sélection de caractéristiques de manière parcimonieuse et une méthode d’apprentissage de métriques permettant de rendre les classes bien séparées dans l’espace caractéristique afin d’améliorer la précision de classification du classificateur EK-NN. Basées sur ces deux études théoriques, un système robuste de prédiction est proposé, dans lequel le problème d’apprentissage pour des données de petite taille et déséquilibrées est traité de manière efficace. Pour segmenter automatiquement les tumeurs en TEP, une méthode 3-D non supervisée basée sur le regroupement évidentiel (evidential clustering) et l’information spatiale est proposée. Cette méthode de segmentation mono-modalité est ensuite étendue à la co-segmentation dans des images TEP-TDM, en considérant que ces deux modalités distinctes contiennent des informations complémentaires pour améliorer la précision. Toutes les méthodes proposées ont été testées sur des données cliniques, montrant leurs meilleures performances par rapport aux méthodes de l’état de l’art. / Radiation therapy is one of the most principal options used in the treatment of malignant tumors. To enhance its effectiveness, two critical issues should be carefully dealt with, i.e., reliably predicting therapy outcomes to adapt undergoing treatment planning for individual patients, and accurately segmenting tumor volumes to maximize radiation delivery in tumor tissues while minimize side effects in adjacent organs at risk. Positron emission tomography with radioactive tracer fluorine-18 fluorodeoxyglucose (FDG-PET) can noninvasively provide significant information of the functional activities of tumor cells. In this thesis, the goal of our study consists of two parts: 1) to propose reliable therapy outcome prediction system using primarily features extracted from FDG-PET images; 2) to propose automatic and accurate algorithms for tumor segmentation in PET and PET-CT images. The theory of belief functions is adopted in our study to model and reason with uncertain and imprecise knowledge quantified from noisy and blurring PET images. In the framework of belief functions, a sparse feature selection method and a low-rank metric learning method are proposed to improve the classification accuracy of the evidential K-nearest neighbor classifier learnt by high-dimensional data that contain unreliable features. Based on the above two theoretical studies, a robust prediction system is then proposed, in which the small-sized and imbalanced nature of clinical data is effectively tackled. To automatically delineate tumors in PET images, an unsupervised 3-D segmentation based on evidential clustering using the theory of belief functions and spatial information is proposed. This mono-modality segmentation method is then extended to co-segment tumor in PET-CT images, considering that these two distinct modalities contain complementary information to further improve the accuracy. All proposed methods have been performed on clinical data, giving better results comparing to the state of the art ones. Théorie des fonctions de croyances Prédiction Segmentation de tumeurs automatique Tomographie par émission de positrons Imagerie TEP/TDM Tomodensitométrie Clustering des données Classification des données Algorithmes automatiques Apprentissage de métriques Sélection de caractéristiques Theory of belief functions Feature selection Distance metric learning Data classification Data clustering Cancer therapy outcome prediction Automatic tumor segmentation PET/CT imaging Emission tomography Algorithms
89	Machine learning in complex networks: modeling, analysis, and applications / Aprendizado de máquina em redes complexas: modelagem, análise e aplicações Thiago Christiano Silva 13 December 2012 (has links) Machine learning is evidenced as a research area with the main purpose of developing computational methods that are capable of learning with their previously acquired experiences. Although a large amount of machine learning techniques has been proposed and successfully applied in real systems, there are still many challenging issues, which need be addressed. In the last years, an increasing interest in techniques based on complex networks (large-scale graphs with nontrivial connection patterns) has been verified. This emergence is explained by the inherent advantages provided by the complex network representation, which is able to capture the spatial, topological and functional relations of the data. In this work, we investigate the new features and possible advantages offered by complex networks in the machine learning domain. In fact, we do show that the network-based approach really brings interesting features for supervised, semisupervised, and unsupervised learning. Specifically, we reformulate a previously proposed particle competition technique for both unsupervised and semisupervised learning using a stochastic nonlinear dynamical system. Moreover, an analytical analysis is supplied, which enables one to predict the behavior of the proposed technique. In addition to that, data reliability issues are explored in semisupervised learning. Such matter has practical importance and is found to be of little investigation in the literature. With the goal of validating these techniques for solving real problems, simulations on broadly accepted databases are conducted. Still in this work, we propose a hybrid supervised classification technique that combines both low and high orders of learning. The low level term can be implemented by any classification technique, while the high level term is realized by the extraction of features of the underlying network constructed from the input data. Thus, the former classifies the test instances by their physical features, while the latter measures the compliance of the test instances with the pattern formation of the data. Our study shows that the proposed technique not only can realize classification according to the semantic meaning of the data, but also is able to improve the performance of traditional classification techniques. Finally, it is expected that this study will contribute, in a relevant manner, to the machine learning area / Aprendizado de máquina figura-se como uma área de pesquisa que visa a desenvolver métodos computacionais capazes de aprender com a experiência. Embora uma grande quantidade de técnicas de aprendizado de máquina foi proposta e aplicada, com sucesso, em sistemas reais, existem ainda inúmeros problemas desafiantes que necessitam ser explorados. Nos últimos anos, um crescente interesse em técnicas baseadas em redes complexas (grafos de larga escala com padrões de conexão não triviais) foi verificado. Essa emergência é explicada pelas inerentes vantagens que a representação em redes complexas traz, sendo capazes de capturar as relações espaciais, topológicas e funcionais dos dados. Nesta tese, serão investigadas as possíveis vantagens oferecidas por redes complexas quando utilizadas no domínio de aprendizado de máquina. De fato, será mostrado que a abordagem por redes realmente proporciona melhorias nos aprendizados supervisionado, semissupervisionado e não supervisionado. Especificamente, será reformulada uma técnica de competição de partículas para o aprendizado não supervisionado e semissupervisionado por meio da utilização de um sistema dinâmico estocástico não linear. Em complemento, uma análise analítica de tal modelo será desenvolvida, permitindo o entendimento evolucional do modelo no tempo. Além disso, a questão de confiabilidade de dados será investigada no aprendizado semissupervisionado. Tal tópico tem importância prática e é pouco estudado na literatura. Com o objetivo de validar essas técnicas em problemas reais, simulações computacionais em bases de dados consagradas pela literatura serão conduzidas. Ainda nesse trabalho, será proposta uma técnica híbrica de classificação supervisionada que combina tanto o aprendizado de baixo como de alto nível. O termo de baixo nível pode ser implementado por qualquer técnica de classificação tradicional, enquanto que o termo de alto nível é realizado pela extração das características de uma rede construída a partir dos dados de entrada. Nesse contexto, aquele classifica as instâncias de teste segundo qualidades físicas, enquanto que esse estima a conformidade da instância de teste com a formação de padrões dos dados. Os estudos aqui desenvolvidos mostram que o método proposto pode melhorar o desempenho de técnicas tradicionais de classificação, além de permitir uma classificação de acordo com o significado semântico dos dados. Enfim, acredita-se que este estudo possa gerar contribuições relevantes para a área de aprendizado de máquina. Aprendizado competitivo Aprendizado não supervisionado Aprendizado semissupervisionado Caminhadas aleatórias Classificação de dados Classificação em alto nível Competição de partículas Redes complexas Competitive learning Complex networks Data classification Data clustering High level classification Particle competition Random walks Semisupervised learning Supervised learning Unsupervised learning

Search results