Spelling suggestions: "subject:"fuzzy classification"" "subject:"buzzy classification""
11 |
Detecção de mudanças a partir de imagens de fraçãoBittencourt, Helio Radke January 2011 (has links)
A detecção de mudanças na superfície terrestre é o principal objetivo em aplicações de sensoriamento remoto multitemporal. Sabe-se que imagens adquiridas em datas distintas tendem a ser altamente influenciadas por problemas radiométricos e de registro. Utilizando imagens de fração, obtidas a partir do modelo linear de mistura espectral (MLME), problemas radiométricos podem ser minimizados e a interpretação dos tipos de mudança na superfície terrestre é facilitada, pois as frações têm um significado físico direto. Além disso, interpretações ao nível de subpixel são possíveis. Esta tese propõe três algoritmos – rígido, suave e fuzzy – para a detecção de mudanças entre um par de imagens de fração, gerando mapas de mudança como produtos finais. As propostas requerem a suposição de normalidade multivariada para as diferenças de fração e necessitam de pouca intervenção por parte do analista. A proposta rígida cria mapas de mudança binários seguindo a mesma metodologia de um teste de hipóteses, baseando-se no fato de que os contornos de densidade constante na distribuição normal multivariada são definidos por valores da distribuição qui-quadrado, de acordo com a escolha do nível de confiança. O classificador suave permite gerar estimativas da probabilidade do pixel pertencer à classe de mudança, a partir de um modelo de regressão logística. Essas probabilidades são usadas para criar um mapa de probabilidades de mudança. A abordagem fuzzy é aquela que melhor se adapta ao conceito de pixel mistura, visto que as mudanças no uso e cobertura do solo podem ocorrer em nível de subpixel. Com base nisso, mapas dos graus de pertinência à classe de mudança foram criados. Outras ferramentas matemáticas e estatísticas foram utilizadas, tais como operações morfológicas, curvas ROC e algoritmos de clustering. As três propostas foram testadas utilizando-se imagens sintéticas e reais (Landsat-TM) e avaliadas qualitativa e quantitativamente. Os resultados indicam a viabilidade da utilização de imagens de fração em estudos de detecção de mudanças por meio dos algoritmos propostos. / Land cover change detection is a major goal in multitemporal remote sensing applications. It is well known that images acquired on different dates tend to be highly influenced by radiometric differences and registration problems. Using fraction images, obtained from the linear model of spectral mixing (LMSM), radiometric problems can be minimized and the interpretation of changes in land cover is facilitated because the fractions have a physical meaning. Furthermore, interpretations at the subpixel level are possible. This thesis presents three algorithms – hard, soft and fuzzy – for detecting changes between a pair of fraction images. The algorithms require multivariate normality for the differences among fractions and very little intervention by the analyst. The hard algorithm creates binary change maps following the same methodology of hypothesis testing, based on the fact that the contours of constant density are defined by chi-square values, according to the choice of the probability level. The soft one allows for the generation of estimates of the probability of each pixel belonging to the change class by using a logistic regression model. These probabilities are used to create a map of change probabilities. The fuzzy approach is the one that best fits the concept behind the fraction images because the changes in land cover can occurr at a subpixel level. Based on these algorithms, maps of membership degrees were created. Other mathematical and statistical techniques were also used, such as morphological operations, ROC curves and a clustering algorithm. The algorithms were tested using synthetic and real images (Landsat-TM) and the results were analyzed qualitatively and quantitatively. The results indicate that fraction images can be used in change detection studies by using the proposed algorithms.
|
12 |
Classificação do risco de infestação de regiões por plantas daninhas utilizando lógica Fuzzy e redes Bayesianas / Classification of the risk of infestation per regions of a crop by weeds using Fuzzy and Bayesian networksGlaucia Maria Bressan 16 July 2007 (has links)
O presente trabalho tem como objetivo principal a classificação do risco de infestação por regiões de culturas vegetais por plantas daninhas. Os riscos por regiões são obtidos por um sistema de classificação fuzzy, usando métodos de Krigagem e análise de imagens. A infestação é descrita por atributos da cobertura foliar, densidade de sementes, extensão dos agrupamentos de sementes e competitividade, obtidos a partir das amostras de densidades de sementes e de plantas daninhas, da cobertura foliar e da biomassa de plantas daninhas. O atributo da cobertura foliar indica a porcentagem de ocupação das plantas emergentes e é obtido a partir de um mapa de cobertura foliar, construído usando Krigagem. O atributo da densidade de sementes caracteriza a localização das sementes que podem germinar e é obtido a partir de um mapa da distribuição da produção de sementes das plantas daninhas, também construído usando Krigagem. O atributo da extensão dos agrupamentos de sementes reflete a influência das sementes vizinhas em uma certa localização e também é obtido a partir do mapa de distribuição da produção de sementes. O atributo da competitividade entre plantas daninhas e cultura é obtido a partir de um sistema neurofuzzy, utilizando amostras de densidade e de biomassa das plantas daninhas. Para reunir os riscos de infestação semelhantes, os valores de risco inferidos por região pelo sistema fuzzy são agrupados considerando valores e localizações próximas utilizando o método k-médias com coeficiente de variação. Uma abordagem probabilística com redes de classificação Bayesianas é também empregada para a obtenção de um conjunto de regras linguísticas para classificar a competitividade e o risco de infestação, por motivo de comparação. Resultados para o risco de infestação são obtidos para uma área experimental em uma cultura de milho indicando a existência de riscos diferenciados que são explicados pela perda de rendimento da cultura. / The goal of this work is the classification of the risk of infestation per regions of a crop by weeds. The risks per regions are obtained by a fuzzy classification system, using kriging and image analysis. The infestation is described by attributes of the weed coverage, weed seed density, weed seed patches and competitiveness, obtained from weed seeds and weed densities, weed coverage and biomass. The attribute of the weed coverage indicates the percentage of infested surface of the emergent weeds which is obtained from a weed coverage map built with kriging. The attribute of the weed seed density is obtained from a weed seed production map also built with kriging which characterizes the locations of seeds which can germinate. The attribute of the weed seed patches is also obtained by the weed seed production map which reflects how the seeds contribute to weed proliferation in the surroundings. The attribute of the competitiveness among weeds and crop is obtained from a neurofuzzy system, using the weeds density and biomass of the plants. In order to aggregate the similar risks of infestation, the values of risks per region inferred by the fuzzy system are clustered according to similar values and locations using the k-means method with a variation coefficient. A probabilistic approach with Bayesian networks classifiers is also considered to obtain a set of linguistic rules to classify the competitiveness and the risk of infestation, for comparison purposes. Results for the risk of infestation are obtained for an experimental area in a corn crop which indicate the existence of different risks, explained by the yield loss of the crop.
|
13 |
Detecção de mudanças a partir de imagens de fraçãoBittencourt, Helio Radke January 2011 (has links)
A detecção de mudanças na superfície terrestre é o principal objetivo em aplicações de sensoriamento remoto multitemporal. Sabe-se que imagens adquiridas em datas distintas tendem a ser altamente influenciadas por problemas radiométricos e de registro. Utilizando imagens de fração, obtidas a partir do modelo linear de mistura espectral (MLME), problemas radiométricos podem ser minimizados e a interpretação dos tipos de mudança na superfície terrestre é facilitada, pois as frações têm um significado físico direto. Além disso, interpretações ao nível de subpixel são possíveis. Esta tese propõe três algoritmos – rígido, suave e fuzzy – para a detecção de mudanças entre um par de imagens de fração, gerando mapas de mudança como produtos finais. As propostas requerem a suposição de normalidade multivariada para as diferenças de fração e necessitam de pouca intervenção por parte do analista. A proposta rígida cria mapas de mudança binários seguindo a mesma metodologia de um teste de hipóteses, baseando-se no fato de que os contornos de densidade constante na distribuição normal multivariada são definidos por valores da distribuição qui-quadrado, de acordo com a escolha do nível de confiança. O classificador suave permite gerar estimativas da probabilidade do pixel pertencer à classe de mudança, a partir de um modelo de regressão logística. Essas probabilidades são usadas para criar um mapa de probabilidades de mudança. A abordagem fuzzy é aquela que melhor se adapta ao conceito de pixel mistura, visto que as mudanças no uso e cobertura do solo podem ocorrer em nível de subpixel. Com base nisso, mapas dos graus de pertinência à classe de mudança foram criados. Outras ferramentas matemáticas e estatísticas foram utilizadas, tais como operações morfológicas, curvas ROC e algoritmos de clustering. As três propostas foram testadas utilizando-se imagens sintéticas e reais (Landsat-TM) e avaliadas qualitativa e quantitativamente. Os resultados indicam a viabilidade da utilização de imagens de fração em estudos de detecção de mudanças por meio dos algoritmos propostos. / Land cover change detection is a major goal in multitemporal remote sensing applications. It is well known that images acquired on different dates tend to be highly influenced by radiometric differences and registration problems. Using fraction images, obtained from the linear model of spectral mixing (LMSM), radiometric problems can be minimized and the interpretation of changes in land cover is facilitated because the fractions have a physical meaning. Furthermore, interpretations at the subpixel level are possible. This thesis presents three algorithms – hard, soft and fuzzy – for detecting changes between a pair of fraction images. The algorithms require multivariate normality for the differences among fractions and very little intervention by the analyst. The hard algorithm creates binary change maps following the same methodology of hypothesis testing, based on the fact that the contours of constant density are defined by chi-square values, according to the choice of the probability level. The soft one allows for the generation of estimates of the probability of each pixel belonging to the change class by using a logistic regression model. These probabilities are used to create a map of change probabilities. The fuzzy approach is the one that best fits the concept behind the fraction images because the changes in land cover can occurr at a subpixel level. Based on these algorithms, maps of membership degrees were created. Other mathematical and statistical techniques were also used, such as morphological operations, ROC curves and a clustering algorithm. The algorithms were tested using synthetic and real images (Landsat-TM) and the results were analyzed qualitatively and quantitatively. The results indicate that fraction images can be used in change detection studies by using the proposed algorithms.
|
14 |
Identification de systèmes dynamiques non-linéaires par réseaux de neurones et multimodèles / Identification of non linear dynamical system by neural networks and multiple modelsThiaw, Lamine 28 January 2008 (has links)
Cette étude traite de l’identification de système dynamique non-linéaire. Une architecture multimodèle capable de surmonter certaines difficultés de l’architecture neuronale de type MLP a été étudiée. L’approche multimodèle consiste à représenter un système complexe par un ensemble de modèles de structures simples à validité limitée dans des zones bien définies. A la place de la structure affine des modèles locaux généralement utilisée, cette étude propose une structure polynômiale plus générale, capable de mieux appréhender les non-linéarités locales, réduisant ainsi le nombre de modèles locaux. L’estimation paramétrique d’une telle architecture multimodèle peut se faire suivant une optimisation linéaire, moins coûteuse en temps de calcul que l’estimation paramétrique utilisée dans une architecture neuronale. L’implantation des multimodèles récurrents, avec un algorithme d’estimation paramétrique plus souple que l’algorithme de rétro-propagation du gradient à travers le temps utilisé pour le MLP récurrent a également été effectuée. Cette architecture multimodèle permet de représenter plus facilement des modèles non-linéaires bouclés tels que les modèles NARMAX et NOE. La détermination du nombre de modèles locaux dans une architecture multimodèle nécessite la décomposition (le partitionnement) de l’espace de fonctionnement du système en plusieurs sous-espaces où sont définies les modèles locaux. Des modes de partitionnement flou (basé sur les algorithmes de« fuzzy-c-means », de « Gustafson et Kessel » et du « subtractive clustering ») ont été présentés. L’utilisation de telles méthodes nécessite l’implantation d’une architecture multimodèle où les modèles locaux peuvent être de structures différentes : polynômiales de degrés différents, neuronale ou polynômiale et neuronale. Une architecture multimodèle hétérogène répondant à ses exigences a été proposée, des algorithmes d’identification structurelles et paramétriques ont été présentés. Une étude comparative entre les architectures MLP et multimodèle a été menée. Le principal atout de l’architecture multimodèle par rapport à l’architecture neuronale de type MLP est la simplicité de l’estimation paramétrique. Par ailleurs, l’utilisation dans une architecture multimodèle d’un mode de partitionnement basé sur la classification floue permet de déterminer facilement le nombre de modèles locaux, alors que la détermination du nombre de neurones cachés pour une architecture MLP reste une tâche difficile / This work deals with non linear dynamical system identification. A multiple model architecture which overcomes certain insufficiencies of MLP neural networks is studied. Multiple model approach consists of modeling complex systems by mean of a set of simple local models whose validity are limited in well defined zones. Instead of using conventional affine models, a more general polynomial structure is proposed in this study, enabling to better apprehend local non linearities, reducing thus the number of local models. Models parameters of such a structure are estimated by linear optimization, which reduces computation time with respect to parameter estimation of a neural network architecture. The implementation of recurrent multiple models, with a more convenient learning algorithm than the back propagation through time, used in recurrent MLP models, is also studied. Such implementations facilitate representation of recurrent models like NARMAX and NOE. The determination of the number of local models in a multiple model architecture requires decomposition of system’s feature space into several sub-systems in which local models are defined. Fuzzy partitioning methods (based of « fuzzy-c-means », « Gustafson and Kessel » and « subtractive clustering »algorithms) are presented. The use of such methods requires the implementation of a multiple model architecture where local models can have different structures : polynomial with different degrees, neural or polynomial and neural. A multiple model with a heterogeneous architecture satisfying these requirements is proposed and structural and parametrical identification algorithms are presented. A comparative study between multiple model and MLP architectures is done. The main advantage of the multiple model architecture is the parameter estimation simplicity. In addition, the use of fuzzy partitioning methods in multiple model architecture enables to find easily the number of local models while the determination of hidden neurons in an MLP architecture remains a hard task
|
15 |
轉折型時間序列的認定 / Pattern Recognition for Trend Time Series程友梅, Cheng, Yu Mei Unknown Date (has links)
轉折型時間序列在現實生活中常常可見,例如因戰爭、政策改變、罷工或自然界的條件劇變等,而使時間序列的走勢發生明顯的轉變。傳統上,對這種轉折型時間序列資料進行轉折點的偵測時,大部分均從事後的觀點,主觀上先行認定結構轉變發生的時點,而後再以檢定加以確認。但此種方法過於主觀,而且轉型並非一蹴可幾,若以單一的轉折點來解釋轉型的現象,似乎不太恰當。
有鑑於此,本文利用模糊轉折區間統計認定法,以事前的觀點,對具有平均數或變異數改變的轉折型時間序列進行轉折區間的認定。並以匯率及貿易餘額的實際例子,利用我們所提出的方法進行單變數及雙變數的模糊分類,進而求出個別及聯合的轉折區間。 / Structure-changing time series are often seen in daily life. For example, war, change of policy, labor strike, or change of natural phenomena result in obvious change of time series. Most of detection of change points for structure-changing time series take place afterwards. In this paper, we pre-sent a change periods detection method for trend time series using the concept of fuzzy logic. Empirical example about exchange rate and balance of international trade is illustrated with detailed analysis.
|
16 |
Fuzzy Classification Models Based On TanakaOzer, Gizem 01 July 2009 (has links) (PDF)
In some classification problems where human judgments, qualitative and imprecise data exist, uncertainty comes from fuzziness rather than randomness. Limited number of fuzzy classification approaches is available for use for these classification problems to capture the effect of fuzzy uncertainty imbedded in data. The scope of this study mainly comprises two parts: new fuzzy classification approaches based on Tanaka&rsquo / s Fuzzy Linear Regression (FLR) approach, and an improvement of an existing one, Improved Fuzzy Classifier Functions (IFCF). Tanaka&rsquo / s FLR approach is a well known fuzzy regression technique used for the prediction problems including fuzzy type of uncertainty. In the first part of the study, three alternative approaches are presented, which utilize the FLR approach for a particular customer satisfaction classification problem. A comparison of their performances and their applicability in other cases are discussed. In the second part of the study, the improved IFCF method, Nonparametric Improved Fuzzy Classifier Functions (NIFCF), is presented, which proposes to use a nonparametric method, Multivariate Adaptive Regression Splines (MARS), in clustering phase of the IFCF method. NIFCF method is applied on three data sets, and compared with Fuzzy Classifier Function (FCF) and Logistic Regression (LR) methods.
|
17 |
Efficient Medical Volume Visualization : An Approach Based on Domain KnowledgeLundström, Claes January 2007 (has links)
Direct Volume Rendering (DVR) is a visualization technique that has proved to be a very powerful tool in many scientific visualization applications. Diagnostic medical imaging is one domain where DVR could provide clear benefits in terms of unprecedented possibilities for analysis of complex cases and highly efficient work flow for certain routine examinations. The full potential of DVR in the clinical environment has not been reached, however, primarily due to limitations in conventional DVR methods and tools. This thesis presents methods addressing four major challenges for DVR in clinical use. The foundation of all methods is to incorporate the domain knowledge of the medical professional in the technical solutions. The first challenge is the very large data sets routinely produced in medical imaging today. To this end a multiresolution DVR pipeline is proposed, which dynamically prioritizes data according to the actual impact in the rendered image to be reviewed. Using this prioritization the system can reduce the data requirements throughout the pipeline and provide high performance and visual quality in any environment. Another problem addressed is how to achieve simple yet powerful interactive tissue classification in DVR. The methods presented define additional attributes that effectively captures readily available medical knowledge. The task of tissue detection is also important to solve in order to improve efficiency and consistency of diagnostic image review. Histogram-based techniques that exploit spatial relations in the data to achieve accurate and robust tissue detection are presented in this thesis. The final challenge is uncertainty visualization, which is very pertinent in clinical work for patient safety reasons. An animation method has been developed that automatically conveys feasible alternative renderings. The basis of this method is a probabilistic interpretation of the visualization parameters. Several clinically relevant evaluations of the developed techniques have been performed demonstrating their usefulness. Although there is a clear focus on DVR and medical imaging, most of the methods provide similar benefits also for other visualization techniques and application domains.
|
18 |
Identification of urban surface materials using high-resolution hyperspectral aerial imageryParanjape, Meghana 07 1900 (has links)
La connaissance des matériaux de surface est essentielle pour l’aménagement et la gestion des
villes. Avec les avancées en télédétection, particulièrement en imagerie de haute résolution spatiale
et spectrale, l’identification et la cartographie détaillée des matériaux de surface en milieu urbain
sont maintenant envisageables. Les signatures spectrales décrivent les interactions entre les objets
au sol et le rayonnement solaire, et elles sont supposées uniques pour chaque type de matériau de
surface.
Dans ce projet de recherche nous avons utilisé des images hyperspectrales aériennes du capteur
CASI, avec une résolution de 1 m2 et 96 bandes contigües entre 380nm et 1040nm. Ces images
couvrant l’île de Montréal (QC, Canada), acquises en 2016, ont été analysées pour identifier les
matériaux de surfaces.
Pour atteindre ces objectifs, notre méthode d’analyse est fondée sur la comparaison des signatures
spectrales d’un pixel quelconque à celles des objets typiques contenues dans des bibliothèques
spectrales (matériaux inertes et végétation). Pour mesurer la correspondance entre la signature
spectrale d’un objet et la signature spectrale de référence nous avons utilisé deux métriques. La
première métrique tient compte de la forme d’une signature spectrale et la seconde, de la différence
des valeurs de réflectance entre la signature spectrale observée et celle de référence. Un
classificateur flou utilisant ces deux métriques est alors appliqué afin de reconnaître le type de
matériau de surface sur la base du pixel. Des signatures spectrales typiques ont été extraites des
deux librairies spectrales (ASTER et HYPERCUBE). Des signatures spectrales des objets typiques
à Montréal mesurées sur le terrain (spectroradiomètre ASD) ont été aussi utilisées comme
références.
Trois grandes catégories de matériaux ont été identifiées dans les images pour faciliter la
comparaison entre les classifications par source de références spectrales : l’asphalte, le béton et la
végétation. La classification utilisant ASTER comme bibliothèque de référence a eu le plus grand
taux de réussite avec 92%, suivi par ASD à 88% et finalement HYPERCUBE avec 80%. Nous
5
n’avons pas trouvé de différences significatives entre les trois résultats, ce qui indique que la
classification est indépendante de la source des signatures spectrales de référence. / Knowledge of surface cover materials is crucial for urban planning and management. With
advances in remote sensing, especially in high spatial and spectral resolution imagery, the
identification and detailed mapping of surface materials in urban areas based on spectral signatures
are now feasible. Spectral signatures describe the interactions between ground objects and solar
radiation and are assumed unique for each type of material.
In this research, we use airborne CASI images with 1 m2 spatial resolution, with 96 contiguous
bands in a spectral range between 367 nm and 1044 nm. These images covering the island of
Montreal (Quebec, Canada), obtained in 2016, were analyzed to identify urban surface materials.
The objectives of the project were first to find a correspondence between the physical and chemical
characteristic of typical surface materials, present in the Montreal scenes, and the spectral
signatures within the images. Second, to develop a sound methodology for identifying these
surface materials in urban landscapes.
To reach these objectives, our method of analysis is based on a comparison of pixel spectral
signatures to those contained in a reference spectral library that describe typical surface covering
materials (inert materials and vegetation). Two metrics were used in order to measure the
correspondence of pixel spectral signatures and reference spectral signature. The first metric
considers the shape of a spectral signature and the second the difference of reflectance values
between the observed and reference spectral signature. A fuzzy classifier using these two metrics
is then applied to recognize the type of material on a pixel basis. Typical spectral signatures were
extracted from two spectral libraries (ASTER and HYPERCUBE). Spectral signatures of typical
objects in Montreal measured on the ground (ASD spectroradiometer) were also used as reference
spectra. Three general types of surface materials (asphalt, concrete, and vegetation) were used to
ease the comparison between classifications using these spectral libraries. The classification using
ASTER as a reference library had the highest success rate reaching 92%, followed by the field
spectra at 88%, and finally with HYPERCUBE at 80%. There were no significant differences in
the classification results indicating that the methodology works independently of the source of
reference spectral signatures.
|
19 |
截取式自迴歸條件變異數分析法 / Trimmed ARCH(1) model廖本杰 Unknown Date (has links)
時間數列分析過程,常常發現其走勢,隨著時間過程而演變,應用傳統的線性模式來配適,往往很難獲得合適預測值。因此近幾年來,非線性時間數列結構性改變的研究越來越受到重視,也一直是時間數列及計量經濟學學者所熱衷的研究主題之一。本文利用模糊理論的觀念,以模糊炳找出配適ARCH模式數列之轉折區間,分別以轉折區問起始點及結束點作為截取點,去配適ARCH(1)模式,稱之為截取式自迴歸條件變異數分析法(Trimmed ARCH(1) model)。針對台幣對美元銀行間每日收盤匯率,分別以單變量ARIMA、ARCH(1)、Trtmmed ARCH(1)來建構模式,並做比較分析。比較結果發現,以轉折區間結束點作為截取點之Trimmed ARCH(1)模式,其預測值最為準確,大為改善了原來ARCH(1)模式之預測水準。 / In time series analysis, we often find the trend of which changing with time. Using the traditional model fitting can't get a good prediction. Hence the research of structure change of non-linear time series is attentive in recent years, and non-linear time series analysis is a research topic which the scholars of time series and econometrics are intent on. This article tries to use the theory of fuzzy ,to recognize the structure change period by the fuzzy classification, let the first point and the last point of the structure change period be the cute points, to fit ARCH(1) mod ie which we called the Trimmed ARCH(1) model. We use the data of the exchange rate between N.T dol liars and U.S dollars to compare the ARIMAwith ARCH(1) and Trimmed ARCH(1), the forcasting performance shows that Trimmed ARCH(1) model takes a better prediction result.
|
20 |
Tratamento de imprecisão na geração de árvores de decisãoLopes, Mariana Vieira Ribeiro 03 March 2016 (has links)
Submitted by Ronildo Prado (ronisp@ufscar.br) on 2017-08-08T20:30:11Z
No. of bitstreams: 1
DissMVRL.pdf: 2179441 bytes, checksum: 3c4089c4b24a3d98521f8561c6f2c515 (MD5) / Approved for entry into archive by Ronildo Prado (ronisp@ufscar.br) on 2017-08-08T20:30:33Z (GMT) No. of bitstreams: 1
DissMVRL.pdf: 2179441 bytes, checksum: 3c4089c4b24a3d98521f8561c6f2c515 (MD5) / Approved for entry into archive by Ronildo Prado (ronisp@ufscar.br) on 2017-08-08T20:30:39Z (GMT) No. of bitstreams: 1
DissMVRL.pdf: 2179441 bytes, checksum: 3c4089c4b24a3d98521f8561c6f2c515 (MD5) / Made available in DSpace on 2017-08-08T20:31:24Z (GMT). No. of bitstreams: 1
DissMVRL.pdf: 2179441 bytes, checksum: 3c4089c4b24a3d98521f8561c6f2c515 (MD5)
Previous issue date: 2016-03-03 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES) / Inductive Decision Trees (DT) are mechanisms based on the symbolic paradigm of machine learning which main characteristics are easy interpretability and low computational cost. Though they are widely used, the DTs can represent problems with just discrete or continuous variables. However, for some problems, the variables are not well represented in this way. In order to improve DTs, the Fuzzy Decision Trees (FDT) were developed, adding the ability to deal with fuzzy variables to the Inductive Decision Trees, making them capable to deal with imprecise knowledge. In this text, it is presented a new algorithm for fuzzy decision trees induction. Its fuzification method is applied during the induction and it is inspired by the C4.5’s partitioning method for continuous attributes. The proposed algorithm was tested with 20 datasets from UCI repository (LICHMAN, 2013). It was compared with other three algorithms that implement different solutions to classification problem: C4.5, which induces an Inductive Decision Tree, FURIA, that induces a Rule-based Fuzzy System and FuzzyDT, which induces a Fuzzy Decision Tree where the fuzification is done before tree’s induction is performed. The results are presented in Chapter 4. / As Árvores de Decisão Indutivas (AD) são um mecanismo baseado no paradigma simbólico do Aprendizado de Máquina que tem como principais características a fácil interpretabilidade e baixo custo computacional. Ainda que sejam amplamente utilizadas, as ADs são limitadas à representação de problemas cujas variáveis são do tipo discreto ou contínuo. No entanto, para alguns tipos de problemas, pode haver variáveis que não são bem representadas por estes formatos. Diante deste contexto, foram criadas as Árvores de Decisão Fuzzy (ADF), que adicionam à interpretabilidade das Árvores de Decisão Indutivas, a capacidade de lidar com variáveis fuzzy, as quais representam adequadamente conhecimentos imprecisos. Neste texto, apresentamos o trabalho desenvolvido durante o mestrado, que tem como principal resultado um novo algoritmo para indução de Árvores de Decisão Fuzzy, cujo método de fuzificação dos atributos contínuos é realizado durante a indução da árvore e foi inspirado no método de particionamento de atributos contínuos adotado pelo C4.5. Para validação do algoritmo, foram realizados testes com 20 conjuntos de dados do repositório UCI (LICHMAN, 2013) e o algoritmo foi comparado com outros três algoritmos que abordam o problema de classificação por meio de técnicas diferentes: o C4.5 que induz uma Árvore de Decisão Indutiva, o FURIA, que induz um Sistema Fuzzy Baseado em Regras, porém não segue a estrutura de árvore e o FuzzyDT que induz uma Árvore de Decisão fuzzy realizando a fuzificação dos atributos contínuos antes da indução da árvore. Os resultados dos experimentos realizados são apresentados e discutidos no Capítulo 4 deste texto.
|
Page generated in 0.4484 seconds