Global ETD Search

1	Self-organization map in complex network / Mapas organizativos em redes complexas Pimenta, Mayra Mercedes Zegarra 25 June 2018 (has links) The Self-Organization Map (SOM) is an artificial neural network that was proposed as a tool for exploratory analysis in large dimensionality data sets, being used efficiently for data mining. One of the main topics of research in this area is related to data clustering applications. Several algorithms have been developed to perform clustering in data sets. However, the accuracy of these algorithms is data depending. This thesis is mainly dedicated to the investigation of the SOM from two different approaches: (i) data mining and (ii) complex networks. From the data mining point of view, we analyzed how the performance of the algorithm is related to the distribution of data properties. It was verified the accuracy of the algorithm based on the configuration of the parameters. Likewise, this thesis shows a comparative analysis between the SOM network and other clustering methods. The results revealed that in random configuration of parameters the SOM algorithm tends to improve its acuracy when the number of classes is small. It was also observed that when considering the default configurations of the adopted methods, the spectral approach usually outperformed the other clustering algorithms. Regarding the complex networks approach, we observed that the network structure has a fundamental influence of the algorithm accuracy. We evaluated the cases at short and middle learning time scales and three different datasets. Furthermore, we show how different topologies also affect the self-organization of the topographic map of SOM network. The self-organization of the network was studied through the partitioning of the map in groups or communities. It was used four topological measures to quantify the structure of the groups such as: modularity, number of elements per group, number of groups per map, size of the largest group in three network models. In small-world (SW) networks, the groups become denser as time increases. An opposite behavior is found in the assortative networks. Finally, we verified that if some perturbation is included in the system, like a rewiring in a SW network and the deactivation model, the system cannot be organized again. Our results enable a better understanding of SOM in terms of parameters and network structure. / Um Mapa Auto-organizativo (da sigla SOM, Self-organized map, em inglês) é uma rede neural artificial que foi proposta como uma ferramenta para análise exploratória em conjuntos de dados de grande dimensionalidade, sendo utilizada de forma eficiente na mineração de dados. Um dos principais tópicos de pesquisa nesta área está relacionado com as aplicações de agrupamento de dados. Vários algoritmos foram desenvolvidos para realizar agrupamento de dados, tendo cada um destes algoritmos uma acurácia específica para determinados tipos de dados. Esta tese tem por objetivo principal analisar a rede SOM a partir de duas abordagens diferentes: mineração de dados e redes complexas. Pela abordagem de mineração de dados, analisou-se como o desempenho do algoritmo está relacionado à distribuição ou características dos dados. Verificou-se a acurácia do algoritmo com base na configuração dos parâmetros. Da mesma forma, esta tese mostra uma análise comparativa entre a rede SOM e outros métodos de agrupamento. Os resultados revelaram que o uso de valores aleatórios nos parâmetros de configuração do algoritmo SOM tende a melhorar sua acurácia quando o número de classes é baixo. Observou-se também que, ao considerar as configurações padrão dos métodos adotados, a abordagem espectral usualmente superou os demais algoritmos de agrupamento. Pela abordagem de redes complexas, esta tese mostra que, se considerarmos outro tipo de topologia de rede, além do modelo regular geralmente utilizado, haverá um impacto na acurácia da rede. Esta tese mostra que o impacto na acurácia é geralmente observado em escalas de tempo de aprendizado curto e médio. Esse comportamento foi observado usando três conjuntos de dados diferentes. Além disso, esta tese mostra como diferentes topologias também afetam a auto-organização do mapa topográfico da rede SOM. A auto-organização da rede foi estudada por meio do particionamento do mapa em grupos ou comunidades. Foram utilizadas quatro medidas topológicas para quantificar a estrutura dos grupos em três modelos distintos de rede: modularidade, número de elementos por grupo, número de grupos por mapa, tamanho do maior grupo. Em redes de pequeno mundo, os grupos se tornam mais densos à medida que o tempo aumenta. Um comportamento oposto a isso é encontrado nas redes assortativas. Apesar da modularidade, tem um alto valor em ambos os casos. Algoritmos de agrupamento Clustering algorithm Complex networks Redes complexas Self-organization map SOM
2	Self-organization map in complex network / Mapas organizativos em redes complexas Mayra Mercedes Zegarra Pimenta 25 June 2018 (has links) The Self-Organization Map (SOM) is an artificial neural network that was proposed as a tool for exploratory analysis in large dimensionality data sets, being used efficiently for data mining. One of the main topics of research in this area is related to data clustering applications. Several algorithms have been developed to perform clustering in data sets. However, the accuracy of these algorithms is data depending. This thesis is mainly dedicated to the investigation of the SOM from two different approaches: (i) data mining and (ii) complex networks. From the data mining point of view, we analyzed how the performance of the algorithm is related to the distribution of data properties. It was verified the accuracy of the algorithm based on the configuration of the parameters. Likewise, this thesis shows a comparative analysis between the SOM network and other clustering methods. The results revealed that in random configuration of parameters the SOM algorithm tends to improve its acuracy when the number of classes is small. It was also observed that when considering the default configurations of the adopted methods, the spectral approach usually outperformed the other clustering algorithms. Regarding the complex networks approach, we observed that the network structure has a fundamental influence of the algorithm accuracy. We evaluated the cases at short and middle learning time scales and three different datasets. Furthermore, we show how different topologies also affect the self-organization of the topographic map of SOM network. The self-organization of the network was studied through the partitioning of the map in groups or communities. It was used four topological measures to quantify the structure of the groups such as: modularity, number of elements per group, number of groups per map, size of the largest group in three network models. In small-world (SW) networks, the groups become denser as time increases. An opposite behavior is found in the assortative networks. Finally, we verified that if some perturbation is included in the system, like a rewiring in a SW network and the deactivation model, the system cannot be organized again. Our results enable a better understanding of SOM in terms of parameters and network structure. / Um Mapa Auto-organizativo (da sigla SOM, Self-organized map, em inglês) é uma rede neural artificial que foi proposta como uma ferramenta para análise exploratória em conjuntos de dados de grande dimensionalidade, sendo utilizada de forma eficiente na mineração de dados. Um dos principais tópicos de pesquisa nesta área está relacionado com as aplicações de agrupamento de dados. Vários algoritmos foram desenvolvidos para realizar agrupamento de dados, tendo cada um destes algoritmos uma acurácia específica para determinados tipos de dados. Esta tese tem por objetivo principal analisar a rede SOM a partir de duas abordagens diferentes: mineração de dados e redes complexas. Pela abordagem de mineração de dados, analisou-se como o desempenho do algoritmo está relacionado à distribuição ou características dos dados. Verificou-se a acurácia do algoritmo com base na configuração dos parâmetros. Da mesma forma, esta tese mostra uma análise comparativa entre a rede SOM e outros métodos de agrupamento. Os resultados revelaram que o uso de valores aleatórios nos parâmetros de configuração do algoritmo SOM tende a melhorar sua acurácia quando o número de classes é baixo. Observou-se também que, ao considerar as configurações padrão dos métodos adotados, a abordagem espectral usualmente superou os demais algoritmos de agrupamento. Pela abordagem de redes complexas, esta tese mostra que, se considerarmos outro tipo de topologia de rede, além do modelo regular geralmente utilizado, haverá um impacto na acurácia da rede. Esta tese mostra que o impacto na acurácia é geralmente observado em escalas de tempo de aprendizado curto e médio. Esse comportamento foi observado usando três conjuntos de dados diferentes. Além disso, esta tese mostra como diferentes topologias também afetam a auto-organização do mapa topográfico da rede SOM. A auto-organização da rede foi estudada por meio do particionamento do mapa em grupos ou comunidades. Foram utilizadas quatro medidas topológicas para quantificar a estrutura dos grupos em três modelos distintos de rede: modularidade, número de elementos por grupo, número de grupos por mapa, tamanho do maior grupo. Em redes de pequeno mundo, os grupos se tornam mais densos à medida que o tempo aumenta. Um comportamento oposto a isso é encontrado nas redes assortativas. Apesar da modularidade, tem um alto valor em ambos os casos. Algoritmos de agrupamento Redes complexas SOM Clustering algorithm Complex networks Self-organization map
3	Automatic Essential Content Extraction from Asynchronous Discussion Boards in e-Learning Lu, Ping-Hui 03 July 2004 (has links) With the trend of using of Internet and multimedia, e-Learning has became an important learning method. e-Learning is easy to use and bring into practice, but it also has the defects inversely. One of those defects is the reuse of important and valuable discussing knowledge from asynchronous discussion boards. Nobody has time and be willing to make effort to manage the important discussion from asynchronous discussion boards in e-Learning except enthusiastic teachers or assistants. All of us know that asynchronous discussion boards is an important tool used in e-Learning for communicating and discussing with all class information for teachers and students. And reusing of important class discussing knowledge can aid teachers and students to teach and study with efficiency and effect. But up to the present, there are few researches in this domain. So, in this research we create an automatic essential content extraction method from asynchronous discussion boards in e-Learning. We explain the usage, management, and shortcomings of asynchronous discussion boards in e-Learning before. And we also describe the designing process of the research in detail. Finally, we describe the operation and the result of content extraction in this research system. All of those are hope to help teachers and students can reuse the valuable knowledge easily and quickly from past class discussion in e-Learning. Data Clustering e-Learning Asynchronous Discussion Board Self Organization Map Information Retrieval
4	Mining IT Product Life Cycle from Massive Newsgroup Articles Chou, Cheng-Chi 22 July 2003 (has links) Product life cycle (PLC) may be used as a managerial tool. Marketing strategies must change as the product goes through its life cycle. If managers understand the cycle concept, they are in a better position to forecast the future sales activities and plan marketing strategies. However, people often make the wrong PLC because of the difficulty of data access and lacking decision-making information. Therefore, this thesis applies customer behavior model to analyze the relationship between the frequency and the duration time from the product discussion, and it calculates the PLC pattern to explore the product¡¦s current position in customers¡¦ mind. Finally, the PLC curve will be constructed by using the information that we got from previous analysis. Moreover, we also employ data mining and information retrieval technique to diagnose the variance of discussion frequency and the content of discussion article to extract the distinctive event that influenced PLC curve. The main contributions of this thesis are described as the following sentence: Self-Organization Map Moving average Information Retrieval Product Life Cycle Data Clustering
5	Algoritmy pro shlukování textových dat / Text data clustering algorithms Sedláček, Josef January 2011 (has links) The thesis deals with text mining. It describes the theory of text document clustering as well as algorithms used for clustering. This theory serves as a basis for developing an application for clustering text data. The application is developed in Java programming language and contains three methods used for clustering. The user can choose which method will be used for clustering the collection of documents. The implemented methods are K medoids, BiSec K medoids, and SOM (self-organization maps). The application also includes a validation set, which was specially created for the diploma thesis and it is used for testing the algorithms. Finally, the algorithms are compared according to obtained results.

1

Page generated in 0.1131 seconds