Spelling suggestions: "subject:"fuzzy kmeans clustering"" "subject:"fuzzy bymeans clustering""
1 |
An investigation into fuzzy clustering quality and speed : fuzzy C-means with effective seedingStetco, Adrian January 2017 (has links)
Cluster analysis, the automatic procedure by which large data sets can be split into similar groups of objects (clusters), has innumerable applications in a wide range of problem domains. Improvements in clustering quality (as captured by internal validation indexes) and speed (number of iterations until cost function convergence), the main focus of this work, have many desirable consequences. They can result, for example, in faster and more precise detection of illness onset based on symptoms or it could provide investors with a rapid detection and visualization of patterns in financial time series and so on. Partitional clustering, one of the most popular ways of doing cluster analysis, can be classified into two main categories: hard (where the clusters discovered are disjoint) and soft (also known as fuzzy; clusters are non-disjoint, or overlapping). In this work we consider how improvements in the speed and solution quality of the soft partitional clustering algorithm Fuzzy C-means (FCM) can be achieved through more careful and informed initialization based on data content. By carefully selecting the cluster centers in a way which disperses the initial cluster centers through the data space, the resulting FCM++ approach samples starting cluster centers during the initialization phase. The cluster centers are well spread in the input space, resulting in both faster convergence times and higher quality solutions. Moreover, we allow the user to specify a parameter indicating how far and apart the cluster centers should be picked in the dataspace right at the beginning of the clustering procedure. We show FCM++'s superior behaviour in both convergence times and quality compared with existing methods, on a wide rangeof artificially generated and real data sets. We consider a case study where we propose a methodology based on FCM++for pattern discovery on synthetic and real world time series data. We discuss a method to utilize both Pearson correlation and Multi-Dimensional Scaling in order to reduce data dimensionality, remove noise and make the dataset easier to interpret and analyse. We show that by using FCM++ we can make an positive impact on the quality (with the Xie Beni index being lower in nine out of ten cases for FCM++) and speed (with on average 6.3 iterations compared with 22.6 iterations) when trying to cluster these lower dimensional, noise reduced, representations of the time series. This methodology provides a clearer picture of the cluster analysis results and helps in detecting similarly behaving time series which could otherwise come from any domain. Further, we investigate the use of Spherical Fuzzy C-Means (SFCM) with the seeding mechanism used for FCM++ on news text data retrieved from a popular British newspaper. The methodology allows us to visualize and group hundreds of news articles based on the topics discussed within. The positive impact made by SFCM++ translates into a faster process (with on average 12.2 iterations compared with the 16.8 needed by the standard SFCM) and a higher quality solution (with the Xie Beni being lower for SFCM++ in seven out of every ten runs).
|
2 |
An Empirical Study On Fuzzy C-means Clustering For Turkish Banking SystemAltinel, Fatih 01 September 2012 (has links) (PDF)
Banking sector is very sensitive to macroeconomic and political instabilities and they are prone to crises. Since banks are integrated with almost all of the economic agents and with other banks, these crises affect entire societies. Therefore, classification or rating of banks with respect to their credibility becomes important. In this study we examine different models for classification of banks. Choosing one of those models, fuzzy c-means clustering, banks are grouped into clusters using 48 different ratios which can be classified under capital, assets quality, liquidity, profitability, income-expenditure structure, share in sector, share in group and branch ratios. To determine the inter-dependency between these variables, covariance and correlation between variables is analyzed. Principal component analysis is used to decrease the number of factors. As a a result, the representation space of data has been reduced from 48 variables to a 2 dimensional space. The observation is that 94.54% of total variance is produced by these two factors. Empirical results indicate that as the number of clusters is increased, the number of iterations required for minimizing the objective function fluctuates and is not monotonic. Also, as the number of clusters used increases, initial non-optimized maximum objective function values as well as optimized final minimum objective function values monotonically decrease together. Another observation is that the &lsquo / difference between initial non-optimized and final optimized values of objective function&rsquo / starts to diminish as number of clusters increases.
|
3 |
Fuzzy Entropy Based Fuzzy c-Means Clustering with Deterministic and Simulated Annealing MethodsFURUHASHI, Takeshi, YASUDA, Makoto 01 June 2009 (has links)
No description available.
|
4 |
Acoustic emission monitoring of damage progression in fiber reinforced polymer rodsShateri, Mohammadhadi 09 March 2017 (has links)
The fiber reinforced polymer (FRP) bars have been widely used in pre-stressing applications and reinforcing of the civil structures. High strength-to-weight ratio and high resistance to the corrosion make the FRP bars a good replacement for steel reinforcing bars in civil engineering applications. According to the CAN/CSA-S806-12 standard, the maximum recommended stress in FRP bars under service loads should not exceed 25% and 65% of the ultimate strength for glass FRP (GFRP) and carbon FRP (CFRP), respectively. These stress values are set to prevent creep failure in FRP bars. However, for in-service applications, there are few physical indicators that these values have been reached or exceeded. In this work analysis of acoustic emission (AE) signals is used. Two new techniques based on pattern recognition and frequency entropy of the isolated acoustic emission (AE) signal are presented for monitoring damage progression and prediction of failure in FRPs. / May 2017
|
5 |
A Recommendation System Combining Context-awarenes And User Profiling In Mobile EnvironmentUlucan, Serkan 01 December 2005 (has links) (PDF)
Up to now various recommendation systems have been proposed for web based applications such as e-commerce and information retrieval where a large amount of product or information is available. Basically, the task of the recommendation systems in those applications, for example the e-commerce, is to find and recommend the most
relevant items to users/customers. In this domain, the most prominent approaches are collaborative filtering and content-based filtering. Sometimes these approaches are called as user profiling as well.
In this work, a context-aware recommendation system is proposed for mobile environment, which also can be considered as an extension of those recommendation
systems proposed for web-based information retrieval and e-commerce applications. In the web-based information retrieval and e-commerce applications, for example in an
online book store (e-commerce), the users& / #8217 / actions are independent of their instant context (location, time& / #8230 / etc). But as for mobile environment, the users& / #8217 / actions are strictly dependent on their instant context. These dependencies give raise to need of filtering items/actions with respect to the users& / #8217 / instant context.
In this thesis, an approach coupling approaches from two different domains, one is the mobile environment and other is the web, is proposed. Hence, it will be possible to
separate whole approach into two phases: context-aware prediction and user profiling. In the first phase, combination of two methods called fuzzy c-means
clustering and learning automata will be used to predict the mobile user& / #8217 / s motions in context space beforehand. This provides elimination of a large amount of items placed in
the context space. In the second phase, hierarchical fuzzy clustering for users profiling will be used to determine the best recommendation among the remaining items.
|
6 |
Fuzzy Set Theory Applied to Make Medical Prognoses for Cancer PatientsZettervall, Hang January 2014 (has links)
As we all know the classical set theory has a deep-rooted influence in the traditional mathematics. According to the two-valued logic, an element can belong to a set or cannot. In the former case, the element’s membership degree will be assigned to one, whereas in the latter case it takes the zero value. With other words, a feeling of imprecision or fuzziness in the two-valued logic does not exist. With the rapid development of science and technology, more and more scientists have gradually come to realize the vital importance of the multi-valued logic. Thus, in 1965, Professor Lotfi A. Zadeh from Berkeley University put forward the concept of a fuzzy set. In less than 60 years, people became more and more familiar with fuzzy set theory. The theory of fuzzy sets has been turned to be a favor applied to many fields. The study aims to apply some classical and extensional methods of fuzzy set theory in life expectancy and treatment prognoses for cancer patients. The research is based on real-life problems encountered in clinical works by physicians. From the introductory items of the fuzzy set theory to the medical applications, a collection of detailed analysis of fuzzy set theory and its extensions are presented in the thesis. Concretely speaking, the Mamdani fuzzy control systems and the Sugeno controller have been applied to predict the survival length of gastric cancer patients. In order to keep the gastric cancer patients, already examined, away from the unnecessary suffering from surgical operation, the fuzzy c-means clustering analysis has been adopted to investigate the possibilities for operation contra to nonoperation. Furthermore, the approach of point set approximation has been adopted to estimate the operation possibilities against to nonoperation for an arbitrary gastric cancer patient. In addition, in the domain of multi-expert decision-making, the probabilistic model, the model of 2-tuple linguistic representations and the hesitant fuzzy linguistic term sets (HFLTS) have been utilized to select the most consensual treatment scheme(s) for two separate prostate cancer patients. The obtained results have supplied the physicians with reliable and helpful information. Therefore, the research work can be seen as the mathematical complements to the physicians’ queries.
|
7 |
Towards a Versatile System for the Visual Recognition of Surface DefectsKoprnicky, Miroslav January 2005 (has links)
Automated visual inspection is an emerging multi-disciplinary field with many challenges; it combines different aspects of computer vision, pattern recognition, automation, and control systems. There does not exist a large body of work dedicated to the design of generalized visual inspection systems; that is, those that might easily be made applicable to different product types. This is an important oversight, in that many improvements in design and implementation times, as well as costs, might be realized with a system that could easily be made to function in different production environments. <br /><br /> This thesis proposes a framework for generalizing and automating the design of the defect classification stage of an automated visual inspection system. It involves using an expandable set of features which are optimized along with the classifier operating on them in order to adapt to the application at hand. The particular implementation explored involves optimizing the feature set in disjoint sets logically grouped by feature type to keep search spaces reasonable. Operator input is kept at a minimum throughout this customization process, since it is limited only to those cases in which the existing feature library cannot adequately delineate the classes at hand, at which time new features (or pools) may have to be introduced by an engineer with experience in the domain. <br /><br /> Two novel methods are put forward which fit well within this framework: cluster-space and hybrid-space classifiers. They are compared in a series of tests against both standard benchmark classifiers, as well as mean and majority vote multi-classifiers, on feature sets comprised of just the logical feature subsets, as well as the entire feature sets formed by their union. The proposed classifiers as well as the benchmarks are optimized with both a progressive combinatorial approach and with an genetic algorithm. Experimentation was performed on true colour industrial lumber defect images, as well as binary hand-written digits. <br /><br /> Based on the experiments conducted in this work, it was found that the sequentially optimized multi hybrid-space methods are capable of matching the performances of the benchmark classifiers on the lumber data, with the exception of the mean-rule multi-classifiers, which dominated most experiments by approximately 3% in classification accuracy. The genetic algorithm optimized hybrid-space multi-classifier achieved best performance however; an accuracy of 79. 2%. <br /><br /> The numeral dataset results were less promising; the proposed methods could not equal benchmark performance. This is probably because the numeral feature-sets were much more conducive to good class separation, with standard benchmark accuracies approaching 95% not uncommon. This indicates that the cluster-space transform inherent to the proposed methods appear to be most useful in highly dependant or confusing feature-spaces, a hypothesis supported by the outstanding performance of the single hybrid-space classifier in the difficult texture feature subspace: 42. 6% accuracy, a 6% increase over the best benchmark performance. <br /><br /> The generalized framework proposed appears promising, because classifier performance over feature sets formed by the union of independently optimized feature subsets regularly met and exceeded those classifiers operating on feature sets formed by the optimization of the feature set in its entirety. This finding corroborates earlier work with similar results [3, 9], and is an aspect of pattern recognition that should be examined further.
|
8 |
RBF-sítě s dynamickou architekturou / RBF-networks with a dynamic architectureJakubík, Miroslav January 2011 (has links)
In this master thesis I recapitulated several methods for clustering input data. Two well known clustering algorithms, concretely K-means algorithm and Fuzzy C-means (FCM) algorithm, were described in the submitted work. I presented several methods, which could help estimate the optimal number of clusters. Further, I described Kohonen maps and two models of Kohonen's maps with dynamically changing structure, namely Kohonen map with growing grid and the model of growing neural gas. At last I described quite new model of radial basis function neural networks. I presented several learning algorithms for this model of neural networks. In the end of this work I made some clustering experiments with real data. This data describes the international trade among states of the whole world.
|
9 |
RBF-sítě s dynamickou architekturou / RBF-networks with a dynamic architectureJakubík, Miroslav January 2012 (has links)
In this master thesis I recapitulated several methods for data clustering. Two well known clustering algorithms, concretely K-means algorithm and Fuzzy C-means (FCM) algorithm, were described in the submitted work. I presented several methods, which could help estimate the optimal number of clusters. Further, I described Kohonen maps and two models of Kohonen's maps with dynamically changing structure, namely Kohonen map with growing grid and the model of growing neural gas. At last I described quite new model of radial basis function neural networks. I presented several learning algorithms for this model of neural networks, RAN, RANKEF, MRAN, EMRAN and GAP. In the end of this work I made some clustering experiments with real data. This data describes the international trade among states of the whole world.
|
10 |
Towards a Versatile System for the Visual Recognition of Surface DefectsKoprnicky, Miroslav January 2005 (has links)
Automated visual inspection is an emerging multi-disciplinary field with many challenges; it combines different aspects of computer vision, pattern recognition, automation, and control systems. There does not exist a large body of work dedicated to the design of generalized visual inspection systems; that is, those that might easily be made applicable to different product types. This is an important oversight, in that many improvements in design and implementation times, as well as costs, might be realized with a system that could easily be made to function in different production environments. <br /><br /> This thesis proposes a framework for generalizing and automating the design of the defect classification stage of an automated visual inspection system. It involves using an expandable set of features which are optimized along with the classifier operating on them in order to adapt to the application at hand. The particular implementation explored involves optimizing the feature set in disjoint sets logically grouped by feature type to keep search spaces reasonable. Operator input is kept at a minimum throughout this customization process, since it is limited only to those cases in which the existing feature library cannot adequately delineate the classes at hand, at which time new features (or pools) may have to be introduced by an engineer with experience in the domain. <br /><br /> Two novel methods are put forward which fit well within this framework: cluster-space and hybrid-space classifiers. They are compared in a series of tests against both standard benchmark classifiers, as well as mean and majority vote multi-classifiers, on feature sets comprised of just the logical feature subsets, as well as the entire feature sets formed by their union. The proposed classifiers as well as the benchmarks are optimized with both a progressive combinatorial approach and with an genetic algorithm. Experimentation was performed on true colour industrial lumber defect images, as well as binary hand-written digits. <br /><br /> Based on the experiments conducted in this work, it was found that the sequentially optimized multi hybrid-space methods are capable of matching the performances of the benchmark classifiers on the lumber data, with the exception of the mean-rule multi-classifiers, which dominated most experiments by approximately 3% in classification accuracy. The genetic algorithm optimized hybrid-space multi-classifier achieved best performance however; an accuracy of 79. 2%. <br /><br /> The numeral dataset results were less promising; the proposed methods could not equal benchmark performance. This is probably because the numeral feature-sets were much more conducive to good class separation, with standard benchmark accuracies approaching 95% not uncommon. This indicates that the cluster-space transform inherent to the proposed methods appear to be most useful in highly dependant or confusing feature-spaces, a hypothesis supported by the outstanding performance of the single hybrid-space classifier in the difficult texture feature subspace: 42. 6% accuracy, a 6% increase over the best benchmark performance. <br /><br /> The generalized framework proposed appears promising, because classifier performance over feature sets formed by the union of independently optimized feature subsets regularly met and exceeded those classifiers operating on feature sets formed by the optimization of the feature set in its entirety. This finding corroborates earlier work with similar results [3, 9], and is an aspect of pattern recognition that should be examined further.
|
Page generated in 0.1172 seconds