• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 27
  • 12
  • 7
  • 3
  • 3
  • 2
  • 2
  • 1
  • 1
  • 1
  • Tagged with
  • 65
  • 65
  • 18
  • 15
  • 11
  • 10
  • 9
  • 9
  • 9
  • 9
  • 7
  • 7
  • 7
  • 6
  • 6
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
41

[pt] AGRUPAMENTO FUZZY APLICADO À INTEGRAÇÃO DE DADOS MULTI-ÔMICOS / [en] FUZZY CLUSTERING APPLIED TO MULTI-OMICS DATA

SARAH HANNAH LUCIUS LACERDA DE GOES TELLES CARVALHO ALVES 05 October 2021 (has links)
[pt] Os avanços nas tecnologias de obtenção de dados multi-ômicos têm disponibilizado diferentes níveis de informação molecular que aumentam progressivamente em volume e variedade. Neste estudo, propõem-se uma metodologia de integração de dados clínicos e multi-ômicos, com o objetivo de identificar subtipos de câncer por agrupamento fuzzy, representando assim as gradações entre os diferentes perfis moleculares. Uma melhor caracterização de tumores em subtipos moleculares pode contribuir para uma medicina mais personalizada e assertiva. Os conjuntos de dados ômicos a serem integrados são definidos utilizando um classificador com classe-alvo definida por resultados da literatura. Na sequência, é realizado o pré-processamento dos conjuntos de dados para reduzir a alta dimensionalidade. Os dados selecionados são integrados e em seguida agrupados. Optou-se pelo algoritmo fuzzy C-means pela sua capacidade de considerar a possibilidade dos pacientes terem características de diferentes grupos, o que não é possível com métodos clássicos de agrupamento. Como estudo de caso, utilizou-se dados de câncer colorretal (CCR). O CCR tem a quarta maior incidência na população mundial e a terceira maior no Brasil. Foram extraídos dados de metilação, expressão de miRNA e mRNA do portal do projeto The Cancer Genome Atlas (TCGA). Observou-se que a adição dos dados de expressão de miRNA e metilação a um classificador de expressão de mRNA da literatura aumentou a acurácia deste em 5 pontos percentuais. Assim, foram usados dados de metilação, expressão de miRNA e mRNA neste trabalho. Os atributos de cada conjunto de dados foram selecionados, obtendo-se redução significativa do número de atributos. A identificação dos grupos foi realizada com o algoritmo fuzzy C-means. A variação dos hiperparâmetros deste algoritmo, número de grupos e parâmetro de fuzzificação, permitiu a escolha da combinação de melhor desempenho. A escolha da melhor configuração considerou o efeito da variação dos parâmetros nas características biológicas, em especial na sobrevida global dos pacientes. Observou-se que o agrupamento gerado permitiu identificar que as amostras consideradas não agrupadas têm características biológicas compartilhadas entre grupos de diferentes prognósticos. Os resultados obtidos com a combinação de dados clínicos e ômicos mostraram-se promissores para melhor predizer o fenótipo. / [en] The advances in technologies for obtaining multi-omic data provide different levels of molecular information that progressively increase in volume and variety. This study proposes a methodology for integrating clinical and multiomic data, which aim is the identification of cancer subtypes using fuzzy clustering algorithm, representing the different degrees between molecular profiles. A better characterization of tumors in molecular subtypes can contribute to a more personalized and assertive medicine. A classifier that uses a target class from literature results indicates which omic data sets should be integrated. Next, data sets are pre-processed to reduce high dimensionality. The selected data is integrated and then clustered. The fuzzy C-means algorithm was chosen due to its ability to consider the shared patients characteristics between different groups. As a case study, colorectal cancer (CRC) data were used. CCR has the fourth highest incidence in the world population and the third highest in Brazil. Methylation, miRNA and mRNA expression data were extracted from The Cancer Genome Atlas (TCGA) project portal. It was observed that the addition of miRNA expression and methylation data to a literature mRNA expression classifier increased its accuracy by 5 percentage points. Therefore, methylation, miRNA and mRNA expression data were used in this work. The attributes of each data set were pre-selected, obtaining a significant reduction in the number of attributes. Groups were identified using the fuzzy C-means algorithm. The variation of the hyperparameters of this algorithm, number of groups and membership degree, indicated the best performance combination. This choice considered the effect of parameters variation on biological characteristics, especially on the overall survival of patients. Clusters showed that patients considered not grouped had biological characteristics shared between groups of different prognoses. The combination of clinical and omic data to better predict the phenotype revealed promissing results.
42

Multitemporal mapping of burned areas  in mixed landscapes in eastern Zambia

Malambo, Lonesome 08 December 2014 (has links)
Fires occur extensively across Zambia every year, a problem recognized as a major threat to biodiversity. Yet, basic tools for mapping at a spatial and temporal scale that provide useful information for understanding and managing this problem are not available. The objectives of this research were: to develop a method to map the spatio-temporal seasonal fire occurrence using satellite imagery, to develop a technique for estimating missing data in the satellite imagery considering the possibility of change in land cover over time, and to demonstrate applicability of these new tools by analyzing the fine-scale seasonal patterns of landscape fires in eastern Zambia. A new approach for mapping burned areas uses multitemporal image analysis with a fuzzy clustering algorithm to automatically select spectral-temporal signatures that are then used to classify the images to produce the desired spatio-temporal burned area information. Testing with Landsat data (30m resolution) in eastern Zambia showed accuracies in predicting burned areas above 92%. The approach is simple to implement, data driven, and can be automated, which can facilitate quicker production of burned area information. A profile-based approach for filling missing data uses multitemporal imagery and exploits the similarity in land cover temporal profiles and spatial relationships to reliably estimate missing data even in areas with significant changes. Testing with simulated missing data from an 8-image spectral index sequence showed highly correlated (R2 of 0.78-0.92) and precise estimates (deviations 4-7%) compared to actual values. The profile-based approach overcomes the common requirement of gap-filling methods that there is gradual or no change in land cover, and provides accurate gap-filling under conditions of both gradual and abrupt changes. The spatio-temporal progression of landscape burning was evaluated for the 2009 and 2012 fire seasons (June-November) using Landsat data. Results show widespread burning (~ 60%) with most fires occurring late (August-October) in the season. Fire occurrence and burn patch sizes decreased with increasing settlement density and landscape fragmentation reflecting human influences and fuel availability. Small fires (< 5ha) are predominant and were significantly under-detected (>50%) by a global dataset (MODIS Burned Area Product (500m resolution)), underscoring the critical need of higher geometric resolution imagery such as Landsat imagery for mapping such fine-scale fire activity. / Ph. D.
43

Enhancing fuzzy associative rule mining approaches for improving prediction accuracy : integration of fuzzy clustering, apriori and multiple support approaches to develop an associative classification rule base

Sowan, Bilal Ibrahim January 2011 (has links)
Building an accurate and reliable model for prediction for different application domains, is one of the most significant challenges in knowledge discovery and data mining. This thesis focuses on building and enhancing a generic predictive model for estimating a future value by extracting association rules (knowledge) from a quantitative database. This model is applied to several data sets obtained from different benchmark problems, and the results are evaluated through extensive experimental tests. The thesis presents an incremental development process for the prediction model with three stages. Firstly, a Knowledge Discovery (KD) model is proposed by integrating Fuzzy C-Means (FCM) with Apriori approach to extract Fuzzy Association Rules (FARs) from a database for building a Knowledge Base (KB) to predict a future value. The KD model has been tested with two road-traffic data sets. Secondly, the initial model has been further developed by including a diversification method in order to improve a reliable FARs to find out the best and representative rules. The resulting Diverse Fuzzy Rule Base (DFRB) maintains high quality and diverse FARs offering a more reliable and generic model. The model uses FCM to transform quantitative data into fuzzy ones, while a Multiple Support Apriori (MSapriori) algorithm is adapted to extract the FARs from fuzzy data. The correlation values for these FARs are calculated, and an efficient orientation for filtering FARs is performed as a post-processing method. The FARs diversity is maintained through the clustering of FARs, based on the concept of the sharing function technique used in multi-objectives optimization. The best and the most diverse FARs are obtained as the DFRB to utilise within the Fuzzy Inference System (FIS) for prediction. The third stage of development proposes a hybrid prediction model called Fuzzy Associative Classification Rule Mining (FACRM) model. This model integrates the ii improved Gustafson-Kessel (G-K) algorithm, the proposed Fuzzy Associative Classification Rules (FACR) algorithm and the proposed diversification method. The improved G-K algorithm transforms quantitative data into fuzzy data, while the FACR generate significant rules (Fuzzy Classification Association Rules (FCARs)) by employing the improved multiple support threshold, associative classification and vertical scanning format approaches. These FCARs are then filtered by calculating the correlation value and the distance between them. The advantage of the proposed FACRM model is to build a generalized prediction model, able to deal with different application domains. The validation of the FACRM model is conducted using different benchmark data sets from the University of California, Irvine (UCI) of machine learning and KEEL (Knowledge Extraction based on Evolutionary Learning) repositories, and the results of the proposed FACRM are also compared with other existing prediction models. The experimental results show that the error rate and generalization performance of the proposed model is better in the majority of data sets with respect to the commonly used models. A new method for feature selection entitled Weighting Feature Selection (WFS) is also proposed. The WFS method aims to improve the performance of FACRM model. The prediction performance is improved by minimizing the prediction error and reducing the number of generated rules. The prediction results of FACRM by employing WFS have been compared with that of FACRM and Stepwise Regression (SR) models for different data sets. The performance analysis and comparative study show that the proposed prediction model provides an effective approach that can be used within a decision support system.
44

Development of Partially Supervised Kernel-based Proximity Clustering Frameworks and Their Applications

Graves, Daniel 06 1900 (has links)
The focus of this study is the development and evaluation of a new partially supervised learning framework. This framework belongs to an emerging field in machine learning that augments unsupervised learning processes with some elements of supervision. It is based on proximity fuzzy clustering, where an active learning process is designed to query for the domain knowledge required in the supervision. Furthermore, the framework is extended to the parametric optimization of the kernel function in the proximity fuzzy clustering algorithm, where the goal is to achieve interesting non-spherical cluster structures through a non-linear mapping. It is demonstrated that the performance of kernel-based clustering is sensitive to the selection of these kernel parameters. Proximity hints procured from domain knowledge are exploited in the partially supervised framework. The theoretic developments with proximity fuzzy clustering are evaluated in several interesting and practical applications. One such problem is the clustering of a set of graphs based on their structural and semantic similarity. The segmentation of music is a second problem for proximity fuzzy clustering, where the aim is to determine the points in time, i.e. boundaries, of significant structural changes in the music. Finally, a time series prediction problem using a fuzzy rule-based system is established and evaluated. The antecedents of the rules are constructed by clustering the time series using proximity information in order to localize the behavior of the rule consequents in the architecture. Evaluation of these efforts on both synthetic and real-world data demonstrate that proximity fuzzy clustering is well suited for a variety of problems. / Digital Signals and Image Processing
45

Type-2 Neuro-Fuzzy System Modeling with Hybrid Learning Algorithm

Yeh, Chi-Yuan 19 July 2011 (has links)
We propose a novel approach for building a type-2 neuro-fuzzy system from a given set of input-output training data. For an input pattern, a corresponding crisp output of the system is obtained by combining the inferred results of all the rules into a type-2 fuzzy set which is then defuzzified by applying a type reduction algorithm. Karnik and Mendel proposed an algorithm, called KM algorithm, to compute the centroid of an interval type-2 fuzzy set efficiently. Based on this algorithm, Liu developed a centroid type-reduction strategy to do type reduction for type-2 fuzzy sets. A type-2 fuzzy set is decomposed into a collection of interval type-2 fuzzy sets by £\-cuts. Then the KM algorithm is called for each interval type-2 fuzzy set iteratively. However, the initialization of the switch point in each application of the KM algorithm is not a good one. In this thesis, we present an improvement to Liu's algorithm. We employ the result previously obtained to construct the starting values in the current application of the KM algorithm. Convergence in each iteration except the first one can then speed up and type reduction for type-2 fuzzy sets can be done faster. The efficiency of the improved algorithm is analyzed mathematically and demonstrated by experimental results. Constructing a type-2 neuro-fuzzy system involves two major phases, structure identification and parameter identification. We propose a method which incorporates self-constructing fuzzy clustering algorithm and a SVD-based least squares estimator for structure identification of type-2 neuro-fuzzy modeling. The self-constructing fuzzy clustering method is used to partition the training data set into clusters through input-similarity and output-similarity tests. The membership function associated with each cluster is defined with the mean and deviation of the data points included in the cluster. Then applying SVD-based least squares estimator, a type-2 fuzzy TSK IF-THEN rule is derived from each cluster to form a fuzzy rule base. After that a fuzzy neural network is constructed. In the parameter identification phase, the parameters associated with the rules are then refined through learning. We propose a hybrid learning algorithm which incorporates particle swarm optimization and a SVD-based least squares estimator to refine the antecedent parameters and the consequent parameters, respectively. We demonstrate the effectiveness of our proposed approach in constructing type-2 neuro-fuzzy systems by showing the results for two nonlinear functions and two real-world benchmark datasets. Besides, we use the proposed approach to construct a type-2 neuro-fuzzy system to forecast the daily Taiwan Stock Exchange Capitalization Weighted Stock Index (TAIEX). Experimental results show that our forecasting system performs better than other methods.
46

Fuzzy Cluster-Based Query Expansion

Tai, Chia-Hung 29 July 2004 (has links)
Advances in information and network technologies have fostered the creation and availability of a vast amount of online information, typically in the form of text documents. Information retrieval (IR) pertains to determining the relevance between a user query and documents in the target collection, then returning those documents that are likely to satisfy the user¡¦s information needs. One challenging issue in IR is word mismatch, which occurs when concepts can be described by different words in the user queries and/or documents. Query expansion is a promising approach for dealing with word mismatch in IR. In this thesis, we develop a fuzzy cluster-based query expansion technique to solve the word mismatch problem. Using existing expansion techniques (i.e., global analysis and non-fuzzy cluster-based query expansion) as performance benchmarks, our empirical results suggest that the fuzzy cluster-based query expansion technique can provide a more accurate query result than the benchmark techniques can.
47

Neuro-Fuzzy System Modeling with Self-Constructed Rules and Hybrid Learning

Ouyang, Chen-Sen 09 November 2004 (has links)
Neuro-fuzzy modeling is an efficient computing paradigm for system modeling problems. It mainly integrates two well-known approaches, neural networks and fuzzy systems, and therefore possesses advantages of them, i.e., learning capability, robustness, human-like reasoning, and high understandability. Up to now, many approaches have been proposed for neuro-fuzzy modeling. However, it still exists many problems need to be solved. We propose in this thesis two self-constructing rule generation methods, i.e., similarity-based rule generation (SRG) and similarity-and-merge-based rule generation (SMRG), and one hybrid learning algorithm (HLA) for structure identification and parameter identification, respectively, of neuro-fuzzy modeling. SRG and SMRG group the input-output training data into a set of fuzzy clusters incrementally based on similarity tests on the input and output spaces. Membership functions associated with each cluster are defined according to statistical means and deviations of the data points included in the cluster. Additionally, SMRG employs a merging mechanism to merge similar clusters dynamically. Then a zero-order or first-order TSK-type fuzzy IF-THEN rule is extracted from each cluster to form an initial fuzzy rule-base which can be directly employed for fuzzy reasoning or be further refined in the next phase of parameter identification. Compared with other methods, both our SRG and SMRG have advantages of generating fuzzy rules quickly, matching membership functions closely with the real distribution of the training data points, and avoiding the generation of the whole set of clusters from the scratch when new training data are considered. Besides, SMRG supports a more reasonable and quick mechanism for cluster merging to alleviate the problems of data-input-order bias and redundant clusters, which are encountered in SRG and other incremental clustering approaches. To refine the fuzzy rules obtained in the structure identification phase, a zero-order or first-order TSK-type fuzzy neural network is constructed accordingly in the parameter identification phase. Then, we develop a HLA composed by a recursive SVD-based least squares estimator and the gradient descent method to train the network. Our HLA has the advantage of alleviating the local minimal problem. Besides, it learns faster, consumes less memory, and produces lower approximation errors than other methods. To verify the practicability of our approaches, we apply them to the applications of function approximation and classification. For function approximation, we apply our approaches to model several nonlinear functions and real cases from measured input-output datasets. For classification, our approaches are applied to a problem of human object segmentation. A fuzzy self-clustering algorithm is used to divide the base frame of a video stream into a set of segments which are then categorized as foreground or background based on a combination of multiple criteria. Then, human objects in the base frame and the remaining frames of the video stream are precisely located by a fuzzy neural network which is constructed with the fuzzy rules previously obtained and is trained by our proposed HLA. Experimental results show that our approaches can improve the accuracy of human object identification in video streams and work well even when the human object presents no significant motion in an image sequence.
48

Customer Load Profiling and Aggregation

Chang, Rung-Fang 28 June 2002 (has links)
Power industry restructuring has created many opportunities for customers to reduce their electricity bills. In order to facilitate the retail choice in a competitive power market, the knowledge of hourly load shape by customer class is necessary. Requiring a meter as a prerequisite for lower voltage customers to choose a power supplier is not considered practical at the present time. In order to be used by Energy Service Provider (ESP) to assign customers to specific load profiles with certainty factors, a technique which bases on load research and customers¡¦ monthly energy usage data for a preliminary screening of customer load profiles is required. Distribution systems supply electricity to different mixtures of customers, due to lack of field measurements, load point data used in distribution network studies have various degrees of uncertainties. In order to take the expected uncertainties in the demand into account, many previous methods have used fuzzy load models in their studies. However, the issue of deriving these models has not been discussed. To address this issue, an approach for building these fuzzy load models is needed. Load aggregation allows customers to purchase electricity at a lower price. In some contracts, load factor is considered as one critical aspect of aggregation. To facilitate a better load aggregation in distribution networks, feeder reconfiguration could be used to improve the load factor in a distribution subsystem. To solve the aforementioned problems, two data mining techniques, namely, the fuzzy c-means (FCM) method and an Artificial Neural Network (ANN) based pattern recognition technique, are proposed for load profiling and customer class assignment. A variant to the previous load profiling technique, customer hourly load distributions obtained from load research can be converted to fuzzy membership functions based on a possibility¡Vprobability consistency principle. With the customer class fuzzy load profiles, customer monthly power consumption and feeder load measurements, hourly loads of each distribution transformer on the feeder can be estimated and used in distribution network analysis. After feeder models are established, feeder reconfiguration based on binary particle swarm optimization (BPSO) technique is used to improve feeder load factors. Test results based on several simple sample networks have shown that the proposed feeder reconfiguration method could improve customers¡¦ position for a good bargain in electricity service.
49

Development of Partially Supervised Kernel-based Proximity Clustering Frameworks and Their Applications

Graves, Daniel Unknown Date
No description available.
50

AIM - A Social Media Monitoring System for Quality Engineering

Bank, Mathias 27 June 2013 (has links) (PDF)
In the last few years the World Wide Web has dramatically changed the way people are communicating with each other. The growing availability of Social Media Systems like Internet fora, weblogs and social networks ensure that the Internet is today, what it was originally designed for: A technical platform in which all users are able to interact with each other. Nowadays, there are billions of user comments available discussing all aspects of life and the data source is still growing. This thesis investigates, whether it is possible to use this growing amount of freely provided user comments to extract quality related information. The concept is based on the observation that customers are not only posting marketing relevant information. They also publish product oriented content including positive and negative experiences. It is assumed that this information represents a valuable data source for quality analyses: The original voices of the customers promise to specify a more exact and more concrete definition of \"quality\" than the one that is available to manufacturers or market researchers today. However, the huge amount of unstructured user comments makes their evaluation very complex. It is impossible for an analysis protagonist to manually investigate the provided customer feedback. Therefore, Social Media specific algorithms have to be developed to collect, pre-process and finally analyze the data. This has been done by the Social Media monitoring system AIM (Automotive Internet Mining) that is the subject of this thesis. It investigates how manufacturers, products, product features and related opinions are discussed in order to estimate the overall product quality from the customers\\\' point of view. AIM is able to track different types of data sources using a flexible multi-agent based crawler architecture. In contrast to classical web crawlers, the multi-agent based crawler supports individual crawling policies to minimize the download of irrelevant web pages. In addition, an unsupervised wrapper induction algorithm is introduced to automatically generate content extraction parameters which are specific for the crawled Social Media systems. The extracted user comments are analyzed by different content analysis algorithms to gain a deeper insight into the discussed topics and opinions. Hereby, three different topic types are supported depending on the analysis needs. * The creation of highly reliable analysis results is realized by using a special context-aware taxonomy-based classification system. * Fast ad-hoc analyses are applied on top of classical fulltext search capabilities. * Finally, AIM supports the detection of blind-spots by using a new fuzzified hierarchical clustering algorithm. It generates topical clusters while supporting multiple topics within each user comment. All three topic types are treated in a unified way to enable an analysis protagonist to apply all methods simultaneously and in exchange. The systematically processed user comments are visualized within an easy and flexible interactive analysis frontend. Special abstraction techniques support the investigation of thousands of user comments with minimal time efforts. Hereby, specifically created indices show the relevancy and customer satisfaction of a given topic. / In den letzten Jahren hat sich das World Wide Web dramatisch verändert. War es vor einigen Jahren noch primär eine Informationsquelle, in der ein kleiner Anteil der Nutzer Inhalte veröffentlichen konnte, so hat sich daraus eine Kommunikationsplattform entwickelt, in der jeder Nutzer aktiv teilnehmen kann. Die dadurch enstehende Datenmenge behandelt jeden Aspekt des täglichen Lebens. So auch Qualitätsthemen. Die Analyse der Daten verspricht Qualitätssicherungsmaßnahmen deutlich zu verbessern. Es können dadurch Themen behandelt werden, die mit klassischen Sensoren schwer zu messen sind. Die systematische und reproduzierbare Analyse von benutzergenerierten Daten erfordert jedoch die Anpassung bestehender Tools sowie die Entwicklung neuer Social-Media spezifischer Algorithmen. Diese Arbeit schafft hierfür ein völlig neues Social Media Monitoring-System, mit dessen Hilfe ein Analyst tausende Benutzerbeiträge mit minimaler Zeitanforderung analysieren kann. Die Anwendung des Systems hat einige Vorteile aufgezeigt, die es ermöglichen, die kundengetriebene Definition von \"Qualität\" zu erkennen.

Page generated in 0.113 seconds