Global ETD Search

21	A framework for comparing heterogeneous objects: on the similarity measurements for fuzzy, numerical and categorical attributes Bashon, Yasmina M., Neagu, Daniel, Ridley, Mick J. 09 1900 (has links) No / Real-world data collections are often heterogeneous (represented by a set of mixed attributes data types: numerical, categorical and fuzzy); since most available similarity measures can only be applied to one type of data, it becomes essential to construct an appropriate similarity measure for comparing such complex data. In this paper, a framework of new and unified similarity measures is proposed for comparing heterogeneous objects described by numerical, categorical and fuzzy attributes. Examples are used to illustrate, compare and discuss the applications and efficiency of the proposed approach to heterogeneous data comparison and clustering. Similarity measures ; Fuzzy objects ; Fuzzy attributes ; Numerical attributes ; Categorical attributes ; Clustering-algorithm ; Classification ; Information ; Distance ; Words ; Sets
22	Methods for Blind Separation of Co-Channel BPSK Signals Arriving at an Antenna Array and Their Performance Analysis Anand, K 07 1900 (has links) Capacity improvement of Wireless Communication Systems is a very important area of current research. The goal is to increase the number of users supported by the system per unit bandwidth allotted. One important way of achieving this improvement is to use multiple antennas backed by intelligent signal processing. In this thesis, we present methods for blind separation of co-channel BPSK signals arriving at an antenna array. These methods consist of two parts, Constellation Estimation and Assignment. We give two methods for constellation estimation, the Smallest Distance Clustering and the Maximum Likelihood Estimation. While the latter is theoretically sound,the former is Computationally simple and intuitively appealing. We show that the Maximum Likelihood Constellation Estimation is well approximated by the Smallest Distance Clustering Algorithm at high SNR. The Assignment Algorithm exploits the structure of the BPSK signals. We observe that both the methods for estimating the constellation vectors perform very well at high SNR and nearly attain Cramer-Rao bounds. Using this fact and noting that the Assignment Algorithm causes negligble error at high SNR, we derive an upper bound on the probability of bit error for the above methods at high SNR. This upper bound falls very rapidly with increasing SNR, showing that our constellation estimation-assignment approach is very efficient. Simulation results are given to demonstrate the usefulness of the bounds. Electrical Communications Antenna Arrays Wireless Communication System Intelligent Signal Processing BPSK SNR Maximum Likelihood Estimation Smallest Distance Clustering Algorithm Narrowband Model
23	Hierarchical Data Structures for Pattern Recognition Choudhury, Sabyasachy 05 1900 (has links) Pattern recognition is an important area with potential applications in computer vision, Speech understanding, knowledge engineering, bio-medical data classification, earth sciences, life sciences, economics, psychology, linguistics, etc. Clustering is an unsupervised classification process corning under the area of pattern recognition. There are two types of clustering approaches: 1) Non-hierarchical methods 2) Hierarchical methods. Non-hierarchical algorithms are iterative in nature and. perform well in the context of isotropic clusters. Time-complexity of these algorithms is order of (0 (n) ) and above, Hierarchical agglomerative algorithms, on the other hand, are effective when clusters are non-isotropic. The single linkage method of hierarchical category produces a dendrogram which corresponds to the minimal spanning tree, conventional approaches are time consuming requiring O (n2 ) computational time. In this thesis we propose an intelligent partitioning scheme for generating the minimal spanning tree in the co-ordinate space. This is computationally elegant as it avoids the computation of similarity between many pairs of samples me minimal spanning tree generated can be used to produce C disjoint clusters by breaking the (C-1) longest edges in the tree. A systolic architecture has been proposed to increase the speed of the algorithm further. Simulation study has been conducted and the corresponding results are reported. The simulation package has been developed on DEC-1090 in Pascal. It is observed based on the simulation study that the parallel implementation reduces the time enormously. The number of processors required for the parallel implementation is a constant making the approach more attractive. Texture analysis and synthesis has been extensively studied in the context of computer vision, Two important approaches which have been studied extensively by researchers earlier are statistical and structural approaches, Texture is understood to be a periodic pattern with primitive sub patterns repeating in a particular fashion. This has been used to characterize texture with the help of the hierarchical data structure, tree. It is convenient to use a tree data structure as, along with the operations like merging, splitting, deleting a node, adding a node, etc, .it would be useful to handle a periodic pattern. Various functions like angular second moment, correlation etc, which are used to characterize texture have been translated into the new language of hierarchical data structure. Computer and Information Science Pattern Recognition Dendrogram Euclidean Space Clustering Algorithm Parallel Algorithm Systolic Arrays HaraLick' s work Horowitz's work Minimal Spanning Tree (MST)
24	Measurement and comparison of clustering algorithms Javar, Shima January 2007 (has links) <p>In this project, a number of different clustering algorithms are described and their workings explained. They are compared to each other by implementing them on number of graphs with a known architecture.</p><p>These clustering algorithm, in the order they are implemented, are as follows: Nearest neighbour hillclimbing, Nearest neighbour big step hillclimbing, Best neighbour hillclimbing, Best neighbour big step hillclimbing, Gem 3D, K-means simple, K-means Gem 3D, One cluster and One cluster per node.</p><p>The graphs are Unconnected, Directed KX, Directed Cycle KX and Directed Cycle.</p><p>The results of these clusterings are compared with each other according to three criteria: Time, Quality and Extremity of nodes distribution. This enables us to find out which algorithm is most suitable for which graph. These artificial graphs are then compared with the reference architecture graph to reach the conclusions.</p> Clustering algorithm Module dependency graph Quality Extremity Precision Recall artificial graph Reference graph Cluster Node Edge Implementation time Computer science Datavetenskap
25	Modelagem fuzzy usando agrupamento condicional Nogueira, Tatiane Marques 06 August 2008 (has links) Made available in DSpace on 2016-06-02T19:05:32Z (GMT). No. of bitstreams: 1 2113.pdf: 882226 bytes, checksum: 022c380c1d469988d9e4617a030f17c3 (MD5) Previous issue date: 2008-08-06 / The combination of fuzzy systems with clustering algorithms has great acceptance in the scientific community mainly due to its adherence to the advantage balance principle of computational intelligence, in which different methodologies collaborate with each other potentializing the usefulness and applicability of the resulting systems. Fuzzy Modeling using clustering algorithms presents the transparency and comprehensibility typical of the linguistic fuzzy systems at the same time that benefits from the possibilities of dimensionality reduction by means of clustering. In this work is presented the Fuzzy-CCM method (Fuzzy Conditional Clustering based Modeling) which consists of a new approach for Fuzzy Modeling based on the Fuzzy Conditional Clustering algorithm aiming at providing new means to address the topic of interpretability of fuzzy rules bases. With the Fuzzy-CCM method the balance between interpretability and accuracy of fuzzy rules is dealt with through the definition of contexts defined by a small number of input variables and the generation of clusters induced by these contexts. The rules are generated in a different format, with linguistic variables and clusters in the antecedent. Some experiments have been carried out using different knowledge domains in order to validate the proposed approach by comparing the results with the ones obtained by the Wang&Mendel and conventional Fuzzy C-Means methods. The theoretical foundations, the advantages of the method, the experiments and results are presented and discussed. / A combinação de sistemas fuzzy com algoritmos de agrupamento tem grande aceitação na comunidade científica devido; principalmente, a sua aderência ao princípio de balanceamento de vantagens da inteligência computacional, no qual metodologias diferentes colaboram entre si, potencializando a utilidade e aplicabilidade dos sistemas resultantes. A modelagem fuzzy usando algoritmos de agrupamento apresenta a transparência e facilidade de compreensão típica dos sistemas fuzzy lingüísticos ao mesmo tempo em que se beneficia das possibilidades de redução da dimensionalidade por intermédio do agrupamento. Neste trabalho é apresentado o método Fuzzy-CCM (Fuzzy Conditional Clustering based Modeling), que consiste de uma nova abordagem de Modelagem Fuzzy baseada no algoritmo de Agrupamento Fuzzy Condicional, cujo objetivo é prover novos meios de tratar a questão da interpretabilidade de bases de regras fuzzy. Com o método Fuzzy-CCM, o balanço entre interpretabilidade e acuidade de regras fuzzy é tratado por meio da definição de contextos formados com um pequeno número de variáveis de entrada e a geração de grupos condicionados por estes contextos. As regras são geradas em um formato diferente, que contêm variáveis lingüísticas e grupos no seu antecedente. Alguns experimentos foram executados usando diferentes domínios de conhecimento a fim de validar a abordagem proposta, comparando os resultados obtidos usando a nova abordagem com os resultados obtidos usando os métodos Wang&Mendel e Fuzzy C-Means. A fundamentação teórica, as vantagens do método, os experimentos e os resultados obtidos são apresentados e discutidos. Geração automática de regras Fuzzy Fuzzy logic Método de agrupamento Sistema Fuzzy Interpretabilidade Algoritmo de agrupamento condicional Fuzzy systems Conditional clustering algorithm Interpretability Fuzzy modeling
26	Measurement and comparison of clustering algorithms Javar, Shima January 2007 (has links) In this project, a number of different clustering algorithms are described and their workings explained. They are compared to each other by implementing them on number of graphs with a known architecture. These clustering algorithm, in the order they are implemented, are as follows: Nearest neighbour hillclimbing, Nearest neighbour big step hillclimbing, Best neighbour hillclimbing, Best neighbour big step hillclimbing, Gem 3D, K-means simple, K-means Gem 3D, One cluster and One cluster per node. The graphs are Unconnected, Directed KX, Directed Cycle KX and Directed Cycle. The results of these clusterings are compared with each other according to three criteria: Time, Quality and Extremity of nodes distribution. This enables us to find out which algorithm is most suitable for which graph. These artificial graphs are then compared with the reference architecture graph to reach the conclusions. Clustering algorithm Module dependency graph Quality Extremity Precision Recall artificial graph Reference graph Cluster Node Edge Implementation time Computer Sciences Datavetenskap (datalogi)
27	Outlier Detection with Applications in Graph Data Mining Ranga Suri, N N R January 2013 (has links) (PDF) Outlier detection is an important data mining task due to its applicability in many contemporary applications such as fraud detection and anomaly detection in networks, etc. It assumes significance due to the general perception that outliers represent evolving novel patterns in data that are critical to many discovery tasks. Extensive use of various data mining techniques in different application domains gave rise to the rapid proliferation of research work on outlier detection problem. This has lead to the development of numerous methods for detecting outliers in various problem settings. However, most of these methods deal primarily with numeric data. Therefore, the problem of outlier detection in categorical data has been considered in this work for developing some novel methods addressing various research issues. Firstly, a ranking based algorithm for detecting a likely set of outliers in a given categorical data has been developed employing two independent ranking schemes. Subsequently, the issue of data dimensionality has been addressed by proposing a novel unsupervised feature selection algorithm on categorical data. Similarly, the uncertainty associated with the outlier detection task has also been suitably dealt with by developing a novel rough sets based categorical clustering algorithm. Due to the networked nature of the data pertaining to many real life applications such as computer communication networks, social networks of friends, the citation networks of documents, hyper-linked networks of web pages, etc., outlier detection(also known as anomaly detection) in graph representation of network data turns out to be an important pattern discovery activity. Accordingly, a novel graph mining method has been envisaged in this thesis based on the concept of community detection in graphs. In addition to finding anomalous nodes and anomalous edges, this method is capable of detecting various higher level anomalies that are arbitrary sub-graphs of the input graph. Subsequently, these ideas have been further extended in this thesis to characterize the time varying behavior of outliers(anomalies) in dynamic network data by defining various categories of temporal outliers (anomalies). Characterizing the behavior of such outliers during the evolution of the network over time is critical for discovering different anomalous connectivity patterns with potential adverse effects such as intrusions into a computer network, etc. In order to deal with temporal outlier detection in single instance network/graph data, the link prediction task has been leveraged in this thesis to produce multiple instances of the input graph. Thus, various outlier detection principles have been successfully applied for mining various categories of temporal outliers(anomalies) in the graph representation of network data. Data Mining Graph Data Mining Outlier Detection Categorical Data - Outlier Detection Network/Graph Data - Outlier Detection Graph Data Mining - Outlier Detection Outliers Rough Clustering Algorithm Computer Science
28	Grid-based Energy Aware Mobility Model for FANETs Uddin, Mohammad Messbah January 2022 (has links) Drones flying in squad formation while interconnected in an ad-hoc fashion are called Flying Ad hoc Networks (FANETs). These FANETs are gathering special interests in the networking community in their deployment for different vital missions. Such missions include rescue missions in case of disasters, monitoring and border control, animal monitoring, crowd monitoring and management, etc. The main problems researched with FANETs are typically inherited from what has been done for mobile ad-hoc Networks (MANETs) and Vehicular Ad-hoc Networks (VANETs) earlier. One of the major problems is routing and forwarding gathered data towards the member(i.e., the drone) closest to the sink or the member that gateways to the Internet to reach the sink. Clustering the FANET nodes (i.e., the drones) is found to be a good solution for this problem. The preeminent contributions of this thesis include a novel grinding technique of the geolocation where FANET is deployed to perform certain tasks, a grid-based mobility model for UAVs, and extending the EMASS algorithm so that it can adapt to our proposed grid-based system. The result proves our mobility model’s superiority over one of the most used mobility models, Random walk. FANET Clustering algorithm Unmanned Aerial Vehicle (UAV) Grid-based clustering Mobility model energy consumption sustainability Elektroteknik och elektronik
29	Machine Learning implementation for Stress-Detection Madjar, Nicole, Lindblom, Filip January 2020 (has links) This project is about trying to apply machine learning theories on a selection of data points in order to see if an improvement of current methodology within stress detection and measure selecting could be applicable for the company Linkura AB. Linkura AB is a medical technology company based in Linköping and handles among other things stress measuring for different companies employees, as well as health coaching for selecting measures. In this report we experiment with different methods and algorithms under the collective name of Unsupervised Learning, to identify visible patterns and behaviour of data points and further on we analyze it with the quantity of data received. The methods that have been practiced on during the project are “K-means algorithm” and a dynamic hierarchical clustering algorithm. The correlation between the different data points parameters is analyzed to optimize the resource consumption, also experiments with different number of parameters are tested and discussed with an expert in stress coaching. The results stated that both algorithms can create clusters for the risk groups, however, the dynamic clustering method clearly demonstrate the optimal number of clusters that should be used. Having consulted with mentors and health coaches regarding the analysis of the produced clusters, a conclusion that the dynamic hierarchical cluster algorithm gives more accurate clusters to represent risk groups were done. The conclusion of this project is that the machine learning algorithms that have been used, can categorize data points with stress behavioral correlations, which is usable in measure testimonials. Further research should be done with a greater set of data for a more optimal result, where this project can form the basis for the implementations. / Detta projekt handlar om att försöka applicera maskininlärningsmodeller på ett urval av datapunkter för att ta reda på huruvida en förbättring av nuvarande praxis inom stressdetektering och åtgärdshantering kan vara applicerbart för företaget Linkura AB. Linkura AB är ett medicintekniskt företag baserat i Linköping och hanterar bland annat stressmätning hos andra företags anställda, samt hälso-coachning för att ta fram åtgärdspunkter för förbättring. I denna rapport experimenterar vi med olika metoder under samlingsnamnet oövervakad maskininlärning för att identifiera synbara mönster och beteenden inom datapunkter, och vidare analyseras detta i förhållande till den mängden data vi fått tillgodosett. De modeller som har använts under projektets gång har varit “K-Means algoritm” samt en dynamisk hierarkisk klustermodell. Korrelationen mellan olika datapunktsparametrar analyseras för att optimera resurshantering, samt experimentering med olika antal parametrar inkluderade i datan testas och diskuteras med expertis inom hälso-coachning. Resultaten påvisade att båda algoritmerna kan generera kluster för riskgrupper, men där den dynamiska modellen tydligt påvisar antalet kluster som ska användas för optimalt resultat. Efter konsultering med mentorer samt expertis inom hälso-coachning så drogs en slutsats om att den dynamiska modellen levererar tydligare riskkluster för att representera riskgrupper för stress. Slutsatsen för projektet blev att maskininlärningsmodeller kan kategorisera datapunkter med stressrelaterade korrelationer, vilket är användbart för åtgärdsbestämmelser. Framtida arbeten bör göras med ett större mängd data för mer optimerade resultat, där detta projekt kan ses som en grund för dessa implementeringar. BigQuery Cloud Computing Clustering Dynamic clustering algorithm Google Cloud Platform Health coaching HRV-values K-Means algorithm Machine learning Stress Unsupervised learning. Computer and Information Sciences Data- och informationsvetenskap
30	DifFUZZY : a novel clustering algorithm for systems biology Cominetti Allende, Ornella Cecilia January 2012 (has links) Current studies of the highly complex pathobiology and molecular signatures of human disease require the analysis of large sets of high-throughput data, from clinical to genetic expression experiments, containing a wide range of information types. A number of computational techniques are used to analyse such high-dimensional bioinformatics data. In this thesis we focus on the development of a novel soft clustering technique, DifFUZZY, a fuzzy clustering algorithm applicable to a larger class of problems than other soft clustering approaches. This method is better at handling datasets that contain clusters that are curved, elongated or are of different dispersion. We show how DifFUZZY outperforms a number of frequently used clustering algorithms using a number of examples of synthetic and real datasets. Furthermore, a quality measure based on the diffusion distance developed for DifFUZZY is presented, which is employed to automate the choice of its main parameter. We later apply DifFUZZY and other techniques to data from a clinical study of children from The Gambia with different types of severe malaria. The first step was to identify the most informative features in the dataset which allowed us to separate the different groups of patients. This led to us reproducing the World Health Organisation classification for severe malaria syndromes and obtaining a reduced dataset for further analysis. In order to validate these features as relevant for malaria across the continent and not only in The Gambia, we used a larger dataset for children from different sites in Sub-Saharan Africa. With the use of a novel network visualisation algorithm, we identified pathobiological clusters from which we made and subsequently verified clinical hypotheses. We finish by presenting conclusions and future directions, including image segmentation and clustering time-series data. We also suggest how we could bridge data modelling with bioinformatics by embedding microarray data into cell models. Towards this end we take as a case study a multiscale model of the intestinal crypt using a cell-vertex model. 518.1

Search results