Global ETD Search

11	Inner Ensembles: Using Ensemble Methods in Learning Step Abbasian, Houman January 2014 (has links) A pivotal moment in machine learning research was the creation of an important new research area, known as Ensemble Learning. In this work, we argue that ensembles are a very general concept, and though they have been widely used, they can be applied in more situations than they have been to date. Rather than using them only to combine the output of an algorithm, we can apply them to decisions made inside the algorithm itself, during the learning step. We call this approach Inner Ensembles. The motivation to develop Inner Ensembles was the opportunity to produce models with the similar advantages as regular ensembles, accuracy and stability for example, plus additional advantages such as comprehensibility, simplicity, rapid classification and small memory footprint. The main contribution of this work is to demonstrate how broadly this idea can be applied, and highlight its potential impact on all types of algorithms. To support our claim, we first provide a general guideline for applying Inner Ensembles to different algorithms. Then, using this framework, we apply them to two categories of learning methods: supervised and un-supervised. For the former we chose Bayesian network, and for the latter K-Means clustering. Our results show that 1) the overall performance of Inner Ensembles is significantly better than the original methods, and 2) Inner Ensembles provide similar performance improvements as regular ensembles. Inner Ensemble Bayesian Network Inner Ensembles Bayesian Network K-Means Inner K-Means
12	APPROXIMATE N-NEAREST NEIGHBOR CLUSTERING ON DISTRIBUTED DATABASES USING ITERATIVE REFINEMENT CALENDER, CHRISTOPHER R. 06 October 2004 (has links) No description available. Computer Science nearest neighbor cluster K-means clustering approximate nearest neighbor approximate k-means
13	High Performance Text Document Clustering Li, Yanjun 13 June 2007 (has links) No description available. Document Clustering Text Mining K-means Bisecting K-means Algorithm Performance Analysis.
14	Motion tracking using feature point clusters Foster, Robert L. Jr. January 1900 (has links) Master of Science / Department of Computing and Information Sciences / David A. Gustafson William Hsu / In this study, we identify a new method of tracking motion over a sequence of images using feature point clusters. We identify and implement a system that takes as input a sequence of images and generates clusters of SIFT features using the K-Means clustering algorithm. Every time the system processes an image it compares each new cluster to the clusters of previous images, which it stores in a local cache. When at least 25% of the SIFT features that compose a cluster match a cluster in the local cache, the system uses the centroid of both clusters in order to determine the direction of travel. To establish a direction of travel, we calculate the slope of the line connecting the centroid of two clusters relative to their Cartesian coordinates in the secondary image. In an experiment using a P3-AT mobile robotic agent equipped with a digital camera, the system receives and processes a sequence of eight images. Experimental results show that the system is able to identify and track the motion of objects using SIFT feature clusters more efficiently when applying spatial outlier detection prior to generating clusters. SIFT Clustering K-Means Player Computer Science (0984)
15	Evaluating Heuristics and Crowding on Center Selection in K-Means Genetic Algorithms McGarvey, William 01 January 2014 (has links) Data clustering involves partitioning data points into clusters where data points within the same cluster have high similarity, but are dissimilar to the data points in other clusters. The k-means algorithm is among the most extensively used clustering techniques. Genetic algorithms (GA) have been successfully used to evolve successive generations of cluster centers. The primary goal of this research was to develop improved GA-based methods for center selection in k-means by using heuristic methods to improve the overall fitness of the initial population of chromosomes along with crowding techniques to avoid premature convergence. Prior to this research, no rigorous systematic examination of the use of heuristics and crowding methods in this domain had been performed. The evaluation included computational experiments involving repeated runs of the genetic algorithm in which values that affect heuristics or crowding were systematically varied and the results analyzed. Genetic algorithm performance under the various configurations was analyzed based upon (1) the fitness of the partitions produced, and by (2) the overall time it took the GA to converge to good solutions. Two heuristic methods for initial center seeding were tested: Density and Separation. Two crowding techniques were evaluated on their ability to prevent premature convergence: Deterministic and Parent Favored Hybrid local tournament selection. Based on the experiment results, the Density method provides no significant advantage over random seeding either in discovering quality partitions or in more quickly evolving better partitions. The Separation method appears to result in an increased probability of the genetic algorithm finding slightly better partitions in slightly fewer generations, and to more quickly converge to quality partitions. Both local tournament selection techniques consistently allowed the genetic algorithm to find better quality partitions than roulette-wheel sampling. Deterministic selection consistently found better quality partitions in fewer generations than Parent Favored Hybrid. The combination of Separation center seeding and Deterministic selection performed better than any other combination, achieving the lowest mean best SSE value more than twice as often as any other combination. On all 28 benchmark problem instances, the combination identified solutions that were at least as good as any identified by extant methods. crowding genetic algorithms k-means Artificial Intelligence and Robotics Computer Sciences
16	R-medžių analizė, taikant juos duomenų gavybos algoritmams / Analysis of r-trees for data mining algorithms Judeikis, Laimonas 04 July 2014 (has links) R-medžių analizė, taikant juos duomenų gavybos algoritmams. / Analysis of R-trees for Data Mining Algorithms. R-medis Data mining Klasterizavimo algoritmai K-means CURE
17	Aplicación de la minería de datos distribuida usando algoritmo de clustering k-means para mejorar la calidad de servicios de las organizaciones modernas caso: Poder judicial Mamani Rodríguez, Zoraida Emperatriz January 2015 (has links) La minería de datos distribuida está contemplada en el campo de la investigación que implica la aplicación del proceso de extracción de conocimiento sobre grandes volúmenes de información almacenados en bases de datos distribuidas. Las organizaciones modernas requieren de herramientas que realicen tareas de predicción, pronósticos, clasificación entre otros y en línea, sobre sus bases de datos que se ubican en diferentes nodos interconectados a través de internet, de manera que les permita mejorar la calidad de sus servicios. En ese contexto, el presente trabajo realiza una revisión bibliográfica de las técnicas clustering k-means, elabora una propuesta concreta, desarrolla un prototipo de aplicación y concluye fundamentando los beneficios que obtendrían las organizaciones con su implementación. Minería de Datos Distribuida Algoritmo Clustering K-means Detección de Patrones
18	Wireless Network SNR Enhancement Using Mobile Relay Stations Ohannessian, Rostom 13 January 2011 (has links) With the proliferation of wireless technologies, wireless Internet access in public places will become a necessity in the near future. In outdoor areas, where the base stations are sparsely distributed, mobile users at the edge of the network communicate with the base station at a very low rate and thus waste network resources. To solve this problem, one of the previously taken approaches was the use of relay stations to improve the throughput of the network. In this work, we take this approach to the next level by updating the positions of the relays according to the particular distribution of the users at certain time instants. By comparing the proposed scheme to fixed relay placement strategies, we show that the former has 15-60% performance improvement over the latter, in terms of the average SNR of the network. Mobile Relays SNR Enhancement k-means Clustering 0544
19	Wireless Network SNR Enhancement Using Mobile Relay Stations Ohannessian, Rostom 13 January 2011 (has links) With the proliferation of wireless technologies, wireless Internet access in public places will become a necessity in the near future. In outdoor areas, where the base stations are sparsely distributed, mobile users at the edge of the network communicate with the base station at a very low rate and thus waste network resources. To solve this problem, one of the previously taken approaches was the use of relay stations to improve the throughput of the network. In this work, we take this approach to the next level by updating the positions of the relays according to the particular distribution of the users at certain time instants. By comparing the proposed scheme to fixed relay placement strategies, we show that the former has 15-60% performance improvement over the latter, in terms of the average SNR of the network. Mobile Relays SNR Enhancement k-means Clustering 0544
20	Non-Line of Sight Identification with Particle Filter Optimization Algprithm in Wireless Location Chen, Tai-Yuan 29 July 2008 (has links) In wireless location systems, received signals may be influenced by non-line of sight (NLOS) propagation errors, which yield severe degradation of location accuracy.Therefore, to distinguish how many measurement signals are line-of-sight (LOS) and to identify them simultaneously will contribute to the increase of location accuracy.We propose a method based on recursive hypothesis testing algorithm, and use residual information to determine whether the NLOS errors are present in measurements. Since the probability distribution of measurements with NLOS errors is different from that of measurements without NLOS errors, a likelihood ratio test can be used in determining the LOS/NLOS status of the measurements. To search for an optimal threshold for the hypothesis testing, particle filtering optimization(PFO) is adopted. The PFO algorithm uses particle filtering to find the best threshold for determining the status of signals measured at all base stations (BSs). In the PFO algorithm, the clustering property of K-means is also used in separating particles, thereby the search of optimal threshold may be implemented in parallel.In this thesis, we focus on the hybrid TOA/AOA (time of arrical/angle of arrival) location method, in which localization only uses the LOS location measurements to calculate the location of a mobile station. Simulation results show that the proposed algorithm performs better than other algorithms which suffer from different degrees of NLOS errors. The proposed scheme also obtains higher identification rate of LOS-BSs in different situations by using the optimal thresholds for status detection. Wireless Location TOA K-means Particle Filter AOA NLOS

Search results