Spelling suggestions: "subject:"densitybased clustering"" "subject:"density_based clustering""
1 |
Design and implementation of scalable hierarchical density based clusteringDhandapani, Sankari 09 November 2010 (has links)
Clustering is a useful technique that divides data points into groups, also known as clusters, such that the data points of the same cluster exhibit similar properties. Typical clustering algorithms assign each data point to at least one cluster. However, in practical datasets like microarray gene dataset, only a subset of the genes are highly correlated and the dataset is often polluted with a huge volume of genes that are irrelevant. In such cases, it is important to ignore the poorly correlated genes and just cluster the highly correlated genes.
Automated Hierarchical Density Shaving (Auto-HDS) is a non-parametric density based technique that partitions only the relevant subset of the dataset into multiple clusters while pruning the rest. Auto-HDS performs a hierarchical clustering that identifies dense clusters of different densities and finds a compact hierarchy of the clusters identified. Some of the key features of Auto-HDS include selection and ranking of clusters using custom stability criterion and a topologically meaningful 2D projection and visualization of the clusters discovered in the higher dimensional original space. However, a key limitation of Auto-HDS is that it requires O(n*n) storage, and O(n*n*logn) computational complexity, making it scale up to only a few 10s of thousands of points. In this thesis, two extensions to Auto-HDS are presented for lower dimensional datasets that can generate clustering identical to Auto-HDS but can scale to much larger datasets. We first introduce Partitioned Auto-HDS that provides significant reduction in time and space complexity and makes it possible to generate the Auto-HDS cluster hierarchy on much larger datasets with 100s of millions of data points. Then, we describe Parallel Auto-HDS that takes advantage of the inherent parallelism available in Partitioned Auto-HDS to scale to even larger datasets without a corresponding increase in actual run time when a group of processors are available for parallel execution. Partitioned Auto-HDS is implemented on top of GeneDIVER, a previously existing Java based streaming implementation of Auto-HDS, and thus it retains all the key features of Auto-HDS including ranking, automatic selection of clusters and 2D visualization of the discovered cluster topology. / text
|
2 |
Mobile Location Estimation Using Genetic Algorithm and Clustering Technique for NLOS EnvironmentsHung, Chung-Ching 10 September 2007 (has links)
For the mass demands of personalized security services, such as tracking, supervision, and emergent rescue, the location technologies of mobile communication have drawn much attention of the governments, academia, and industries around the world. However, existing location methods cannot satisfy the requirements of low cost and high accuracy. We hypothesized that a new mobile location algorithm based on the current GSM system will effectively improve user satisfaction. In this study, a prototype system will be developed, implemented, and experimented by integrating the useful information such as the geometry of the cell layout, and the related mobile positioning technologies. The intersection of the regions formed by the communication space of the base stations will be explored. Furthermore, the density-based clustering algorithm (DCA) and GA-based algorithm will be designed to analyze the intersection region and estimate the most possible location of a mobile phone. Simulation results show that the location error of the GA-based is less than 0.075 km for 67% of the time, and less than 0.15 km for 95% of the time. The results of the experiments satisfy the location accuracy demand of E-911.
|
3 |
Discovering Intrinsic Points of Interest from Spatial Trajectory Data SourcesPiekenbrock, Matthew J. 13 June 2018 (has links)
No description available.
|
4 |
An Improved Density-Based Clustering Algorithm Using Gravity and Aging ApproachesAl-Azab, Fadwa Gamal Mohammed January 2015 (has links)
Density-based clustering is one of the well-known algorithms focusing on grouping samples according to their densities. In the existing density-based clustering algorithms, samples are clustered according to the total number of points within the radius of the defined dense region. This method of determining density, however, provides little knowledge about the similarities among points. Additionally, they are not flexible enough to deal with dynamic data that changes over time. The current study addresses these challenges by proposing a new approach that incorporates new measures to evaluate the attributes similarities while clustering incoming samples rather than considering only the total number of points within a radius. The new approach is developed based on the notion of Gravity where incoming samples are clustered according to the force of their neighbouring samples. The Mass (density) of a cluster is measured using various approaches including the number of neighbouring samples and Silhouette measure. Then, the neighbouring sample with the highest force is the one that pulls in the new incoming sample to be part of that cluster. Taking into account the attribute similarities of points provides more information by accurately defining the dense regions around the incoming samples. Also, it determines the best neighbourhood to which the new sample belongs. In addition, the proposed algorithm introduces a new approach to utilize the memory efficiently. It forms clusters with different shapes over time when dealing with dynamic data. This approach, called Aging, enables the proposed algorithm to utilize the memory efficiently by removing points that are aged if they do not participate in clustering incoming samples, and consequently, changing the shapes of the clusters incrementally.
Four experiments are conducted in this study to evaluate the performance of the proposed algorithm. The performance and effectiveness of the proposed algorithm are validated on a synthetic dataset (to visualize the changes of the clusters’ shapes over time), as well as real datasets. The experimental results confirm that the proposed algorithm is improved in terms of the performance measures including Dunn Index and SD Index. The experimental results also demonstrate that the proposed algorithm utilizes less memory, with the ability to form clusters with arbitrary shapes that are changeable over time.
|
5 |
Density Based Clustering using Mutual K-Nearest NeighborsDixit, Siddharth January 2015 (has links)
No description available.
|
6 |
A Methodology Of Swarm Intelligence Application In Clustering Based On Neighborhood ConstructionInkaya, Tulin 01 May 2011 (has links) (PDF)
In this dissertation, we consider the clustering problem in data sets with unknown number of clusters having arbitrary shapes, intracluster and intercluster density variations.
We introduce a clustering methodology which is composed of three methods that ensures extraction of local density and connectivity properties, data set reduction, and clustering. The first method constructs a unique neighborhood for each data point using the connectivity and density relations among the points based upon the graph theoretical concepts, mainly Gabriel Graphs. Neighborhoods subsequently connected form subclusters (closures) which constitute the skeleton of the clusters. In the second method, the external shape concept in computational geometry is adapted for data set reduction and cluster visualization. This method extracts the external shape of a non-convex n-dimensional data set using Delaunay triangulation. In the third method, we inquire the applicability of Swarm Intelligence to clustering using Ant Colony Optimization (ACO). Ants explore the data set so that the clusters are detected using density break-offs, connectivity and distance information. The proposed ACO-based algorithm uses the outputs of the neighborhood construction (NC) and the external shape formation. In addition, we propose a three-phase clustering algorithm that consists of NC, outlier detection and merging phases.
We test the strengths and the weaknesses of the proposed approaches by extensive experimentation with data sets borrowed from literature and generated in a controlled manner. NC is found to be effective for arbitrary shaped clusters, intracluster and intercluster density variations. The external shape formation algorithm achieves significant reductions for convex clusters. The ACO-based and the three-phase clustering algorithms have promising results for the data sets having well-separated clusters.
|
7 |
Energy Harvesting Potential of a Micro-Thermal Network Using a Nodal Approach to Reduce GHG Emissions in Mixed Electrical GridsAbdalla, Ahmed January 2023 (has links)
Integrating the electrical and thermal community buildings' energy systems can play an important role in harvesting wasted energy resources and reduction of carbon emissions from buildings and electricity generation sectors. It also increases demand management flexibility by minimizing the curtailed electricity on the grid through electrified heating without increasing the electricity peak demand. The current work examines Integrated Community Energy and Harvesting systems (ICE-Harvest), a new generation of distributed energy resources systems (DERs). They prioritize the harvesting of community waste energy resources—for example, heat rejected from cooling processes and distributed peak electricity fossil-fuel-fired generators, as well as energy from curtailed clean grid electricity resources—to help in satisfying the heating demands of commercial and residential buildings. As such, ICE-Harvest systems provide a solution that can minimize greenhouse gas emissions from high-energy-consumption buildings in cold-climate regions such as North America and Northern Europe.
In the current research, a thermal energy sharing model was developed to provide a dynamic characterization of the potential benefits of integrating and harvesting energy within a community of any number of buildings. The proposed model estimates the amount of rejected heat from cooling and refrigeration systems that can be simultaneously collected and used to heat other nearby buildings connected with a low temperature microthermal network (MTN). It also determines the proper timing and quantity of electricity used by the heat pumps in low-temperature MTNs as well as the reduction of both GHG emissions and the energy required from the EMC relative to conventional stand-alone systems. For an energy-balanced community cluster, the model showed that, over the course of a year, the energy harvesting would reduce this node’s GHG emissions by 74% and cover approximately 82% of the heating requirements compared to the BAU system.
The results also revealed that the diversity in thermal demand between the connected buildings increases the harvesting potential. This research develops two clustering methods for the ICE-Harvest system. The proposed methods are clustering around anchor building and density-based (DB) clustering with post-processing by adding the closest anchor building to each cluster that focuses on the diversity of the buildings in each cluster. The energy sharing model is used to examine these techniques in comparison with the density-based clustering technique, the commonly used technique in the literature on a large database of 14000 high energy consumption buildings collected in Ontario, Canada. The results of this case study reveal that DB clustering with post-processing resulted in the largest emission reduction per unit piping network length of 360 t CO2eq /km/year. In addition, this research identified seven different cluster categories based on the total and simultaneous cooling-to-heating ratios of each cluster.
The ICE harvest system integrates the thermal and electrical networks to add more flexibility to the electricity grid and schedule the electrification of heating (EoH). Current research provides a reduced model for the ICE-Harvest system to study its impact for over 1100 clusters of different categories on a provincial scale on the GHG emission and electricity demand from the grid. The use of ICE-Harvest systems at this scale can displace the energy required from the gas-fired heating resources by 11 TWh, accounting for over 70% of the clusters’ total heating requirements. This results in a 1.9 Mt CO2eq reduction in total GHG emissions, which represents around 60% of the clusters’ emissions.
Operating conditions of the thermal network (TN) in the integrated community energy systems affect the ability to harvest waste energy and the reduction of GHG emissions as well as the electricity peak demand and consumption. In the current research, modeling of different thermal distribution network operating scenarios was performed for the different community energy profile clusters. These operation scenarios include low-temperature (fourth generation), ultra-low (fifth generation), a binary range-controlled temperature modulating thermal network operating between Low and Ultra-low temperatures (ICE-Harvest), and a new proposed scenario wherein a continuous range-controlled temperature modulating micro-thermal network. The continuous range-controlled temperature scenario shows the most benefits with the large implementation on the identified clusters. It adds more flexibility to balance the electricity grid as well as results in large GHG emission savings while controlling the increase in site electricity peak demand.
The load profile of the cluster affects the selection of the most beneficial energy integrated system. This research shows that, for most of the heating-dominated clusters, it is better to employ the continuous range-controlled temperature TN with peak control and CHP on sites to serve the high heating demands along with short term and seasonal thermal storage. For the majority of balanced and /or cooling-dominated clusters, it is better to implement more carbon-free resources to the electricity grid or on-site that produce electricity but are not associated with heat such as wind, hydro, and solar PV panels. Parametric studies were performed in this research including changing the CHP size, the CHP utilization efficiency, and the grid gas-fired generators usage conditions to show their impact on the GHG emissions reduction from the clustered buildings.
The analysis was implemented on a fleet of 1139 sites in Ontario and the results showed that the CHP size and operating hours have a measurable impact on GHG emission saving. The system can reach up to 58% and 66.5% emission savings of the total sites’ emissions with 93% and 39% operating hours respectively following the Ontario grid natural gas peaking power plants for the years of 2016 and 2017 with larger CHP sizes. The largest share of GHG emission saving in 2016 is by the CHP (61%) as opposed to 30% in 2017.
The reduced models introduced in this research for the thermal energy sharing, the ICE-harvest system operation and sizing, and the MTN operation aid the investigation of the impact of the large implementation of the ICE-Harvest systems on the GHG emissions and electricity grid. / Thesis / Doctor of Philosophy (PhD)
|
Page generated in 0.1297 seconds