131 |
Motion segmentation by adaptive mode seeking and clustering consensusPan, Guodong., 潘国栋. January 2012 (has links)
The task of multi-body motion segmentation refers to segmenting feature trajectories
in a sequence of images according to their 3D motion affinity without
knowing the number of motions in advance. It is critical for understanding and
reconstructing a dynamic scene. This problem essentially consists of two subproblems,
segmenting features and detecting the number of motions. While the
state-of-the-art LBF algorithm achieves segmentation accuracy as high as 96.5%,
it is still disturbed by a phenomenon called over-locality. A novel mode seeking
algorithm with an adaptive distance measure is proposed to avoid this problem,
and improves the accuracy to 98.1%. The LBF algorithm is incapable of detecting
the number of motions itself. A randomized version of the mode seeking algorithm
is presented, which could detect the number as well as preserve satisfactory
segmentation accuracy. To detect the number of motions, a kernel optimization
method locates it via kernel alignment. However, it suffers from over-locality and
over-detects the number of motions. An intersection measure and two mutual
information measures are presented to solve this problem. Using these measures,
the proposed clustering consensus framework recasts the motion number detection
problem to a clustering consensus problem. It extends the kernel optimization
method from two-clustering consensus to multiple-clustering consensus. A large
number of experiments and comparisons have been done, and convincing results
are obtained. / published_or_final_version / Computer Science / Doctoral / Doctor of Philosophy
|
132 |
Relationship-based clustering and cluster ensembles for high-dimensional data miningStrehl, Alexander 28 August 2008 (has links)
Not available / text
|
133 |
Robust methods for locating multiple dense regions in complex datasetsGupta, Gunjan Kumar 28 August 2008 (has links)
Not available / text
|
134 |
Ab initio calculations: an extension of Sankey's method區逸賢, Au, Yat-yin. January 1999 (has links)
published_or_final_version / Physics / Master / Master of Philosophy
|
135 |
Linear clustering with application to single nucleotide polymorphism genotypingYan, Guohua 11 1900 (has links)
Single nucleotide polymorphisms (SNPs) have been increasingly popular for
a wide range of genetic studies. A high-throughput genotyping technologies
usually involves a statistical genotype calling algorithm. Most calling
algorithms in the literature, using methods such as k-means and mixturemodels,
rely on elliptical structures of the genotyping data; they may fail
when the minor allele homozygous cluster is small or absent, or when the
data have extreme tails or linear patterns.
We propose an automatic genotype calling algorithm by further developing
a linear grouping algorithm (Van Aelst et al., 2006). The proposed
algorithm clusters unnormalized data points around lines as against around
centroids. In addition, we associate a quality value, silhouette width, with
each DNA sample and a whole plate as well. This algorithm shows promise
for genotyping data generated from TaqMan technology (Applied Biosystems).
A key feature of the proposed algorithm is that it applies to unnormalized
fluorescent signals when the TaqMan SNP assay is used. The
algorithm could also be potentially adapted to other fluorescence-based SNP
genotyping technologies such as Invader Assay.
Motivated by the SNP genotyping problem, we propose a partial likelihood
approach to linear clustering which explores potential linear clusters
in a data set. Instead of fully modelling the data, we assume only the signed
orthogonal distance from each data point to a hyperplane is normally distributed.
Its relationships with several existing clustering methods are discussed.
Some existing methods to determine the number of components in a
data set are adapted to this linear clustering setting. Several simulated and
real data sets are analyzed for comparison and illustration purpose. We also
investigate some asymptotic properties of the partial likelihood approach.
A Bayesian version of this methodology is helpful if some clusters are
sparse but there is strong prior information about their approximate locations
or properties. We propose a Bayesian hierarchical approach which is
particularly appropriate for identifying sparse linear clusters. We show that
the sparse cluster in SNP genotyping datasets can be successfully identified
after a careful specification of the prior distributions.
|
136 |
Food web structure of a Pantanal shallow lake revealed by stable isotopesLove-Raoul, Nteziryayo January 2013 (has links)
Food webs are good ecological macro-descriptors and their study is important in ecology in understanding nutrient cycles, tracing and quantifying energy and in describing trophic interactions within an ecosystem. The knowledge of food web finds applications in various natural sciences disciplines but also in many productive sectors. This study investigated the structure of the food web of a shallow lake in the Pantanal flood plain. The food web included two macrophytes, six aquatic insects, four crustaceans and 24 fish species. Sources of carbon for the various organisms living in the lake were identified through the values of δ13C exhibited by the organisms. The δ15N signature was used to estimate the trophic position of each organism. A cluster analysis based on the two isotopic signatures revealed six different feeding guilds and emphasized on the broad occurrence of omnivory among animals living in the lake. This study revealed that the use of food carbon was the most important factor that structured the lake community. Very low values of δ13C in zooplankton, benthic dwellers and bottom-feeder organisms as well as similarities between the gradient of δ13C and that of use of methane oxidizing bacteria informed on the possible use of biogenic methane as a source carbon and energy for the lake biota.
|
137 |
Aggregate programming in large scale linear systemsTaylor, Richard Winthrop 08 1900 (has links)
No description available.
|
138 |
Using Cluster Analysis, Cluster Validation, and Consensus Clustering to Identify SubtypesShen, Jess Jiangsheng 26 November 2007 (has links)
Pervasive Developmental Disorders (PDDs) are neurodevelopmental disorders characterized by impairments in social interaction, communication and behaviour [Str04]. Given the diversity and varying severity of PDDs, diagnostic tools attempt to identify homogeneous subtypes within PDDs.
The diagnostic system Diagnostic and Statistical Manual of Mental Disorders - Fourth Edition (DSM-IV) divides PDDs into five subtypes. Several limitations have been identified with the categorical diagnostic criteria of the DSM-IV. The goal of this study is to identify putative subtypes in the multidimensional data collected from a group of patients with PDDs, by using cluster analysis.
Cluster analysis is an unsupervised machine learning method. It offers a way to partition a dataset into subsets that share common patterns. We apply cluster analysis to data collected from 358 children with PDDs, and validate the resulting clusters. Notably, there are many cluster analysis algorithms to choose from, each making certain assumptions about the data and about how clusters should be formed. A way to arrive at a meaningful solution is to use consensus clustering to integrate results from several clustering attempts that form a cluster ensemble into a unified consensus answer, and can provide robust and accurate results [TJPA05].
In this study, using cluster analysis, cluster validation, and consensus clustering, we identify four clusters that are similar to – and further refine three of the five subtypes defined in the DSM-IV. This study thus confirms the existence of these three subtypes among patients with PDDs. / Thesis (Master, Computing) -- Queen's University, 2007-11-15 23:34:36.62 / OGS, QGA
|
139 |
Linear clustering with application to single nucleotide polymorphism genotypingYan, Guohua 11 1900 (has links)
Single nucleotide polymorphisms (SNPs) have been increasingly popular for
a wide range of genetic studies. A high-throughput genotyping technologies
usually involves a statistical genotype calling algorithm. Most calling
algorithms in the literature, using methods such as k-means and mixturemodels,
rely on elliptical structures of the genotyping data; they may fail
when the minor allele homozygous cluster is small or absent, or when the
data have extreme tails or linear patterns.
We propose an automatic genotype calling algorithm by further developing
a linear grouping algorithm (Van Aelst et al., 2006). The proposed
algorithm clusters unnormalized data points around lines as against around
centroids. In addition, we associate a quality value, silhouette width, with
each DNA sample and a whole plate as well. This algorithm shows promise
for genotyping data generated from TaqMan technology (Applied Biosystems).
A key feature of the proposed algorithm is that it applies to unnormalized
fluorescent signals when the TaqMan SNP assay is used. The
algorithm could also be potentially adapted to other fluorescence-based SNP
genotyping technologies such as Invader Assay.
Motivated by the SNP genotyping problem, we propose a partial likelihood
approach to linear clustering which explores potential linear clusters
in a data set. Instead of fully modelling the data, we assume only the signed
orthogonal distance from each data point to a hyperplane is normally distributed.
Its relationships with several existing clustering methods are discussed.
Some existing methods to determine the number of components in a
data set are adapted to this linear clustering setting. Several simulated and
real data sets are analyzed for comparison and illustration purpose. We also
investigate some asymptotic properties of the partial likelihood approach.
A Bayesian version of this methodology is helpful if some clusters are
sparse but there is strong prior information about their approximate locations
or properties. We propose a Bayesian hierarchical approach which is
particularly appropriate for identifying sparse linear clusters. We show that
the sparse cluster in SNP genotyping datasets can be successfully identified
after a careful specification of the prior distributions.
|
140 |
Clustering with genetic algorithmsCole, Rowena Marie January 1998 (has links)
Clustering is the search for those partitions that reflect the structure of an object set. Traditional clustering algorithms search only a small sub-set of all possible clusterings (the solution space) and consequently, there is no guarantee that the solution found will be optimal. We report here on the application of Genetic Algorithms (GAs) -- stochastic search algorithms touted as effective search methods for large and complex spaces -- to the problem of clustering. GAs which have been made applicable to the problem of clustering (by adapting the representation, fitness function, and developing suitable evolutionary operators) are known as Genetic Clustering Algorithms (GCAs). There are two parts to our investigation of GCAs: first we look at clustering into a given number of clusters. The performance of GCAs on three generated data sets, analysed using 4320 differing combinations of adaptions, establishes their efficacy. Choice of adaptions and parameter settings is data set dependent, but comparison between results using generated and real data sets indicate that performance is consistent for similar data sets with the same number of objects, clusters, attributes, and a similar distribution of objects. Generally, group-number representations are better suited to the clustering problem, as are dynamic scaling, elite selection and high mutation rates. Independent generalised models fitted to the correctness and timing results for each of the generated data sets produced accurate predictions of the performance of GCAs on similar real data sets. While GCAs can be successfully adapted to clustering, and the method produces results as accurate and correct as traditional methods, our findings indicate that, given a criterion based on simple distance metrics, GCAs provide no advantages over traditional methods. Second, we investigate the potential of genetic algorithms for the more general clustering problem, where the number of clusters is unknown. We show that only simple modifications to the adapted GCAs are needed. We have developed a merging operator, which with elite selection, is employed to evolve an initial population with a large number of clusters toward better clusterings. With regards to accuracy and correctness, these GCAs are more successful than optimisation methods such as simulated annealing. However, such GCAs can become trapped in local minima in the same manner as traditional hierarchical methods. Such trapping is characterised by the situation where good (k-1)-clusterings do not result from our merge operator acting on good k-clusterings. A marked improvement in the algorithm is observed with the addition of a local heuristic.
|
Page generated in 0.0273 seconds