1 |
A Probabilistic Classification Algorithm With Soft Classification OutputPhillips, Rhonda D. 23 April 2009 (has links)
This thesis presents a shared memory parallel version of the hybrid classification algorithm IGSCR (iterative guided spectral class rejection), a novel data reduction technique that can be used in conjunction with PIGSCR (parallel IGSCR), a noise removal method based on the maximum noise fraction (MNF), and a continuous version of IGSCR (CIGSCR) that outputs soft classifications. All of the above are either classification algorithms or preprocessing algorithms necessary prior to the classification of high dimensional, noisy images. PIGSCR was developed to produce fast and portable code using Fortran 95, OpenMP, and the Hierarchical Data Format version 5 (HDF5) and accompanying data access library. The feature reduction method introduced in this thesis is based on the singular value decomposition (SVD). This feature reduction technique demonstrated that SVD-based feature reduction can lead to more accurate IGSCR classifications than PCA-based feature reduction.
This thesis describes a new algorithm used to adaptively filter a remote sensing dataset based on signal-to-noise ratios (SNRs) once the maximum noise fraction (MNF) has been applied.
The adaptive filtering scheme improves image quality as shown by estimated SNRs and classification accuracy improvements greater than 10%. The continuous iterative guided spectral class rejection (CIGSCR) classification method is based on the iterative guided spectral class rejection (IGSCR) classification method for remotely sensed data. Both CIGSCR and IGSCR use semisupervised clustering to locate clusters that are associated with classes in a classification scheme. This type of semisupervised classification method is particularly useful in remote sensing where datasets are large, training data are difficult to acquire, and clustering makes the identification of subclasses adequate for training purposes less difficult. Experimental results indicate that the soft classification output by CIGSCR is reasonably accurate (when compared to IGSCR), and the fundamental algorithmic changes in CIGSCR (from IGSCR) result in CIGSCR being less sensitive to input parameters that influence iterations. / Ph. D.
|
2 |
La mise en œuvre des politiques de cluster : dilemmes organisationnels, pathologies et évaluation. Le cas d’un pôle de compétitivité français / Implementing cluster policies : organizational dilemmas, pathologies and evaluation. The case of a French policy-driven clusterGlaser, Anna 16 December 2014 (has links)
Confrontés au succès du concept de « cluster », les gouvernements du monde entier mettent aujourd'hui en œuvre des politiques de cluster. Cependant, l'écart entre la recherche et la pratique ne cesse de grandir. Tandis que les chercheurs élaborent des théories présentant les clusters comme des entités permettant de susciter efficacement innovation et compétitivité, les praticiens des clusters – gouvernements, animateurs de pôles de compétitivité et entreprises membres – semblent se débattre dans la gestion de ces objets particulièrement entremêlés. Notre thèse porte sur ce « relevance gap ». Une étude systématique de la littérature (SLR) est menée sur les politiques de cluster : elle démontre que les animateurs de clusters font constamment face à une multitude de dilemmes organisationnels, c'est-à-dire un ensemble de décisions et de choix pour lesquels il n'existe pas une réponse rationnelle unique. Ces dilemmes portent sur la mise en œuvre de la politique, sur la gestion des membres des pôles de compétitivité, et sur l'adaptation structurelle du pôle aux réalités locales. Apporter des réponses à ces dilemmes est évidemment du ressort des pôles de compétitivité, mais cela génère des effets secondaires et pathologies, qui doivent être pris en compte et évalués. Dans cette thèse, nous développons une étude de cas qualitative d'un pôle de compétitivité situé en région parisienne : nous analysons en détail dilemmes et pathologies. Quatre pathologies sont identifiées : l'inefficacité (suscitée par des dilemmes de leadership), la méfiance (suscitée par les dilemmes autour des subventions), la non-conformité (suscitée par des dilemmes structurels) et le pragmatisme (suscité par des dilemmes managériaux et de collaboration). Cette pathogénèse peut contribuer à améliorer la mise en œuvre et l'évaluation des politiques de cluster. Enfin, la thèse invite les chercheurs à passer de l'étude de l' « anatomie des clusters » à celle des « pathologies des clusters». / As “cluster” became a new buzzword, governments around the world increasingly implement cluster policies. However, a relevance gap is growing between cluster research and practice. Scholars build theories about the roles of clusters as powerful entities fostering innovation and competitiveness. Meanwhile, governments and policy-driven cluster organizations struggle to manage these highly entangled objects. This thesis addresses such a relevance gap. A systematic literature review (SLR) is conducted on cluster policy research, which demonstrates that governments and policy-driven cluster managers constantly face a multitude of organizational dilemmas, i.e. a set of decisions and choices for which there is no “one best choice”, in matters such as how to implement cluster policy (political dilemmas), how to manage policy-driven cluster members (managerial dilemmas) and how to adapt the policy-driven cluster to the local reality (structural dilemmas). Answering these dilemmas is constitutive of the management of policy-driven clusters, but it generates side-effect pathologies that need to be monitored and evaluated. In this thesis, we conduct a qualitative empirical investigation of a French policy-driven cluster located in the Paris Region: we analyse in detail the organisational dilemmas and their related side-effect pathologies. Four different pathologies are identified: inefficiency (driven by leadership dilemmas), distrust (driven by subsidies dilemmas), non-conformity (driven by structural dilemmas), and pragmatism (driven by managing innovation and collaboration dilemmas). The deeper knowledge of these pathologies contributes to improve cluster policy implementation and cluster evaluations. Finally, this thesis argues that academics need to shift from studying the “anatomy of clusters” to studying the “pathology of clusters”.
|
3 |
Evaluating clustering techniques in financial time seriesMillberg, Johan January 2023 (has links)
This degree project aims to investigate different evaluation strategies for clustering methodsused to cluster multivariate financial time series. Clustering is a type of data mining techniquewith the purpose of partitioning a data set based on similarity to data points in the same cluster,and dissimilarity to data points in other clusters. By clustering the time series of mutual fundreturns, it is possible to help individuals select funds matching their current goals and portfolio. Itis also possible to identify outliers. These outliers could be mutual funds that have not beenclassified accurately by the fund manager, or potentially fraudulent practices. To determine which clustering method is the most appropriate for the current data set it isimportant to be able to evaluate different techniques. Using robust evaluation methods canassist in choosing the parameters to ensure optimal performance. The evaluation techniquesinvestigated are conventional internal validation measures, stability measures, visualizationmethods, and evaluation using domain knowledge about the data. The conventional internalvalidation methods and stability measures were used to perform model selection to find viableclustering method candidates. These results were then evaluated using visualization techniquesas well as qualitative analysis of the result. Conventional internal validation measures testedmight not be appropriate for model selection of the clustering methods, distance metrics, or datasets tested. The results often contradicted one another or suggested trivial clustering solutions,where the number of clusters is either 1 or equal to the number of data points in the data sets.Similarly, a stability validation metric called the stability index typically favored clustering resultscontaining as few clusters as possible. The only method used for model selection thatconsistently suggested clustering algorithms producing nontrivial solutions was the CLOSEscore. The CLOSE score was specifically developed to evaluate clusters of time series bytaking both stability in time and the quality of the clusters into account. We use cluster visualizations to show the clusters. Scatter plots were produced by applyingdifferent methods of dimension reduction to the data, Principal Component Analysis (PCA) andt-Distributed Stochastic Neighbor Embedding (t-SNE). Additionally, we use cluster evolutionplots to display how the clusters evolve as different parts of the time series are used to performthe clustering thus emphasizing the temporal aspect of time series clustering. Finally, the resultsindicate that a manual qualitative analysis of the clustering results is necessary to finely tune thecandidate clustering methods. Performing this analysis highlights flaws of the other validationmethods, as well as allows the user to select the best method out of a few candidates based onthe use case and the reason for performing the clustering.
|
Page generated in 0.0865 seconds