Global ETD Search

11	On the uses of generalised linear mixed models for the simultaneous investigation of multiple performance indicators Hewson, Paul James January 2005 (has links) No description available. 519.53
12	Bayesian model-based clustering Fuentes Garcia, Ruth S. January 2004 (has links) No description available. 519.53
13	Supervised and unsupervised model-based clustering with variable selection Cozzini, Alberto Maria January 2012 (has links) The thesis tackles the problem of uncovering hidden structures in high-dimensional data in the presence of noise and non informative variables. It proposes a supervised and an unsupervised mixture models that select the relevant variables and are robust to measurement errors and outliers. Within the class of unsupervised clustering models we extend variable selection to the family of Student's t mixture models. While t distributions are naturally robust to noise and extreme events, sparsity is achieved by imposing regularization on the location and dispersion parameters. An EM algorithm is implemented to return the maximum likelihood estimate of the model parameters given the added penalty term. To further asses the contribution of each variable we propose a resampling procedure that ranks the variables according to their selection probability. Supervised clustering is implemented in a Bayesian framework. The model assumes a mixture of Lasso type regressions with t-distributed errors. While the Lasso representation of the normal linear model imposes regularization on the regression coefficient, variable selection is explicitly modelled by a latent binary indicator variable. The model relies on particle Markov chain Monte Carlo algorithm to approximate the posterior distribution of the parameters of interest. To highlight the properties and advantages of the proposed models, two real life problems are considered. The first one requires us to identify subtypes of breast cancer tumors by grouping patients based only on their gene expression levels when only few of the thousands genes are informative. In the second case our aim is to cluster different financial markets spanning several macro sectors and explain their trading performance only on the basis of the observed statistical features of their price dynamics. 519.53
14	Appariement de descripteurs évoluant dans le temps : application à la comparaison d'assurance / Matching data descriptor over time : application to insurance comparison Bedenel, Anne-Lise 03 April 2019 (has links) La plupart des méthodes d'apprentissage standards nécessitent des descripteurs de données identiques pour les échantillons d'apprentissage et de test. Or, dans le domaine de la comparaison d'assurances en ligne, les formulaires et variables d'où proviennent les données sont régulièrement modifiés, amenant à travailler avec une faible quantité de données. L'objectif est donc d'utiliser les données obtenues avant la modification de la variable pour augmenter la taille des échantillons observés après la modification. Nous proposons d'effectuer un transfert de connaissances entre les données observées avant et après la modification. Une modélisation de la loi jointe de la variable avant et après la modification est proposée. Le problème revient donc à un problème d’estimation dans un graphe où l’identifiabilité du modèle est assurée par des contraintes métiers et techniques, amenant à travailler avec un ensemble réduit de modèles très parcimonieux. Les liens entre les descripteurs avant et après la modification sont totalement inconnus, impliquant des données manquantes. Deux méthodes d’estimation des paramètres, reposant sur des algorithmes EM sont proposées. Une étape de sélection de modèle est ensuite effectuée par un critère asymptotique et un critère non asymptotique reposant sur l’analyse bayésienne, incluant une stratégie d’échantillonnage préférentiel combinée à un algorithme de Gibbs. Une recherche exhaustive et une recherche non-exhaustive, basée sur un algorithme génétique et combinant l’estimation et la sélection de modèles, sont comparés pour obtenir le meilleur compromis "résultats-temps de calcul". Une application sur des données réelles termine la thèse. / Most of the classical learning methods require data descriptors equal to both learning and test samples. But, in the online insurance comparison field, forms and features where data come from are often changed. These constant modifications of data descriptors lead us to work with the small amount of data and make analysis more complex. So, the goal is to use data generated before the feature descriptors modification. By doing so, we increase the size of the observed sample after the descriptors modification. We intend to perform a learning transfer between observed data before and after features modification. The links between data descriptors of the feature before and after the modification are totally unknown which bring a problem of missing data. A modelling of the joint distribution of the feature before and after the modification of the data descriptors has been suggested. The problem becomes an estimation problem in a graph where some business and technical constraints ensure the identifiability of the model and we have to work with a reduced set of very parsimonious models. Two methods of estimation rely on EM algorithms have been intended. The constraints set lead us to work with a set of models. A model selection step is required. For this step, two criterium are proposed: an asymptotic and a non-asymptotic criterium rely on Bayesian analysis which includes an importance sampling combined with Gibbs algorithm. An exhaustive search and a non-exhaustive search based on genetic algorithm, combining both estimation and selection, are suggested to have an optimal method for both results and execution time. This thesis finishes with an application on real data. Sélection de modèles 519.53
15	Μη-συμβατικά θεμέλια της ασαφούς πιθανοθεωρίας και της στατιστικής ασαφών δεδομένων Θεοδωρόπουλος, Παναγιώτης 08 October 2009 (has links) - / - Στατιστική Πιθανότητες 519.53 Statistics Probabilities
16	Ανάλυση διασποράς & εφαρμογές αυτής Ρήγα, Βασιλική 27 August 2008 (has links) - / - Διασπορά Μοντέλο ANOVA 519.53 Dispersion Model ANOVA
17	An approach to estimating the variance components to unbalanced cluster sampled survey data and simulated data Ramroop, Shaun 30 November 2002 (has links) Statistics / M. Sc. (Statistics) 519.53 Multilevel models (Statistics) Linear models (Statistics) Cluster set theory
18	Η χρήση της ανάλυσης συστάδων στις οικονομικές επιστήμες Καϊμακά, Ελένη 17 September 2012 (has links) Θέμα της παρούσας διπλωματικής εργασίας αποτελεί η εφαρμογή της διαδικασίας ανάλυσης συστάδων (cluster analysis) στον κλάδο των οικονομικών. Στο πρώτο κεφάλαιο περιγράφεται η γενική μεθοδολογία της ανάλυσης συστάδων και αναλύονται τα στάδια που ακολουθούνται για την υλοποίηση της. Πιο συγκεκριμένα, αναπτύσσονται οι δύο κύριες μέθοδοι συσταδοποίησης, η ιεραρχική και η διαιρετική, καθώς και οι συνήθεις μετρικές απόστασης και αλγόριθμοι σύνδεσης που χρησιμοποιούνται στην ανάλυση συστάδων. Στο δεύτερο κεφάλαιο, αναφέρονται οι τομείς των οικονομικών στους οποίους, σύμφωνα με την βιβλιογραφία, εφαρμόζεται η ανάλυση συστάδων, ενώ παράλληλα γίνεται και μια σύντομη εμβάθυνση στην μεθοδολογία της ανάλυσης συστάδων που προτιμάται ανά τομέα. / In the present diploma thesis, applications of cluster analysis to economics are reviewed. The first chapter describes the general methodology of cluster analysis and analyzes the steps followed during the procedure. In particular, the two main clustering methods, hierarchical and divisive, as well as the common distance metrics and linkage rules are developed. In the second chapter the fields of economics which, according to the literature, the cluster analysis is usually used to, are described. Furthermore, there has been a brief review of the methodology of clustering followed for each economic sector. Ανάλυση συστάδων 519.53 Cluster analysis Economics
19	Ανάλυση συστάδων (cluster analysis) Καράγεωργα, Ισμήνη 12 April 2013 (has links) Στη συγκεκριμένη διπλωματική εργασία αναλύεται το πρόβλημα της ανάλυσης συστάδων. Σκοπός της ανάλυσης συστάδων είναι να ομαδοποιεί τα στοιχεία σε cluster έτσι ώστε τα στοιχεία που ανήκουν στο ίδιο cluster να έχουν μεγαλύτερη ομοιότητα από τα στοιχεία που ανήκουν σε διαφορετικά cluster. / In the current diplomatic thesis is analyzed the problem of cluster analysis. The purpose of cluster analysis is to group items in clusters, so that items belonging to the same cluster have a greater similarity than the items belonging to different clusters. Ανάλυση συστάδων 519.53 Cluster analysis Clustering
20	An approach to estimating the variance components to unbalanced cluster sampled survey data and simulated data Ramroop, Shaun 30 November 2002 (has links) Statistics / M. Sc. (Statistics) 519.53 Multilevel models (Statistics) Linear models (Statistics) Cluster set theory

Search results