• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 115
  • 22
  • 19
  • 15
  • 7
  • 5
  • 5
  • 4
  • 2
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 236
  • 236
  • 90
  • 44
  • 43
  • 37
  • 30
  • 30
  • 27
  • 25
  • 24
  • 22
  • 21
  • 20
  • 20
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
141

Modélisation statistique de l’érosion de cavitation d’une turbine hydraulique selon les paramètres d’opération

Bodson-Clermont, Paule-Marjolaine 03 1900 (has links)
Dans une turbine hydraulique, la rotation des aubes dans l’eau crée une zone de basse pression, amenant l’eau à passer de l’état liquide à l’état gazeux. Ce phénomène de changement de phase est appelé cavitation et est similaire à l’ébullition. Lorsque les cavités de vapeur formées implosent près des parois, il en résulte une érosion sévère des matériaux, accélérant de façon importante la dégradation de la turbine. Un système de détection de l’érosion de cavitation à l’aide de mesures vibratoires, employable sur les turbines en opération, a donc été installé sur quatre groupes turbine-alternateur d’une centrale et permet d’estimer précisément le taux d’érosion en kg/ 10 000 h. Le présent projet vise à répondre à deux objectifs principaux. Premièrement, étudier le comportement de la cavitation sur un groupe turbine-alternateur cible et construire un modèle statistique, dans le but de prédire la variable cavitation en fonction des variables opératoires (tels l’ouverture de vannage, le débit, les niveaux amont et aval, etc.). Deuxièmement, élaborer une méthodologie permettant la reproductibilité de l’étude à d’autres sites. Une étude rétrospective sera effectuée et on se concentrera sur les données disponibles depuis la mise à jour du système en 2010. Des résultats préliminaires ont mis en évidence l’hétérogénéité du comportement de cavitation ainsi que des changements entre la relation entre la cavitation et diverses variables opératoires. Nous nous proposons de développer un modèle probabiliste adapté, en utilisant notamment le regroupement hiérarchique et des modèles de régression linéaire multiple. / Cavitation erosion which results from repeated collapse of transient vapor cavities on solid surfaces is a constant problematic in hydraulic turbine runners and continues to enforce costly repair and loss of revenues. A vibratory detection system of cavitation erosion was installed 10 years ago for continuous monitoring of 4 hydropower units. A new hardware version of the system was developed and installed in 2010. This new system configuration is more reliable and allows more accurate evaluation of the cavitation erosion of the runners in kg/10 000 h. The first objective of this study is to investigate cavitation behavior upon one generating unit and to build a statistical model which will allow prediction of instant cavitation related to operating variables, such as gate opening, water flow, headwater level, tailwater levels, etc. The second objective is to develop a methodology for the reproducibility of the studies to other sites. A retrospective study will be conducted and we will mainly focus on data available since the system update in 2010. The preliminary analysis enhanced the complexity of the phenomenon. Indeed, changes in the relationship between cavitation and various operating variables were observed and could be due to a seasonal behavior or different operating conditions. Using hierarchical clustering and regression models, we formalize this heterogeneity by developing a model which includes operating variables such as active power, tailwater level and gate opening.
142

Inégalités d'oracle et mélanges / Oracle inequalities and mixtures

Montuelle, Lucie 04 December 2014 (has links)
Ce manuscrit se concentre sur deux problèmes d'estimation de fonction. Pour chacun, une garantie non asymptotique des performances de l'estimateur proposé est fournie par une inégalité d'oracle. Pour l'estimation de densité conditionnelle, des mélanges de régressions gaussiennes à poids exponentiels dépendant de la covariable sont utilisés. Le principe de sélection de modèle par maximum de vraisemblance pénalisé est appliqué et une condition sur la pénalité est établie. Celle-ci est satisfaite pour une pénalité proportionnelle à la dimension du modèle. Cette procédure s'accompagne d'un algorithme mêlant EM et algorithme de Newton, éprouvé sur données synthétiques et réelles. Dans le cadre de la régression à bruit sous-gaussien, l'agrégation à poids exponentiels d'estimateurs linéaires permet d'obtenir une inégalité d'oracle en déviation, au moyen de techniques PAC-bayésiennes. Le principal avantage de l'estimateur proposé est d'être aisément calculable. De plus, la prise en compte de la norme infinie de la fonction de régression permet d'établir un continuum entre inégalité exacte et inexacte. / This manuscript focuses on two functional estimation problems. A non asymptotic guarantee of the proposed estimator’s performances is provided for each problem through an oracle inequality.In the conditional density estimation setting, mixtures of Gaussian regressions with exponential weights depending on the covariate are used. Model selection principle through penalized maximum likelihood estimation is applied and a condition on the penalty is derived. If the chosen penalty is proportional to the model dimension, then the condition is satisfied. This procedure is accompanied by an algorithm mixing EM and Newton algorithm, tested on synthetic and real data sets. In the regression with sub-Gaussian noise framework, aggregating linear estimators using exponential weights allows to obtain an oracle inequality in deviation,thanks to pac-bayesian technics. The main advantage of the proposed estimator is to be easily calculable. Furthermore, taking the infinity norm of the regression function into account allows to establish a continuum between sharp and weak oracle inequalities.
143

Modelos de mistura para dados com distribuições Poisson truncadas no zero / Mixture models for data with zero truncated Poisson distributions

Gigante, Andressa do Carmo 22 September 2017 (has links)
Modelo de mistura de distribuições tem sido utilizado desde longa data, mas ganhou maior atenção recentemente devido ao desenvolvimento de métodos de estimação mais eficientes. Nesta dissertação, o modelo de mistura foi utilizado como uma forma de agrupar ou segmentar dados para as distribuições Poisson e Poisson truncada no zero. Para solucionar o problema do truncamento foram estudadas duas abordagens. Na primeira, foi considerado o truncamento em cada componente da mistura, ou seja, a distribuição Poisson truncada no zero. E, alternativamente, o truncamento na resultante do modelo de mistura utilizando a distribuição Poisson usual. As estimativas dos parâmetros de interesse do modelo de mistura foram calculadas via metodologia de máxima verossimilhança, sendo necessária a utilização de um método iterativo. Dado isso, implementamos o algoritmo EM para estimar os parâmetros do modelo de mistura para as duas abordagens em estudo. Para analisar a performance dos algoritmos construídos elaboramos um estudo de simulação em que apresentaram estimativas próximas dos verdadeiros valores dos parâmetros de interesse. Aplicamos os algoritmos à uma base de dados real de uma determinada loja eletrônica e para determinar a escolha do melhor modelo utilizamos os critérios de seleção de modelos AIC e BIC. O truncamento no zero indica afetar mais a metodologia na qual aplicamos o truncamento em cada componente da mistura, tornando algumas estimativas para a distribuição Poisson truncada no zero com viés forte. Ao passo que, na abordagem em que empregamos o truncamento no zero diretamente no modelo as estimativas apontaram menor viés. / Mixture models has been used since long but just recently attracted more attention for the estimations methods development more efficient. In this dissertation, we consider the mixture model like a method for clustering or segmentation data with the Poisson and Poisson zero truncated distributions. About the zero truncation problem we have two emplacements. The first, consider the zero truncation in the mixture component, that is, we used the Poisson zero truncated distribution. And, alternatively, we do the zero truncation in the mixture model applying the usual Poisson. We estimated parameters of interest for the mixture model through maximum likelihood estimation method in which we need an iterative method. In this way, we implemented the EM algorithm for the estimation of interested parameters. We apply the algorithm in one real data base about one determined electronic store and towards determine the better model we use the criterion selection AIC and BIC. The zero truncation appear affect more the method which we truncated in the component mixture, return some estimates with strong bias. In the other hand, when we truncated the zero directly in the model the estimates pointed less bias.
144

Alternative regression models to Beta distribution under Bayesian approach / Modelos de regressão alternativos à distribuição Beta sob abordagem bayesiana

Paz, Rosineide Fernando da 25 August 2017 (has links)
The Beta distribution is a bounded domain distribution which has dominated the modeling the distribution of random variable that assume value between 0 and 1. Bounded domain distributions arising in various situations such as rates, proportions and index. Motivated by an analysis of electoral votes percentages (where a distribution with support on the positive real numbers was used, although a distribution with limited support could be more suitable) we focus on alternative distributions to Beta distribution with emphasis in regression models. In this work, initially we present the Simplex mixture model as a flexible model to modeling the distribution of bounded random variable then we extend the model to the context of regression models with the inclusion of covariates. The parameters estimation is discussed for both models considering Bayesian inference. We apply these models to simulated data sets in order to investigate the performance of the estimators. The results obtained were satisfactory for all the cases investigated. Finally, we introduce a parameterization of the L-Logistic distribution to be used in the context of regression models and we extend it to a mixture of mixed models. / A distribuição beta é uma distribuição com suporte limitado que tem dominado a modelagem de variáveis aleatórias que assumem valores entre 0 e 1. Distribuições com suporte limitado surgem em várias situações como em taxas, proporções e índices. Motivados por uma análise de porcentagens de votos eleitorais, em que foi assumida uma distribuição com suporte nos números reais positivos quando uma distribuição com suporte limitado seira mais apropriada, focamos em modelos alternativos a distribuição beta com enfase em modelos de regressão. Neste trabalho, apresentamos, inicialmente, um modelo de mistura de distribuições Simplex como um modelo flexível para modelar a distribuição de variáveis aleatórias que assumem valores em um intervalo limitado, em seguida estendemos o modelo para o contexto de modelos de regressão com a inclusão de covariáveis. A estimação dos parâmetros foi discutida para ambos os modelos, considerando o método bayesiano. Aplicamos os dois modelos a dados simulados para investigarmos a performance dos estimadores usados. Os resultados obtidos foram satisfatórios para todos os casos investigados. Finalmente, introduzimos a distribuição L-Logistica no contexto de modelos de regressão e posteriormente estendemos este modelo para o contexto de misturas de modelos de regressão mista.
145

New statistical modeling of multi-sensor images with application to change detection / Nouvelle modélisation statistique des images multi-capteurs et son application à la détection des changements

Prendes, Jorge 22 October 2015 (has links)
Les images de télédétection sont des images de la surface de la Terre acquises par des satellites ou des avions. Ces images sont de plus en plus disponibles et leur technologies évoluent rapidement. On peut observer une amélioration des capteurs existants, mais de nouveaux types de capteurs ont également vu le jour et ont montré des propriétés intéressantes pour le traitement d'images. Ainsi, les images multispectrales et radar sont devenues très classiques.La disponibilité de différents capteurs est très intéressante car elle permet de capturer une grande variété de propriétés des objets. Ces propriétés peuvent être exploitées pour extraire des informations plus riches sur les objets. Une des applications majeures de la télédétection est la détection de changements entre des images multi-temporelles (images de la même scène acquise à des instants différents). Détecter des changements entre des images acquises par des capteurs homogènes est un problème classique. Mais le problème de la détection de changements entre images acquises par des capteurs hétérogènes est un problème beaucoup plus difficile.Avoir des méthodes de détection de changements adaptées aux images issues de capteurs hétérogènes est nécessaire pour le traitement de catastrophes naturelles. Des bases de données constituées d'images optiques sont disponible, mais il est nécessaire d'avoir de bonnes conditions climatiques pour les acquérir. En revanche, les images radar sont accessibles rapidement quelles que soient les conditions climatiques et peuvent même être acquises de nuit. Ainsi, détecter des changements entre des images optiques et radar est un problème d'un grand intérêt en télédétection.L'intérêt de cette thèse est d'étudier des méthodes statistiques de détention de changements adaptés aux images issues de capteurs hétérogènes.Chapitre 1 rappelle ce qu'on entend par une image de télédétection et résume rapidement quelques méthodes de détection de changements disponibles dans la littérature. Les motivations à développer des méthodes de détection de changements adaptées aux images hétérogènes et les difficultés associiées sont présentés.Chapitre 2 étudie les propriétés statistiques des images en l'absence de changements. Un modèle de mélange de lois adapté aux ces images est introduit. La performance des méthodes classiques de détection de changements est également étudiée. Dans plusieurs cas, ce modèle permet d'expliquer certains défauts de certaines méthodes de la literature.Chapitre 3 étudie les propriétés des paramètres du modèle introduit au chapitre 2 en faisant l'hypothèse qu'ils appartiennent à une variété en l'absence de changements. Cette hypothèse est utilisée pour définir une mesure de similarité qui permet d'éviter les défauts des approches statistiques classiques. Une méthode permettant d'estimer cette mesure de similarité est présentée. Enfin, la stratégie de détection de changements basée sur cette mesure est validée à l'aide d'images synthétiques.Chapitre 4 étudie un algorithme Bayésien non-paramétrique (BNP) qui permet d'améliorer l'estimation de la variété introduite au chapitre 3, qui est basé sur un processus de restaurant Chinois (CRP) et un champs de Markov qui exploite la corrélation spatiale entre des pixels voisins de l'image. Une nouvelle loi a priori de Jeffrey pour le paramètre de concentration de ce CRP est définit. L'estimation des paramètres de ce nouveau modèle est effectuée à l'aide d'un échantillonneur de Gibbs de type "collapsed Gibbs sampler". La stratégie de détection de changement issue de ce modèle non-paramétrique est validée à l'aide d'images synthétiques.Le dernier chapitre est destiné à la validation des algorithmes de détection de changements développés sur des jeux d'images réelles montrant des résultats encourageant pour tous les cas d'étude. Le modèle BNP permet d'obtenir de meilleurs performances que le modèle paramétrique, mais ceci se fait au prix d'une complexité calculatoire plus importante. / Remote sensing images are images of the Earth surface acquired from satellites or air-borne equipment. These images are becoming widely available nowadays and its sensor technology is evolving fast. Classical sensors are improving in terms of resolution and noise level, while new kinds of sensors are proving to be useful. Multispectral image sensors are standard nowadays and synthetic aperture radar (SAR) images are very popular.The availability of different kind of sensors is very advantageous since it allows us to capture a wide variety of properties of the objects contained in a scene. These properties can be exploited to extract richer information about these objects. One of the main applications of remote sensing images is the detection of changes in multitemporal datasets (images of the same area acquired at different times). Change detection for images acquired by homogeneous sensors has been of interest for a long time. However the wide range of different sensors found in remote sensing makes the detection of changes in images acquired by heterogeneous sensors an interesting challenge.Accurate change detectors adapted to heterogeneous sensors are needed for the management of natural disasters. Databases of optical images are readily available for an extensive catalog of locations, but, good climate conditions and daylight are required to capture them. On the other hand, SAR images can be quickly captured, regardless of the weather conditions or the daytime. For these reasons, optical and SAR images are of specific interest for tracking natural disasters, by detecting the changes before and after the event.The main interest of this thesis is to study statistical approaches to detect changes in images acquired by heterogeneous sensors. Chapter 1 presents an introduction to remote sensing images. It also briefly reviews the different change detection methods proposed in the literature. Additionally, this chapter presents the motivation to detect changes between heterogeneous sensors and its difficulties.Chapter 2 studies the statistical properties of co-registered images in the absence of change, in particular for optical and SAR images. In this chapter a finite mixture model is proposed to describe the statistics of these images. The performance of classical statistical change detection methods is also studied by taking into account the proposed statistical model. In several situations it is found that these classical methods fail for change detection.Chapter 3 studies the properties of the parameters associated with the proposed statistical mixture model. We assume that the model parameters belong to a manifold in the absence of change, which is then used to construct a new similarity measure overcoming the limitations of classic statistical approaches. Furthermore, an approach to estimate the proposed similarity measure is described. Finally, the proposed change detection strategy is validated on synthetic images and compared with previous strategies.Chapter 4 studies Bayesian non parametric algorithm to improve the estimation of the proposed similarity measure. This algorithm is based on a Chinese restaurant process and a Markov random field taking advantage of the spatial correlations between adjacent pixels of the image. This chapter also defines a new Jeffreys prior for the concentration parameter of this Chinese restaurant process. The estimation of the different model parameters is conducted using a collapsed Gibbs sampler. The proposed strategy is validated on synthetic images and compared with the previously proposed strategy. Finally, Chapter 5 is dedicated to the validation of the proposed change detection framework on real datasets, where encouraging results are obtained in all cases. Including the Bayesian non parametric model into the change detection strategy improves change detection performance at the expenses of an increased computational cost.
146

Analyse statistique d'IRM quantitatives par modèles de mélange : Application à la localisation et la caractérisation de tumeurs cérébrales / Statistical analysis of quantitative MRI based on mixture models : Application to the localization and characterization of brain tumors

Arnaud, Alexis 24 October 2018 (has links)
Nous présentons dans cette thèse une méthode générique et automatique pour la localisation et la caractérisation de lésions cérébrales telles que les tumeurs primaires à partir de multiples contrastes IRM. Grâce à une récente généralisation des lois de probabilités de mélange par l'échelle de distributions gaussiennes, nous pouvons modéliser une large variété d'interactions entre les paramètres IRM mesurés, et cela afin de capter l'hétérogénéité présent dans les tissus cérébraux sains et endommagés. En nous basant sur ces lois de probabilités, nous proposons un protocole complet pour l'analyse de données IRM multi-contrastes : à partir de données quantitatives, ce protocole fournit, s'il y a lieu, la localisation et le type des lésions détectées au moyen de modèles probabilistes. Nous proposons également deux extensions de ce protocole. La première extension concerne la sélection automatique du nombre de composantes au sein du modèle probabiliste, sélection réalisée via une représentation bayésienne des modèles utilisés. La seconde extension traite de la prise en compte de la structure spatiale des données IRM par l'ajout d'un champ de Markov latent au sein du protocole développé. / We present in this thesis a generic and automatic method for the localization and the characterization of brain lesions such as primary tumor using multi-contrast MRI. From the recent generalization of scale mixtures of Gaussians, we reach to model a large variety of interactions between the MRI parameters, with the aim of capturing the heterogeneity inside the healthy and damaged brain tissues. Using these probability distributions we propose an all-in-one protocol to analyze multi-contrast MRI: starting from quantitative MRI data this protocol determines if there is a lesion and in this case the localization and the type of the lesion based on probability models. We also develop two extensions for this protocol. The first one concerns the selection of mixture components in a Bayesian framework. The second one is about taking into account the spatial structure of MRI data by the addition of a random Markov field to our protocol.
147

台灣地區公共電視使用之願付價格分析 / Assessing willingness to pay for maintaining the operation of Public Television Service in Taiwan.

黃慧甄, Huang, Huei Jhen Unknown Date (has links)
本研究主要目的是想要探討台灣地區民眾對於維持公共電視的營運與發展之願付價格。資料來自於中央研究院人文社會科學研究中心調查研究專題中心所進行的一項公共電視願付價格調查,其中關於願付價格的部分是透過條件評估法的方式取得,受訪者隨機分配至兩個題組之一,其中A題組為考慮公共電視帶給家人的好處,而B題組為考慮公共電視帶給全國民眾及社會的好處。 建模時我們採用一或二要素混成模型,這是一個可以將不理性受訪者分離,僅針對理性受訪者的願付價格進行估計的一個模型。分析結果顯示考慮到公共電視帶給家人的好處時,年齡50歲以上的受訪者中存在著一群無論什麼價格都不願意支付或Ney-sayers的人,估計的比例大約為21.13%;而考慮到公共電視帶給全國民眾及社會的好處時,很少看文化教育節目的受訪者中也存在著一群無論什麼價格都不願意支付或Ney-sayers的人,估計比例大約為13.53%。 針對願意支付合理價格的受訪者配適加速失敗模型時,我們不僅就位置參數引進解釋變數,同時也引進解釋變數至尺度參數。分析結果顯示願意支付合理價格的受訪者,在考慮公共電視帶給家人的好處時,每年願意支付價格之平均值為1477元;而考慮到公共電視帶給全國民眾及社會的好處時,每年願意支付價格之平均值為1663元。顯示受訪者在考慮到全國民眾及社會的好處,願意付出較高的價錢。 / This study aimed to explore people’s willingness to pay (WTP) for maintaining the operation of Public Television Service (PTS) in Taiwan. The survey using contingent valuation method was conducted by the Center for Survey Research, Academia Sinica. The survey sample was split into two groups, A and B. Each group was presented with the same scenario but different scope of benefits. Group A considered the benefits that PTS might bring to one’s family, while Group B considered the benefits that PTS might bring to the whole society. The model used in this study was a one/two-component hybrid model, a model that is able to separate those who are willing to pay a reasonable price from these who are not, and obtain their mean WTP estimate. Multinomial logistic part of the model, indicated that for Group A, among those who were 50 years of age or older, about 24.48% were not willing to pay any price or Ney-sayers. And for Group B, among those who seldom watched cultural or educational programs, about 12.91% were not willing to pay any price or Ney-sayers. Appling accelerated failure time model to those who were willing to pay for reasonable prices enabled us to evaluate the WTP. We not only introduced explanatory variables in the location parameter but also the scale parameter. The estimated mean WTP for Group A was found to be NT$1477 per year, while the mean WTP for Group B was NT$1663 per year.
148

Bayesian Cluster Analysis : Some Extensions to Non-standard Situations

Franzén, Jessica January 2008 (has links)
<p>The Bayesian approach to cluster analysis is presented. We assume that all data stem from a finite mixture model, where each component corresponds to one cluster and is given by a multivariate normal distribution with unknown mean and variance. The method produces posterior distributions of all cluster parameters and proportions as well as associated cluster probabilities for all objects. We extend this method in several directions to some common but non-standard situations. The first extension covers the case with a few deviant observations not belonging to one of the normal clusters. An extra component/cluster is created for them, which has a larger variance or a different distribution, e.g. is uniform over the whole range. The second extension is clustering of longitudinal data. All units are clustered at all time points separately and the movements between time points are modeled by Markov transition matrices. This means that the clustering at one time point will be affected by what happens at the neighbouring time points. The third extension handles datasets with missing data, e.g. item non-response. We impute the missing values iteratively in an extra step of the Gibbs sampler estimation algorithm. The Bayesian inference of mixture models has many advantages over the classical approach. However, it is not without computational difficulties. A software package, written in Matlab for Bayesian inference of mixture models is introduced. The programs of the package handle the basic cases of clustering data that are assumed to arise from mixture models of multivariate normal distributions, as well as the non-standard situations.</p>
149

Conditioning of unobserved period-specific abundances to improve estimation of dynamic populations

Dail, David (David Andrew) 28 February 2012 (has links)
Obtaining accurate estimates of animal abundance is made difficult by the fact that most animal species are detected imperfectly. Early attempts at building likelihood models that account for unknown detection probability impose a simplifying assumption unrealistic for many populations, however: no births, deaths, migration or emigration can occur in the population throughout the study (i.e., population closure). In this dissertation, I develop likelihood models that account for unknown detection and do not require assuming population closure. In fact, the proposed models yield a statistical test for population closure. The basic idea utilizes a procedure in three steps: (1) condition the probability of the observed data on the (unobserved) period- specific abundances; (2) multiply this conditional probability by the (prior) likelihood for the period abundances; and (3) remove (via summation) the period- specific abundances from the joint likelihood, leaving the marginal likelihood of the observed data. The utility of this procedure is two-fold: step (1) allows detection probability to be more accurately estimated, and step (2) allows population dynamics such as entering migration rate and survival probability to be modeled. The main difficulty of this procedure arises in the summation in step (3), although it is greatly simplified by assuming abundances in one period depend only the most previous period (i.e., abundances have the Markov property). I apply this procedure to form abundance and site occupancy rate estimators for both the setting where observed point counts are available and the setting where only the presence or absence of an animal species is ob- served. Although the two settings yield very different likelihood models and estimators, the basic procedure forming these estimators is constant in both. / Graduation date: 2012
150

Modèle de mélange de lois multinormales appliqué à l'analyse de comportements et d'habiletés cognitives d'enfants.

Giguère, Charles-Édouard 11 1900 (has links)
Cette étude aborde le thème de l’utilisation des modèles de mélange de lois pour analyser des données de comportements et d’habiletés cognitives mesurées à plusieurs moments au cours du développement des enfants. L’estimation des mélanges de lois multinormales en utilisant l’algorithme EM est expliquée en détail. Cet algorithme simplifie beaucoup les calculs, car il permet d’estimer les paramètres de chaque groupe séparément, permettant ainsi de modéliser plus facilement la covariance des observations à travers le temps. Ce dernier point est souvent mis de côté dans les analyses de mélanges. Cette étude porte sur les conséquences d’une mauvaise spécification de la covariance sur l’estimation du nombre de groupes formant un mélange. La conséquence principale est la surestimation du nombre de groupes, c’est-à-dire qu’on estime des groupes qui n’existent pas. En particulier, l’hypothèse d’indépendance des observations à travers le temps lorsque ces dernières étaient corrélées résultait en l’estimation de plusieurs groupes qui n’existaient pas. Cette surestimation du nombre de groupes entraîne aussi une surparamétrisation, c’est-à-dire qu’on utilise plus de paramètres qu’il n’est nécessaire pour modéliser les données. Finalement, des modèles de mélanges ont été estimés sur des données de comportements et d’habiletés cognitives. Nous avons estimé les mélanges en supposant d’abord une structure de covariance puis l’indépendance. On se rend compte que dans la plupart des cas l’ajout d’une structure de covariance a pour conséquence d’estimer moins de groupes et les résultats sont plus simples et plus clairs à interpréter. / This study is about the use of mixture to model behavioral and cognitive data measured repeatedly across development in children. Estimation of multinormal mixture models using the EM algorithm is explained in detail. This algorithm simplifies computation of mixture models because the parameters in each group are estimated separately, allowing to model covariance across time more easily. This last point is often disregarded when estimating mixture models. This study focused on the consequences of a misspecified covariance matrix when estimating the number of groups in a mixture. The main consequence is an overestimation of the number of groups, i.e. we estimate groups that do not exist. In particular, the independence assumption of the observations across time when they were in fact correlated resulted in estimating many non existing groups. This overestimation of the number of groups also resulted in an overfit of the model, i.e. we used more parameters than necessary. Finally mixture models were fitted to behavioral and cognitive data. We fitted the data first assuming a covariance structure, then assuming independence. In most cases, the analyses conducted assuming a covariance structure ended up having fewer groups and the results were simpler and clearer to interpret.

Page generated in 0.0488 seconds