Spelling suggestions: "subject:"then EM algorithm"" "subject:"them EM algorithm""
131 |
Parallel Tomographic Image Reconstruction On Hierarchical Bus-Based And Extended Hypercube ArchitecturesRajan, K 07 1900 (has links) (PDF)
No description available.
|
132 |
Apprentissage supervisé à partir des multiples annotateurs incertains / Supervised Learning from Multiple Uncertain AnnotatorsWolley, Chirine 01 December 2014 (has links)
En apprentissage supervisé, obtenir les réels labels pour un ensemble de données peut être très fastidieux et long. Aujourd'hui, les récentes avancées d'Internet ont permis le développement de services d'annotations en ligne, faisant appel au crowdsourcing pour collecter facilement des labels. Néanmoins, le principal inconvénient de ces services réside dans le fait que les annotateurs peuvent avoir des niveaux d'expertise très hétérogènes. De telles données ne sont alors pas forcément fiables. Par conséquent, la gestion de l'incertitude des annotateurs est un élément clé pour l'apprentissage à partir de multiples annotateurs non experts. Dans cette thèse, nous proposons des algorithmes probabilistes qui traitent l'incertitude des annotateurs et la qualité des données durant la phase d'apprentissage. Trois modèles sont proposés: IGNORE permet de classer de nouvelles instances tout en évaluant les annotateurs en terme de performance d'annotation qui dépend de leur incertitude. X-IGNORE intègre la qualité des données en plus de l'incertitude des juges. En effet, X-IGNORE suppose que la performance des annotateurs dépend non seulement de leur incertitude mais aussi de la qualité des données qu'ils annotent. Enfin, ExpertS répond au problème de sélection d'annotateurs durant l'apprentissage. ExpertS élimine les annotateurs les moins performants, et se base ainsi uniquement sur les labels des bons annotateurs (experts) lors de l'étape d'apprentissage. De nombreuses expérimentations effectuées sur des données synthétiques et réelles montrent la performance et la stabilité de nos modèles par rapport à différents algorithmes de la littérature. / In supervised learning tasks, obtaining the ground truth label for each instance of the training dataset can be difficult, time-consuming and/or expensive. With the advent of infrastructures such as the Internet, an increasing number of web services propose crowdsourcing as a way to collect a large enough set of labels from internet users. The use of these services provides an exceptional facility to collect labels from anonymous annotators, and thus, it considerably simplifies the process of building labels datasets. Nonetheless, the main drawback of crowdsourcing services is their lack of control over the annotators and their inability to verify and control the accuracy of the labels and the level of expertise for each labeler. Hence, managing the annotators' uncertainty is a clue for learning from imperfect annotations. This thesis provides three algorithms when learning from multiple uncertain annotators. IGNORE generates a classifier that predict the label of a new instance and evaluate the performance of each annotator according to their level of uncertainty. X-Ignore, considers that the performance of the annotators both depends on their uncertainty and on the quality of the initial dataset to be annotated. Finally, ExpertS deals with the problem of annotators' selection when generating the classifier. It identifies experts annotators, and learn the classifier based only on their labels. We conducted in this thesis a large set of experiments in order to evaluate our models, both using experimental and real world medical data. The results prove the performance and accuracy of our models compared to previous state of the art solutions in this context.
|
133 |
Approche EM pour modèles multi-blocs à facteurs à une équation structurelle / EM estimation of a structural equation modelTami, Myriam 12 July 2016 (has links)
Les modèles d'équations structurelles à variables latentes permettent de modéliser des relations entre des variables observables et non observables. Les deux paradigmes actuels d'estimation de ces modèles sont les méthodes de moindres carrés partiels sur composantes et l'analyse de la structure de covariance. Dans ce travail, après avoir décrit les deux principales méthodes d'estimation que sont PLS et LISREL, nous proposons une approche d'estimation fondée sur la maximisation par algorithme EM de la vraisemblance globale d'un modèle à facteurs latents et à une équation structurelle. Nous en étudions les performances sur des données simulées et nous montrons, via une application sur des données réelles environnementales, comment construire pratiquement un modèle et en évaluer la qualité. Enfin, nous appliquons l'approche développée dans le contexte d'un essai clinique en cancérologie pour l'étude de données longitudinales de qualité de vie. Nous montrons que par la réduction efficace de la dimension des données, l'approche EM simplifie l'analyse longitudinale de la qualité de vie en évitant les tests multiples. Ainsi, elle contribue à faciliter l'évaluation du bénéfice clinique d'un traitement. / Structural equation models enable the modeling of interactions between observed variables and latent ones. The two leading estimation methods are partial least squares on components and covariance-structure analysis. In this work, we first describe the PLS and LISREL methods and, then, we propose an estimation method using the EM algorithm in order to maximize the likelihood of a structural equation model with latent factors. Through a simulation study, we investigate how fast and accurate the method is, and thanks to an application to real environmental data, we show how one can handly construct a model or evaluate its quality. Finally, in the context of oncology, we apply the EM approach on health-related quality-of-life data. We show that it simplifies the longitudinal analysis of quality-of-life and helps evaluating the clinical benefit of a treatment.
|
134 |
"Uma aplicação industrial de regressão binária com erros na variável explicativa" / "An industrial application of binary regression with errors-in-variable explanatory"Daniel Fernando de Favari 22 June 2006 (has links)
Neste trabalho, aplicamos um modelo de regressão binária com erros de medição na variável explicativa para analisar sistemas de medição do tipo atributo. Para isto, utilizamos o modelo logístico com erros na variável, para o qual obtemos as estimativas de máxima verossimilhança via o algoritmo EM e a matriz de informação de Fisher observada. Além disso, fizemos um estudo de simulação para compararmos o método analítico e os modelos logístico sem erros na variável (ingênuo) e logístico com erros na variável. Finalmente, aplicamos nossa metodologia para avaliarmos um sistema de medição passa/não passa da maior montadora de motores Diesel (MWM International). / In this work, we apply a study of binary regression model with errors-in-variable to analyze attributive measurement systems. For this, we use the logistic model with errors-in-variable to obtain parameter estimates of maximum likelihood through EM algorithm and the observed Fisher information matrix. In addition we do a simulation study to compare analytic method and the logistic model with and without measurement errors-in-variable. Finally, we apply our methodology to evaluate a attributive measurement system for the largest Diesel motor company of the world (MWM International).
|
135 |
Análise dos resultados de ensaios de proficiência via modelos de regressão com variável explicativa aleatória / Analysis of proficiency tests results via regression models with random explanatory variableAline Othon Montanari 21 June 2004 (has links)
Em um programa de ensaio de prociência (EP) conduzido pelo Grupo de Motores, um grupo de onze laboratórios da área de temperatura realizaram medições em cinco pontos da escala de um termopar. Neste trabalho, propomos um modelo de regressão com variável explicativa X (aleatória) representando o termopar padrão que denominaremos por artefato e a variável dependente Y representando as medições dos laboratórios. O procedimento para a realização da comparação é simples, ambos termopares são colocados no forno e as diferenças entre as medições são registradas. Para a análise dos dados, vamos trabalhar com a diferença entre a diferença das medições do equipamento do laboratório e o artefato, e o valor de referência (que é determinado por 2 laboratórios que pertencem a Rede Brasileira de Calibração (RBC)). O erro de medição tem variância determinada por calibração, isto é, conhecida. Assim, vamos encontrar aproximações para as estimativas de máxima verossimilhança para os parâmetros do modelo via algoritmo EM. Além disso, propomos uma estratégia para avaliar a consistência dos laboratórios participantes do programa de EP / In a program of proficiency assay, a group of eleven laboratories of the temperature area had carried through measurements in ¯ve points on the scale of the thermopair. In this work, we propose a regression model with a random explanatory variable representing the temperature measured by the standard thermopair, which will be called device. The procedure for the comparison accomplishment is as follows. The device and the laboratory\'s thermopair to be tested are placed in the oven and the difererences between the measurements are registered. For the analysis of the data, the response variable is the diference between those diference and the reference value, which is determined by two laboratories that belong to the Brazilian Net of Calibration (RBC). The measurement error has variance determined by calibration which is known. Therefore, we ¯and the maximum likelihood estimates for the parameters of the model via EM algorithm. We consider a strategy to establish the consistency of the participant laboratories of the program of proficiency assay
|
136 |
Contributions statistiques aux prévisions hydrométéorologiques par méthodes d’ensemble / Statistical contributions to hydrometeorological forecasting from ensemble methodsCourbariaux, Marie 27 January 2017 (has links)
Dans cette thèse, nous nous intéressons à la représentation et à la prise en compte des incertitudes dans les systèmes de prévision hydrologique probabilistes à moyen-terme. Ces incertitudes proviennent principalement de deux sources : (1) de l’imperfection des prévisions météorologiques (utilisées en intrant de ces systèmes) et (2) de l’imperfection de la représentation du processus hydrologique par le simulateur pluie-débit (SPQ) (au coeur de ces systèmes).La performance d’un système de prévision probabiliste s’évalue par la précision de ses prévisions conditionnellement à sa fiabilité. L’approche statistique que nous suivons procure une garantie de fiabilité à condition que les hypothèses qu’elle implique soient réalistes. Nous cherchons de plus à gagner en précision en incorporant des informations auxiliaires.Nous proposons, pour chacune des sources d’incertitudes, une méthode permettant cette incorporation : (1) un post-traitement des prévisions météorologiques s’appuyant sur la propriété statistique d’échangeabilité et permettant la prise en compte de plusieurs sources de prévisions, ensemblistes ou déterministes ; (2) un post-traitement hydrologique utilisant les variables d’état des SPQ par le biais d’un modèle Probit arbitrant entre deux régimes hydrologiques interprétables et permettant ainsi de représenter une incertitude à variance hétérogène.Ces deux méthodes montrent de bonnes capacités d’adaptation aux cas d’application variés fournis par EDF et Hydro-Québec, partenaires et financeurs du projet. Elles présentent de plus un gain en simplicité et en formalisme par rapport aux méthodes opérationnelles tout en montrant des performances similaires. / In this thesis, we are interested in representing and taking into account uncertainties in medium term probabilistic hydrological prediction systems.These uncertainties mainly come from two sources: (1) from the imperfection of meteorological forecasts (used as inputs to these systems) and (2) from the imperfection of the representation of the hydrological process by the rainfall-runoff simulator (RRS) (at the heart of these systems).The performance of a probabilistic forecasting system is assessed by the sharpness of its predictions conditional on its reliability. The statistical approach we follow provides a guarantee of reliability if the assumptions it implies are complied with. We are also seeking to incorporate auxilary information to get sharper.We propose, for each source of uncertainty, a method enabling this incorporation: (1) a meteorological post-processor based on the statistical property of exchangeability and enabling to take into account several (ensemble or determistic) forecasts; (2) a hydrological post-processor using the RRS state variables through a Probit model arbitrating between two interpretable hydrological regimes and thus representing an uncertainty with heterogeneous variance.These two methods demonstrate adaptability on the various application cases provided by EDF and Hydro-Québec, which are partners and funders of the project. Those methods are moreover simpler and more formal than the operational methods while demonstrating similar performances.
|
137 |
Efficacité de l’algorithme EM en ligne pour des modèles statistiques complexes dans le contexte des données massivesMartel, Yannick 11 1900 (has links)
L’algorithme EM (Dempster et al., 1977) permet de construire une séquence d’estimateurs qui converge vers l’estimateur de vraisemblance maximale pour des modèles à données manquantes pour lesquels l’estimateur du maximum de vraisemblance n’est pas calculable. Cet algorithme est remarquable compte tenu de ses nombreuses applications en apprentissage statistique. Toutefois, il peut avoir un lourd coût computationnel. Les auteurs Cappé et Moulines (2009) ont proposé une version en ligne de cet algorithme pour les modèles appartenant à la famille exponentielle qui permet de faire des gains d’efficacité computationnelle importants en présence de grands jeux de données. Cependant, le calcul de l’espérance a posteriori de la statistique exhaustive, qui est nécessaire dans la version de Cappé et Moulines (2009), est rarement possible pour des modèles complexes et/ou lorsque la dimension des données manquantes est grande. On doit alors la remplacer par un estimateur. Plusieurs questions se présentent naturellement : les résultats de convergence de l’algorithme initial restent-ils valides lorsqu’on remplace l’espérance par un estimateur ? En particulier, que dire de la normalité asymptotique de la séquence des estimateurs ainsi créés, de la variance asymptotique et de la vitesse de convergence ? Comment la variance de l’estimateur de l’espérance se reflète-t-elle sur la variance asymptotique de l’estimateur EM? Peut-on travailler avec des estimateurs de type Monte-Carlo ou MCMC? Peut-on emprunter des outils populaires de réduction de variance comme les variables de contrôle ? Ces questions seront étudiées à l’aide d’exemples de modèles à variables latentes. Les contributions principales de ce mémoire sont une présentation unifiée des algorithmes EM d’approximation stochastique, une illustration de l’impact au niveau de la variance lorsque l’espérance a posteriori est estimée dans les algorithmes EM en ligne et l’introduction d’algorithmes EM en ligne permettant de réduire la variance supplémentaire occasionnée par l’estimation de l’espérance a posteriori. / The EM algorithm Dempster et al. (1977) yields a sequence of estimators that converges to the maximum likelihood estimator for missing data models whose maximum likelihood estimator is not directly tractable. The EM algorithm is remarkable given its numerous applications in statistical learning. However, it may suffer from its computational cost. Cappé and Moulines (2009) proposed an online version of the algorithm in models whose likelihood belongs to the exponential family that provides an upgrade in computational efficiency in large data sets. However, the conditional expected value of the sufficient statistic is often intractable for complex models and/or when the missing data is of a high dimension. In those cases, it is replaced by an estimator. Many questions then arise naturally: do the convergence results pertaining to the initial estimator hold when the expected value is substituted by an estimator? In particular, does the asymptotic normality property remain in this case? How does the variance of the estimator of the expected value affect the asymptotic variance of the EM estimator? Are Monte-Carlo and MCMC estimators suitable in this situation? Could variance reduction tools such as control variates provide variance relief? These questions will be tackled by the means of examples containing latent data models. This master’s thesis’ main contributions are the presentation of a unified framework for stochastic approximation EM algorithms, an illustration of the impact that the estimation of the conditional expected value has on the variance and the introduction of online EM algorithms which reduce the additional variance stemming from the estimation of the conditional expected value.
|
138 |
Statistická analýza výběrů ze zobecněného exponenciálního rozdělení / Statistical analysis of samples from the generalized exponential distributionVotavová, Helena January 2014 (has links)
Diplomová práce se zabývá zobecněným exponenciálním rozdělením jako alternativou k Weibullovu a log-normálnímu rozdělení. Jsou popsány základní charakteristiky tohoto rozdělení a metody odhadu parametrů. Samostatná kapitola je věnována testům dobré shody. Druhá část práce se zabývá cenzorovanými výběry. Jsou uvedeny ukázkové příklady pro exponenciální rozdělení. Dále je studován případ cenzorování typu I zleva, který dosud nebyl publikován. Pro tento speciální případ jsou provedeny simulace s podrobným popisem vlastností a chování. Dále je pro toto rozdělení odvozen EM algoritmus a jeho efektivita je porovnána s metodou maximální věrohodnosti. Vypracovaná teorie je aplikována pro analýzu environmentálních dat.
|
139 |
[pt] COMPARAÇÃO DE MÉTODOS DE MICRO-DADOS E DE TRIÂNGULO RUN-OFF PARA PREVISÃO DA QUANTIDADE IBNR / [en] COMPARISON OF METHODS OF MICRO-DATA AND RUN-OFF TRIANGLE FOR PREDICTION AMOUNT OF IBNR19 May 2014 (has links)
[pt] A reserva IBNR é uma reserva de suma importância para as seguradoras. Seu cálculo tem sido realizado por métodos, em sua grande maioria, determinísticos, tradicionalmente aplicados a informações de sinistros agrupadas num formato particular intitulado triangulo de run-off. Esta forma de cálculo foi muito usada por décadas por sua simplicidade e pela limitação da capacidade de processamento computacional existente. Hoje, com o grande avanço dessa capacidade, não haveria necessidade de deixar de investigar informações relevantes que podem ser perdidas com agrupamento dos dados. Muitas são as deficiências dos métodos tradicionais apontadas na literatura e o uso de informação detalhada tem sido apontado por alguns artigos como a fonte para superação dessas deficiências. Outra busca constante nas metodologias propostas para cálculo da IBNR é pela obtenção de boas medidas de precisão das estimativas obtidas por eles. Neste ponto, sobre o uso de dados detalhados, há a expectativa de obtenção de medidas de precisão mais justas, já que se tem mais dados. Inspirada em alguns artigos já divulgados com propostas
para modelagem desses dados não agrupados esta dissertação propõe um novo modelo, avaliando sua capacidade de predição e ganho de conhecimento a respeito do processo de ocorrência e aviso de sinistros frente ao que se pode obter a partir dos métodos tradicionais aplicados à dados de quantidade para obtenção da quantidade de sinistros IBNR e sua distribuição. / [en] The IBNR reserve is a reserve of paramount importance for insurers. Its calculation has been accomplished by methods, mostly, deterministic, traditionally applied to claims grouped information in a particular format
called run-off triangle . This method of calculation was very adequate for decades because of its simplicity and the limited computational processing capacity existing in the past. Today, with the breakthrough of this capacity, no waiver to investigating relevant information that may be lost with grouping data would be need. Many flaws of the traditional methods has been mentioned in the literature and the use of detailed information has been pointed as a form of overcoming these deficiencies. Another frequent aim in methodologies proposed for the calculation of IBNR is get a good measure of the accuracy of the estimates obtained by them and that is another expectation about the use of detailed data, since if you got more data you could get better measures. Inspired by some articles already published with proposals for modeling such not grouped data, this dissertation proposes a new model and evaluate its predictive ability and gain of knowledge about the process of occurrence and notice of the claim against that one can get from the traditional methods applied to data of amount of claims for obtain the amount of IBNR claims and their distribution.
|
140 |
Distribution-based Approach to Take Advantage of Automatic Passenger Counter Data in Estimating Period Route-level Transit Passenger Origin-Destination Flows:Methodology Development, Numerical Analyses and Empirical InvestigationsJi, Yuxiong 21 March 2011 (has links)
No description available.
|
Page generated in 0.0686 seconds